 I would like to introduce David Beyer, who is the lecturer of the Inbox School. We are the biotech center in Taiwan. It's a much smaller room, it's personalized to the biotech center website. So David Beyer is the professor of chemical devices for worship and the focus is on our youth. He is a member of the National Foundation of Science and I can tell you that he is a scientist. He is a young investigator of the National Science Foundation, a background foundation. I would like to introduce David Beyer, who is a young investigator of one of the societies. He graduated from the University of California, working in the right department. And then he came to the post of the UCSF when he worked with David Beyer. And then where he began to be interested in computation, working on some methods for solving the placing problem in crystallography. And then he came to the University of Washington, to start his lab. He is very known for a series of algorithms and of course he has a method for predicting the structural political sequence from designing your structure to determine structure for experimental data like NMR or NMR as far as amount. But also I think we are going to see today how the same algorithms and work deals can be used for finding a way to use people to science and they can help the public understand how proteins matter in the world. I won't take any more of this time to be showing you the work on this paper. Thank you all for that great introduction and for picking up the gear for this. So broadly speaking, my work on two types of problems. The first is our creation problems. And here the problem is problems are presented by biological systems. We have that for genomes. We use genomes for genes. And the general prediction problem is of course to make what the structures and functions and interactions of all these molecules are. The inverse problem is the designing problem and this comes about because while we have a huge wealth of molecules in the nature that do all kinds of wonderful things, there is an even larger class of possible molecules and molecules that don't exist but clearly exist and we can make these molecules take a pre-beneficial to many different areas of the modern world. And so the general problem here is to work backwards rather than starting from the sequence to go towards the structure to start from the functions we'd like to have that don't have yet and work backwards to determine what the sequences of these molecules would be. So today I'm going to talk about the first part of prediction problems and then tomorrow I'll talk about design. So these problems are closely related. Given the model of the energetics of the interactions within and between macromolecules, a prediction problem normally is given, for example, the sequence of a 13-finance lowest energy state. So in prediction problems you know what the chemical composition is and you know what the sequence is whatever you see in the biological system and the problem is to find, as I said, the lowest energy state of the chain or two proteins together. Whereas a design problem we instead know what we'd like to have when we have to find the lowest energy sequence for that. So in both cases we can use the same model of the energetics. It's just in one case we're working with a biologian to find the lowest energy structure and in the design case we start with the structure and work backwards towards the sequence. Now in principle structural biology should be computable because, of course, protein structure should be determined by the amino acid sequences. This has been known for many, many years now and it's very likely that protein structure structures and complexes and RNA, these little RNA molecules as well correspond in general to global energy. And when they aren't actually deep global energy minimum they're probably among the very lowest line possible states. And of course why do we want to find compute structural biology? Well, if we could do it it would be very useful obviously and it's also a real test understanding of what goes on in the lack of biology. So I'm going to begin by talking about the for a plastic oven structure prediction problem where you start with an extended chain and the problem is to predict the three dimensional structure of the protein. Now the problem of course is that of even a small sort of chain has a lot of often very large over possible configurations. So the outstanding model is how to sample through this space of possible configurations and one has to at the beginning initially sample as broadly as possible and that means the calculations have to be doable very quickly. So at this level we use a simplified representation of the chain where the side chains are represented just by single points and the primary driving forces are burial, the hydrophobic residues in the core and the varying of the data strands. Now the problem is that at this level while we can sample very quickly the amount of time it takes to fold up a protein chain at this level of representation is about what we saw during it was about the length of time that there was a movie to play we can't calculate energies very accurately because we've simplified the model too much. So we in practice will carry out many thousands or tens of thousands of independent folding ones like this and each one will end up in a slightly different structure often very different structure and the challenge is to pick out which one of those is actually the lowest energy structure of the protein ability. So we need to put in all the atomic detail to start answering that question about which is the lower energy which is the lowest energy structure so the type of detail is exactly what you'd expect. You look at protein structures you see it's very close to packed you don't have holes, packing is very important we describe that as simple vanishing potential hydrogen bond is very important you look at macromole after the structures all the very full groups are generally making hydrogen bonds and these hydrogen bonds have an angular dependence which we modeled of course the salvation we've got polar atoms on the outside non-polar atoms on the inside and so forth now when we put all that detail in the calculation becomes much slower and you can see that now the backbone is really moving only very little what's happening here is we're starting from the endpoint of the previous calculation we're simply searching for the lowest energy structure in the neighborhood and these are packed the hydrogen bond so the results and this is a point I may emphasize this first part of my talk pretty much look the same no matter what problem we are studying on this axis is the energy on this axis is the measure of structural similarity in this case since this is a protein where we know the structure is similar to the native structure and these different black points are points that are sampled on different we can start with the first calculation where things are rapidly moving around we take the endpoint of that we go into the second calculation where everything is just moving a little bit and we take the the lowest energy structure that's sampled on that second one and so each dot comes from a separate layer of trajectories so you can see that different trajectories end up with very different energies and with very different structures here so for example one of them was on black it ended up here quite high in energy another one ended up down here relatively low in energy but there's another one that's almost as low in energy which is far out here so this simply says that this landscape that's being searched is filled with bills and ballots it's sort of random at this point just depends on which direction you have them to start off that first movie and that's the only difference there's a different random number that starts the first random population the second thing that you see is that if you start at the native structure generally you're quite a bit lower in energy than all of these structures that you sample starting from the extended chain so the native structure is in this quite deep minimum and the other thing you see is that if you can get close enough to the native structure the energy drops starts dropping quite a bit but you have to get quite close and the reason you have to get quite close is illustrated here this is now a comparison of the native structure this is where the protein has shown on the previous slide and one color is the native structure and the other color is that low energy structure that had the arrow next to it and you see that in the core of the protein the side chains are pretty close to where they are in the native structure and it's like you've got a jigsaw puzzle you've got it roughly right all the jigsaw puzzle pieces are fitting together roughly the right way and this is what makes the energy drop because it's very hard for a random protein configuration with the side chains to all act so perfectly now we've extended this over the years to many other types of problems so this is a problem we have a long barrier protein we have a homodimer and this would work as well for any type of homolymer and here what you saw there playing and I guess I should do it again is the same type it's exactly like the first movie I showed except you have two chains that are holding up at the same time and it's done symmetrically so every time we make a move in this chain we make a move in the other chain too but we're also searching for the rigid body orientation at the same time when the system isn't too big the lowest energy structures we're then going to start going to the high resolution movie to find and put in all the details and when the system isn't too big the lowest energy structures we get can look quite a bit like the quite close to the native structure this also holds for RNA this is now the low resolution part of the search for RNA so again we really simplify things a lot so that's the way and these are the same type of plots for RNA molecules again energy versus RMSD again you see that the native structure is in general lower in energy than the structures that we sample starting from the extended chain and the lowest energy structures can often be pretty close to the native structure so here we have comparisons now you notice these are really small RNA molecules and that's a general characteristic of systems where we're actually able to find without any experimental information the lowest energy structure if I were to show you the calculation now in order for RNA molecules we would see all these blue planes that may be way out here we simply wouldn't have any of these low energy the native or they were close in structure to the native structure okay so summarizing on this slide what the general features you see of all these plot lines we see a free energy gap between the native structure and the structures that we generate starting from the extended chain and this gap is these structures we generate from the extended chain are non-native structured of course the diners, RNAs, membrane proteins the picture always looks the same and why is this it's not because we're able to compute energies at some super high accuracy it's really because in order for proteins and the really amazing thing about molecular biology is that you have full structures and RNA molecules holds up these are biological entities or if you have a polymer of 200 subunits like a winter recipe protein you'd expect to be some random ensemble of states and the really striking thing about black biology of course is that you have that biological hold up into very precise structures and it's only possible if the free energies of those structures are very much lower than the structures of alternative possible states so we're able to create a structure because it's a magnitude of these actual gaps which actually exist in order for proteins and RNAs to be stable are larger than the errors in our current area the errors in calculating the energies so what this means then is that the real challenge is how do we sample close enough to the native structure to get into this sort of narrow basin where all the perfect acne occurs and the energy really drops so we really have a search problem and this is much of what we've been working on over the years there are one certainly trying to be smarter about the search so you can imagine for example sending out exposure time defined as well as elevation point of some new planet you've discovered you can imagine sending out many many explorers and having them each parachute down to some man pointing on the planet surface and then report back to what the lowest elevation point they found is and then you can send out new explorers to search those areas where the best lowest elevation points to the accounted first search so these are the things that we've been playing around with but the second approach is we simply recruit more explorers and so a number of years ago we started a shooting project where we were trying to create the structure of a protein we send the sequence out not to our own little cluster but we send out anyone in the world who's participating and they basically run that first they run the two movies that we saw the lowest and then they send back to us the lowest energy structure and so we have a lot more sampling so I think we can have a hundred times more explorers like in the landscape exploration analogy we can also start closer we don't have to start with an extended chain if we have more information so often it's a often your sequence that you might be interested in is related to the sequence of a protein known structure and it's the inherent observation that sequences that are similar almost always old and similar structures so we can start with the structure of the homo and that's the comparison model of the problem finally and I shouldn't say that's cheating because we're not starting a public extended chain but really the best way to cheat is to use experimental data of course that's like selling in the explorer analogy that's like saying the lowest elevation point on earth don't want to look in North America look in the Middle East and that's a big clue and everything that is experimental data has a huge asset in finding the correct structure and so I'll spend some time talking about this and finally we've been doing this for some time and as I said people would watch the calculation and they started writing in saying that the computer was really dominant so to try to take advantage of trying to let people die with the course of this search we developed a game called Colbin and it was ultimately quite neat from that and that's what I'll talk about at the end of my talk so first just to illustrate what happens with Rosetta at home it's a slightly bigger company and if we just do it in-house we get these red points here this is again energy per star meant C if we send it out to Rosetta we send it out now to all the participants of that at home we get that sometimes these rather unsettling plots is again energy per star meant C and you see there's one lucky person one of the lottery this point is lower in energy than everybody else lower in energy is also pretty close to the native structure this guy's structure and so we acknowledge these people on our website and we acknowledge them in our papers do these actions is a pretty important contribution but you can see this is really not where you'd like to be because there's only we didn't have this point then probably this might be the lowest energy structure and this is not really good enough for for confident structural prediction why when people come to me and say I return to university for a team I don't know what the structure is can you predict it for me so we can try it right because you see this is the problem if you don't you move this as well so this is why average structural prediction is not really something that's a great practical utility currently alright so what about comparative modeling so this is from the half structural prediction experiment we're testing the structural prediction methods so in this case we started with a sequence we folded up we started with a sequence that was homologous to a structure from now we build a model shown in red here and then we basically started many many trajectories focusing more on the high resolution of mine like that second loopy where things weren't moving as much and the lowest energy structure we got out is this mean one to show here and in blue is the actual native structure and you can see that the green structure very population things are blue from red to green that's really quite a bit closer and if you look in the core again the side chains are coming together in a very similar way in the creation which is in green and the native structure which is in blue this also goes remembering that this is from a challenge to predict the structure of the adenosine receptor here we started with the beta-2 adrenerbate receptor which is shown in gray and again doing this refinement now in a force field that reproduces some of the characteristics of the limited environment in the membrane we end up with the structure shown in pink which is close to the x-ray structure which was subsequently given out which is in purple now I should emphasize though this is now I'm showing you it was in the membrane this protein also has loops and the loop regions we our model is basically wrong in and the problem with loops is that we end up talking about this deep well that the native structure sits in that really applies to the core but loops are much more shallow well and those present much and sometimes they're quite flexible they don't really have a well-defined structure and those are much more challenging for modeling to have to deal with much finer energy differences okay but the practical utility of this methodology I think has turned out to be is in combination with experimental data and as I said when you have experimental data it really limits the search enormously and I'm going to illustrate now different ways in which this basic procedure which I've described it's going to be the same two-stage protocol to start with the extended chain you can fold up the low resolution level then high resolution how we can incorporate experimental data of different types so in some cases this is another prediction from a cast of line structure vision in some cases the predictions are actually accurate enough that you can solve the X-ray visual graphic baseball with that this is really the exception of the rule so it's not really a practically useful method unfortunately but this just shows a case where this is a prediction and then the electron density is is sorry this is actually the native structure in the electron density that you did using phases from a model and experimental and you can see the map is essentially perfectly retraced easily by auto building and the problem of course is that molecular replacement only has a pretty narrow radius of convergence too so you have to really close before a model models be really good before one can use it this way now according to my own who was a graduate student here has now implemented something which is actually quite a bit more useful he's put in a term which reflects the agreement of a model with the electron density into this in the both stages of the search process and you sort of saw the first application he looked at was to see the density if you see the model for to try to fit into the density and this has been a collaboration with this has been so using data this is pretty high resolution this is again the same type of this is density you're very starting with the C-authent trace so density and then going through this and you can see that the model here has gotten quite a bit better starting in red moving green and blue is the crystal structure which is known to take care of sorry this is a busy slide but this is more Frank's work so I wanted to highlight it here so what Frank is looking at now is in trying to increase the base of the burdens of lack of replacement and the neatest thing is if you have a starting model a series of possible models you can use a programs like phaser to try and solve the phase problem with a phase sweep is there a position of my model in the unit cell where I can repitulate the attraction amplitude to make it now in physical cases there are not, the best solution isn't that much better or maybe worse than the correct solution made for a line of noise and so what Frank has been doing is taking many possible solutions generated, this is the phaser score, this is basically how well phaser thinks the data, this is how good the models actually are, you can see there really isn't much correlation at this stage and then Frank takes these structures and finds them in Rosetta and then in this case the correct solution has now achieved a much better score so this is still early days but there is an exciting possibility about being able to really speed up x-ray structure termination using lack of replacement, that is be able to achieve good solutions to the phase problem with lack of replacement models where you can't currently do that but probably the most immediately useful application of this methodology has been NMR structure termination so in that first movie when the protein is folding up in that very rapid way, something that is very important is the hypotheses about what the local structure is in different regions and if you just have a sequence alone you can't be sure so you have to model each possible chunk of the sequence with a variety of possible building blocks but if you have NMR data termination about the local structure and you can use this to guide your selection of possible things, possibilities that need to be sampled during that initial structure build up was added back a few years ago we showed that using this information was a little enough to kind of illustrate information which one gets at the very beginning of a NMR structure termination process was a little enough to really alter the search enormously so one could reproducibly get and really all that the company NMR data was doing was combining what the space of local possibilities was in that initial search so the NMR data go in and so that first rapid large scale search for foams in a little bit more easily on the native structure and then the subsequent high resolution search is really just the same as what I showed where you've got so much more density sampling density in the region around the native structure that that you find it you find it very, very much more frequently and so whereas for I said that initial structure integration is really very useful for building over 100 native assets with chemical chips one can become confident about the structure calculation process and this is now being used quite quite frequently I got a paper to review a year ago this is when I knew that things were really, really starting to work I got a paper to review a year ago where someone had solved the structure you get it downloaded it was out of the chemical chips they copy the structure and then they solved they had crystals they solved a phase problem with the model and they started the system I had heard about it all so then I knew that this had to be it was definitely useful and I wish that happened more often so what we've done since then is we've added more experimental data if we just have the chemical chips we can get up to about 120 20 new assets but then beyond that the chemical chips information isn't enough to help us find the native structure and the next piece of information that turns out to be very, very powerful is into a bipolar coupling which provides information on the orientation of the secondary structure all that's relative to each other now this sort of information is usually used at the end of the dimensional animal structure determination we're not used at all but it turns out this has enormous power in guiding we put that in, we've put the residue of that polar coupling information in during that initial notice we searched every time we made a move and you remember it was very junky we asked do we fit the residue of that polar coupling data better or worse than before and we accept the move if the data are better fit and that really helps converge on the native structure so we're able to go to larger proteins up to about 150 new assets and so we think that and what's neat about both chemical shift data and the Zoodle that polar coupling data it only involves backbone atoms so a lot of the difficulty in current data in our structure deformation is getting the side chains assigning all the NOVs between the side chains and that's difficult we suspect are very crowded that the states can happen but the sorts of data we need here are much less of an error so this is a collaboration first with that Bax and more recently with the Union many animal groups have been very generous and delighted to get for us test these methods this just shows that if you have chemical shift data one can combine it with this symmetric structure forming two so this is the very next way of getting information from on cobalt in American structures which can be built by standard methods so when we go to still larger proteins we need more information and now we're throwing in distances macro nano leads just enough to roughly determine what the data strand technology is so this gives distance information between the data strands and it turns out that in this case doing many many independent simulations turns out to be an efficient and instead a iterative approach where one has a population successfully makes the best structure for the population while maintaining diversity turns out to be more effective so this just shows different iterations taking the lowest energy structures this is the ensemble shown here and this is the eventually internal structure solved in impressive artist group using much more information selective amino acid labeling to get kind of ease in the backbone so again we take that using this type of approach where Rosetta has a lot of the information in it which is supplementing Rosetta with a little bit of information but at least in this case a fair amount of information again the ballpark lays a lot of potential this is another case showing that even though we're putting in information only on the backbone the side chains are getting predicted getting determined properly too and of course the side chains position of the side chains can be determined by Rosetta, Hydrocystin, or still just like it is in calculations where you have no experimental information at all okay so I want to emphasize this which sort of makes the point that this approach is different from conventional structure of determination so conventional so let's see so as I've said the energy landscape from that model is very bumpy because there would be this black line here and so the result we end up with will depend on where we start so we stand up over and here we'll end up in fact with this hole far from the native minimum now what happens when we add experimental data on it let's say the experimental data is not very we want very much data so it doesn't kind of vigorously determine the structure so the native structure's load is good for the experimental data but also this wrong structure over here and of course this in reality would be very much larger with many more dimensions so we put in the experimental data in terms of bias when we're doing the calculation then this structure here we now no longer end up here we might get pushed up into this higher this higher belt here the driving force is in that direction here but on the red the red tend to start pushing it on the other end if we start here then downhill is down to this deeper minimum before we had planned it we were stopped but now we might actually find this deeper minimum so if we look over in this plot then we might expect to see different effects in turning on these experimental data far from the native structure compared to close to the native structure so far from the native structure we're out here that's like this scenario the energy can well get worse because the native produce the experimental data so that odds would be with the energy function however if we're close to the native structure like here then adding the experimental data can actually give us lower energies and this is exactly what we see so this is a calculation this is actually from the one that the protein I just showed you larger protein the black dots the gray dots are what we get from many independent calculations in the absence of experimental data and the red dots are what we get when we turn on the experimental data this energy doesn't include anything about the experimental data it's just that when we turn it on now we can access this whole region here where we weren't accessing before so we have to find these very much lower energy structures now this is not at all intuitive these in general you think you're doing a constrained optimization of a function that you do worse than when you're doing a free optimization which is this great place and in fact when we don't get close enough this is a still larger protein where we don't where we fail in this case when we turn on the data we don't access lower energy points in fact the energies move up a little bit so this finding of this result where if one gets lower energies in the presence of experimental data is a real homework that runs on the right track so this is one way which one can know that one has continued the right structure at the end and one more point here is this sort of also emphasized the difference between this approach using data called structures and conventional structure determination where you really rely solely on the data to give you your final model that's really when you turn into a structural model and you can see with data like this if you just rely on the data you might end up over here it's really the combination of the two and the data guiding you into the right region sort of that sort of the physical chemistry to take over at that stop that's making this work well this shows that one can put in other types of data and this is disulfide this is a collaboration with Tim Sprinter and with disulfide nothing that made this in some strength in the map this is the site of the 10th anniversary region of the Integrated Receptor this is a resetic calculation compared to an NMR model here not for us okay so in summary structures can be determined because that is a limited experimental data I've taken you through the different types of data that one can use in this I think an exciting opportunity to use all this data NMR data as well as other types of data too and I think this is again in general about structure determination with limited data we're really usually working with people telling people use this methodology which we of course did out freely to try and solve structures so data sets either get our program or if you don't want to try you have to help alright so something that's come up more recently done I'm quite excited about is can we start thinking now about exciting states of proteins and this has become interesting recently because there are methods like NMR methods and other methods that have come out where you can start looking not only ground states of proteins but the activated states which may claim different variables and protein functions so we came at this or we came on this sort of an indirect way we were testing to test the the force field that we used we just took a large number of proteins I think there's 117 shown on this slide and for each of those we carried out a large amount of sampling using a zeta at home but we spiked in some information about the location of the native structure so we we pretty sure the sample close enough I told you when most protein we simply don't sample very close to the native structure without experimental data and so you can see in all of these plots the region around the native structure is lower in energy than than regions are higher these are the same kind of thoughts energy versus pharmacy and you blow up just take one of these blow it up you see again the lowest energy structure of the native structure you superimpose them on the native they look very close however if you look closely you start to see cases where like this one here the lowest energy structure is close to the native structure but it's not exactly there there's another case so all these cases start appearing appearing where here's for example low structure but here's here's something that's higher arm is to be lower in energy so of course our initial reaction was they thought this must be barriers and our ability to consume energies and that's still certainly likely to be the case for a number of these but when we look closer we start to think that there might be a little bit more going on so this is an example here's energy versus RMSE here here we have some structure down the floor and some RMSE which are lower in energy than the ones that are much closer in so what's going on well this protein this is a crystal structure shown in red and it has this loop that sticks out all of these structures here tuck the loop in in this way and the native structure is in blue of a new tendency solid crystal form and you can see that probably what's happening here is that that protein goes to stick this in because there's no crystal packing interactions that are stabilized in this loop to be out so to test this we took the this whole set of proteins and we did the same folding population in the context of the crystal and when you put it in the crystal the lowest energy structure that you find is very close to the native structure it's shown here and so we think that at least some of what we're seeing here that these variations of the loop of crystal packing interactions would maybe distorting the structure and this just shows that generally we see much closer reconvictualization of the crystal structure so we do the folding of these structure population in the context of the crystal than what we do at the outside of the crystal and this is just another example where the crystal structure was shown here we got this structure here in green and it turned out there was an independent structure shown in blue and probably this is the most excuse me in the same case and this gets at the excited state's idea there's an RNA binding for each shown bind to RNA we don't have RNA in a copy of this copy they should have just folded up the monitor and we got to found these obstructions the one shown in red and the one shown in blue turns out one of them resembles the aqueous structure one of them resembles the bound structure so this is what we start thinking that these alternative structures we see the alternative minima that correspond to things that really exist and there's the actual packing interactions or maybe there are excited states just slightly excited states that play a role in the protein function so we're very excited now about trying to map these landscapes in more detail actually compute the free energy landscapes right around the data structure and start combining with comparing the experimental data alright so at the end just wanted to say show you a little bit of what we found folding the blocks like to distribute computing distributed thinking where we're now trying to get the whole world to think for us and I was initially pretty sure that this was going to be very useful for education that I'd taught my kids high school a couple of times and the kids would get into it and you know they'd have to wrap up with macromolecules and stuff but more recently I started believing that these people can actually do amazing things and I showed you some of the problems that they've been able to solve and then you know so when you play fold it actually you know what you do with the other order so when you play fold it you see a version of the protein here I don't think that you can do you can move it around like this and the things that play a protection to are my store which means the most important innovation we had to take and to get to things which was one of the things that are going to be a higher score so these are all the other people who are playing right now and this is our store here so we're going to see if we can catch up so there's nothing we can do which are just like I said pulling on things I can I can I can I can you're not going to get very far just pulling things around but you want to take advantage of but people can do higher level things so I can say I can say well I've tweaked things around a little bit now I want to just optimize the positions of the side change so that's all taken and you see my score by short of being getting a lot better now I can also say that also now I might want everything to move continuously to minimize the energy and so I can first I that's just the side change let's see I think I'm already better than I was and now now sophisticated losers can do all kinds of things but I'm not a sophisticated loser I just froze but you can well what I'm going to now show you is that that people have there's one little more more questions so you can have it you can tell the program you can say we don't turn to regions I've had those they're very good regions so that's basically my score you think okay well this you get to play around cool things and that's kind of fun but can anyone do anything interesting with this so that's why I'm going to show you now so I took you through some of the various things you can read some regions that you can put in rubber bands constraints put things together and you can create recipes you can create algorithms that put together different types of these that put together different types of these algorithms so basically what this is this is a layer seen on top of Rosetta so basically they have all the great things when Rosetta is doing high-resolution searches making a random move on the backbone that is repacking side changes minimizing everything whether or not to accept here instead of a random move a person can go into a perturbation like I showed then basically then the repassing is called shaking side changes and then legal minimizing tweening and then whether or not to accept is something that that the person that gets beside and this is where people can be much smarter than computers because they can sort of be more patient and in some of the best results you see that people are pursuing avenues to slightly coordinating worse and worse and worse they're continuing and then they get back on at the end and this look ahead ability is something that people really aren't very good at people form teams here with all sorts of you can sort of get an idea of where they are and now they're just going to result this is this shows a classic argument that there is something really good at this is we gave them cases where we had models that were close but they were wrong and then the people are really good at this problem so they can look at this thing and say there's a hole here but you can't just always have to repack the side chains because you have to move you have to cool this down and rotate it in in this case here as well you get the side chain and you have to pull the whole thing in so you have to move the backbone and then repack the side chains and this again kind of can that a computer can't do see what we're going to make against it but it has to go there and then do it so what we're seeing here are the red or the starting puzzles blue is the actual structure and green is the best holding solution the highest score in one we see that really blue is a pretty impressive thing here we recently started giving them some without a sequence alignment aspect to that so they can change the alignment of the sequence to a homologon known structure and this is a result from a few days ago which I thought was really stunning this is a they started off with the red model and they the best score was the green one they changed the alignment and wiggled some other things along and the blue is the actual structure so people are able to solve these things and oh yeah and then this is an interactive game so each of these lines here shows a different team this is time this is energy this is an urban sort of energy so lower is better and you can see how you saw that people can see how well each other are doing so if you look there's this blue team which comes from behind here it crosses the purple team and that despite all these other people to think that there was more to be done and you can see there's this huge amount of activity here where everyone's racing to get ahead and then at the end they find you a lot and then this is within a team and they're trading information finding different things out and of course at the end they all start sharing things and do well they specialize so some people are opening some people do do do and I showed in the beginning an email that now I just decide email players to find out how they were doing it and people have developed algorithms you can just read this one here and we started we initially the hope was we could encode what people are doing as algorithms and you would automatically the sophistication of what people are doing and they really use a lot of human intelligence a lot of what we try this and then we backtrack it doesn't work is and then people little recipes and sometimes the search and then we use a quick recipe whatever it is I don't even know what these things are so that's been really kind of people these people have no normal experience they're these people and nothing about science none of these people are about chemists do them or scientists and they just like when I go through these I think a lot of people would make total sense I mean I haven't thought of them but I can really see what good things do okay so that's it so for the structure determination stuff I talked about more about Riju Das Box and Robin Box and Oliver with NMR Riju Daniel's graduate student here has done all the work of collecting to see what they're replacing my type I do the work on all the minimum mapping out the landscapes and all that is the topic of modern politics and the work that's done by Ross the neighborhood neighborhood town tomorrow I'll talk about how these methods are designed to be possible so you said you were starting the amino acid polymer takes time to be connected and I guess I've heard of you know very rare t-r-a's that come in to give it more time to let the downstream pull if you consider things like that yeah well you know I'm really talking about this with the the the the the C-turn is right it's the the the reason is that from when you really don't want a protein domain old it can be made because otherwise you could end up with something that's that's really not the crack structure at all but I think there's a lot the way the ribosomes set up which really prevents bullying from occurring until the protein domain is nearly completely synthesized so I think that's why it's a reasonable model to just take a protein and the pool the seeds which are notええ why not to you yeah this or how far how far how far and how far okay who and how far far how far how far to me that a lot of potentials, the basis of a lot of our potentials are ones that are one pair-wise, so that means how many bodies walk through, and walk through the extra access. And are lacking maybe some of the more specific in the array. That thing on a mean pie and a pair on a pie. Yeah, so what you're saying is, I think what you're asking is, we see these very, these funnels which are very narrow, and maybe the dual ones, the truth, are broader. And maybe if we had a better model, they wouldn't be broader. Bring down the low-parms. Yeah. Well, I think that's certainly possible, and it's almost certainly true to some extent. But on the other hand, the puzzle, the Jigsaw puzzle, like packing, is something that, you know, imagine if you take the back of a structure, or even take a Jigsaw puzzle, or maybe you sort of lean the outline, at some point you just can't fit, you know, things don't fit the same. So I think it's certainly true that the true, nature's true, you know, potential function probably is more of a lifeline. And then there's something about packing problems, Jigsaw puzzle problems, where they have this property where you die and pre-close before it fits. So I think, I'd say, you probably go for things. And then the second thing I was going to ask you about is sampling shoes. So over and over and we see that it's not really based on this method. Yeah. The last example. So what are your, maybe, impressions on the best way to make a partition approaching the two mind domains that are? I mean, domains is how proteins do it, right? That's how, why we're all here, right? Because evolution is here to have the three mind domains at a modular level, and start creating all these marvelous proteins that go into the algorithms. Well, I mean, algorithms, the dumbest thing to do is just, like, of course, you know, everyone does, you just partially double all the sequence and partially domainings, and you're trying to do the activation on the domainings. Of course, you know, how's a little extra experimental? You know, I think it's all about exposure. Yeah. Yeah. Well, it really depends on the complexity of the system. I think the important thing is the number of degrees of freedom in your problem. Yeah. Number three, the number of new masses, for example, divided by the number of constraints. So it's all a function of, you know, a small enough system and you don't need anything. You need more data. You need to define it. So, yeah, I don't think there is an absolute answer to that question. Clearly, having back home in the NMR data is a huge boost. Although, you know, I should, I need to qualify that. So, I think I may have painted an overview of this picture. The X-ray data is still very, very important. So, for example, in places where there's great polar interaction, detailed data, I think it's extremely hard to compute. So, I think X-ray is very important. So, if you want high-resolution data in places like that, because that's where a lot of the important stuff that goes on in protein happens. Obviously, if you want to look how ligands bind and so forth. So, my comments about the success of prediction, really, you know, all my examples are high-polycores, where you have the packing. But, again, a lot of the work part of protein is in the high-polycores, the other bits, where you have loops coming together in certain ways. So, I think getting the loops right will always require more data. So, experimentally, we see that the confirmation will tell us if the H-ray uses the chargers inside genes. Yeah. And can you handle the cis-transpoline? Yeah. So, the cis-transpoline is handled. We just have a distribution of, you know, for each segment of the protein will become systems of transpossibilities. As far as the PK's pH dependencies for the design work, I'll be talking about tomorrow. We'll model that. But, for the level of resolution we're describing here, that's not the model for the root of the system that everything has had. It's, you know, the pH of the base pack, the charge shape of the base pack per pH 7, for the aging constant. So, that's obvious. But, that obviously would become important and would limit the accuracy of these type of calculations. Well, I did, but that value requires a higher level of resolution, probably also lower. So, what's the limit to total function that you can actually input into a sequence of one of these operands? I mean, are you limited to the part of the control of your acids and is there some non-modified states or is there something where you can actually, something we'll get? No. Most additional odds? Right. You can put in, the way that we've set this up is it's very modular. So, we can easily go from sage, protein, DNA, just to take the residues or something. We're doing a lot of design work about natural nemats in town. Really, all you have to do is just describe what a natural nemats is. So, it's, and then as far as the translation, modifications, again, it would be straightforward. Now, how accurately would you, I think the subtle effect of those is another question, but from a straight modeling point of view, it's not practical. What are your other plots of the RMST versus the energy? And what's the conclusion you can draw from that is that RMST is a great tool in measure of successive, on the set of, you know, structures that, you know, people tend to use the precision of their submitted RMST with some measure of how good, you know, the structure might be, but I'm going to suggest that RMST is very crude and is not at all, you can't have any confidence that you found a, a low energy structure just to kind of give a lower RMST. Oh, yeah. Well, so why does, so, yeah, so that's a good point. Why do you think that in those plots, often there were low RMST structures with very high energy? Yeah. Well, why is that? It's because, you know, you have the native structure like this, you know, kind of like this. And now, let's just say, I need our loop on top of each other, these two parts here, the energy will skyrocket, because it's very sensitive to a tonic overlap. And maybe you say, well, in that case, you could just go for the either. Maybe something just sort of stuck between two places, so it's not in quite the right place. So you can have very high energies close to the native structure. In fact, for a lot of the older and the March structures, they always have very, very high energies because of exactly those types of effects. Since you're, currently, there are a lot of things that are made close, or there are things that are minimized. Things are, right now, where it's interesting to look at the same distribution that's going to actually, instead of paralyzing, along with this, it's actually, yeah, that's right. It's very close, but perhaps we'll have to forget because it doesn't, it doesn't really go for that at all. Yeah, so you use the number of native contacts right now. So there are alternative measures, but qualitatively, you get the same picture. I mean, the details will be different, but close things are close, or low-level things that are far apart. And what you do see in that type of plot, sometimes it's alternative, not actually that, but firearms, near-dismirons, and you can only look at splunk lines, but the rest of it is really the same. Alright, thanks.