 Yeah, it's all right. I doubt anyone ever want to listen to it again, but that's all right. We're good. Okay. It's fine. There's already water here. No, it's all right. I don't need it. Oh, it's all right. Is it far away here? Okay. Hey. Wow, you guys are better than my Gen Chem students and quieting down when I start talking, you're going to attest to that. So welcome everyone for what I believe is now our eighth more or less annual board and lecture we missed one year for obvious reasons a few years back. And it's really a great, great pleasure to first of all acknowledge West and Sheila Borden who made this lecture possible. West was a member of our faculty for many years before moving on to the University of North Texas. For those of you who are fortunate to sit in CHP, he was the one who really spearheaded that building and I'm sure that some of my colleagues who have been on campus longer than I can can speak to other things that he's done on campus, but West was just a great citizen member of the department. He retired moved to fashion Island and we cannot entice him to leave. So this year's board and lecturer is Greg vault, who's at the University of Chicago. Greg started his academic career at getting his bachelors at the University of Kansas and I'll leave off years. He then got his PhD at Cal Tech and did a postdoc at Berkeley and now I'm trying to remember who you work with the Chandler and Miller I was gonna say Miller and obviously Chandler and Miller. And he joined the faculty at the University of Pennsylvania in 1989, when our own bill Reinhardt was also there or shortly before you got there but roughly the same timeframe. And then moved to the University of Utah in 1994 as a distinguished professor from 1997 to 2010. And then most recently, since Greg Katz day put moved about going about 14 years ago to the University of Chicago. Greg is someone who has done a lot of really great work a lot of it around liquid water as you can see, and also looking at various types of multi scale simulations. I have great memories of when you gave a seminar back at Ohio State on some of the multi scale modeling of various various biological systems you probably don't remember it but I was young and impressionable. And it was really really impressed by the work. Greg has gotten a number of awards is an inaugural class of ACS fellows is a fellow of the American physical society. He received the division award and theoretical physical division award and theoretical chemistry. The Hildebrandt Award for the American Chemical Society in 2019. And most recently the bio physical society innovation award. And so it really is my pleasure to welcome and I've got a new dub wood block that thank you Diana for getting these happening. To thank you for coming and giving the board and lecture this year, and we'll you guys can read as well as I can we look forward to. All right, thank you. Okay. Well, it's a pleasure to be here. Can you hear me okay. I think all right. You have about 10 different choices to meet there. Hope you got the right one. Okay, so it's a pleasure to be here. I've only been here one other time for some big conference and and I were trying to remember what it was but I've never visited the department. It's really a fantastic place I must say that Eddie who went to get me water is probably the single best organizer in 35 years that I've ever had she I wish she could or he or they could have heard that. All right. So what am I interested in you know I'm interested in hard problems and some years ago we decided that this was definitely something worth dealing with. So most problems that are traditional and theoretical chemistry sort of two scale problems and, for example, you may want to deal with the electronic structure and that tells you about how the nuclei move. Or if you know something about the interactions you may want to do statistical mechanics which tells you about thermodynamics right so those are two scales. Most systems anymore that were interested in our multiple scale this is a little image from some proteins this is a protein actin that hydrolyzes ATP so we have in the actin protein. It starts hydrolysis reaction which is quantum mechanics. Oh, thank you so much. I paid you a really nice compliment but you missed it. Someone can tell you about it later. I'll email it to you also just in case you need it. Thank you. So. And then this it turns out this hydrolysis is coupled very much to the polymerization of this filament how it behaves so you have at least three scales there. And of course this thing doesn't want to work now so I don't know and you. I can we had this really been quite an electronic experiences today. And where we stand right now is you know we want to push up and scale and you know many of you seen diagrams like this. And we can we can start with ab initio molecular dynamics you're trying to solve electronic structure as the nuclei move that's a very valuable technique but demanding more conventional molecular dynamics. All right. I'm an unplug it. Okay, so what else failed unplug it. Let me start all over. So specialized MD. We can our conventional MD right we look at scalable simulations we want to go to very big pores. Specialized now is all about GPU computing where we can really for modest size systems really crank up the speed. And if we throw in free energy sampling right this is something where we enhance what the simulations do for us that tells us things about going over barriers and things like that. And that's what we used in this direction. But none of that is good enough. What I'm focusing on very much is the idea of coarse grain modeling so coarse grain molecular dynamics is an attempt to take a very fully atomistically resolved system and to write it in some simpler way. So I'm going to spend a lot of time talking about that today and show you mostly some applications of this. Let's see if this will stop evolving. I think if I turn it off, it'll stop. Okay. That's really a mystery to me. I'm sorry. All right. So what is coarse graining? So in our hands coarse graining, we wanted to do it in a fundamental way. And what I mean by that is starting with some sort of a theory. So coarse graining typically involves taking a molecular structure and simplifying it into something that looks kind of spindly. We call it a lot of beads, for example, it looks sort of like a necklace. But the idea is to represent mass collections as particles, right? You could call them quasi particles if you wanted. And the hard questions in this field, which to this day exist, are what is the best sort of mapping that you can do here? I'll come back to this issue quite a bit. So is this enough or too much? What if you don't know the answer you're looking for? Do you worry about going too far, eliminating too much information? And then a harder aspect of the problem once you've figured out this mapping is what is this new interaction? This is not the potential energy function anymore for the atoms. And I'm going to tell you a little bit about that. So what we wanted to do was to not do what the vast majority of the people working in this field do, which is start here and parameterize it somehow. Use some experimental data or ad hoc data or whatever, make a toy model. I don't want to sound overly critical, but the question is, can you relate it to the real molecular structure? So your favorite theory is statistical mechanics for doing something like this. What you do in molecular dynamics is you try to sample all of the degrees of freedom. That's a very highly dimensional vector, all of the particles in three dimensions. And you're trying to sample the Boltzmann distribution. If the system is out of equilibrium, then you've got to do something else. But if it's in equilibrium, that's basically what you're trying to do. The hard part is that these force fields for a non-trivial biomolecular system, for example, are very, very rough. They're exponentially complicated themselves inside of an exponentially complicated sampling problem. This is actually surprisingly difficult. One way to think about this is, suppose I just threw down blindly a computer just to do this integral, just chew up, generate configuration after configuration. It takes one single particle trying to overlap with another single particle, which gives you a huge repulsion, which will kill that exponential. So this sampling problem is incredibly difficult. So coarse-graining, the idea is to try to simplify that sampling problem by taking away the parts of the problem that aren't as important. They are still important. You don't want to throw them away. And I'm going to show you how we did that, and then I'm also going to show you how it's really not good enough for where we want to end up. So just a couple little math, and I promise no more math. So how you do this mathematically is you need some mapping functions. The mapping functions take the collection of atoms. So for example, they're center of mass, right? It's a simple linear function. Maps them here. Those are the m's. So the way you introduce those kind of functions into a calculation like this, these are called collective variables, is you put an integral over those functions with a delta function. If you integrate a delta function over all space, it's equal to one. So you can stick it in here. And the idea is, of course, not just one of them, you stick in all of them. And then you do what makes mathematicians unhappy is you switch the order of integration. You subtract one side from the other. And what you show is that this integral over all possible configurations of my coarse-grained system must always be zero. And the only way that can always be zero is for what's inside this brackets to always be zero, right? Because otherwise you may have accidentally got that to happen, and that can't happen. So what that proves for you that if you could do this calculation, and I'll give you the hint, you can't. I'm just showing you if you could, right? This thing is now what we call a potential of mean force. So the most rigorous what we call bottom-up definition of this coarse-grained interaction is if you could do it, you would integrate over all the small degrees of freedom, all of them. This delta function would say, okay, I only register that integral when the coarse-grained configuration matches the map, right? And then I'd wait that. So it actually makes this problem that I said is so hard even harder, right? So this is hopeless. It's not conceptually hopeless though, because this thing is not a potential energy function. It's a free energy function, because I have averaged over lots of motion here, so entropy's in it, right? So if entropy's in it, this is a thing that's a free energy function. So what we did some years ago, which, you know, had a lot of notoriety, I think. I guess it's been cited over a thousand times this paper, is we showed how to do this calculation. And this is an idea. It's actually an early example of machine learning. Now that machine learning is everywhere, I can claim that we did that. Twenty years ago, we did it before anyone thought about it. But the idea is, okay, in machine learning, you need a cost function. You need something you're trying to fit data, all right? And what we were doing at the time was we were fitting forces that came from quantum calculations to some simpler model, right? They call this force matching. So this is a least squares fit, by the way, if you hadn't noticed, right? It's trying to fit this to that, or I should say that to this square, so it's positive sum over all of the points, and maybe average that in some way. That's a least squares fit, all right? What, you know, every now and then you have a stroke one day of sort of an idea that really can pay off, is I told Sergei, who was a postdoc, I said, you know, this coarse-grain stuff is going on, seems like it's really at a hawk. What if we did this force matching, but we changed resolution also, right? We didn't just, we took all the atomistic forces and then we projected them onto coarse-grain particles, right? And did force matching. Well, it turns out that that was a very, very powerful approach. You can prove mathematically that if you have infinite knowledge of how to represent this coarse-grain interaction, which you don't, that's the dirty secret, right? You don't. But if you did, it will be exact. It's so-called variational, which means if you work harder and harder to improve that, you get a better answer, right? I will tell you that what a lot of people are doing now is using deep neural nets to fit this. We use numerical functions at the time, spline functions, but it was machine learning by God, I'll claim it. Okay. Actually, maybe I won't want to claim it one day because I'm not so keen on machine learning, but what the heck? I have to mention this just, I won't go into detail, but Scot Shell is a wonderful young theorist. He also came up with an alternative way. It's called relative entropy minimization. The idea is if you have a coarse-grain model and an atomistic model projected onto that, you should minimize the relative entropy with each other. And this is yet another way to do this bottom-up coarse-grain. So let me show you, I think I'm done with equations, so you can relax now. I don't want to treat you like it gets. Maybe you really like equations, but that's it for the equations. I didn't believe that this would work. I thought Sergei, to his credit, that first paper, he did a lipid bilayer. He did cholesterol in a lipid bilayer. No one had ever done cholesterol. It was pretty impressive, but I said, you know what? Let's strip away all this biological complexity. Let's do water. Okay. If you can do water, I'm going to believe this. And I don't mean a single molecule of water. I mean a liquid of water. Okay. So what we want to do is take water. With all this, you know, Anne will tell you all this crazy perversity. So Tarris will tell you, you know, hydrogen bonds, it's four-fold coordinated. People argue about it forever. You know, it's got weird phase transitions. I want to turn it into a fluid of single particles, right? The coarse-grain particle. And I want those to not act like argon, right? I want them to act like water. So the simplest thing is, you know, simple fluids, the argon or what we call linear-Jonesian fluids, they have 12 neighbors and it's just packing. This is a famous Van der Waals theory that, it's just 12 neighbors and it's just divided by packing. So one thing I'm going to have to figure out is if all the physics of water translates into single particle, coarse-grain particles, it's got to defeat the Van der Waals picture. It's got to have four neighbors, not 12. All right, plus some other stuff. Well, he did this and, lo and behold, oh, this doesn't work anymore, so sorry about that. Lo and behold, this is the radial distribution function right if you sit on one particle where the probability of finding some distance away from you. It was exceptional. If you integrate this, you get four. The question is, how did that happen? The reason I'm showing you this is, you know, trust me, if you came here with more interest in biological science, I'm going to get into some very heavy biological problems here in a minute. I showed this to show you, if you do this right, in my opinion, right, and you want to translate the physics to another scale, this is a great example to show you. So this is what happens when you do it. This is, when I have these coarse-grain particles and they interact through pairwise interactions, right, and they're single particles, this for comparison is a Leonard Jones potential. And there's that hard wall, right, one over R to the 12th, which leads to packing. So what happens when I take the physics of water, there's tetrahedral coordination, it's expanded, there's tons of entropy, and when I do this coarse-graining, in Leo Kadenoff's language, he would call it renormalization, right? In fact, when I gave this talk, when I first moved to Chicago, Leo was in the front, and he says, this is just RG, RG. So he was happy about that. This is what happens. You get this kink right there, and it destroys the ability of this fluid to pack into 12 neighbors. It can only pack into four. That kink, and then this deep minimum is this additional salvation, so there's more structure in water than there is. So what I'm trying to tell you is that the process of rigorous coarse-graining takes the physics of water, puts it into new objects, but the physics didn't leave. The physics popped up here. Now, some of you that aren't asleep, you say, well, what about the dipole moment? You can't have a dipole moment for single particles. Well, that's an issue called representability. So you will lose information, but I was just interested in the structure of the fluid. So that convinced me that we're off and running. Okay, quickly. Oh, I lied about the equations. I'm sorry. So if it's a free energy, even if I call it V, that means it's got an energetic or an enthalpy part and an entropic part. And so you can actually decompose these interactions, a finite derivative. You can get at the entropy. And I will tell you, without showing you the results, this is very much due to the entropic part. That's because water is an expanded liquid. It's got these fourfold, expanded me in a lot of free space, a lot of entropy, right? It's got these fourfold coordination. That entropy had to show up somewhere and it showed up there by saying, you are not allowed to pack me into a low-entropy state because that's there for you now. All right. Okay, so let's move forward. Oh, I should say an alternative is doing it the top down way. And for example, the most popular model is one called Martini. It stands for something. Actually, later at dinner, I'll try one of these Martinis probably. But this stands for something that I don't remember. But the idea with Martini model is to not do any of this stuff, but to parametrize this. And what we're able to show is, first of all, you can't capture that entropy. So these kind of models, you need a temperature derivative of your interactions at the coarse-grained level to tell you that you are looking at entropy and they cannot capture that. By the way, I published this paper just to tell some of the young people how not to make friends. And not analyzing what I said and better understanding the origins of a Martini hangover. So in that paper, we showed that their model was lacking the entropy. So they're still friends. I mean, you know, they're okay. Okay, and this is the last of the... If any of you are interested, this is a review article at great labor, mainly of the other group members that wrote it. A very comprehensive review article. We also have software now that people can use and this was published. Okay, so you would think that this is a solved problem and you would think that, you know, he just told me, you know, you could do this, you can coarse-grained anything, you do it rigorously, blah, blah, blah. Here's the problem. We want to go to really high scales, right? So I mentioned actin filaments. This is looking at an imaging from a cell. I forgot what kind of cell, but you can see these actin filaments in here, right? So actin filaments, molecular motors, and the cytoskeleton, I think this is the nucleus. We want to go all the way, eventually, from these monomers, which are the actin protein, all the way up to this kind of scale, which is thousands and thousands of proteins. I don't know. So here's the coarse-graining, okay? Let me point out that what I just got done telling you about so proudly is not good enough here, all right? Because what we need to do is we need to really get very aggressive in how we coarse-grained these proteins. Why? Because computationally, you know, this already, at the fully atomistic level, is a million atoms, right? This is 13 proteins in sort of the natural helical twist of one of these filaments. You add solvent, you add a million atoms right there. So if you coarse-grained it at a pretty high level of resolution sill, yeah, it'll help, but it won't be anywhere near enough. So we need to be very aggressive about how we represent these proteins. And this is an example. The actin monomer is 375 amino acids plus the bound nucleotide, and we're representing it by, I think it's 13 of these coarse-grained objects, okay? So that's more like 30 to 40 amino acids per coarse-grained particle, not three coarse-grained particles per amino acid. This is very coarse-grained. And you have to ask yourself, is it make any sense at all to treat this anymore as some Newtonian object, right? I just plug it into molecular dynamics code. I integrate f equals ma. Maybe I add some dissipation to kind of thermostat the system. That's crazy. It doesn't make any sense because these particles are representing all kinds of stuff going on in that protein, you know? So an example, well, these are other kinds of examples. I'll just skip through it. Huge changes in membranes. They all involve hundreds or thousands of proteins. This is the actin filament, right? This kind of cycle. Okay. I'm going to talk about viruses in a bit. So the question is, how can you represent this in a new way that captures the physics and allows you still to have a very coarse-grained model that will be so computational efficient that you have a hope of getting to this scale, all right? That's a fundamental question. And to... I was lucky or unlucky. I started my career doing quantum dynamics, okay? So I did not quantum electronic structure, but quantum dynamics, so nuclear quantum motion. All of you that have done that are familiar with a non-adiabatic transition, right? So you have... You're moving along on some state, electronic state, your nuclei feel that those interactions, and then you make a transition to a different one, right? And there's some rules for how that goes. So it occurred to me that, you know, really the right way to represent this kind of very coarse object is not with Newtonian mechanics, but with what we call an isomorphism to quantum dynamics, okay? And to illustrate that. For example, this copper ball is representing the nucleotide, and we know that the nucleotide can be in three different states. It can be ATP. It can undergo hydrolysis with a phosphate group still bound there, or the phosphate group can have dissociated, so they have three realities. So that little copper ball is better represented by an object that can move through space, but all has three embedded states in it, like quantum states. They're not quantum states, but I'm using quantum mechanics. So the idea, oh, and I should also say that the kinetics, and I'll mention this in a minute, the kinetics of this depends very much on what these other balls are doing, especially these four big balls, these domains, they accelerate the hydrolysis by a factor of 40,000. And when you undergo this hydrolysis, it affects other parts of the protein. For example, this thing called the D-loop, which I'm representing by one ball here, can be folded or unfolded. And that is tied to what nucleotides say it is, and whether it's folded or unfolded plays into whether the filament is stable or not. So it's like a big coupled quantum problem, except there's good things. There's so stochasticity, you don't have to worry too much about coherence. You know, it's not a quantum computer, thank God. Okay, it's just something where I'm using what quantum does to represent what this thing does. And I will show you a couple of examples where this has been a huge payoff for us. For us, it's also very important that this was not just some ad hoc description that actually there was real theory behind it. So now, geez, more than 10 years ago, we published literally how you would do what I talked about in the early part of this talk, how you would formally project out all of that from atomistic interactions. You may not want to do that, right? You may want to be short-circuited somehow, but we showed that it can be done rigorously. Okay, so in a nutshell, if you want to have a take-home lesson from this, what we sort of have done is merge two main areas sort of modeling. One is particle-based coarse-grained things move through space. They need to move through space. They're going to aggregate. They're going to do all kinds of things. And embedded in that are, we often call these Markov state models, or one way you can think of it is quantum dynamics but without a lot of coherence, right? So like what we call a master equation. So basically, this kind of dynamics is living in these particles, and what happens with this dynamics matters how those particles move and how those particles move matter how this. So it's a very cool sort of coupled problem. So bottom line, all that quantum mechanics I learned early in my career paid off. And I'll show you a couple examples of where it is used. Okay, so this can really open the door to all kinds of things we can describe. For example, I mentioned hydrolysis or electron transfer or protonation. You know, if amino acids get protonated and they're inside these coarse-grained balls, that is a huge change. That's a big electrostatic change. So this can capture that, stuff like this, the ligand binding. All these things can be described within these coarse-grained models in an implicit way as opposed to just trying to tweak some particle-like model to capture all that, right? It's very hard to do that. It was impossible to do that. Okay, one quick example, as I've mentioned, actin several times. Actin filaments, you know, they polymerize, right? And when they polymerize, the hydrolysis gets going and stuff happens. For example, is the hydrolysis between different actin units cooperative or random? These kind of questions are very interesting. So for example, actually from quantum mechanics calculations, we can calculate the dependence of the barrier to nucleotide hydrolysis on these coarse-grained beads. This angle phi has to do with the confirmations of those coarse-grained beads. We can plug it into this algorithm I told you and we could do things like answer questions. These are the two limits. So this is called vectorial. So as the polymer polymerizes, the actin filament polymerizes, does the ATP to ADP conversion sort of go like a wave behind it? Or is there a lag time where the polymerization goes on and then it more randomly occurs? I'll tell you the answer. It's neither one of these limits. Yeah, I'm not showing you. It's kind of in between them. It turns out that actin polymerization is pretty robust to things that modulate this hydrolysis. And it needs to be that way because it was really sensitive to that your cells couldn't work. But that's an example. Let me show you another example. I'll spend the rest of the talk talking about this. This is a problem, obviously. A lot of people are still concerned greatly with it. If you're not a biologist, just freak out at this moment. I'm going to explain how this virus works. It's relatively simple. And we're going to target one part of it. So the HIV virus generates bad proteins called gag proteins. And of course it has its RNA. Every virus has its genetic material. So what happens with the gag proteins is they accumulate on an infected cell. They target a lipid called PIP2 and they accumulate. They bind with the RNA of the virus, assuming what's called a bud. This is a cell membrane. It's obviously a schematic. It keeps going. Some of the forces that keep it going are still not completely understood, but it keeps going. Some other proteins come along, which I'm not showing, cut that neck. And it goes free. This is called a virion. Many of you have been looking at SARS coronavirus virions until you're ready to vomit. This thing with the spikes. So this is a virion. The HIV does have envelope proteins, which I'm not showing. It's a symbol. This is called an immature virion. So then what happens is an enzyme that comes along, which I'm not showing. It's actually part of this process. HIV protease starts cutting up that gag. Choose it up. And it cuts it in certain places. And a couple of the important places free up the green part called the capsid domain. And that's called maturation. So the capsid domain assembles around the RNA at the form of the virus capsid. I'll show you a really good picture to this in a moment. This is now an infectious mature virus. Many, many anti-HIV drugs target that protease to stop this process. And the problem is the virus mutates and so on and so forth. And of course then it keeps going. This is just showing it the same cell it fuses and so on and so forth. So we are interested in this process of assembly. And one of the good things, useful things is you can assemble these capsid structures without all this biology. You can take that protein you can put it in a beaker proper concentration. You may add some what are called crowding agent to sort of facilitate things. And you will form these capsids without all the... You don't even need the RNA. So from the point of view of doing careful controlled experiments, simulations that can be compared to things that don't have all the full glory of biology, that's really good, all right? This is what the oh, we published this paper now a few years ago, the first ever. There's 1200 proteins in here, right? So it's a very complicated process. This is what the thing looks like. There's some really fascinating mysteries about it. So this is the HIV capsid. What happens is this is the gag. The HIV protease chops up in more than actually three places but right there it frees up this capsid domain. It dimerizes. The C-terminus has a very strong binding with another... It kind of looks like wings here. And then it forms this capsid. This looks at first like a fairly simple structure but it's really not. You see the hexamers there? Okay. One half of one dimer is in one hexamer. The other half is in the other hexamer next to it. And then all the C-terminus is in a shell below. And if that's not bad enough, Euler's theorem, it's a mathematical theorem tells me I need 12 pentameric defects to close this thing. All right. Euler's theorem is actually really complicated. I thought it must be simple. Apparently it's 12. All right. And these are not different proteins. The red things are pentamers not hexamers. And you close the surface. So there's huge questions about how does it grow? Why does it grow? Why does it look like a cone? Okay. Does it grow from one side or the other? When do the pentamers come into play? So in typical fashion I had a really great postdoc, John Grime. I called him into my office. I said, you know, John, we've been doing all this course grading. We ought to be able to do this. Would you please go do it? Okay. And a few years later, you won't talk to me anymore. But I really like John. This was the outcome of the first simulation. Okay. This was really bad. John almost jumped off the seal at roof. Okay. So here are some simulations. These are reproducible hundreds of them. Along here, I plot the concentration of the protein. Here, the crowding agent concentration for some unknown reason to be in different units. Okay. What we discovered first was something really important. And that is down in this quadrant, the nucleators, all nucleation of a, all growth of a lattice requires a nucleator, right? So these are these little trimmers. What that is, is three of these, right? So you can kind of see it. There's the red, the green, and the blue in a very nice binding interface to form a trimeric structure. These come, these go all the time. They're actually seen in light scattering. So the good news here was we identified the initial nucleator, what grows the lattice. Okay. The really bad news was when you go over this boundary, you start growing lattice, but it's totally out of control. And you can do these simulations over and over again. You get crazy stuff. Every now and then you get some hopeful thing here, but some monster grows off of it, never works out. So John and I were like, oh gosh, what did we do? And we did something stupid. In hindsight, it turns out, so take a look at this protein. You know, if you look at that protein, you got these nice helical bundles, but that's really floppy, right? You see all those linkers? Okay. So this protein, and these are NMR experiments, which we, in our wisdom had missed, okay? It's flopping all over the place. And only about, I say 5%, it's more like about 10% of the time, are you doing the right confirmation to add to the lattice. The other time, you'll crawl around, you won't, you'll pop off, okay? So the problem is to explicitly sample all those confirmations would take all your computer time for a single protein, right? Let alone 1200 of them. So this is where the ultra-core screening came in. We realized, look, we really understand there's this conformational space, this kind of granularity that you can say it's folded, it's in the right confirmation or not. There's time scales that all came really from NMR experiments. We didn't even need to do the calculations. And you can add this to an ultra-core screen model. All you need is a variable that travels along with your core screen particle. It says I'm in the right confirmation or the not the right confirmation. It's a little more subtle now because as you deplete this with Le Chatelier's principle, you have to replenish it, you have to do some things. But this basically implicitly adds to this core screen model the conformational selection that goes on. That's really important because it is a missing entropy, right? So when you grow an object like that, you have all these proteins and they're very happy with all that entropy floating around free. When they go into a lattice, they lose all that entropy. Delta S goes in the wrong direction. That's bad news. It compensates by all those contacts that it's forming in the lattice, right? Delta H. But what we were missing was the lost entropy of basically the confirmations of this. Those also go in the wrong direction, right? So when they're out in solution, they're really happy flopping around when they go in that lattice. You lose that delta S and we could include that now in this ultra-core screen model, okay? So this is what we got. This is what the simulation actually looks like. There's all the crowding agents, all the extra proteins. I start growing caps. That's within here. I'm just showing the growing caps. I'm going to run this movie again for you in a second. So you can see, we didn't run it all the way here because it slows down critically and eventually it closes. Let me run the movie again because I want to describe something to you. Okay. We'll run please. There are three kinetic processes here. One is the growth of the lattice and that's templated by these trimmers, right? The blue are showing these trimmers. They're growing the lattice. Then as the lattice forms, it curves. Naturally how the proteins hack, it curves. Then you'll see the pentamers come and go, come and go, but kinetically at some rate, they're being trapped, mostly in the high curvature region. So this formation, I want you to remember when I use the word kinetic, it's a kinetic funnel, not a thermodynamic funnel like so much protein folding. It is three kinetic processes that need to be synchronized properly, growth of lattice, curvature of lattice, trapping pentamers to allow closure, all right? And if you throw any of those out of whack, which is what we did when we ignored this conformational entropy, you get junk, okay? However, there are two things that were troubling about this. Is there a clock here? Okay, we're in good shape. One is that these simulations always seem to grow from the small end, the most highly curved end. Just thinking about that, there's two things that make you nervous about that. One is it seems strange that you would grow from the highest curvature region first, right? Secondly, if you want in the actual bias to encase the RNA, you would think you'd want to grow from the big cap, first to kind of hug the RNA and then close, right? Just the opposite. Third was what this absolutely wonderful post-doc didn't tell me is most of the time you get tubules here, you don't get these caps. This isn't kind of a rare event, right? So many, many times these proteins like to be in this nice hexagonal lattice. If you don't get adequate pentamers, you just grow a big long tube, all right? Okay, so that came after publishing the paper, and we mulled about that for a long time. Along comes my structural biology friends, and they get better every year, right? Cryo-EM, cryo-electron tomography. They discover something remarkable, and that is in the middle of all these hexamers, and pentamers too, but I'll talk. There's an anion. This is called inositol hexa-phosphate. So it's a ubiquitous anion in your body. It's a, you know, phosphate. It's got a lot of negative charge. It loves being in the middle of those hexamers because there's these arginine groups that are positively charged. It just loves it there. And you can see from these, this is just junk. This is if you mutate the arginines, you get junk. But if you take a quick look, this is no, this is called IP6. So there's no IP6 in an in vitro. You just get tubes, which is what John was getting most of the time, right? Tubes. You add a little bit of IP6. You start getting cigars. You start closing the ends. If you add a lot more, if you look carefully, I get lots of good HIV cores, some spherical particles, but it clearly in an in vitro, it makes a huge difference if you have this IP6. In fact, now all the HIV researchers that used to struggle to grow these cones, they just put some IP6 in and they just get them right and left, all right? Okay, so it's obviously what's really interesting about this is nature with this complex virus trying to assemble 1,200 proteins, plus or minus, you know, 50 or whatever, has utilized an anion that's ubiquitous in your body to make that happen well, all right? And the question is, can we recapitulate and understand what's going on? Well, go back to the atomistic case. We want to put, you know, good atomistic data into a core screen model. So what we first did was calculate the binding of IP6 to heximers and pentimers. It comes from two directions. You can go from the bottom or the top, okay? And if you look closely, there's strong binding, as you expect, right? This negative anion loves it around those arginines. Pentimers like IP6 more, or IP6 likes pentimers more. It's kind of a mutual love affair there, right? And about five kilocalories per mole more. And that's a little surprising because if it's these arginines, you'd think the heximers with six of them would bind strongly, but the pentimers are tighter. So the IP6 can really bind in there and really rub up against these arginines. Okay, so we translate this, to make a long story short, we translate this into our core screen model. You can see IP6 binding in one as a hex, a pentamer one as a hexamer. And I'm going to show you some very interesting results now. It completely changes this assembly, the kinetics of it. So what I'm going to show you is, again, this is what the actual simulation looks like. This is a bad representation of the core screen proteins, but that's the core screen proteins. And we get just completely reproducibly forced, right? So HIV caps. I will show you in a minute. These are exactly like when you do cryoelectron tomography of actual HIV viruses. They don't all look like this perfect cone. Sometimes they even look like a Tylenol pill, right? This is the more perfect one. Sometimes they're the fat guys. Sometimes this one's a little bit screwed up. So we can line them up, right and left. We recapitulate this completely, okay? And so here's the cool part, kinetically. Remember the word kinetics, okay? What I'm going to show you is how this works. This is time, okay, in the core screen simulation. This is how many proteins are assembling up to 1200 here. And this I'll show another axis here in a minute, is the pentamer. So what you'll see is initial seed. And then a slow growth phase where it's trying to grow a lattice. And then what happens is a few pentamers form. You can't see them. They're actually on the backside of the shell. You'll see it in the movie. And when those pentamers form, just a couple, IP6 nails them and stabilizes them. Does not let them heal, right? They're defects. So defects on the hexagonal lattice. They would like to heal. IP6 binds them and that's it. And then it's like game over. There's an explosive growth phase where you grow the caps that are in the force, et cetera. So this is the movie you're going to see, okay? All right. So you see the initial trying to grow. And then you'll see, if you look carefully, you'll see one or two pentamers, right? And the IP6, the orange ball, stabilizes them. Once you get that, it's game over, right? Get the cap, boom. It grows so fast I don't resolve it in this movie. And the reason for that is that you see the IP6 and there's some in the hexamers, okay? It's stabilized a few. It's got a cap. And then the hexamers want to start growing. They want to be a tubule, right? They say, I want to be tubule, please. But they're screwed up because they have this cap and they grow a distorted tubule in the wrong way. And then this end closes and more IP6 stabilizes that. As you keep running the simulation, more and more IP6 will intercalate into the hexamers. But what's important about this is it's an early-stage kinetic process, all right? It's the Achilles' heel of this capsid formation. And I'm going to show you pharmaceutical efforts so you can imagine, okay, if this IP6 is so important, let's come up with a drug that interferes with it, right? Because these are critical. This is kinetically critical. We want to interfere with that process in there, all right? So this is a... I just did that. This is a graph of showing the pentamers, right? This is what I showed you before. Look at the slow phase. No pentamers, maybe one, maybe two, maybe three. Just a few to stabilize this highly curved thing and then boom, it explodes the growth. More pentamers coming up. The 12 oilers happy, okay? And then I grow these capsids over and over and over again. This is totally reproducible. This is a control. So I showed you, if you don't have IP6, you better get tubules. You cannot parameterize the coarse-grained model per se to give this result, but you have to test it, right? You need to test it. You'll see a lot early on, but the pentamers come and they try to stay there, but there's no IP6 to stabilize them so they get healed and then you'll grow a tubule because the hexamers like being hexamers and that you grow tubules over and over again. That's a control simulation here, right? So this is, I'm not sure what that is. Actually it looks like a superposition of many simulations. I hope it's not a single simulation because that's crazy. Okay. If, one more control just for the fun of it. If you really crank up the IP6, you don't even get hexamers anymore. You get pentamers everywhere. You get these so-called T1 particles, highly symmetric. You see the IP6 and all these pentamers. Okay. So there's some really interesting questions about this. One is, these are literal reconstructions of cryo-electron tomography experiments by my collaborator, John Briggs. These are our simulations and of course, we lined up five with six, but I'm cherry-picked a little bit, but I will tell you that look, I mean this compares to that. That compares to that. It's amazing. So what's also interesting about this is that kind of morphology to occur it's a kinetically-trapped object. These things kinetically go to like a glass, right? They quench into a glass. The only way thermodynamically, that could be like this, is you have a highly degenerate thermodynamic minimum that can have all these structures. That's very improbable. So we think the HIV capsid is a very interesting kinetically-trapped object in this virus. And then lastly, and then I'll finish up with just some very recent stuff, there is a drug. It's called linocapivir. I wish I had designed the drug and I'd be able to retire soon. But there's a drug that, they didn't know how it worked, but it's called linocapivir. And look at this. So linocapivir competes with IP6. So if IP6 is so important in this growth, right, you could either try to replace IP6 with something that wasn't important or you would send in another compound that competes with it. So how does it compete with it? Linocapivir stabilizes hexamers more than IP6 stabilizes pentamers. And you can see this in the experiments. If I have sort of an equal ratio here, I get tubules. That means linocapivir is winning. I'm getting tubules, right? If I crank up IP6, you see a mix. And if you crank up IP6 enough, you win, right? There's enough IP6 to beat the linocapivir. This is a very, very promising drug. I think it's in its final stages of approval. It's got like a lifetime of three months. You can, one injection, the last for three months. So you think with especially third world communities, for example, you know, daily cocktails, it was hard to get them to take. So this is going to be very valuable. And it may be all about how it competes with IP6. Okay. How much time do I have? Another? So five minutes is enough? Okay. Yeah, so time for questions. Let me just tell you about one last thing. It turned out, it has turned out that, you know, this HIV virus is carrying the genetic material of the virus. It needs to get into the nucleus to replicate, right? And everyone thought for the longest time that that capsid would go into a newly infected cell, it would fall apart. Somehow the genome would find its way into the nucleus. And, you know, along comes cryo-electron tomography, as well as some fluorescence experience, and shows this capsid going into the, what's called the nuclear pore complex. These are these big pores in the nucleus of yourself that can let stuff in and out. And so they involve, they're very complicated, thousands of proteins. So to show you that, you know, I don't want to brag, but to show you the power of the methods we do, we have developed a coarse-grained model of the nuclear pore complex, all right? There are good structural models, atomistic structural models. This is called integrative bottom-up coarse-graining. I won't go through all these equations, but we can build these models. And we can start studying HIV capsids going into the core, right? And it's not clear what it takes to go in there happily or not. And I'll just give you the punchline. What happens, if this movie will play for us, is the nose of the capsid finds its way there, and then it's kind of like an electrostatic ratchet. I have to tell you, this didn't have to happen, okay? It could have spit it out, it could have been wrong, model could have been wrong. But because we build these models from the real interactions, what happens is the tip of this thing finds its way there, and then the interaction with this nuclear pore proteins wraps it in and docks it in. And so that's very, very, very cool. I should say that the pore also has to dilate some. There's different things like that. And we can actually determine what's successful and what's not. So the green ones are successful. So what I just showed you, if the narrow end goes, also this pill guy, I'll come back to this, the pill, as you might have guessed, can do pretty well. But if you're coming in the wrong orientation, you're too much girth, you get spit out, all right? It won't make it. So this is very interesting because another target for impeding HIV would be to stop that from happening, right? And quickly we can look at how the capsid is stressed or strained as it goes through. You need mathematical ways of describing how this lattice distorts itself. And what you find is the perfect HIV capsid, it doesn't melt or the proteins don't fall apart. It distorts. So it's like what we call an entropic spring, right? By increasing its entropy, as it's getting squeezed, going into this nuclear pore, it can drop the free energy, right? Delta S goes in the right direction and the free energy goes down. So it can handle this kind of traversal. These are the stress-strain striations. What's interesting is the pill, even though it seems like it should go through here easily, it develops very severe strain in sort of sharp curved areas, whereas the cone shape seems to be able to tolerate that, okay? So it disperses these strains in a better way. I'll show you another picture. The pill even starts spitting out, starts fracturing, right? Starts breaking. So what we think is, I mean, this is really going out on a limb, you know, maybe completely wrong. We think the reason that HIV capsid is a cone is, in part, due to the assembly mechanism you have, but in part because it's designed that way, if you use the word designed to survive this traversal into the nucleus, right? So it, we think, is designed this way so it can withstand the pressure of going through. And then we're not, to be fair, we're not simulating the remaining process. There's a thing called the nuclear basket here. It's kind of like a catcher glove that helps it go all the way into the nucleus. So this is what we're studying now. And also with Lynacapivir bound to this, does it fracture it? We think it fractures it. Okay, so that's it. By the way, this is a remarkable simulation due to the postdoc I'm about ready to give thanks to thousands of proteins, bottom up course grading, real physics emerging from it. I don't think there's anything else like it has been done, frankly. So I'm going to skip this part and just give thanks. This is Arpa who just has done the nuclear pore work. Manish and Arpa did the assembly work that I told you about. And these other students, these are Patrick and Jay Hook, and Alex is a faculty member now, did early work. So that's it. Thank you very much for your attention and happy to answer your questions. At the top, I think I remember how state must have been around 2005 when you first figured something out. Figured starting out, yeah. Yeah. As exciting as this. Well, you can kind of see the, yeah. Yeah. Our condition and Greg, if you don't mind repeating questions. Oh yeah. So that they can hear. Right. Yeah. Students post the hot questions. Yeah. Garrett, I know it's not on you. Yeah. Garrett's good. Yeah. Yeah, this all for course grading. It's a similar thing. It's, um, actually it's subtle. Yeah. That's a good question. It's a multi-res... Huh? Oh, I'm sorry. You're correct. So Garrett asked me how many of these course grain objects or beads are required for the HIV proteins? It's subtle. It's actually a multi-resolution model. You need, if you're going to pack into a lattice, you actually need pretty high resolution. So there's one bead per carbon alpha with short range forces. Okay. So carbon alpha is per amino acid. And then the long range, more attractive forces that are these binding interfaces are on, on the order of like the actin is like some 10 or 20 or 30 amino acids per bead. So there's two kinds of beads, attractive and repulsive. Yeah. So that's a good question though. So they are, they're roughly, and also it's only plus or minus 50 proteins or so, right? So you could argue the reason there's different morphologies, they have different numbers of proteins, but you can't do that correlation. Well, I think there's a natural, it's again these kinetic, well, if you don't have IP6, it'll grow to very long tubules. Yes. Yeah, but the capsid, what happens is that the, the combination of IP6 with the pentamers creates curvature, natural curvature, you know, and it's got a kinetic process for curving, and that just stops it from growing beyond a certain size. Same size. Well, yeah, I wouldn't call them, they're roughly the same size, right? I mean, yeah. So I think that kinetic process of curvature plus lattice growth is clocked in such a way that it shuts it off, you know, past a certain, it won't go past a certain size, right? Oh, it's 12. Once you get 12, that's it. So maybe that's it, like if you, if you count the pentamers. Yeah. Huh? It's, you know, it actually, Euler's theorem is a lot more complicated than that, but 12 was the number. I can tell you that. I tried to understand it, but it was a lot of math. Yeah. Yeah. Well, yeah, it's bucky balls and whatever. It's 12. Yeah. So any honeycomb lattice to close it requires 12 pentamers, actually. That's what I think Euler's theorem says. So it's soccer balls, an example. Yeah. It does, doesn't it? Yeah. So this question was that the capivary of the drug does not look anything like the IP6, the anion. And even he's mentioned that it doesn't actually dissolve in water. First of all, let me comment. I have no idea how these medicinal chemists figured out that structure. It's not a trivial molecule, right? It's a complicated molecule, but it binds totally differently than IP6. It binds to an interface between two hexamers and it fits really well. If you look at crystal structures cryo-M, despite how complicated it really fits into this binding, it's not binding in the same place that IP6 is at all. IP6 is between these hexamers in this pore-like region. So it is, by that binding stabilizing the hexamer formation more than IP6 stabilizes the pentamer formation unless you jack up the IP6 concentration a lot. Yeah. And then it's a concentration battle. If the binding event is the limiting step that you care about, then you would need a multi-resolution. Let me back up on that. You could implicitly include that in your coarse-grained model. So for example, you could imagine a bead. It describes some amino acids. The binding site is in the amino acids and you would have a kinetic process for binding or not binding. And how that bead interacts with the other ones depends very much on whether you're bound or not. So you could build that in an implicit way. If you wanted to explicitly do it, you'd need what we call a multi-resolution model. You'd need some very highly resolved, almost molecularly resolved, and then the rest of it could be coarse-grained. And I got to tell you that that's a frontier for the future. Really doing everything I talked about in a rigorous way where some part of it is totally resolved and some part of it is not very coarse-grained, that's hard to do. That's a good question. So you could do it implicitly, but it might not be satisfying to you. It might not be the question you're asking is, how does it bind? And what I'm saying is that it binds or it doesn't bind. And when it binds, it changes what everything else does. So you could imagine a, I don't know, GPCR or something you bind and it causes some big conformational changes. So that was the slide that kept changing on its own. I don't know why it did that. But we have an interesting philosophy that with a lot of coarse-grained modeling, the philosophy is I want something really fast and I run it on a few processors. We want to really push to, you know, thousands of proteins. So we coarse-grained the model and then we run on, like, Frontera at TAC over, you know, many, many cores. Or we recently juiced up the lamps code to do these ultra-coarse-grained models and two NVIDIA GPUs are equivalent to 510 CPU cores at Frontera to give you an idea. So with GPU computing, you can do a lot. But our goal is not to just have simple toy models. We want to, these were really hard calculations, really big, right? So we want to, but they're impossible. You know, there's billions of atoms otherwise. They're totally impossible to do otherwise. Yeah. So we get big computer allocations if that's, yeah. Yeah. So, yeah. So David is saying about the initial conditions, how you start these simulations, do you sort of stack the deck? Let me, let me do this. The actual nuclear pore has some filaments that come off. And it's not completely understand what they bind with here, but they are believed to help guide it in. So we cheat about that, that we don't just float this thing around in space and wait, you know, while we chew up the computer allocation and wait for it to dock there. We do start this, but we do it because we do know that those filaments are there. There's biochemical evidence that they're important. So, what we're looking at is imagining that you, these are defining sort of a cone to get in. Then we imagine you go with your fat guy or your skinny or the pill or you're oriented wrong. Those are the things, but we do cheat. I mean, I don't know if you call it cheat or not. We know those filaments are there. So, yeah. And then David, to answer the rest of your question, this is coming, everything I showed before was vertical, this is coming from the side. Over in here are other proteins. They happen to be intrinsically disordered proteins. So no one knows their structure, but they kind of, as this thing comes through, they presumably grab it like a catcher's mitt and help it more. And then another protein that they do know about called CPSF6 displaces it. So there's a switch off, and CPSF6 helps it go the rest of the way. And we're working now to do that. The problem is with intrinsically disordered proteins, you have no idea, you can't get structure. So we'll have to start with some toy models, I guess, to start. This is amazing also the fact that these scientists determined these structures is really remarkable structural biology. We're really lucky they did that. The idea of what we do is, you know, experimental data is always nice to have, right? So I'm sorry, the question is, could he use my software to just go back to his lab or does he need a lot more experimental data stuff? It's always nice to have some experimental data, but our goal is given a force field, right? So charm or whatever, you can develop coarse-grained models. I will say that, you know, smallish proteins, lipids are pretty easy with this. Really big proteins where you get very aggressive with the coarse-graining are pretty easy if anything we do is easy. In the intermediate regime, it's harder. But yeah, you could use the software. And there's different options in the software of different things you can do. Yeah. Yeah. Yeah. Yeah. But it would be just the last two, and someone who doesn't think about anything bigger than about 20 hours, the last two systems look pretty similar. So, oh, you mean the, I think that's transferable. I do think that I'm going to, some day before I die, I'm going to write an essay for one of these fancy journals, if they'll let me, and I'm going to call it the myth of transferability. In other words, there is no transferability of anything except a Schrodinger equation. Okay? So, yeah, but it's worse than that, right? Because the coarse-graining is that, we define it based on some statistical ensemble, right? For a certain number of particles of a certain kind. Okay? The notion that you can parameterize coarse-grained force fields that you just pick them up, and you just, you know, there's no fundamental basis for that. Yeah. So our philosophy is you take your system, you want to study, coarse-grained it, okay, based on our codes or whatever, and you simulate that. But be very careful if you want to use that for something else. I'm not saying there's no transferability, but there's real issues of that. Yeah. That's an excellent question. So the question is, how do you decide what level of coarse-graining? And that's an open question. We have algorithms that, if we say I want 25 coarse-grained beads, it'll tell you where they're best to be. But it won't tell you you should have 25. And that was kind of my first, one of the first slides there, is there's really a deep question of how coarse you go before your information content falls, you know, crashes, right? And I think we do think sort of some machine learning ideas, these kind of things of identifying important correlations and when you lose those correlations, right? But that problem is an excellent question and it's not solved. If you could solve it, that would be great. So in other words, you know, but I have a feeling it's problem-specific, just like this question about the binding, right? I mean, you know, it's what you're asking. But we do that by intuition now, pretty much. Yeah.