 So I will talk today about the work I've been doing with open force field initiative. Specifically I'll talk about the torsion fitting pipeline, particularly generating data for torsion scans. So first a little bit about why is this thing like still, how do you get rid of that? This is the fix. Alright, it's your mouse gone now. I don't have a mouse. Oh well, should I try to get the mouse on again? Or whatever, anyway. Okay, it's fine. So the open force field initiative is a collaboration between several labs and industry partners to improve force fields. We also have software developers, software engineers, which really help out with the software aspects of force field development. So. Okay, so sorry, I don't know. Let's let's let's do that again. Because that was working. So sorry, it's just going to have to be there. You can drag the little windows to your other monitor and then blow it away. This thing? Yeah. It doesn't work. Alright. Okay. It doesn't show up when there's anything. I spelled Daniel right away. Oh, okay. Alright, whoever's playing the drums, please stop. Okay, I guess, okay, I guess I figured out how to. Sorry. I don't know why this is all right today. Okay, so the goal of the force field initiative is to develop open scalable toolkits that automatically parameterize force fields. Also to generate and curate open data sets that can be used for producing small molecule force fields and for other applications to and also to systematically improve force fields. So I will talk about generating data for torgent parameters. The pipeline includes several different aspects, but I'll focus only on fragmentation and indexing the database. So first of all, what is the torgent potential? What is what what is the torgent energy? Why do we need to generate data for it? So the torgent energy of a molecule is the energy of the molecule as it rotates about its central bond. And in most force field, this is given as a sum of cosines to the truncated Fourier series where the K, the force constant determines the barrier height. The N is the multiplicity and then you have the phase angle. And the way the data is usually generated for these parameters is you hold the dihedral angle, which is the angle between the two planes formed by atoms 1, 2, 3 and 2, 3, 4. You hold that at a certain value and then do a geometry optimization to allow the other degrees of freedom to relax and calculate the energy. But it's important to note that the torgent parameters are not fitted directly to the Fourier series, it's actually fitted to the residuals. So some people see this as more as a dumping ground for all the errors of the 1, 4 interactions. So anyway, that's what that's what the torgent potential is. Now QC torgent scans are quite expensive to generate because you need to do several geometry optimizations. Okay, so what is the torgent fitting pipeline include? First the molecule comes in and we need to enumerate the ionization states, protonation states, and tautomers, which is not that easy, but it will be on the scope of this talk, so I will not talk about it right now. Once we have all the states of the molecules, the molecule needs to be fragmented in such a way that you don't destroy the chemistry of this fragment inside its parent's molecule. Once you have the fragments, you have to run torgent scans and sometimes you need to run multi-dimensional torgent scans, which means scanning several torgents together, which is even more expensive. This is done currently using really amazing QC archive project, which automates the torgent scans and then stores all the data, annotates it, and it's an open data set that database that everyone can access. Now once we have the data, the torgent parameters are fit to this data either using force balance or in the future using Bayesian inference. So the full stack of the software involved in this pipeline has been developed by many different people. So Daniel Smith, Dela, and Levi have been working on the multi-QC archive project, and then Li-Ping and Yu-Dung have worked on the geometry optimization, multi-dimensional torgent scans, and of course there's also Cyfor that does the QC calculations. Okay, so first I will talk about indexing the model fuels for the quantum chemistry database. Okay, why is this important? So different communities in computational chemistry have different representation of molecules. So when we initially started talking about QC archive about an open database for QC model fuels, we realized that different communities think about it differently. So in the quantum chemistry community, model fuels are represented as XYZ coordinates and elements, and that's it. There's no connectivity. You know, that's just the way they think about model fuels and each confirmation is in general considered another molecule. In the molecular mechanics community, we think of molecules as a conformational distribution, right? All different conformations are the same molecule and then in chemia informatics is just a graph. But turns out that the chemia informatics syntax works really well for combining these different representation of molecules. So the, you know, the very popular ones are smiles and entries. So this smile stands for simplified molecular input line entry specification is basically a syntax for representing the graphs of the molecule. And entry is from IUPAC. It's the IUPAC international chemical identifier. It's a standardized way of representing molecules. Now, the problem with, so these, these, these indices are great, but there are some issues. So one of the issues with smiles, smiles is really good because you can use it for substructure searching and and it's also a little more human readable than entries. But the problem is that there is no standardization for smiles. So most, most chemia informatics toolkits have their own canonicalization algorithm. Even though it's canonical, turns out that after working on this for a while, that it's really only canonical within a certain version of a toolkit. So I've been maintaining C miles now for a little under a year, and there were several times where updates and upstream software had changed some of the smiles in my tests. It's around 1%. So it's not that much. And most of it involved stereo chemistry, which apparently is pretty complicated. That's why it changes. But the problem is that if you want to make sure that your database is sustainable and, you know, data integrity, you need to always be submitting the molecules with the same smiles because otherwise how are you going to find it? Usually, you know, these are strings and it's, you know, the strings have to match if they're different. And you know, you're either going to have redundancy or you're not going to be able to find the molecule. So what C miles does is C miles generate uses different toolkits to generate these indices. But it will be, we're going to be pinning, I mean, it's not in production yet. It's still pre-alpha, but we're going to be pinning, once it goes into production, we are going to be pinning the toolkits versions so that we know which version of the toolkit generated the smiles. So we can always make sure that we're using the same smiles when we're searching and when we're depositing. And the other issue that C miles addresses is that, you know, these smile strings are, there is no order to like the atoms in the molecule. So the indices on the atoms are arbitrary. You can number it in whatever way you want. The problem is that for the QC data, you have XYZ coordinates with symbols and the order matters. And when you submit the smile string and then you submit your Q, your XYZ coordinates and then you want to later come back and recreate the graph, you have a loss of information. As in like, if you generated the, if you generated the XYZ coordinates from this graph that was in one order and then you create a new graph and that's a different order, it becomes hard to map back what information you have in the database, versus the indices in your graph. So to get around that, we are using smiles with tags. So these are math indices that can then be used as a smart string to do a self structure search. And that gives you back a mapping of the current indices in the graph to where those atoms are in the XYZ coordinates. And, you know, I did speak to, so really if you're always using the same toolkit, you don't really need this because most toolkits will generate the graphs the same way. But the problem is that people use different toolkits and this just, and all toolkits have a way, I mean, most Kenny information toolkits anyway have a way to use smarts for subscription matching. So whatever tool could use, you can get the mapping of the coordinates to, to your graph. Does this mean that you need to have the same tags to be able to match something in the database? No, no, no, no, because. Retreat it. No, so the smiles with the tags are not used for searching. Searching is, we're using, we're going to be using the smiles, just regular smiles and you can self structure searching on that. The tagged smiles are there if you want to now, you want to take this data and put it into like, oh no, open eye molecules or already kept molecules and you want to make sure to get the coordinates. And so canonical like naming of the atoms and not just the. Yeah. Yeah. And since the ACS talk was about software sustainability, I spoke a little bit about, you know, using different using on testing and test coverage. I just want to give a shout out to mostly cookie cutter that was started with, with Levi here in the lab and it's really great and makes it very easy to get all the hookups to the different to the different tags. So, also, I want to talk a little bit about how I test see miles. So, given that different version updates can change the, can change the canonicalization of smiles, I do test, I, I think I have around like 2000 molecules from drug bank that I test every time to make sure that nothing changed and I have different environments that test, you know, the current version of the software and the older version of the software and then that tells you something change because if you have tests passing in the older version but it's not passing in the current version, then, you know, it helps you figure out what has changed and if it's important enough to then go and update the indices that you already have in the database. So, while we're pinning the version, if changes are big enough, then maybe we'll go ahead and update all the indices and this allows us to, you know, maintain integrity of the data and decide if we do need to update the indices. Like, currently there was a change that I thought was pretty significant. Where Artikid started considering nitrogen in a, in a three carbon brain as stereogenic, and before that it hadn't so that's because that fell out you pack standard, and you know that's I think pretty, you know, that would be something that maybe we would update the database for I mean, we're still, you know, we're not fully in production yet so, but you know that would be so that would be a change that's significant. So now we can talk about science. So, I'm going to be talking about the other part that I've been working on which is fragmenting molecules for quantum chemical torsion scans. Now, why do we need to fragment molecules for torsion scans. Number one is the expense. You know, that's the obvious reason torsion scans are expensive. We need to run many geometry optimizations. And if we look at the distribution of drug like molecules and drug bank. You know, the, these calculations grow depending on the level of theory and method using anywhere from like MQ to like the sex. So, you know, you want to, you want to minimize the amount of heavy atoms you have in your molecule. You also want to minimize the amount of rotatable bombs you have in your molecule because then you'll have like an explosion of torsions that you need to drive. That's, you know, that's that's the obvious reason but another reason for wanting to keep your fragment small is currently in the force field. So, the torsion parameters do not incorporate collect correlations between different torsions. So you want to make sure to isolate the torsions and not not convoluted with intramolecular interactions and the larger your model he'll becomes the you know the higher probability you'll have interactions and it just becomes harder to, you know, to keep it clean. So, you know, ideally you want to have the smallest fragment that you can possibly have that is still representative of the chemistry of the parent molecule. So, what are the problems when you're fragmenting molecules for torsion scan so let's take a look at this by fennel chair and we're going to look at the central bond the central bond of this by fennel is, you know, if you look at this is the corresponding torsion scan. You look at it, it looks like a freely rotating bond, you know, there are some barriers, you know, you can freely rotate this. But then, if all you do is protonate the nitrogen on this ring, your barrier heights go up. And then if you deprotonate the oxygen your barrier, your barriers go up even higher. And then we have this winter ion. You end up with a scan that looks closer to a double bond than a single bond. So what is going on here. If you draw the resonance structure of this winter ion, you end up with you realize that the central bond is part of the conjugated system. And therefore it makes sense that the barrier heights are this high. But the problem is, you know, there are several problems here. Number one is that most chemie informatics toolkits will flag, will flag, won't consider this a double bond, this will be considered a single bond. So, you know, that's, that's one issue. Another issue is if you have say you have the torsion scan of this biphenyl without the oxygen there, can you use that torsion scan to for, you know, can you just frack, can you just cuddle that right. You probably shouldn't. And the other problem is, which is more about chemical perception is, can you use the same torsion parameters for the neutral form and the zwitterion form. You know, that's, that's something that's more of a question for chemical perceptions, which I'm not going to talk about right now, but I've done a little work on this. So the question is, can we, can we predict when a molecule, when a bond will be conjugated. So it turns out there is a measure that we can calculate. It's called the vibrant bond letter. And it's the measure of the electron population overlap between two atoms. You can calculate this from semi empirical PC calculation, which is basically the quadratic sum of the it's the quadratic sum of the density matrix over the occupied orbitals and add a bit at an A and B that are part of the bond. And when you calculate these values, you get a value that pretty strongly correlates with what can is think about when I think when they think about multiplicities of bonds. So in this case, in this case that your neutral, the neutral molecule has a bond whatever around one and the zwitterion has a bond whatever around 1.5, which is aromatic. Now, if you take the barrier heights of these different molecules, and you client against the vibrant bond order, you actually see that the relationship is linear, which I think was pretty striking. And the question is, how can we use this and is this general. Right. Um, you know, you can think about if it is general, then that means can you extrapolate torsion parameters without having to run all the scans for all the like different tumors. Yeah. Okay, so, you know, that was looking at that was, you know, the motivating example that got me started for this project but then I wanted to see what does the vibrant bond order look like for drug like molecules in general. So I looked at the distribution of the vibrant bond orders for the FDA approved molecules and drug bank and I divided, I took out the bonds that were in rings and the bonds that weren't the rings because the bonds that are rings. Well, I'm not going to be fragmenting rings because you know, I, you know, we just know that from intuition that we shouldn't be doing that so that I'm not really concerned about that but it's interesting to see how you know where the peaks are because if you look at the orange distribution which are the ones, you've got a peak at one and one point five, which is what you would expect, right. And then the blue one or the bonds not in rings you have, you know, very hot peak at one and then you have a peak at two and a peak at three. The vibrant both so these vibrant bond orders were calculated using a one and over there they tend to be slightly underestimated. So it makes sense that it's shifted a little bit to the right. What is it using it to what goes into the input of the bond order was the charge. It's a density matrix. Okay. So, now the, the bonds that we are concerned about are the bonds that have hybrid bond there is the fall between one and one point five. And yeah, I mean there is some density there and these are these are the bonds that we're going to be focusing on. So, what can vibrant bond orders teach us about the chemical environments of bonds. So, as I said the vibrant bond order is a function of the density electronic density, which as we know is a function of confirmation. So, does the vibrant bond orders change with confirmation. So to answer that question, I mean while they do, but I also want to see how they change with with with confirmation so I generated a whole bunch of confirmers for kinds inhibitors. And I will be going through the fit in it as a representative example of what the data looks like. So, we have I'm only looking at the bonds that are not in rings. So, I have 230 non consumers calculated by the bond order for each one of them. And if we look at the single carbon carbon bonds, you can see that the vibrant bond orders are pretty tightly distributed around one. Right. But then we look at the, at the funds involved in ethers in the ethers, you start seeing multi modality. And, you know, where it gets interesting is the bonds between the connoisseurs and the benzene you have the yellow bond and the purple bond. And what you see is that the purple bond is, you know, it's by mortal there, you know, to peaks, you know, higher variance. But then you have that yellow distribution, which looks pretty tight and what why is that. Well, it turns out, so both of these bonds are conjugated. But if you draw the resonance structure for those bonds, what you find is that the two resonance structures where the double bond is between the nitrogen and the connoisseurs. Over there, the negative charges are nitrogen, which is more stable. But in this case, where the double bond is between the nitrogen and the benzene, you got a negative charge on the carbon. So it turns out that, you know, the purple bond is stronger coupled with the connoisseurs than the other bond to the benzene. And it shows up in the distribution of the vibrant bond order. Now, when I looked at it more closely and I looked at the confirmations of the molecule of that bond at the different modes, what I found is that in the mode of the higher vibrant bond order, the confirmation was planar so that it can conjugate. And then where the vibrant bond order was lower, it was out of plane and that's where it can conjugate. So, yeah, it, you know, the, the variance of the vibrant bond order really tells us a lot about the chemical environment in the bond. Now, besides, yes. The whole second part of the molecule has moved back because it's favorable to be there or is it trying to be? So these confirmations were generated by Omega and they're generated using libraries, torsion libraries. So I didn't drive it. I just like generate a whole bunch of customers and looked at what the values are for the different bonds. But actually I'll show later that it's anti-correlation torsion scans. So the higher the barrier is, the more variance there is, the more modalities. I'll show it later. It's at the end, yeah. Okay, so besides the variance telling us about the chemical environment, we can also look at the correlations of the vibrant bond order and the correlations tell us about where the conjugated systems. So again, what we're looking over here at your left is the correlation map between all bonds against all bonds. And the first thing that pops out is the benzene ring, right, where you have, you know, correlated, where the alternating bonds are anti-correlated with each other for different confirmations, which makes sense when you think about aromaticity. And then we have the fuse ring system which also forms this block of correlation. And then we have this other ring, which, you know, it's also correlated with each, the bonds are also correlated with each other. It's not as strongly as an aromatic ring because it's not aromatic. But where it gets very interesting is if you look at, again, if you look at the purple and the yellow bond, what you find is that the purple bond, which as we showed is, you know, more conjugated or is, you know, more strongly coupled with a quinazoline, is also, it's also more correlated with a fuse ring versus the yellow bond. Okay, that's just showing that. And the fuse ring is a possible to draw a border between the two rings, to look at the correlations, or you can't tell. Like, look at colors, you can't tell. Well, so I haven't, okay, so I actually looked at vibrant boundaries of like these fuse ring systems where things start getting weird, right, because you have, you know, they're not equally shared with each other and it does show up. But I haven't like spent time to draw out and figure out like if this matches, if the correlation and like not necessarily correlated matches up with that. I'm assuming it probably does. Like a free ring system. So I've done some, I've just done it just to check to see what happens. And it definitely is not like, if you look at a regular benzene ring, it's pretty much like almost 1.5 all around. But with these like naphthalene and other ones, it's not that way. You have like some are higher, some are lower depending on there in the center versus outside. Yeah, so it does follow that roughly. And then if you look at the ethers, you know, the bonds in the ethers are very strongly correlated with each other, and they're also slightly correlated with a fused ring. You see that the red and grayish ones are more strongly correlated with the fused ring and others. And turns out, I'm not going to show it here, but turn that if you end up drawing the resonance structure, that one is more stable. If the other bond between the nitrogen and the clannous link are is in the double bond resonance structure. So it's, you know, it shows up like that. And then, you know, the single bonds, they're not really correlated with anything. So this, this kind of pattern, I've done this for all kinase. I mean, what was the FDA proof kinase inhibitors last year, and mostly follow, you know, similar patterns. So if the Weibberg bond water, the Weibberg bond water tells us a lot about chemical environments through its dependence on confirmation, right? But if it's dependent on confirmation to tell us about the chemical environment, can we actually use it as a meaning indicator? So to answer that question, I first looked at the standard deviation of the Weibberg bond orders with respect to confirmation. And what I'm plotting over here, what I'm showing here is the, is the distribution of the standard deviations over different molecules, you know, for their confirmations. If you look at it, you see that most of the standard deviations with respect to confirmations fall below 0.02. Okay, so the differences are not that big. So to look at it even further, I did a brute force combinatorial fragmentation of molecules. So that means I took the kinase inhibitor set, fragmented everywhere, and then combinatorially rebuilt fragments. And then for each one of these fragments, I generated a whole bunch of confirmers and calculated the Weibberg bond orders for all those bonds, and then looked at the distribution of each bond in different chemical environments of the Weibberg bond orders. So what we're looking, so I'm just going to take you through one example of what this data looks like. So here we're looking at the graph in it, and we're looking at the bond between the benzene and the sulfur. And this is what the distribution of the Weibberg bond order looks like if this bond is in the parent molecule. Okay, now then when I look at all the fragments that have this bond in it and look at the distribution of the Weibberg bond orders, this is what I get. And if you look at it, you see that they obviously fall into three separate bins. Actually four, there's one on its own then up there. With the red one, I'm pointing to the red one, that's the Weibberg bond order of that bond in the parent environment, in the parent molecule. So when I looked at the different fragments in these bins, what turns out was different between them was the removal of the fluorine on the benzene ring. So in everything in the yellow bin, all the fragments had two fluorines on the benzene. All the fragments in the light blue bin only had one fluorine, and all the all the benzene in the dark blue bin had no fluorines. But turns out it gets even more complicated or interesting whatever way you're going to look at it here is that depending if you have the ring attached to the nitrogen. So if you have, if you don't have the ring that that decreases the vibrant bond order. So it turns out this fragment that has two fluorines here, which really all the other ones, two fluorines are in this bin ends up being shifted to a smaller vibrant bond order. And then this molecule that only has one fluorine that all the other ones fall in here ends up in the yellow bin because it doesn't have the ring. And you know, here you have the fragment without any fluorines. That's also slightly shifted. So, so they're so the vibrant bond order, you know changes with confirmation but it also changes with what chemical environment that's in. And the question is, can we, can we use these changes as an indicator and how much, how much does it change with respect to the confirmation and can we decouple the sterics effects and the resonance effects. So, with this set, it's, it's, it's interesting, you know, your root of flooring the electronics changes. But the flooring is also like it's not in the long four interaction but it's in the one five interaction and probably if you're going to be driving the tour you're going to keep those two forings there you wouldn't get rid of it, because there's also some sterics. So I wanted to generate a set that separates the that where all the chains will be only resonance and inductive not steric. So that I generated a set of fennels with different with different functional groups. So I create I generated a combinatorial set so. So for example, I had what's I used to knock seven are one for this panel and if our two I use all other functional groups, and the same for each one of them. So I ended up with around like on 96 molecules for each our group. And then for each one of them, I calculated the vibrant bond water for that are one group and as you can see with this actual functional groups that that cover you know electron donating and electron withdrawing groups so I get to see the different effects. Did you also do the, what's it the ortho original brain or just the no I just did met power because ortho involves one five interactions I wanted to not have any of the sterics. I wanted to make sure that I'm separating the start that's only resonance and inductive effects. Um, so, okay, so this is what the data looks like so first I so up until now whenever I calculated the vibrant bond order, I had one confirmation at a time, calculate the vibrant bond order for it. In this case, I was, but I mean, if we're going to be using it for a way to like for a survey, we don't want it to be so dependent on confirmation. So we want to be able to estimate the vibrant bond order for whatever the confirmation is of a certain molecule. So for that. Open eye has the elf they call it electrostatically least interaction interacting functional groups, which Chris, I think Chris Bailey was the one who did that, which basically looks for that are stable for a and one BCC charges so that the charges don't fluctuate a lot. And the basic idea there is that you don't want the strongly electrostatic interacting group to be close together to avoid intermolecular interactions. So it generates come from us finds the ones that are lower energy that has less of these. You know, polar groups interacting with each other, and takes like an average of 10 of those come fromers from the lowest 2% energy of the come fromers and turns out when I look at it it's very reproducible like it's very stable. I recalculate it many times I always get the exact same value. So, you know, you get an estimate of. You know, you don't have the the variance because of the confirmation. So this is looking using using that vibe of bond order. So if you look at the phenoxide and looking at the vibrant bond order of where phenoxides are one with all other functional groups of our two. Now what's interesting is they all have there's all like there's a distribution right for each one of them, you get a distribution of a bond order but what's also interesting just to see the trend that you see here, as the vibrant bond orders become smaller with a more electron withdrawing groups, which is you know which is just interesting. I'm just going to look closely at the an Ethel group here just to show you what the data looks like. So in this case I have, you know, these molecules with what the vibrant bond order there is, these just have an Ethel groups but this one also has other groups, other functional groups, and turns out that the vibrant bond order decreases with more electron withdrawing groups, which actually makes sense when you think about it because that allows the nitrogen to share the nitrogen is electron donating so if you have a phenoxide electron donating it's pushing electrons in and then the nitrogen can't, you know, won't be pushing that many electrons back, but if you have the positively charged nitrogen it's pulling electrons and that allows the donating electrons of the nitrogen so the vibrant bond order increases. I hope there's a metal. So, yeah, the patterns are similar while you see it. It also increases the strength. Yeah, yeah. And depending on which functional groups you know some of the para is stronger than the meta. And, you know, depending on like if it's donating or withdrawing. Is it true that variance in meta is smaller than the power? Because you know the conjugation. I think so. I do think so. I think so but I don't remember exactly. Yeah, you see you see the strongest effects with a parent, and you see it also like the difference between like the phenoxide and the and the and the positively charged nitrogen right that's where you see the biggest difference. Okay, so then I looked at what what was the distribution of the standard deviation for this set. Right and turns out that the standard deviation is a lot bigger. You know it's definitely bigger than the changes that you see for confirmation. So what that's telling us is that yes we could use changes in the vibrant bond order as an indicator of the chemical environment has changed. Because the changes that happen in the change of chemical environment are bigger than the changes that happen statistically is bigger than the changes that happen with confirmation. So, for that set that I showed before I then took the set and generated towards you see torsion scans to see if the, you know, if the, if we still have that trend of, as I showed earlier with a with a bifemmel groups, where the increase in vibrant bond order also increases the torsion barrier heights. So here we're going to look at this nitro group. And the reason why I'm looking at the nitro group was because the general had like a pretty low vibrant bond order so I was curious if the trend is still there for like the lower vibrant bond orders. And turns out it is there so this I'm just looking at what the torsion scan looks like for this bond. And this is this. Okay, all of this was run using QC archive and the data is available in QC archive, you know you can look at it and play with it. And it turns out, if you look at the, so I use sci for to do the torsion scans and sci for can also calculate the vibrant bond orders. And the stars are the vibrant bond order for each confirmation in the torsion scan. And as you can see they're perfectly anti correlated. And it turns out that's the reason why you see the multimodality and it makes sense that your variance is going to be higher with higher barriers because, you know, if your barriers are low, your distribution is going to be pretty tight. But if your barriers are high, and the vibrant bond orders are anti correlated with a torsion scan. And yes, that's what you're going to get. Right. And, you know, here I'm looking at different molecules with different functional groups with decreasing vibrant bond orders and you see the same trend. Okay, and then when you plot it on the line the way I did with the biphenyl. Again, you see a very similar trend where the relationship is linear. So here we're looking at more functional groups and more molecules and they're just showing a little bit because later I'll show all of them and it's just it's a lot of data. So it's hard to look at. It's easier to look at with less data on it. And, you know, most of them are following the same trend. And then this is just plotting all the data on it. And again, most functional groups do follow the same trend. So it seems to be somewhat general. I wouldn't say yet that it's fully general because it's biphen, it's like just this fennel with functional groups. So it's, you know, it's a limited. I mean, there's still chemical diversity there, but it's, it's pretty limited, but the trends are still there. But about the one group that is anti correlated data. No, no, no, no, there's actually a future line. I'm going to I'm addressing it. You know, splots where you're trying to book curves. Which one? Yeah, here. So I knew that tops of the barriers are correlated to a bond order. But could you actually reproduce these entire torsion scans by multiplying the fiber bond order by some constant? It looks like you could, right? Maybe. I guess just look at correlations of different points on this curve. We also plot the torsions instead of as energies as expected populations that then they'll have the same. Yeah, yeah. Well, they're somewhat different. It's like each of the negative to you, right? It's not. Yeah, at least the peaks will be at the same spot. Yeah, the peaks will be at the same spot. Much more about the results. Yeah, but I mean, that would be interesting, but how would it be useful? Well, then you need the bookers. Yeah, but the problem is that I'm still going to have if I want to if I want to get the torsion scans from vibrant bond order of a torsion scan, I'm still going to have to do the torsion scan. Yeah, you get boundaries from torsion. So here what I'm showing is I have a torsion scan and because I'm using side four, which also can calculate the bond order. I just I just already calculated for those confirmation. So I'm looking at specific confirmations. So. Okay, got. Did you show a slide where the, like, the dependence of the vibrant bond order on a combination was quite small, so you could maybe compute the video for just one information. And then I spoke about that. Can you repeat that again? If I can. Yeah, yeah, so I know I definitely yes. But you showed. Oh yeah, the blue and red. Right. So here I had, I didn't do like a, just like generating confirmers that are like lower energy confirmers like usually when you do the confirmer generation, you don't do like all the high and right. It's, it's the confidential distribution that you would probably find. Yes. And that one is doing the full torsion scan. So what was, what was, does the answer your question. So, I mean, following up on the raffle suggestion where you can compute the, the, a scalar from something that's cheap, like, can you put a product order from just one of your applications and then use it to. Well, we might be able to show, especially showing that if you have a customer that's a good representative of the confirmations like something like using elf there's still that. So this relationship that I'm showing here is using these elf estimated vibrant bond orders and that relationship is still there. So yeah, I mean, if definitely could be possible that you can do something like that. These are these means over conventional sample or whatever. Each driver bond order is, as I explained the elf confirmers. So basically it does like an average of 10 confirmers of like the generates a whole bunch of population and then takes the 20% of the of the of the population where you have the least interacting electric like strongly interacting electrostatic groups and then takes an average of 10 of those. It's, I don't think there's a paper published on it. They write on the, on the documentation that that it's from, like it seems to be pretty stable and I'm also seeing that it's very stable. I do find that sometimes it does underestimate the vibrant bond orders, but I think that might be an am one thing I don't know if it's an opening with its name one thing. Okay, so again, this is looking at all the data. And yeah, there is an anti correlated one. So I went and looked at all the ones that are outliers and that are like not following me. I looked at, I looked at all the scans and then I also looked at the scans that took a lot of time. I turned out there were two issues that I saw, which just shows how these things are never straightforward. So in some cases where you have a trivalent nitrogen, it can be either curvil or planar. Right. So for the cases where you have these outliers, seven, like twice I saw where the night like for all other molecules in the in that series, the, the, the nitrogen was planar. And for those that were outliers, it was permittal. And for all of them, it also had a phenoxide on the other group and it might be because I wasn't using the, I wasn't using augmented basis set like diffuse functions for the anions and that might have created this issue. So we run it with on diffuse, with diffuse basis sets. And the other thing that I saw was this was something this is something that torsion drive is supposed to address, which, you know, when you rotate around the bond, there are other dihedral right and if the other dihedral is the lowest conformation, then you might not have the lowest conformation. So there were some cases where another dihedral was rotated, and then for that barrier you end up having not the lowest energy conformer for that barrier so you had torsion scans that are not symmetric actually got like a very high barrier and then a lower barrier that is more is more aligned with the trend. So they were like, I don't know around 10 inch scans like that that I have to rerun to make sure that I get the lowest energy conformer for the different conformations in the scans I have to rerun it. But then when you take out those, I hate taking out data, but like, you know, I have to rerun them. So, you know, this is what the trends look like without it, where, you know, they're pretty consistent, I would say. The distribution of sorts. Oh, yeah, I mean that that that first let's go through this but I have that. Okay, so for the fragmentation scheme using all, you know, all of this understanding of vibrant bond orders. What is our fragmentation scheme. And so, you take a molecule, you calculate, and using an A and one calculation calculated by the bond orders and one calculations I'm using open either pretty fast, they're like 10s of seconds, not so fast but like fast relative. And then we find the rotatable bonds. I mean, currently I'm just using, well, I was using open I like if a bond is rotatable, but then I realized that in this system so I now have like the smart pattern that matches, you know, more of the bonds than what open I was finding. And then for each of these bonds, I built out one bond in every direction. And now if that bond I built out as part of a functional group or a ring, I keep that. And that's my minimal fragment. And then you calculate the vibrant bond order for those fragments. Now, if the if the vibrant bond order is within a certain threshold that is chosen by the user, then that is your fragment. If it's, you know, the difference is larger than a certain threshold. Let's say in this case I'm showing that you know the green ones well they're pretty close to what the parent molecule is. The red one, well, it's like a difference of point one, which actually is pretty significant ones by the bond in bond orders. In that case I add on different functional groups to recalculate until until you're within whatever threshold you had chosen. Now the question is what threshold should use. So for that, I looked at the distribution of slopes. And this is what the distribution looks like the mean slope is like 188 kilojoules the median is one sentence is like 180. So if you do like a back like a rough calculation turns out a change of point one the vibrant bond order is roughly like seven KT in energy, which is pretty big. So, you know, using this you can decide how what your threshold should be like how much it's, you know, how how how how accurate do you want to be. Okay. Now, I'm the next thing I'm going to show is newer stuff and it's a little bit like I haven't finished thinking through it yet. But anyway, so now how do you continue growing out the fragments right there are many different paths that you can go and and initially like this is like my schedule I thought I was going to find like different paths and they're converged some converge faster than other to what the value is in the parent molecule. But then, you know, when I actually did it and so so these are, you know, I'm using different heuristics for the past. So one of it was, if I have a molecule I need to grow out and I have like two three bonds that I can go with. I look at the vibrant bond order in the parent molecule of those bonds and I should probably go with the higher ones because those are more conjugated to the system right that's one heuristic. Another heuristic is what's the shortest path to the current bond that I'm looking that I'm trying to fill the fragment for. So these are the two heuristics when I'm using these are the two different paths. So we're going to be looking at this molecule and that bond turns out that data looks more complicated than what I thought it's going to look. This also makes it a little more interesting so we're looking at that bond and you know the fur and the parent molecule has a bond that are like around 1.08. Okay, so starting out with our first minimal the minimal fragment that you get from just keeping you know the everything around it with the rings and functional groups. That's what you get and that's where you start out. So you might decide that's not good enough for you you want to grow out so you grow out you add that nitrogen there. Okay, you're getting a little closer. But now you know there are there's more than one way you continue growing you can either like add that ring gaps in the nitrogen or add that fuss. I don't know what that means. Roughly you know. Full spiral diamond full. Yeah, anyway, like which one do you choose while depending on your heuristic. You're going to either add the ring or you're going to add that but you see how different the vibrant condors become now. Right. And then you know you can continue growing depending on your wrist it can turns out like here this is probably where you should stop it's like it's converging here it's pretty close to them this other heuristic using labric bone motor is still like oscillating a bit still goes up a bit before it gets closer. Well, yeah, and then you know it gets even more complicated that you know larger fragments are not necessarily better. So, say you have this bond right here, you know we can get that's the fragment I'm looking at that's the parent, I add on you know the ether I'm still good right pretty close to why I want to get but then you know I just throw out a little bit I add this. To remember it ring and something changed in the electronic so so these these. Yeah, these effects are pretty long range. So, I'm still I'm running so currently I'm running some torus like for some of these where the differences are pretty big I'm running some torches cans to see like how different the torches cans are. But I would say like if you're going to fragment you should, you know, you should. So, um, basically, if you have a certain threshold that you're okay with. You know, you'll try you try to go with what gives you the smallest fragment for that threshold. All right, so in summary, what we discussed. Basically that we see miles to index molecules that divided bond orders can inform the environment and then we fragment we're using divided bond order to inform us how we destroyed work didn't destroy the chemical environment. So for future work, we got to integrate this all with QC archive. And also for future work on this is to have some way to score the fret like, because it's not, it's not like there's not one clear cut way of doing this right different heuristics are fragmenting where your threshold to be. So I am, you know, thinking about having some sort of way to score how like if you have a set of molecules and you're fragmenting it using some parameters or some scheme like how good does it do overall. Right. And with that, I'd like to thank the many people that have been very helpful. I want to thank, you know, john and everyone in the group. Josh and, you know, raffle for always listening to my class. And then, of course, Chris Bailey and open eye has been really great. And those have been active. We paying with you don't David and Caitlin and everyone at multi who has, you know, created this QCR project which is really, really, really nice. And with that, I will take questions. Is there anyone still. Questions from online fast. Yeah, let's do the online questions first. Anyone online have a question. Yeah. Any questions. When you use distributions right they said, you said they clustered three or four groups. I'm talking about right. Which one like a nice plot of distributions so early on. Which one this one. Yeah. So do you just cluster by on or you end up with some authority. So I use them. So, okay, so I did it two ways. So first I sorted it by the elf estimate, just so that I can see. Right. So I had so much data, and I just sorted it. And for the colors are used like the score of the difference in the distribution to the parent molecule. So I think these are color coded by the and the mean, the maximum mean difference to the, to the, to the distribution. So yeah. But yeah, this was a lot of data to look at. So there's nothing else online. Thank you for talking and thank you comment for organizing. Yeah.