 It's about time to get started with our talks again. And sorry about the display. I was trying to hide the Windows taskbar, but I haven't used Windows for so many years that I don't know how to do it anymore. I can't make it go away. So that bottom strip of the screen seems to be blocked. So for those of you whom I'm meeting for the first time, I'm Li Ping Long. I'm an assistant professor. I'm at UC Davis and one of the academic collaborators in this open force field initiative. And today I would like to tell you about the strategy that we use to optimize the parameters in our first release of the optimized force field. And before I begin, I want to say that this was really made possible with effort and assistance from everyone in the collaboration, but in particular, the three individuals that I'm showing on the screen here. So you don't chill from my group, Simon and Ruth Roy, who is a post-doctoral fellow supported by XLPIE and is in Don Kedera's lab. And Daniel Smith, a software scientist with Mulsey, who not only have really gone above and beyond in terms of the efforts they contributed, but they also came up with many original and very innovative ideas that I think really made the difference between this working and not working. So without any further ado, we can start. Some of this will partially overlap with what David gave in his introductory talk. He has that kind of laid the foundation for everything that we're going to be talking about in the rest of the day. So basically, as of this week, our first major round of parameter optimization is complete. That's not to say we're not going to do any more optimization and corrections, but we've now reached a point where we can start doing some really extensive testing and benchmarking and getting community feedback from what we're doing. So the current version of the optimized force field available online is we're going to call that our first release candidate. The previous rounds of optimized results are also online. We're just not encouraging that you use the early ones. Use this. There's a numerical versioning scheme that Jeff referred to. So x is the major release, y is like a minor release, and z is a bug fix release. And then we're also giving these releases code names. And one reason why we're calling the first of the release partially is that we're hoping we can make the four major releases, and then we could have a partially sage, rosemary, and pine. Yeah. OK, so let me summarize the optimized parameters of the force field and the data that went into optimization kind of added claims. So first, which parameters were optimized? We looked at bond stretching and angle bending equilibrium parameters and force constants. The number of parameters is listed up there. We also looked at barrier heights in the torsion. So to be clear, we did not optimize the phases in the torsions. And we did not add any new barriers that were not already there, which means that in the starting force field, if it was one-fold, then it stayed one-fold. We didn't add any two-fold or three-fold terms to a two-torsion term that was originally one-fold. There's also a total of 30 Leonard Jones sigma and epsilon parameters, so 15 sigma and 15 epsilon's that were optimized for a total of 530. So one of the, I guess, one of the arguments that we made for one of the advantages of this spin-off format from very early on is that this really gives you a very compressed representation of a small molecule force field in that the number of independent parameters is a lot smaller compared to the indirect chemical perception that uses atom types. Yeah. Like, oh, you mean how many atom types do we have in here? No atom types. There are Leonard Jones types, which, I guess, serve some of the same purposes as atom types, but they're not used to the sign of London Grand Prix. All right. Yes. In the new optimization system, do you optimize everything at once? I don't know about the zone. It's not for us. It's not for me. There are people who don't even know who you are. Yeah. In the protocol for optimization, do you optimize everything at the same time? For example, if Leonard Jones are optimized separately, then you'd have 30 parameters and roughly 60 observables. That's an important question. The answer is coming up. Short answer. We did iterate back and forth. We can bond it and non-bond it just once. This time we did not fully co-optimize them just because we ran out of time, so again, at a glance, what data did we use to prioritize the force field? We used both data from initial calculations as well as experimental data, and to generate and curate this data, there was a lot of effort that went into this, so you don't primarily did the QM data generation working closely with Daniel Smith who created and maintains the QCR cut ecosystem, and Simon was carried out the experimental data curation by pulling the information out of the thermal ML archive. So the valence parameters are informed by ab initio optimized geometries and calculated vibrational frequencies. There's a total of about... There's almost 1,800 optimized structures and 900 sets of frequencies. And then the torsion parameters are informed by these torsion drives. These are basically energy versus torsion angle profiles of constrained optimized QM geometries. Everybody here is more or less familiar with these. And then for Leonard Jones, we're going to look at the density and heat vaporization of molecular liquids. This is sort of just taking the historical cue of what was done to optimize the OPLS parameters back in the day, and we have a total of 39 liquid densities of vaporization for use right now. Okay, so... And then just at a glance, again, how are the parameters optimized? First, we have to describe the starting point. So we started from Svernoff 99 Frost, which were the parameters that Christopher Bailey adopted from... They basically form maximal, you know, closeness to amber 99 and parmafrost. And then... And the parameters are optimized by this regularized nonlinear, at least squares procedure which is implemented into the force balance software developed in my group. And the parameters were optimized in three major stages. This answers Arnie's question that we first fitted all of the valence and torsion parameters through the QM calculation, then we froze those parameters and then Simon optimized the Leonard Jones parameters to reproduce the thermodynamic properties, and then we froze those parameters and then we re-optimized the bonded parameters again. And then that's where we said that we were going to use the nonlinear release candidate. And then here is the current location of the optimized parameters, the fitting data and optimization output. We actually forked it over from my group repository to the more official force field repository, so this link needs to be updated now. So the force field is provided in the OFFX in all format. You can use it for simulations right away. And this repository includes really detailed release notes for each parameterization, which kind of shows our thought processes beginning to the point where we are now. And then the downloadable files includes not only the parameters, but a lot of really detailed information about the optimization. So how well does our current parameter set perform for each single torsion, each single optimized geometry, each parameter, you can find all of that information in there. Okay. So this is just a version of our software component and data flow diagram, and the parts in green are what I'm going to be talking about. So the parameter optimizer is going to take information and take quantum chemical information from the QCR type. It's also going to incorporate experimental properties and the differences between simulated properties relying on Simon's property estimator toolkit to do that. And then after the optimization we end up with an optimized force field and eventually we get to the new release that we have to take. So at some point along our thought process we had to decide on a QM level of theory. And then here we're mainly going to be talking about QM levels of theory for conformational energies because arguably the torsion drives are the most important QM calculations that we are doing in this step. And an important component of the torsion drive, not the only component, are relative conformational energies. And there are a few conformational energy benchmarks in the literature. There actually aren't that many. But here are two that we decided we're done quite carefully where they compared conformational energies from various DFT functionals. They compared it to high level couple cluster reference. And there was kind of a surprising result that there are, that there exist small basis sets, so double zeta basis set published in the 90s that give you very close accuracy and conformational energy compared to a much larger like triple zeta basis set that all it's published much later. And we're going to be using a pretty well established global hybrid density functional and P3LIP with with Gremna's D3 correction. And then for conformational energies this is we're going to, we basically decided this was sufficient for the generation of our data set. And in keeping with precedent we are carrying out these calculations by gas phase. Although I think it's a very interesting and important discussion whether we want to do future calculations with implicit solar. So which molecules do we use? David alluded to this. So we started with 468 small molecules provided by Roche and Xavier Lucas our contact at Roche so really thankful for those. And then from these molecules we identified 820 rotatable bonds involving four heavy atoms that are not in rings. So those are like the most obvious candidates for origin drives. And then because we're doing these in the gas phase we don't really want our fitting to be contaminated by the formation of strong intramolecular non-bonded interactions such as hydrogen bonds. So we filtered out the origin drives where there were strong intramolecular hydrogen bonds and that and then we ended up with 669. And then there was another set of molecules. The coverage set. I didn't know as much about this set so I don't have pictures of those molecules in this talk but David did. And this coverage set basically ensures almost full coverage of the spin-off parameters and it leads to 417 more origin drives. I think when we fully assess the coverage of the parameters we got 481 out of 500 parameters that were covered by the coverage set. And then we performed energy minimizations of these to get local minima. There was actually a conformer generation step in here although I'm not sure exactly what conformer generation procedure was used. So even though we don't have 1,785 molecules that's how many local minima we have. And then we are going to tweak the actual parameters so that our minima structures are as close as possible to the local minima. That's actually precisely what Jeff did in his demo. So that is one of the things that we're doing here. And then at the lowest energy minimum we're going to do the frequency calculations and we're going to match the vibrational frequencies as well. So I'll just briefly plug the QC archive project that Daniel is going to go into a lot more detail. This is basically a quantum chemistry computation environment where it will basically organize all of your calculations and also figure out where are the cloud resources to run your calculations. And it's really great for organizing large data sets such as the ones that we need for setting our force field parameters. And then so the calculations are going to be done on this ecosystem. So QC archive is going to be running our calculation. And so what QC archive actually executes at kind of a high level is that it calls this portion drive method. And portion drive is a method that does this recursive wavefront propagation of these constrained energy optimization. The reason why we are doing this is that you know that if you do a sequence of constrained energy minimizations you can get hysteresis where you might fall on to kind of a different branch of the potential energy surface where you have reorganization of your orthogonal degrees of freedom. And the thinking was that if you ever run into you know if you ever cross between these branches of the potential energy surface you might be able to step back and then do some constrained energy minimizations in the other direction and end up with some lower energy geometries. So this was implemented in this recursive fashion and that's essentially what portion drive is. And then because we need these constrained energy minimizations portion drive is going to then is then going to call a software package called geometric or geometry optimization. So every quantum chemistry package has an internal geometry optimizer. Geometric is a little bit different. It's an external geometry optimizer that basically calls any code you want for the energies and gradients and it contains the internal coordinate system for the geometry optimization. We found who were pleasantly surprised that it's pretty robust in these constrained optimizations because QC archive has run more than 200,000 of these and we haven't had a single convergence error. And QC archive is also going to implement other types of optimizations that we need such as a unconstrained geometry optimization and Hessian calculations which are going to give us our bond and angle force constants and then once these calculations are finished on QC archive it's only a matter of some scripting to download all of the calculations and convert them into a format that force balance can read. And now let's talk about the selection of experimental data. This is what Simon did. I just took all of his smile strings and pasted it into chem drawn. This is how I made this figure. So these are the molecules that had the thermodynamic property data that we identified from thermal ML. So the black ones they have both density and heat vaporization. The red ones have density only and the blue ones have heat vaporization only. To point out that thermal ML only covers data that's published in the last 10 years. So a lot of our data is actually not in there. But we think this problem will become less serious when we move to thermodynamic properties of mixtures where thermal ML really has a very copious data on this case. For now we're going to focus on small compounds with good parameter coverage and our selected set of molecules covers 15 out of 35 non bonded types. And even though this doesn't seem like too much it's actually quite a bit difficult because out of the ones that we don't cover a lot of them are things like ions or an HCL HVR and things like this. So the 15 out of 35 types that we're optimizing I think it actually covers a fairly good swap of chemical spaces. And then to do this optimization we use this force balance tool and because I've given these slides before maybe I'll go through this a little bit a little bit more quickly this is basically a Python toolkit that carries out this nonlinear optimization of the parameters for you. It handles the construction of the objective function that represents the amount of disagreement between your force field predictions and the reference data and it automates the execution of the molecular mechanics code to you know to calculate your energies, your minimize geometries or to run your simulations. Well now it calls property estimate or two months and then and then one thing that one feature of force balance is that is that it can is that it can handle I guess a force field parameters that will often obey like relationships such as such as you know you might you might want some parameters to sum up to zero or you might want other parameters to not change sign just because they represent physical quantities they can't just be anything so force balance has influence the mapping between the unconstrained optimization variables and the physical force field parameters and so and so because there is this mapping you can basically you can basically make the optimization obey any kind of constraint you want in your parameter space so because time is limited I'm not going to do this in too much detail but if you want to learn about basically how the objective function is constructed I played it out pretty carefully okay so what did we do that was new in the past few months so in order to enable this parameter optimization we had to implement new terms in the objective function so the optimized geometry target is new and the way it works is that it will match the molecular mechanics optimized structure to the QM optimized structure and in order to do this matching it first decomposes the structure into the internal coordinates the objective function is calculated from the sum of squared differences in the internal coordinates in particular we focus on bonds angles and in proper torsions because for the torsion degrees of freedom that's what the torsions are for and then and then we have a new torsion profile target this new target was it really it actually takes the what we were doing before is that we were fitting MM single point energies to these QM torsion energy profiles but Christopher Bailey pointed out during one of our meetings that when you actually run a simulation the molecular mechanics structure will largely relax and that will and then that will give you in practice a very different profile than that single point profile that you fitted so this new torsion profile target is going to carry out a restrained molecular mechanics minimization prior to comparing the MM energies to the QM energies and that's how we are going to fit all of our all of our torsion and in the third part there was an existing vibrational frequency target but we weren't able to calculate the frequencies in open MM before but this is implemented now so so you don't know why basically we divided these tasks among three of us and now we can do this yes the torsion atoms are frozen yeah like it is in principle possible to use a package like geometric and actually freeze the torsion angle and relax everything else but that optimization is much more expensive and we're doing millions of these optimizations in every cycle of the parameter optimization so there is an approximate component to this freezing of the torsion angle alright and then as for the thermodynamic properties we decided to use the property estimator toolkit that Simon developed he's going to tell you more about that and because and so this is a toolkit for simulating physical properties and we basically use that as a drop in replacement for the for the codes that were native to force balance for calculating properties and the advantage is that this is going to give us a platform for improved performance and improved methods in the future because there's lots of innovative statistical mechanical methods to estimate these properties more rapidly as we go towards future releases so you don't get Simon work together to create this force balance and property estimator interface for force balance asks property estimator for the properties and also asks for the the gradients of the properties with respect to the parameters property estimator hands these back and the force balance will build the objective function and its associated gradients and then run the optimization that way and so the basic way of the basic workflow of using force balance is that you first set the parameters that you want to optimize in the force field file you set up your target folders that contain the data that you want the force to be fitted to and then you create this input file that specifies your calculation settings and then you run and wait for the results and then and so now we can start talking a little bit about the results so I said that the optimization was run in three main stages and here's the convergence behavior of the three stages so in the first stage we are fitting to only 2M properties and you can see that most of the decrease in the objective function happens in the first iteration and by the fifth iteration things are only changing by one or two percent so our convergence criteria are met by the 14th iteration or so and that's when we went to the second stage and in the second stage because you are calculating thermodynamic properties there's some amount of statistical noise that means the objective function can't convert as precisely as when you were fitting to the QM properties and so in the second stage you see things go up and down a bit and you basically have to terminate that manually because it's hard to define the perfect convergence criteria for that and then after we freeze the Leonard Jones parameters from here we fit to the QM properties again and that gives us the blue points here and we end up with an objective function that is just very slightly higher than the objective function that we got from stage one okay and so here is a graph that describes the quality of fit for the torsion profile so let me explain a little bit what this graph really means so here on the x-axis each point on the x-axis is one torsion profile and the gray curve is the starting value of the objective function the red curve is the optimized value of the objective function and I've sorted them in decreasing order of the optimized objective function the red curve is generally lower than the gray curve which is expected because we're optimizing the objective function but sometimes the gray curve is below the red curve which means that this optimization does not uniformly improve every single torsion value some of them like most of them get better but some of them get worse and furthermore the starting value of the objective function is not a very good predictor of the final value of the objective function and in fact I think there's probably a lot of fine details going on here and why certain torsions are easier to fit and others are harder if we take a closer look at individual torsion profiles you can see that some of them are fitted very well on the left the molecule on the left whereas the molecule on the right is not quite looking as good sometimes this might have to do with a particular torsion type which is applied to multiple molecules but the torsion profile for these multiple molecules are very different so you might need to add an extra parameter type and so for the cases that are not looking as optimal those are just things that will continue to study future as for, yes are you guys doing any error? is there sorry I'm losing my voice right now are you guys doing any train test splits for these torsional profiles? we're not doing train test splits right now but I guess that was we were I guess that's kind of what the I guess that's kind of what the validation is for and maybe one reason why we're not doing the splits right now is because is because our set of molecules does not cover our parameters that many times for example if our molecules covered our parameters a hundred times then I'd be more comfortable doing the tests and I would also say that we have lots of additional calculations that we haven't used for fitting yet that we're going to be testing on it seems to me you're doing yourself a bit of a disservice because the scale on the right one is larger than the left one so the errors aren't as bad as they look but I'm curious it seems that the optimization the fit got worse on the right I think I think it got slightly worse how could that happen well it could happen if the same torsion type is actually assigned to two different molecules what optimization procedure did you use oh I trust radius Newton-Raphson it's a trust radius Newton-Raphson thanks I haven't looked at this much so maybe I'm missing something but there's an arbitrary vertical offset that one could apply so presumably the most dominant state would be the energy global minimum if you align those one might get a different impression what's the metric of because it looks like the peaks are aligned or those two subsidiary peaks almost yeah that's totally true so you have to make a choice and how do you align things and we align all of the curves to the lowest energy structure in the QM and the reason why we did that is that if you end up with a if you end up with an MM curve that has a different lowest energy structure compared to in the QM then it will show up as having a negative energy on that on that y-axis and that is an indicator that when you run the simulation you could end up with an incorrect equilibrium sort of quarterly is that you've got them aligned on that QM energy minimum but it's the MM one is shifted off to the right so it's almost as though if you allow a little bit of phase shifting or something you know slop then suddenly it might all come together better and might even optimize better yeah that's so that's true I mean the way we've aligned things right now is also kind of the way the optimizer sees it so if in this alignment MM is lower than QM anywhere and the QM energy is low then it receives basically the maximum weight but yeah you're right if the green curve were shifted upward then we would see that basically the difference between the green and the blue is that the minima seems to be shifted inward by about right which might be preferable to this anyway yeah yeah yeah I guess that's true the way of aligning these curves is definitely not like said and done there might be different ways to do it but this way of using the lowest energy structure in the QM is how I've done it for I guess at least a couple of years but I haven't but there could be different ways to do it so I think that for the purposes for the purpose of an MD simulation you actually care about the heights of the barriers relative to the lowest states right so the optimized oh wait never mind I'm reading it wrong so you actually flipped sorry that's actually going to be quite bad because the minima actually changed to a different phase seems yeah yeah so the barriers actually got quite higher I just misread the plot I'm sorry yeah I think that we just have like a few hundred proportion profiles some of them get better some of them get worse you can always I could pick maybe like a hundred plots that look like the one on the left I could also pick maybe ten that look like the one on the right and then the ones on the right are always going to make that kind of uncomfortable right but I think every time the optimization actually makes an individual you know makes an individual profile worse that probably just means that there that there might be like a certain periodicity that we're not including in the parameter type or that one parameter type might need to be exploited too and I know we have discussed this point before but if we could since it's not possible to do PMFs with a quantum wavefunk with a quantum description a quantum chemistry description of the molecule it's too expensive it would be nice to have a set of charges that you felt reflected the gas phase so that you could be fitting the torsion profiles to a gas phase and therefore mimic more what the quantum chemistry calculations are we looked at this a little bit and determined some solution phase charges and then through quantum chemistry calculations and then went back and from the gas phase determined charges that were used in the fitting of the torsion profiles so that's just a comment I'm still not really happy with that I'd really like to do PMFs with the quantum chemistry but maybe it's something to think about yeah, so the I pull Q system is definitely like definitely has that physical like this physically grounded idea that in the gas phase your charges are going to be different compared to what you actually run the simulation another possibility I've been pulling around you're just thinking about is whether it might be possible to do both the QM and MN calculations with a solvent model and I think some people have started to do this yes, and so I'm going to make one more comment I don't think we should rely on an implicit solvent model for water water hydrogen bonds right and I'm quite I'm okay with using it for organic solvents benzene, dichloromethane things like that but I am very concerned about using it for water I am still a little bit confused on the right hand side plot what are you aligning for the relative energies I really don't get that sorry here this is the lowest energy structure in the QM and so every curve has that energy subtracted from each of the points so that's where all of the free curves are crossing when it's a yeah right all three curves you subtract the energy of the lowest energy structure in the QM is that really a relative energy well I guess I guess the way I I guess one way I interpreted is this if you were if you could do sampling in the QM you're probably going to get an equilibrium distribution that's centered around minus 75 degrees and maybe like plus 75 degrees whereas if you do the sampling of the MM it might actually predict your parts of your equilibrium distribution to be around negative 165 degrees and and I think that when we look at force field errors the zeroth order error is the equilibrium distribution and then only after we get that right should we pay a lot of attention to getting carries so but this is just really off the cuff and we're totally wrong but another way of comparing would be to get it like a Jensen-Shannon divergence between the probability density functions that correspond to these so that would allow for some lateral shift and in some sense it would give you what you really want which is the distribution yeah yeah yeah that's I haven't thought about that before neither I think that I think that has a pretty good chance of working yeah like I guess when it comes to relative probabilities of structures I find that when you get to like maybe five or six kT but well like I might it might be needed to use an elevated temperature but I think that sounds good yeah so I guess maybe is it okay to keep going I know we're a little over let's see have torsion profiles optimized geometries vibrational frequencies thermodynamic properties changes in parameters so sure so the optimized geometries and vibrational frequencies we can go over these kind of quickly so here again is the plot that shows you that here for the vast majority of cases fitting to the optimized geometries really does improve the objective function and then here is the more detailed analysis of exactly what happens so here let me explain how to read these plots the scatter plots that you see there you see an orange one and a blue one these are these are mm minimized internal these are mm minimized like bond lengths versus qm minimized bond lengths for the initial parameters in orange and the optimized parameters in blue if the force field was perfectly accurate then everything would be on the dotted line and because the force field only has one equilibrium parameter because we're grouping by bond type that's represented by the vertical line here so on the left panel you can see that the original equilibrium bond length parameter is in orange and the new equilibrium bond length parameter is a lot larger and then after the optimization you can see that the points start to cluster a lot closer to the diagonal line which means that this is a positive result but you can also see that by optimizing the valence parameters we are not really able to control the distribution in the energy minimized bond lengths of the same bond type because we don't see these we don't see these actually collapsing to the diagonal line it's more like the cluster moves over to the diagonal line and then there are also cases where the spread in the qm optimized bond lengths is a lot larger meaning and that's when our optimization of the valence parameters isn't able to improve things by all that much those are cases where we might want to rely on maybe automatically identifying new bond types or maybe using bond orders to interpolate between bond types that's some really promising stuff and here you can see an example of an angle type that was assigned to several angles and molecules where some of the qm and mm minimized angles are around 130 but other ones are around 110 or 115 so this is kind of an example where especially when it comes to the case of angles there can be a lot of there can actually be a lot of frustration in the angle terms of the minimized structure in the sense that the minimized value of the angle might actually be pretty far from the equilibrium angle parameter and that's definitely what some people call vibrational frequencies I think we can go through this a little bit more quickly here you can see the objective function decreases even more dramatically and moreover the initial value of the objective function seems to not vary by too much across the different molecules and if you look at the performance in the computed versus the mm versus qm vibrational modes you can see that most of the error in the objective function comes from a few outliers in the high like 1000 to 1000 range and after optimizing the valence parameters things fall a lot closer to the diagonal line so we think we're doing okay for the vibrational frequencies I should mention that we still want to move to internal coordinate Hessians and that just didn't quite make the cut this year and we're going to do that soon yeah I think there should be OH stretches yeah you mean like so OH stretch would be up here and I think some of those improved but maybe the biggest improvement is this yeah I I don't know exactly what that is yeah it's could be an angle and when I was putting this plot together I didn't exactly have the machinery to enter maybe Trevor can help me with that later alright so and so thermodynamic properties this is this perhaps the most challenging one because to estimate the thermodynamic property you need to run an entire simulation right so here so here we turn to the property estimation toolkit to run the simulations and give us the estimated properties as well as the gradients and on the left is the densities on the right is heat of vaporization and yellow is pre-optimization and blue is post optimization okay so this when you see the decrease in the objective function you know things you know things are getting better but but you don't really know you know how much more accurate things are getting and this and this plot will you know kind of give you a sense for that um so where was the original force field not performing as well um for example these points here like this very dense liquid experimentally around 2500 here it predicts it to be up to be about 1900 that's this uh um di-bromome thing okay and so and so the optimization will uh will correct the parameters for bromine basically in order to bring the density closer to experiment um as for the heats of vaporization you also see that um you also see that there are generally improvements but there is kind of an outlier on the on the high side here where even after optimization of the force field we are getting the heat of vaporization to be too high and this turns out to be um to be not only a pretty polar molecule but also a carboxylic acid um and so um and so um it might be possible that there could be a that there could be some sampling issues of that carboxylic acid in the gas phase because that flipping of that di-usual angle is just just so rare okay so that's a so that's the basic performance for thermodynamic properties as for um changes in parameters you can see that you can see that there's some variation depending on the parameter type and that um and and that's uh and and sometimes that's just a function of you know certain parameters just being much more well determined like a bond line um and other times you see we see a lot more larger variations in the angle force constant um that could be partially due to the relatively large prior width or you know relatively weak regularization that we said um but um but also there might be larger like intrinsic uncertainties in these parameters when they were when they were initially yes um here are the changes in the parameters for the van der Waals that David already showed you as well as the as well as the um the proper torsion parameters the proper torsion parameters are the ones that change the most but the larger changes are on the order of three k calc remords okay so this uh brings me close to the end um so um so in terms of outlook the force field is ready for benchmarking and testing really looking forward to hearing about your results because um because this is just a starting point right we would like to use this as a good starting point to do better and better um and um in terms of some near-term development goals there are some obvious things that we'd like to do um we'd like to use these internal coordinate sessions that I said we would do in January we haven't done them yet um and then um and for torsion drives the Roche set included a lot of flexible rings that we didn't include that didn't really make it into this data set so this requires a torsion drive with an energy upper limit which is implemented now so we can so we can do these um we're also interested in co-optimization of Lenard Jones and the bonded um parameters to see if we can do even better in terms of our qualities of fit um and you know for for our longer term goals there's many long-term goals to to discuss so this is not at all an inclusive list but um um but for example there might be valence degrees of freedom like um like say certain angles that have very large deformations that need to be explicitly scanned um there's a study that we started that is currently ongoing we're interested in improved charge models um um one one potential improved charge model is the REST 2 method that Michael Schauperl has developed and then um I'm also very interested in thermodynamic properties of mixtures in our training data set we think that might be a lot better than using heats of vaporization because we're never really simulating things that are purely in the gas space um and then um and so and so with that I did a I did attempt to list um um all of the um all of the folks who uh who contributed to um you know the um the current work as uh as well as as well as ongoing work um but um but it's also very likely that I um better left somebody out of this list you know it's been great um I'm working with everyone so far um and I'm interested in hearing you just so I'm the next couple of years trying to figure out what's going on fine with me um um um um um um um um