 Okay, so great to be here. We're doing a tag team to talk about open force field and what open force field is up to and where it's headed. And it's great to be here in person after so long. Just before this, we some of us came from what's called the chemistry meeting, which is a meeting on free energy calculations by energy calculations. For those who don't know, this is a starting to be a key tool used in the pharmaceutical industry to help guide early stage drug discovery. And force fields are a key ingredient used in those calculations. So it's part of what open force field is about. At the last chemistry meeting, which happened five years ago because this pandemic intervened, open force field had not yet launched as a funded project. It started in October 2018. At this meeting, it was the most widely used public force field, at least by counter dimensions and talks. So that was a big exciting, an exciting change for us. A brief timeline. In January, in 2016, Chris Bailey arrived at UC Irvine in my lab on sabbatical to start essentially getting things going for open force field. And I know the exact date because at the moment I greeted my wife in the labor and our first, our last daughter was born that day. So open force field is exactly that old. And then in 2018, it launched as a funded projects with pharma folks joining on coordinated initially through Virginia Tech at Malsy. And so MSF didn't exist at that point. It was created later to help facility. So that's how old my daughter was at this point. And so, you know, here we are today. We have our first post COVID meeting. Many of us are meeting each other for the first time, which is cool. And, you know, this seems open force field is usually the most accurate public force field. And so here's my daughter now. So she reads, making real strides. You'll hear more. I'm just going to give some highlights of things you can hear more about or you can talk to us about. And then Jeff and Lily will come up and tell you some more about what's going on. One thing we've done is quite a bit of benchmarking of finding free energy calculations with open force field across a wide range of targets. And David Hahn and Vitas have done some really nice work on this cross comparing a bunch of different force fields, looking at all of these different targets around here, something like 1100 by free energy calculations. And it looks good in general. There's room for improvement still, but it's good. And people are, as a result, people are using it a lot. At the same time, you can take that analysis and do a deep dive to see how our generations of the force field changing accuracy. There was actually quite a bit of this at the recent chemistry meeting and a big picture, like as we go from subsequent versions that are close to each other, people don't immediately see it see a huge impact on buying free energy calculations because there's a lot of sources of noise in those. But if you dive in really deeply, as David has done in significance test things and look at, can you see an impact of changes in parameters that are used a lot in the free energy calculation? So this is after some nice significance testing and the bars here go up if going from 1.0 to 2.0 make the results worse and down if they make it better. So you can see that almost all of the changes from our 1.0 to 2.0 version made the binding free energies more accurate. This is after careful significance testing. So we are making headway even in binding free calculations where there's more noise. We see a lot of headway with respect to quantum chemical geometries and energetics. There's also some great tools that have come out that you can use and you'll hear more about these. For example, bespoke fit. Danny Cole will be talking about this in the open force field session tomorrow. This fits custom torsions for specific molecules or chemical series. So on the on the top right you can see a graph looking at the original torsional profile for this molecule in orange and then the quantum profile and the refit profile with the spoke fit in blue. So you can use a fast semi empirical quantum method or your quantum method of choice to refit the torsions and it really does seem to improve accuracy at the bottom right is impact on binding free energy calculations for one particular system. And Danny I think will tell us more tomorrow about what goes on here. There's a fragmentation of the molecules and then fitting torsions that will work along a chemical series. And it does improve accuracy. There's some numbers at the bottom right rms error that is just that right column. And so you can see bespoke fit at the bottom right has way better accuracy rms error on torsion profiles. So you can hear more about that if you want. Also the infrastructure we built as part of the consortium ends up being used and for interesting things far beyond what we're doing. As an example Danny Cole's group with their own funding has been able to build a double exponential functional form force field a general force field so major change in functional form fit it with the same infrastructure test it with the same infrastructure pull water model and everything. So this is something that might have taken 10 years of work for somebody to do before and they've done it in a short period of time. There's a simulation without top right at the bottom left is a bunch of salvation free energy calculations done with the force field. So now I can use it to do salvation free energy calculations and cross compare with more conventional functional forms. There's also we're a basis group at nt state looking at a polarizable exponential 6810 force field and you'll be able to hear more from them about that in a publication soon expect but they've been able to make a general force field that does that and tends to a number of things including enthalpies of vaporization on the right. It looks better than sage there. This is not something that our main line open force field is expecting to do to go towards soon but it's cool because people are starting to use this infrastructure to accelerate force field science in a way we haven't seen before. And then will a Wang and Mike Gilson's group will be talking tomorrow about adding polarization simply direct polarization to our model and then refitting force fields with that again using the same infrastructure and so on the right is a graph with dielectric constants calculated from this and on sage sage is the left right is the polarizable model so it improves dielectric constants a lot. We recently put together a mission statement and here it is you don't have to read the whole thing but part of the idea of at least the industry funded part like consortium is to keep making steady progress in improving the accuracy of force fields. We've run into problems occasionally where we have an idea and we pursue that idea a whole lot and that's a lot of effort into it and maybe in the end it doesn't work out so we're trying to focus a lot in the coming year on benchmarking so you have ideas your ideas have to clear the hurdles and not the ideas that don't clear the hurdles are going to get you know neglected or dropped even if they seem to be really good ideas before you try the benchmarking then it's critical so that we don't make philosophy driven choices we make data driven choices. Now we do continue to make steady progress in force field accuracy and that benchmarking is going to help us systematize things like this this is a comparison we often run looking at cumulative distribution function for small molecule RMSDs or geometry errors relative to quantum chemistry so lower is better and you can see that as we go across subsequent versions of the force field we're making smaller and smaller geometry errors or we're reducing the scale of geometry errors and the same thing tends to hold true for other metrics. You'll also hear in this talk and in what's coming about other cool things you can do or directions were headed for example we now can handle proteins within our infrastructure which makes it easier to simulate proteins in complexes and we're as part of heading towards easy handling of biopolymer covalent modifications of biopolymers lily is building out a graph network charge model that could do really fast assignment of charges even for large molecules which will have some important applications. So you'll get to hear more about the roadmap of this to come I think we're really at the point in the force field where we've made clear progress it's exciting people are really starting to use it but there's so much still to do at some level we feel like we're still just getting going as much as I'd like to be done working on force fields and just go and use them. So you're going to hear more about this but over to Jeff to talk about what we've done so far. Thank you and could I check uh how is my audio over there in the back of the room everybody here okay um am I clearly being amplified by the microphone? Okay uh then I'll use I'll use my outdoor voice as well uh great hi I'm Jeff Wagner I'm the infrastructure lead at open force field and in this part of the talk I'm going to try to convince you of three things I'm going to try to convince you that our models work I'm going to try to convince you that people are actually using them in the real world and I'm going to try to convince you that you should be using them too. So I'm going to start by explaining how we make our force fields when we go to fit a force field we start with an initial guess shown on the left in this case this is a figure from the sage paper explaining that we began with open ff 1.3.0 parsley and then we did a refit to two sources of data one source of data was condensed phase information so densities of mixtures and pure compounds as well as enthalpies of mixing for compounds and so we take our initial guess of a force field we go pack these boxes to match the experimental experimental solutions make sure you know stimulate these boxes to see what density we get see if that agrees with experiment and then try varying the non-bonded parameters up and down a little bit redoing this seeing if things get better or worse and improving the force field sort of by a gradient descent method once we finish that non-bonded refittings we then went to a large data set of qm optimized molecules and their associated energies and we tested whether the mm force field could match the correct ranking of conformers and the energy differences between them and also their geometries and we vary all the parameters up and down a little bit these would be the bonds the angles the torsions and the improvers and we'd use this again to do a gradient descent for the valence terms and finally we'd get a release candidate of a force field out of this optimization process with this release candidate we would go and benchmark it we would do benchmarks similar to how we did the qm training and the condensed phase training but just on data that wasn't used in the training and also we would do benchmarking using free energy calculations. Here's an example of what the condensed phase fitting looks like. These are two types of properties that we fit to on the left we have enthalpies of mixing and on the right we have densities the blue series is the initial force field partially 1.3.0 and the orange series is sage 2.0.0 and you can see before the refitting for example on the right densities were being systematically underestimated up at that high range and with the end of the refitting everything was pretty much on the center line the enthalpies of mixing are a little bit well more mixed but statistically you can look at it and the the training did improve that and so we're not aiming here to to fit protein like a binding for energies we're just taking these small carefully measured interactions and training our force field to get them right. Here's an example that Mowgli showed of the the qm benchmarking and so here this is using the industry data set or the industry training industry benchmarking set that many of you participated and contributed molecules to and here we're showing a cumulative distribution function of when we start with a qm optimized geometry and then we feed that into an mm optimization using a number of different force fields what happens during that mm optimization because the mm force field recognize that it's at a minimum and stay there in which case the minimization would result in an rmsd of zero or does it not think that it's at a minimum and the mm potential energy surface has a minimum somewhere else and it wanders off so a perfect force field here would have a vertical line basically at zero and then a horizontal line and why it's one that would mean that everything sticks right at the qm minimum when you re minimize it using an mm force field we don't have perfect force fields but what you can see is that through successive generations of training our force fields we do manage to capture more area under the curve that means that larger and larger fractions of the 70 000 conformer data set are remaining close to their qm minima and well great we can get uh some densities right we can get some enthalpies of mixing right we can we can get conformers right and maybe that's useful for some users who really want uh to use open ff for those applications but i think the majority of people are interested in protein ligand binding free energies this is another figure from the sage paper showing performance on a large set of protein ligand binding free energies i believe it's almost 600 ligands and 20 different proteins and it's showing that sage is an improvement over parsley that we're now uh you know among the best for open source uh openly available force fields and that we're approaching the performance of sort of long-lived commercial force fields as well so this is exciting oh and on the left that's root mean squared error you want that to be small on the right that's kendall's tau that's how well you rank the conformers in agreement with their experimental ranking you want that to be large uh and so you can see kind of things descending on the left and things ascending on the right importantly it shows that in successive iterations of our force field fitting we are seeing improvements even in things that we didn't fit to protein ligand binding free energy we fit to boring things but it turns out that we're getting protein ligand binding free energies more correct and you'll notice that the numbers on some of these plots have been jumping around uh sometimes i'm talking about sage 2.0.0 and sometimes 2.1.0 and that's because 2.1.0 just came out last week so we don't have all of the benchmarks run on it yet um but sage 2.1.0 is really exciting because instead of taking these things that we're already doing quite well on and trying to get them you know 0.1 angstroms better uh our fitting team for the last two years has been getting reports of osulfonomides are tricky oh phosphoreses in certain situations have trouble you know i'm seeing problems with these specific chemistries and we've had a lot of really helpful feedback and using that we've expanded our training set for the force field such that it covers a bunch of chemistry so some of our parameters we're refitting but some of them we didn't have enough training data on so we weren't doing and over the years we've been generating more of this training data and we're getting more user reports and so sage 2.1.0 focuses on the outlines it's the culmination of people identifying these chemistries and us you know smelling funny things in corners of chemical space and this is due to a huge amount of work by the way from the people on the right pavon trevor chape and and josh we're all going to be giving talks later on but in particular it's important to note that there's a lot more parameters included in the training that means that we're relying a lot less on just whatever values were in the previous force field or what we're going to keep um and especially we've been haunted on the bottom here we can see there's the sulfonamides we've been haunted for a long time by problems with geometries of sulfonamides and this cleans up a lot of those over here on the bottom right these are plots of two angles these are three atom angles that get used in sulfonamides and before and after uh the changes that were made in open ff 2.1.0 you could see that things that were assigned a 31 would sometimes deviate from uh their expected geometry and things that would were assigned a 32 would deviate and they would deviate systematically in one direction or the other not like oh you know it's just chemistry is complicated it's like no things are systematically going to the wrong bin here and so pavon and trevor in particular did a lot of work to redefine the smart patterns so that things are now falling into the correct bin uh if you want to hear more about this pavon's going to be giving talk tomorrow uh in the open force field room from 4 to 4.25 p.m. and indeed this is cool we said oh we're going to go outlier hunting you know and we can make up all these narratives about what we're trying to do but when we actually look at the outcome this is uh one that experiment again where we take a qm optimized conformer and then we re minimize it in mm and we check to see how far it wanders away from that qm minimum and uh a perfect method again would just have everything in the leftmost bin all the rmsd's would be zero the mm stays right where the qm says a minimum is but that's not real however what's cool here is we see that leftmost bin the thing that we weren't focusing on the things that we're already doing well on that didn't get that didn't get a whole bunch of new members what we did get is an increase in the lower range bin that comes at the expense of just taking a little bit off of all of these large outliers at high rmsd so these things that we were doing exceptionally poorly on before we've now gone and wrangled all of those in we brought the flock back into a lower bin so again if you want to hear more about this and the process behind it definitely come to the bond stock tomorrow. David Mobley also mentioned the spoke fit this is when you want to spend a little bit of extra computational time to get an extra accurate result from your calculation you can use open ff bespoke fit and this will take a molecule of interest and it will add terms on top of the general force field so an input for bespoke fit would be sage 2.1.0 and your molecule of interest and bespoke fit will produce a new force field that's sage 2.1.0 with additional torsion terms on the end of it for the torsions in this molecule of interest and it will identify the torsions that will fragment up that molecule it will do qm drives of those torsions to get a really accurate surface and then it will train the new terms that get added to the force field based on those qm results and does it work well yeah again you don't just have to believe that oh if the qm geometry matching gets better if qm energy matching gets better then probably something good happens downstream no uh josh and danie and simon really went out and tested this i believe with the help of david han and the free energy results get better on the right is the results of a version of parsley that was available at the time that was extended with bespoke terms and on the left is just the version of parsley without the bespoke terms and you do indeed see that many like basically all of the metrics quality get better so if you have the time to do bespoke fitting uh you can check that out i believe we have an industry speaker this afternoon william lou who's going to talk about some of his experiences doing that and so i hope you now believe that our models work and now i'm going to try to convince you that people are actually using so it's not that we have some model that only we can run because it's so Byzantine uh i'm going to show a couple of use phases so one is from we we had this talk in an all hands meeting came out of the blue for me uh from tobias hookner and he talked about using different small molecule force fields to investigate crystals uh small molecule crystal structures and he he tried a whole bunch of them and yeah personally in sage end up working really well in this situation again something that we hadn't trained the force field for it wasn't anything that we had identified as a big use case but he used it and it's a general force field and it did a generally good job um if you want to come to tobias's talk it's going to be on a different topic actually but he's a really smart guy and he's talking tomorrow in the open force field room from 1110 to 1130 we also uh you've probably heard from us that we're coming out with this amazing rosemary force field in the future it's going to be a self-consistent protein small molecule force fields such that uh if you have some modification to your protein the small molecule parameters will just fit in there uh the the parameters will basically be indistinguishable and it's it's a huge amount of work and you're going to hear about this in one of the talks tomorrow as well but in the meantime we've been joking you know oh if you wanted to do something in a pinch you could put together a franken force field take something that wasn't meant to be combined and just use the smirnoff hierarchy rules to combine it and josh and i uh josh michael and i on the bottom there were having a check-in last year and we were just kind of joking around and we ended up saying two or three hours past the end of our check-in and we put together a little notebook of yeah let's just frankenstein together amber's ff14 sp and sage and just see what happens and we ended up with a notebook that can parameterize post-translation modified proteins and during our workshop push we went ahead and and did a workshop i think around october which rebecca alfred attended and she started using this workflow and i'm really excited to hear where this is gone so uh rebecca is at at yens and i believe and she's going to be talking about her experiences using this for post-translation modified protein modeling this afternoon also apparently people like free energy workflows and i've i had never run a free energy calculation in my life before i came to open force field and i still haven't but apparently other people do it and we are being used in an enormous number of free energy calculations um this is just this is the stuff that came in my inbox in the last like three weeks by the way there's so much of this happening and so if you're wondering you know are there free energy frameworks that can use open force field do they exist uh yes they very much do and uh people are using them people have been using them uh and so if you want to be running free energy stuff with open open force fields the tools are out there and you'll be able to hear about that this afternoon with Richard gowers and you'll be able to hear about that with william woo our industry speaker as well and there's a million other places where our force fields are being used we're getting integrated into commercial toolkits we're being integrated into open source toolkits uh we had the big covid moonshot consortium uh the past few years apparently uh and they were using open force fields so the our force fields were on computers all over the world being used in huge data sets uh there have been scientific papers so yeah in case you haven't started using our stuff yet people out there are using this and it's not hard to use and it's not hard to get started i wanted to give you an update which is that in the past year we've added protein support to our uh to our ecosystem so before our force fields were focused on small molecules and our infrastructure was focused on small molecules but in the run up to our rosemary force field we've been expanding out the infrastructure we've got this ff14 sp port into smirnoff format that you can use to parameterize proteins and this simulation here was run using the open force field toolkit uh it took 20 lines of code and about a hundred seconds and those 20 lines of code if you don't believe me or just there on the right um we'll be talking about this more tomorrow during the infrastructure talk but this was the culmination of the effort from a whole bunch of people and you should look forward to being you know to setting up protein ligand systems in the future in open force field this is also going towards one of our big collaborative projects this year which is alchemy scale and as i said it's great we can fit to qm we can fit to condensed phase but what people ultimately want to know is does your force field work for free energy calculations or at least that's what a lot of people want to know and we've been in a working group with open free energy in the codera lab and other stakeholders in the past year working on a large uh free energy calculation orchestrator called alchemy scale which can dispatch bajillions of free energy calculations at a time um and i won't go into a huge amount of detail about this because david dotson's going to do that right before lunch but this is a really exciting project and it means we're going to be running a lot of free energy calculations uh as a routine thing it's part of just not even releasing force fields just making candidate force fields doing experiments with different functional forms and stuff different types of parameters and this is going to be a big help for us to expedite our force field release process and so now i want to convince you that you can too that you can get into our ecosystem and learn to use our tools make your own workflows and get moving on that so here i want to highlight that in 2020 we did a survey of the advisory board and we asked them a whole bunch of things we probably asked them too many questions but one of the questions we asked is hey how is the documentation and the answer was not very good um we got uh basically a 3.3 out of five uh we had a lot of comments specifically saying oh i had a problem and i couldn't figure out to solve it i couldn't figure out how to solve it it probably would have helped to have this in in documentation a lot of our people work it or a lot of our users work at companies where they can't send us a bug report and so the more tools we can give them and the easier we can make it to use those the better the dividends for us and so uh around that time we hired josh mitchell he's our scientific communicator and he's got a real passion for clear understandable parsable documentation and he's taken a huge amount of initiative to get the word out there to make our documentation easily approachable and since the time he's been here that same question phrased the same way to the advisory board in 2023 gave us an average rating of 4.5 out of five we had a a poll kind of question asking about the quality where you know kind of went from bottom to top bad to good and everything was in the top two categories all of the responses were hey these docs are hugely improved so if you tried our docs in 2020 and you found them lacking or if you're thinking oh Jesus am I going to be waiting through source code you won't be waiting through source code the docs are there and you can get started uh we've now got on the far left these are what we call wayfinding docs now that we have a number of packages people don't know if they want an interchange or a molecule or what what have you um josh has put together this wayfinding figure where you can see the workflow for where the molecules go in where the force fields go in what comes out in what format and you can click through these different colors to get to the documentation for that specific package or to the api for that specific thing uh josh has added a lot of kind of open eye style theory documentation so not telling you in terms of the commands that you put in the computer but telling you in terms of the scientific theory what's going on in the tool so you can understand if it's suitable for your use or why it's taking so long or you know anything like that and also he's he's updated our documentation theme things are clear now you can easily pull out code blocks from other texts you can see the arguments that go into functions um people have said really nice things about the parsability of our docs since that change went in and yeah you should keep your ears open for more workshops that will be running later this year uh sort of like the post-translation modified protein workshop we found that to be really successful we found that to be a great way to engage with people and we want to run a lot more of these sort of like micro hey here's a cool application here the lego blocks to put it together um if you want to use it verbatim go for it if you want to understand the lego bricks and put it together a different way go for it but this seemed to be something that was really popular and it helped get us engagement so keep your eyes open for more workshops and so i hope i've convinced you that our models work that people are using them and that you can use them too and now i'm going to go ahead and turn it over to lily to talk about what we're going to be doing in the next year cool thanks jeff um so jeff i just in a really cool view of all the amazing things that we can do with all the work that we've been so far um and now i'm here to give a bit of an overview of some of the uh new stuff that's coming out um all that we're working on uh in the upcoming year and a bit further into the future um so these new uh projects are going to span both science and infrastructure needs but i'm going to start off by highlighting those that contribute most to our commission which is bringing bigger and better force fields um so as always we will continue uh incrementally improving and updating uh what we already have um which so far has been a series of continuously improving small molecule force fields but over the next year we plan to expand the scope of our force fields substantially to cover both different domains chemistry and increase the general applicability of the size of the molecules that they can apply to so first and foremost uh what we want to do is move on from small molecule world and enter the vast region of biopolymers starting up with proteins so our initial goal with the next major release that we're planning rosemary um is to fit a single self-consistent force field containing parameters for both small molecules and proteins um this has been one of the biggest projects that we've been working on over the past few years and it's been spearheaded by chaffin he will talk and talk about this in a lot more detail tomorrow afternoon um but i'm hoping here to just give a bit of an idea of how much work we've already put into it and uh roughly where we are in terms of progress so using our infrastructure chaffin has uh successfully generated um a series of new quantum chemistry datasets to both train and benchmark uh a next generation force field these data sets have largely focused on profiling key torsions in proteins both the backbones and the side chains um and now that they're generated as part of our open philosophy they're up online for everyone to use um and work with as jeff and josh will demonstrate in the infrastructure presentation tomorrow um so using this data chaffin has been able to fit several force field candidates um which fall into a couple of different categories to make sure that where the rhythm is not just possible um these would be the null model uh which is the force field uh fitted with this additional quantum chemistry data but without additional protein parameters so using the same parameter specifications as in our most recent small molecule force field stage 2.1 and the specific model which is um which does contain several additional protein specific torsions uh each of these force field candidates is benchmarked on both small molecule and peptide targets um with various qm the metrics based on qm targets like the relative conformer energies torsion profiles and optimized jump through set jet has already gone through um and we look at performance on on both data sets uh what's interesting is that um the performance on the general small molecule qm targets is mostly for what to say which seems really good success yay um but the torsion profiles for the peptides in particular can show large mismatches in qm so this has been a large part of the project so far is uh is fitting one of these protein force fields uh benchmarking looking at the problems and then going back through the process um refitting uh and exploring different settings like uh using different weights for the data um starting off with different priors and possibly fitting to additional um quantum chemistry data that we generate as needed so i'll let jayton talk more about this tomorrow i won't still all this sunday but just as a taster i might give an idea of how the current protein force field candidates are doing um so these qm benchmarks are all very long and good but mostly what people care about you know assimilation properties so the other benchmarks that we used to evaluate the force fields include predicting NMR observables and protein stability over long simulations so on the left here i have a graph um uh comparing uh the predicted NMR observables to experimental values for three different force fields on the left we have um amber 14 sp um and in the middle we have our two different malware specific protein force field candidates so initially it looks like we have a long way to go both of the open ff candidates perform worse than amber or NMR observables and um the story actually gets worse if you look at the rmsc trajectories because uh these uptakes in rmsc oh can you see my mouse i'm trying to move my mouse over this but can you see it oh that screen's the left of the screen oh okay so is it visible over there oh yeah you can just use it okay sorry um uh so these uptakes in rmsc um of a long time scale simulation of a protein show that uh helix and alpha helix and y's in one trajectory of both of the protein force field candidates that we're looking at um so yeah the story's not great so far but what's really interesting is that whoops um if you replace that for none of the water model tip 3p uh with a foresight water model opc the the open ff candidates become much more competitive um so on the left with the NMR observables one of the uh force field candidates the the null model with opc water um actually achieves comparable performance with 14 sp and uh chip in has also conducted additional long time scale simulations with proteins and opc water and so far uh has not observed any helix unwinding um so this is maybe indicative that the problem is not so much with uh the parameters that we fitted but potentially with uh the water model that we're using so that brings me to our next project goal with rosemary um to add self-consistency with water and iron parameters by optimizing our own water model and and ions um so up until now we've been using the tip 3p water model inherited from amber um but that has a lot of problems that we kind of already knew starting with the fact that it was developed um with short range electrostatic uh cutoffs instead of uh and and not for use with PM simulations um so we've been thinking for a while now that possibly we should be reaching our water model um for example we saw that when we refit vandalized parameters in sage uh that the enthalpy of mixing um of aqueous structures um improved slightly but remained systematically overestimated so that suggests that the common factor here the water model uh might have been from proven um added to that the fact that champions candidates post-work candidates get much better results with ovc despite all the parameters being fit to tip through here um suggests that uh this might be the next test direction to go for improving um right so yeah um we're refitting the water models because we think there's a lot of room for improvement um so that should come out as part of the rosemary line um hopefully with or right after the initial release of rosemary um and after that we're hoping to improve our force fields um along an electrostatic pathway so we plan to expand our electrostatic treatment in two ways um and the first way is again within the rosemary force field line with the addition of neural network charges um so up until now we've been using the aim one is to see charge model to assign electrostatic parameters to our force to our molecules with our small molecule force fields um and that's worked pretty well um but aim one is to see his method that uses semi empirical calculation to come up with the original aim one charges so realistically this really limits the scope of the molecules that we can address to a couple of hundred atoms at best um so if you want to work with proteins we're going to need a way to assign charges efficiently and a neural network model allows us to assign self-assisted charters to both small and large molecules um as you can see here uh with one of the prototype models that we have here um we achieve efficiency scaring spot result magnitude better than um a full on open eye aim one bcc calculation another advantage of using a neural network is that uh we can design the network to avoid doing the northern life of geometry so that we avoid the dependence on the conformity geometry inputs that you might get with uh with aim one bcc or other qm based charge models so our initial goal because we are releasing this as a proper replacement for rosemary is to fit our charges to replace aim one bcc uh that being said we're not limited to uh only considering aim one bcc once we release our initial product this pays the way for improving the level of theory um to possibly a higher level so when the chemistry theory um so we've been working on this over the past year or two uh and we have now achieved models that do perform similarly to our existing methods for calculating aim one bcc charters so we're quite confident that we will be able to release this um as a minor update to rosemary soon after the initial release um as jess mentioned uh the infrastructure team has been hard at work putting together uh the ways to use bright charters um so now that we have the infrastructure in place for people to play with um uh it's up online in the most recent open ff toolkit release uh this is caveat um this is very much a prototype model this is not the most up-to-date model that we have um and that's why it's it's wrapped in all these private modules with private analysts um but as demonstrated in the infrastructure talk tomorrow you should be able to assign partial charters to proteins using this um and in less than a minute in the case of the gv3 protein um so that's rosemary i want to now talk about another really exciting direction that we're going into in the upcoming year which is expanding electrostatics by adding virtual sites um virtual sites are now one of the top priorities that we have for improving small molecule providers we know that it's difficult for classical fixed charters to properly represent the electrostatic surface around a molecule um to the level of detail needed so for example this is a method bromide and on the left here we have the qm electrostatic potential surface computed at hs631g star and you can see a signal hole here around around the bromine um with classical m1b specifically fixed charge fixed charges uh this signal hole disappears there's just no real way to represent it without adding um a virtual off-site particle so uh and and we know that this failure to represent the ESP properly does result in close systematic errors in simulation properties like um salvation of free energies so our goal is to create a virtual site um we will initially fit it uh to the electrostatic potential or the electric field at hs631g star this is going to specifically target low hanging fruit and no problems no problems such as the sigma holes that are just illustrated um and we've already got a pilot study suggesting that this should improve our parameters and performance substantially um this will be a fairly large project we will have to uh generate a swear that we do qm data and possibly more experimental physical property data to both fit and validate against um and much like with the graph charges uh we are initially fitting to hs631g star but we're definitely looking forward to investigate to fit into higher levels of theory once the initial product is out so um the timeline for this is uncertain we predict that virtual sites will come after rosemary but if a virtual site is a force field is ready before then um we're not gonna we are going to release it when we have it ready so this timeline is more of a a qualitative estimate than anything definite um beyond the virtual site force field we have a lot of other exciting possibilities on the horizon um there are projects like automated chemical perception um fitting the surrogate modeling moving towards a general neural neural network force field um incorporating polarizability and exploring alternate functional forms um sorry um and a lot of the talks tomorrow will cover these topics so if you're interested please do come to the urban effort room and and have a listen um but the problem with having all these really cool things going on is that we have limited time and resources so the question becomes which feature do we prioritize um at what point do we investigate in major updates and changes the infrastructure and at what point do we uh does a feature make it into a fostered release so at open ff one way that we can classify these these features is by putting them into three different bins um on the left we have the newest ideas and the newest sites these could be really cool but um they're still too new to know if they'll actually pay off in the middle um we have ideas that have been a bit more fleshed out that the um the consortium or the staff members can start putting effort into um and then on the right hand side we have the things that open ff has fully committed to to both bringing and maintaining for its users and that would include user facing functionality and and port deliverables so what we need here is a way to both assign um features into these bins and then a way for features to move between them so that we can work out exactly how much effort and how to prioritize uh new features coming up and so that's why one of the major projects in the infrastructure um track coming up is bringing together this generalized benchmarking suite um to the oms staff members to start off with and then to general users so um the goal of the generalized benchmarking suite is to make it easier to both implement and run benchmarks on your work um this package is is going to be designed with a modular plugin architecture um and incorporate multiple different data sources be able to work with many different um tree alternate alternate functional forms um and work with a number of different analyses so we're already well into phase one um which will start by implementing this architecture and then follow up by um including season one style industry benchmarks as well as the save release benchmarks followed by phase two where we may run a second season of benchmarking um but the main focus will be on sorry on expanding usability um and usability is a major thing of the infrastructure focused for this year so a lot of effort here um this year will center around making interchangeable usable um expanding the range of importers that will include to incorporate a simple amber charm and gromac systems um and we'll also look at improving the user experience in protein with the workflows uh and the toolkit will expand a polymer loading infrastructure to support dna rna user defined custom substructures and so on we'll also support loading from formats like pdbs and mmcd um and some more formats and finally now that josh has done such an amazing job with package specific documentation uh the main focus for improving um the infrastructure there will be to centralize it all in an area and start developing centralized cross package workflows so uh all in all we have a busy but exciting year ahead of us bringing breaking into new domains of chemistries um exploring more electrostatics and making up products both easier to use and evaluate so with that i'll pass back to david to wrap up all right so i'm just going to try to wrap up on the speaker um you know to kind of summarize um open ff seems to already be competitive probably the the best public small molecule force field and we think the 2.1 which we just released is going to be better for some applications the so here's finding free energy calculations with edits and with 2.0 and some other methods the 2.1 release as you heard and you can hear more about tomorrow has made some improvements to the fitting process that actually improve accuracy relative to qm and we think make it a better force field in general uh some of those the improvements are about how we fit and then a lot more are about details of chemistry and fixing problem cases one of the interesting things is we have better initial gases for parameters now they come straight from gone rather than starting from a previous force field and um in addition independent assessment agrees that we're making headway or essentially we had previously built out a benchmarking infrastructure that allowed pharma and industry to run benchmarks on proprietary chemistry internally and then report back essentially anonymized performance statistics and so that showed more or less the same thing that that's performance is improving from one version to the next on the top is looking at some error structural and energetic error metrics by bin and for our force fields versus including gaff and and opls four which is the industry competitor and so our force fields are making progress in that and at the bottom it that's the top one is for public data the bottom is for proprietary data where they only reported back anonymized performance statistics so you can see that in the paper but we're making clear headway and this was an independent assessment so we're excited about where we're going as I said earlier I think at some level we still feel like we're just getting going because there are things like virtual sites that we already know are going to improve force field accuracy and we want to get those into a force field as soon as it makes sense to do so although that will break some downstream things like binding free energy calculations with some tool sets and as you heard we're headed towards a consistent biopolymer force field we are pretty small so there's the initiative which is everybody doing everything that relates regardless of their funding source and part of that is is funded by the national and institutional health we but the consortium part which is the industry funded part really is roughly speaking these five people on the left and some of them are fractional I think we have really a small fraction of David Dodson who you'll hear from later so it would be great to have a little bit more so we are continuing to actively recruit partners if you if you know anybody who wants to join us and the initiative there's there's too many people doing related research picture all of them there are a couple changes Pavan Bahara who is a new person on the quantum chemistry side has moved to a permanent position and I'll be still giving us some advice for a while and so we have probably two new people in the QM data set QM fitting and data set side coming in and also we have a new project manager James Eastwood who will be started soon who's going to be shared with open free energy anyway we we want to thank everybody who supports open force field the NIH and the consortium we really appreciate all the the farmer partners and industry partners both past and present there's a big list up on openforcefield.org and the team and the alumni it's a great community of people a virginia tech provided initial hosting through mulsey and then omsf now it's really been great working with mulsey on this and being part of helping with omsf on this and being part of helping to get on this up going and we're excited about the ongoing interactions with open free energy that's been great as well and in some sense we stand on on the groundwork laid by the amber community in the past the 80s and 90s so we appreciate that work as well so thanks very much for your attention and we probably have time for a couple questions