 This is an error working. Yes, okay, good. So I'll be talking about the overall roadmap. There'll be a lot of detail over the next couple of days. Fill in. So this will just be a very high level view of what we want to do and what is going on overall. So the open force field consortium is part of this more general open force field initiative. And so the overall goals, and we'll talk about the consortium goals within that are really develop the highest possible accuracy, classical atomistic force fields for biomolecular and biocompatible molecules. We really do want something that is completely consistent when you go from proteins to polymers to DNA. Really the chemistries, the fundamental chemistries are the same and should reflect that. That's the overall, an overall goal. Develop an open scalable extensible toolkit for automatically parameterizing classical atomistic force fields for experimental data. In order to achieve a lot of the things we've been talking about, we really need a toolkit that people can plug components into, that everyone can collaborate on, that can launch programs at the scale. I think what we've seen a lot over the past 20 or 30 years is a lot of one-offs that are specific to individual force field efforts that really limit your ability to do large scale testing and benchmarking and improvements and limit the ability of people to collaborate. Also generate and carry open data sets necessary for producing high accuracy by molecular and biocompatible force fields. We want to have the data accessible so everyone can see what went into a given force field, how it was used, what the provenance is, which requires efforts at really maintaining those data sets well. Is it not working? I'm just seeing the very first orange slide that says open force field roadmap. It did advance, okay. Oh, screen. We're not sharing. Well, yeah. We're sharing the screen. Yeah, right. We're sharing the screen. Yeah, that's right. Yeah. We're sharing the screen. This is, yeah. No, no, that's the wrong one. Yeah, that's the problem. There we go. Is that working now? Yeah, that looks much better. Perfect, okay. Much better being better at all, yes. Meaning actually adequate as opposed to completely wrong. Okay. And finally, we want to answer important and unresolved scientific questions about the cost benefit of adding different types of detail to the molecular energy function. What level, polarizability or not? What type of polarizability? This has been a debated question for 20 years in the field, at least. And it really hasn't been resolved. I think part of it is because the necessary experiments that need to be run actually do require the sizes of effort that use a very large data set with provenance data and a scalable toolkit to be able to do it. So that's the overall arching goal of the initiative. These are not things that are gonna be done in one year or even two years. So I want to move or not all of these things are necessarily of immediate interest to Pharma. So I want to talk next about how the open force field consortium, okay, it was working before. No? Oh, there we go. Okay, yeah, and then this coming back, provide consistency across different molecular polymer classes. I mentioned that a little bit before, but really we should be, a hydroxyl should only depend on its chemical environment, not whether it's in a protein or peg or DNA. It should be the same thing. And so that's an overarching goal that really should, if we ever want to be able to easily stick something in some biocompatible polymer or non-covalently add something to a protein, this has to be the requirement long-term. So what are the goals of the open force field consortium? And that is the particular academic and dress field crowd collaboration meeting here today. And so this is a much smaller initial remit. Deliver small molecule force fields that have extremely high coverage of chemical space are completely open. Everyone can go in, see all the architecture, use all the tools. There's no IP associated with any of the parameters involved are better than the existing art, at least where better and existing art can be defined a little bit more. It's intentionally vague there, but it should be something that you can go in and use almost immediately and at least it won't be anything worse with a clear route for very soon improvement in at that point, well-definable ways. Are consistent with current bimolecular force fields. The overall goal includes improving proteins as well, but we don't want to break compatibility with what works now. So we'll talk a little bit about how we're going to try to ensure that again, we don't, we're meeting at a medical center that we first, we do no harm. So we'll talk about that. And we want to deliver the initial past, the tools, datasets and benchmarking capability necessary to improve the force fields. Now the open force field consortium operates in synergy with other efforts of the open force field initiative funded through different routes. And we'll talk about that. I'll talk about the role now and that's probably mostly now, but since the rest of the talks will be focused more on the consortium goals. So an overview of the overall framework to talk about this. I talked about the necessity of having such a framework. So we've got, you know, whoops, that I can't both. So definition of force field, we need to be able to find the functional form, the charge calculation method, fixed and adjustable parameters. Defining that is where we need to start with. And from those definitions, and again, we'll be talking a lot about exactly how we view this, how we view what we're going to do now, how we're going to do the future. That gets passed to some sort of parameter optimizer. In the short term, we'll be using regularized least squares fitting. Eventually we want to move to Bayesian sampling, which provides a much larger umbrella that allows us to answer a lot of the questions we really want to answer. Physical property calculator, which we're putting together as essentially a very API, if I think, you're not just handing stuff in, randomly reading whatever the text file that's coming out, but a well-defined API that will allow us to swap in and calculate things like liquid dense states, heats and mixing, but to more complex things as well up to, you know, binding free energies. And also we need for that an experimental property database. We'll be talking about that as well and a lot. Both condensed phase experiments, this is very sensitive, condensed phase experiments and quantum mechanical calculations, all of which we'll be discussing, of course, that you get out an optimized force field. And the way we envision this is as you add more experimental data, or change the experimental data that's going in, this is a well-defined workflow process. You'll get out an optimized force field consistent with that with eventually, in the medium term, good uncertainty estimates and uncertainty quantification. So that framework and how that fits into the science is what we'll be going over the next couple of days. So what's ready now? There's some things that are ready now that make this, that are making this possible. First, there's a Smirnoff Frost Force Field, which people have talked about a lot with the idea that it's based on direct chemical perception. It's much more generalizable to large chemical spaces because you can go in, once you define your chemical environments, everything can be parameterized that way. And David and Caitlin will talk about that. Preliminary benchmarks of dielectric, the density of dielectric constants, hydration for energies, it's comparable in accuracy to GAF, but is set in the process to be optimized now. So this is something that can be used now and without significant, without any real degradation in performance. And an open force field toolkit. This will expand in features as we go through, but the idea is that what you should be able to do right now, although now there's a little bit of tweaking you need to do, but it will be essentially off the shelf usable in the very near future. And we'll talk about, of course, more about this later. Can parameterize a small molecule using any standard mold to SDF inputs, combine it with the protein and solvent, run in amber, charged, NAMDI, Gromax, wooded Desmond, and run right there. So you can take, so it does not perform the parameterization right now. It's essentially the atom type, but of course we're not, we're moving away from the atom type. It's the molecule assignment engine. So this is something that you can build this essentially right now with some hand holding will be part of the workflow that can be downloaded and run out of the box in the very near future. So this is set up such that whatever improvements we make can be easily incorporated into a workflow, either now or in the very, depending on how much hand holding expertise there is, or essentially trivially in the very near future. So plan deliverables and efforts for the first year. So I'm distinguished in being things we actually planning on getting that you guys can start using and the things that we'll be working on to set up future improvements. So deliverables, a fully open force field toolkit for parameter assignment with support for major small molecule input formats, small molecule force field with refit starting from Smirnoff 99 Frost, with refit selected torsions, Leonard Jones parameters, bond angle terms for the existing Smirnoff types using regularized least squares through force balance, a compatible with amber biomolecular force field. So the first pass at something that is better than the current Smirnoff and that we'll be talking about as we go through and an automatic benchmark suite such that it's easy to see exactly how is it performing, assessing condensed phase properties by compatibility with biomolecular force fields. So this I think is going to be an important part that we're not just doing the parameterization, you're doing the validation on an automated best benchmarking suite that you can add more components to easily. Open validated database of experimental data and QM used for refitting. So whatever's going into the refitting, all that data is available in a database. Of course, more will be filled in after the first year. But there's going to be a lot of efforts happening behind the scenes in terms of tools, well, a number of tools for fragmentation, torsion drives and torsion fitting in quantum mechanics, plug-in architecture for physical property calculation, hierarchical modeling for accelerating the condensed phase parameter searches, doing parameterization over your change, having to run new simulations every time you tweak a single energon's parameter is not viable for large scale optimization. So we'll be talking about strategies for that, we'll be working on strategies for that. There'll be a force field strike team that if users are using something and discover some problem that we can go rapidly in and fix that on a one-off way outside of the initial automated parameterization and development of the Bayesian inference structure that's gonna make possible some of the larger scale changes. For the second year, deliverables will be not just a single force but a regular process for releasing improvements to force fields on the refitted data, benchmarking and releasing schedule, integrating additional experimental data each cycle. So more complex mixture data that are not just single simulations but things like partition coefficients, chemical potentials, host guess binding thermodynamics, it's gonna be a while before we can actually put ligand binding free energies into the parameterization because of the expense there, because of the unreliability of some of the experimental data but host guess systems which Mike Gilson will talk about provide a way to an intermediate between this. Starting to use the Bayesian inference framework to do things like sample parameter space broadly escaping local minima, not just minimizing but being able to sample the probability space of the parameters, refining and expanding atom types and this is really important. We don't wanna just have, we wanna have not a expert driven way to pick atom types. Again, atom types is not quite the right words since we'll get chemical patterns but a way to do that on a statistical basis. So we'll be talking about, John will be talking about that later today. Providing X-generation bond charge correction models and other that the first pass at possible electrostatics and but fitting the bond charge corrections. And so starting to use the inference frameworks to do those more complex effort things, main efforts, investigate effects of different functional forms using Bayesian inference. Are they worth it? I think that's gonna be something that we really wanna think about. Not immediately but it's gonna be and it's not something that we wanna deliver soon but we need to start thinking about that. When we start optimizing what we have, what are the limits that we're going to run into after a couple of years of this? We wanna have the tools together to say great, we know the limits now, how can we improve them? And at with the minimum cost. So third year three and beyond continued version improvements. We're gonna be at this point, we should be able to have something that the works relatively well. Improve functional forms where evidence supports it. Again, this is year three. We don't imagine really changing the existing functional form for a couple of years. Offsite charges polarizability, better letter jumps functions, combining rules. And I think this is something that by this point we'll hopefully have something and easy to use optimization workflow to incorporate your own data internally and new potential energy models or components. Part of the idea is having this toolkit that is open so that if you need to do something locally, if you wanna say no, we strongly feel that this type of experimental data needs to be used, you can generate your own versions using the same toolkit and the same data with the additions. Integrating additional experimental data, each cycle, you can imagine things like NMR couplings, protein ligand binding for energies, protein crystal data. This is all something that, it's still unclear how we can use them but in a couple of years it might be clear and we wanna be able to incorporate those as well. The Bayesian Inference Framework, quantified parameter uncertainty, that'll be coming along but it's probably to be able to get it to work as well as we want to, it'll take a little bit longer. And providing, the Bayesian Inference Framework is how we're gonna be able to provide statistical evidence for the accuracy complexity trade off. And beyond the couple, one of the reasons for having these workshops is to figure out what else we should be putting in the long-term schedule. We don't know everything, you guys have a lot of ideas so part of the discussions will, some things might not be possible in the medium term but we wanna put on the docket to be thinking about longer term. So plan initiative efforts beyond the consortium. Like I said, we want completely self-consistent by and like your small molecule and polymer force fields. They should all be chemically consistent. That is beyond the remit of the consortium right now but it will allow things like easy treatment of covalent ligands, support for non-natural amino acids, biopolymer, similar support, things like peptoids, even just whatever matrix you're putting your molecules in, improve protein nucleic acid lipids, especially in heterogeneous systems. Next generation force fields. How do we combine these efforts with machine learning and neural network potentials? Is there a way to combine the physical picture with the informatics picture to better include multi-body effects reactive systems? I think the elephant in the room for anyone doing physics-based computational chemistry is what's machine learning doing? Our strong feeling is that, talking with everyone is that machine learning, the chemical space is too rugged to really get enough experimental data to do everything but clearly there's going to be techniques and ideas that can be borrowed. If there's somewhere we can combine this, constrain your machine learning with the laws of physics. There must be some way to do better than running full quantum mechanics that uses machine learning effectively. So I think that's something that we are gonna be thinking a lot about in between trying to get this other stuff working. We have an NIH proposal that is in revision right now, got some people on the panel were extremely excited about it. And so we're working on that in revision. DARPA proposal to do sort of automated parameterization synthesis that will be submitted in the next couple of days. It's not, apparently DARPA is not affected by the shutdown, so we'll be submitting submit in 10 days or so for these areas. So we'll continue to find other funding as well to be able to pursue the things that, pharma management is not as excited about funding in the short term. Collaborating with other force field developers and automated benchmarking, smart off support, et cetera. There's a lot of people who've been working on force fields and we don't want to go at it entirely alone. We wanna be collaborating with other people. In a lot of ways, these are closed shops. I don't think Schrodinger is gonna be playing any nice games anytime soon, but amber developers and charm developers and other force field developers are interested. We wanna work with them. And so automated benchmarking, making sure that everything's supportable and smart enough, these are things that are also important that we wanna be working on in this process, better if we all go into this together. And again, another thing we wanna be doing at this workshop is are there other things you're interested in that management won't fund for now? Hey, if they're great ideas, we'll find ways to get them funded. If they're going to make this effort work better, we'll do that. Let's see. So what we'll be talking about through the rest of the next couple of days is the subgroups that are working on these components in this map that I showed before. I won't run through everything, but most of these, most of the subgroups will be presenting over the next couple of days to talk about what they're doing, how that fits into the overall framework, what the main efforts are and deliverables, the specifics of the deliverables. They gave very high level thoughts. And yeah, so I think there's a lot of science that's in this chart that you'll learn about over the next few days. And that's what I have. What questions are there? Or comments? One of the slides is for you, can you get, let's do the mic. Yes. Okay, there's some people on Zoom. On one of the slides, there was a note about the RD kit. Yes, yes. That's almost working. Jason's gonna talk about that, yeah. That's like, what about our number one priorities right now? Jeff Wagner is gonna take a little through what we have right now. Right, I gave you the wrong person. You stand up, Jeffrey. Jeffrey is our software scientist. So he's gonna be your primary means of support and it would be good to get to know him and to get friendly with him. So if you want to know exactly the state of things, you'll hear both the infrastructure overview talk, which is sort of our roadmap there, and you'll have time hands on with the toolkit tomorrow afternoon. So definitely tune in for that. But that is the number one priority right now, so. But it's not just as easy as putting RD kit in because it doesn't have all the stuff we need. So there's other components we have to steal from Amber Tools, yeah, yeah. So you had on your slides that the non-natural amino acids are not part of the consortium, but they're part of the initiative, right? Yes, yeah. Okay, so what's the plan, just a rough plan of the initiative? What's going on there? Would you mind sharing that? So I think the general idea is, if this is something we've talked a little bit about, but not so much, is that if we wanna be redesigning everything consistently, like if you're designing your small molecules and your proteins such that, you know, a given carbon in a given chemical environment is the same, you've automatically parametrized non-natural amino acids. But yes, the short version. If you are lucky, we'll get those anyhow. Yes, of course, you know, if there's more funding for more resources, do I get to happen sooner? There's sort of a phase break here when you go from existing protein force fields to everything self-consistent. And that's something that the NIH proposal should help to, it's part of the NIH proposal. Yeah, that's gonna be, it's gonna take getting a lot of stuff up together such that we can really say, yes, here's a protein force field and it's as good as what exists currently. So I think it's important to say that sort of the level of funding of the total initiative controls the throttle of how things get done. So it'll happen, it just might take eight years, right? Unless we find other sources of funding. That's why we're cultivating sustainable federal funding through NIH grants and trying other things like DARPA has large pots of money that are synergistic with what we're trying to do for molecular design. We have a small NSF proposal, which is the sum total of funding of two graduate students before this initiative started and more people are signing on board. So please encourage your other industry colleagues who are not part of this as well. NSF, part of this is to collect information. If it's important enough, then we've got the resources for it. Yeah, that'll happen sooner rather than later. We seem to have quite a significant group here. Can we help with the federal funding in any way? Yes, I think that, yeah. What if I say that that's something- You've been reviewing. You're much more- What if we say that that sounds like it could be useful and we'll talk about it maybe subsequently because I'm not sure how best to take advantage of that. Right. It's a helpful suggestion. Yes, we'll talk about it. Yes, there is, we'll talk about it. Do you envision a particular strategy for the quantum mechanical data like requiring a particular function now or it'll be more about free for all? Yes, we do. I don't know that well. So that will come up. When does that talk? We'll hear more about this in the torsion fitting especially early today. And then there's other stuff on the electrostatics as well. So yeah, torsion electrostatics is gonna come in. In the near future, it probably won't affect Leonard Jones parameters, things like that, but more data to come or more info to come. Just one quick remark. Is it fair to say that one could assume that the purpose of this whole effort is to create a framework rather than good parameters? I would say they go hand in hand. Is that we need a framework in order to do science. Well, I would also need good parameters, so. I would assume that when you have the framework that you build on that. Yes, yes. Parameters will follow. And of course, the parameters you need to have to validate your framework. Right, right. I agree with that too. Maybe what I'd say it is they go hand in hand. The main efforts at the beginning are gonna be focused on the framework because the framework is really what's gonna make a force field process come out, not just the force field set of parameters. Yes. Well, we do mean to deliver improved parameters. Yes, yes. In the first year, we intend to improve parameters that we can show are better, that are part of the work, that are merged from the workflow. The rationale is that good parameters should be a side effect of good infrastructure. Thank you. Thank you. Fucking close agreement. Only we got time. I'm asking the question is, is the way that I would envision it, it is actually what we're trying to do here is create something that then the world can start using and together we create good parameters. Yes. And then for us, we can't produce all the good parameters, maybe some good parameters. And that's kind of like what I see it more. Absolutely.