 Okay, so today I'll be talking about the host case binding calculations in the OpenFF Evaluator. Now this project was started by Simon Boothroyd and Dave Sloccaro and so I'll be giving an overview of our effort to integrate Paprika into OpenFF Evaluator. So for those who doesn't know what Paprika is, I'll talk about it in this presentation. Okay, so the question is why do we care about host guest systems where we actually want a force field that will describe Pratt & Leaghan systems. So host guest systems, they may include both hydrophilic and hydrophobic interactions. Upon binding of the guest model to the host, the host may undergo conformational changes. The binding affinity is comparable to some Pratt & Leaghan systems ranging from weak to intermediate. And the Pratt & Leaghan states of the host can be predicted with high confidence. And these systems have also been used in the sample challenges to test the force fields and also the computational methods. And compared to Pratt & Leaghan systems, they are much much smaller. So we can run the simulation quicker and longer. So we use host guest systems as a good model to test and hopefully optimize our force field. So in particular, the host will be focusing on our cyclic dextrons. So it is a cyclic oligosaturides consisting of glucose monomers. And the naturally occurring cyclic dextrons are alpha beta and gamma, consisting of six, seven and eight glucose monomers respectively. They have a hydrophobic interior with a hydrophilic exterior. They can bind small molecule fragments and drug-like molecules. And the experimental binding data are available for a wide range of guest molecules. Now the host guest systems are stored in this repository called Taproom, which is available on GitHub. The repository contains two host molecules, the alpha cyclic dextron and beta cyclic dextrons, with 33 guest molecules. So 10 of them are amines, 7 are cyclic alcohols, and 16 are carboxylic. So together they form 43 unique host guest pairs. The repository also includes structure files of the host guest system and the multi-files contain AM1 BCC charges. There are also YAML files, which contains information on how to set up the calculations, which I'll talk about in the next slide. And the experimental data obtained from the reference below. So although there may be more than one method or one way to estimate the bunny free initiative, here we'll focus on the APR method, the detachable release method, which was developed in the Gilson lab. So here basically we first define dummy atoms and anchoring atoms on the host and the guest, which will define a reaction coordinate that describes the path that the guest molecule would take upon binding or unbinding. And we split the simulation into three stages. So in the first stage, we attach the restraints and we apply restraints on the conformation of the host and the translation and orientation of the guest. And we do this over 15 windows. And the work or the free energy cost of turning on the restraints is obtained by thermodynamic integration. And in full phase, we pull the guest out of the host over 45 windows. And finally in the release phase, we release the restraints we applied on the host in a similar manner as the touch phase, but in reverse. For the guest, we don't need to run a simulation for this because we can estimate it analytically. Now, because of the asymmetry of the psychodextrin host, there are two possible binding posts. So we need to perform the API calculation twice on each binding post. And to get the final free energy, we add the Boltzmann probability of the two binding free energy and convert it back into energy space to get the final binding free energy. So as you can see, there are a lot of steps involved to perform a single APR calculations. Thankfully, we have the Paprika toolkit, which was developed by Dave Slokauer and Neil Hendrickson from the Gilson Lab, who are now both alumni of the Gilson Lab. And the Paprika toolkit was built on Python 3. And it automates a lot of the process. So right now it interfaces with the AMBER simulation program. We can create typology instructors from a Python API with the TLIP program. The host guest restraints can be set up automatically. And the analysis of the MB trajectories can be done with either TI or M-bar. And this is available on GitHub. And this toolkit was used recently to benchmark the 43 host guest systems with the smell of nanoFrost field, 1.05. Okay, so our goal is to take this toolkit, the APR calculation of the host guest systems, and integrate it into Evaluator. Now, this is a very simple diagram of the infrastructure. If you look at David Mobley's presentation or some of Boothroyd's presentation, they would have a much more sophisticated diagram. So given a force field defined by the OpenFF toolkit, the OpenFF Evaluator will estimate physical properties through simulations, such as not limited to densities, dielectric constant, and the heat of vaporization. And then we can assess the force field, how well they match the experimental data. So we want to include host guest binding into this process here. And this is not limited to just benchmarking force field. The Evaluator also can be used to optimize the force field parameters. And it does so by integrating with force balance, which incorporates both QM data and experimental data. So we would like to include host guest binding, host guest calculation as part of the optimization process to see if we can get a much better force field. And Evaluator was designed a program from scratch by Simon Boothroyd, and it's available in the OpenFF GitHub repository. So, okay, this is just to show a very simple diagram of the Paprika workflow that we have implemented in Evaluator. So the first step is to generate the topologies and the structures that's done by OpenFF toolkit using the information given in Taproom. Next we set up the host guest's trains, which we call in the Paprika toolkit. The simulation is, empty simulation is done with OpenMM, and then the analysis is done through Paprika. So this is the workflow for doing one APR calculation. So Evaluator uses Dask, which is a Python library for parallel computing. And Dask can run these in parallel in a local cluster or under HPC. So we can run all 43 host guest system in parallel. So just to show a snippet of what the user needs to do, obviously specify the force fields. They need to select the data set, in this case the Taproom data set, choose the property to estimate. We wanted to do a host guest binding infinity through the Paprika workflow. And the rest will be starting up the server and client, which we usually define as how much resource to use. And this template is the general template of how to set up the Evaluator. And it's not unique to the host guest binding. So for other properties, the user just needs to change the workflow options and the data set. So as you can see, there is a very minimal effort on the user end, as everything is automated at the back end. Okay, so as I mentioned at the start of the presentation, this is still a work in progress. It is available on the GitHub repository of Evaluator under the Paprika integration branch. Most of the workflow is implemented. We just have a few minor bugs and glitches that we need to fix. So right now, this line of line nine for us and partially force field is supported through the OpenFF toolkit. We also have support for the GIF 1 and 2 through the Amber tools program. By default, the simulation will use OpenMM, but we also have the option to write it on Amber. As I mentioned on the previous slide, we can write it on a local cluster or on the HPC. And our training data or data set is limited to the 43 host guest pairs. So what's next? In the not too distant future, we would like to benchmark the partially force field and see how well it performs with the host guest systems. We need to implement the gradient calculation, which will be needed for the parameter optimization. Also include a workflow to calculate the binding enthalpy because we have the experimental numbers for that as well. And also support for an implicit solvent as part of the host guest calculation. So there are interests for this in the Kielsen lab. And last but not least, we would like to expand our training data to 43 is a bit small. We might include other host guest systems other than sacrodexterms. So I would like to acknowledge obviously Southern Bistrode and DaySlock error and the OpenFF team, especially the software scientists Jeff Wagner and David Dodson for technical support past, present, and future. And also the Senegal Supercomputing Center, where I've been running and testing the code. And on that note, thank you for listening and I applaud you for my stuttering during the presentation.