 Hello everyone, I'm Yunqing, and today I'm going to talk a little bit about my project on fitting M-M functional forms on QM energies and forces. So it is a huge dilemma that the protein molecular simulation community has been facing for decades is that M-M is very fast but not that reliable, Q-M is considered as the ghost standard but prohibitively slow. So recent years there is a trend that people are using symmetry functionists like Bader and Palerino to summarize the local chemical environment of atoms and come up with the so-called machine learning force field, machine learning potential models or machine learning force field which is somehow not that which is somehow has an accuracy that's close to Q-M but slightly faster than Q-M but still it's still around 2000 times if not more slower than M-M models especially in solvated the system and there is a huge design space between M-M potentials and M-M potentials. If you look at a pros and cons of the M-M potentials and M-M potentials then I guess we can agree here that whatever merits for molecular force field comes from its function, its simplistic functional forms. So for example, if you have all your path is bounded non-mundered and bounded as basically harmonics or some other polynomials then it is extremely fast to sample and you can encode a lot of your beliefs in that. So for example, if you believe that when you have a non-bounded interaction and you separate the atoms apart to a enough distance then the energy should degrade to zero then you can encode that belief in your functional forms whereas all the benefits from M-M potential comes from we believe from its optimization strategy to be more specific you can use this kind of optimization strategy used by machine learning communities namely back propagation to rapidly train your neural network to come up with more and more accurate models. So it's impossible if we can combine the merits of M-M and M-M potential and of course this is hardly anything new and it's basically the work force of the entire open force field consortium which is force balance developed by Li-Ping. There's only one problem or maybe two for this one force scheme is that although you optimize very aggressively the force field parameters you are not optimizing the typing scheme of force field parameters and this means that you need to start with a decent enough force field. So a lot of the force field especially small market force field that people are using right now started from development in 1990s. So it is of our interest to ask the question is it possible to optimize atom typing and the parameters at the same time. Before we dig into the details let's just clarify a little bit on the nomenclature of different aspects of force field. So the final energy the energy that we care about which could give us a lot of information including the behavior of the molecular system of course you can take the derivatives to get the forces comes from parameters as well as coordinates. So the parameters comes from including bound angle, torsion, non-bonded types, a parameter this comes from bound angle, torsion, non-bonded types which comes from atom types. And the we call the process of translating a molecular graph into these types and further into the parameters atom typing and the rest force field parameterization. And in this project we're wondering whether we can replace the current atom typing scheme which is based on basically looking at the table. Can we switch that to be based on graph neural networks. So there is a wish list that we hope to achieve by the end of the project. So we want to know whether graph nets can fit atom types, whether it can fit atom, bond, angle, torsion types and therefore parameters and whether we can reproduce the same set of parameters given the sufficient Mb trajectory and if we prove that we can do very well on Md energies and forces then we can move on to Qm which we really care about whereas Mm is just a no more than a sanity check and finally such neural parameterization schemes afford us the ability to expand the complexity of the functional form. So for example we can include terms that are used in MmFF or some other classful force field that have been proven to increase the fineness between Mm and Qm but were too difficult to compromise if promised manually that is to say to putting a huge table. So speaking of huge table let's look at one of the most classic huge tables for atom typing scheme. So this is just a few columns of gaff. This is gaff one I believe which is published in 2004 but the development effort goes well back into 1990s if not earlier. If you analyze the grammar of all these atom typing rules then you will realize that it is no more complex than combining node attributes number of neighbors and neighbor attributes all together. So interestingly it is obvious to see that these these grammar could be realized by a graph kernel called Weiss-Fehler-Lehmann test which is basically a tree expansion heuristic. So for example if you look at a cyanote here I don't think you can see my cursor but anyhow if you look at a cyanote here then expanded trees into green green green and you further expand the the green nodes into green sign green and green sign green and the the middle green node could be expanded to three whereas the side nodes could be expanded as two. So this procedure has been used to distinguish between different nodes in a graph or different graphs for a very long time and last year there is a wonderful paper published by Shrieh et al in MIT that consists of a GitHub repo for four lines of code that prove that Weiss-Fehler-Lehmann test could be approximated by graph nets. So what we have here is that graph nets could be as powerful as Weiss-Fehler-Lehmann test whereas Weiss-Fehler-Lehmann test can almost perfectly approximate the atom typing. Naturally we can hypothesize that graph nets can approximate atom typing and if I want to put an attribute on that there is actually a corresponding relationship between graph nets with a layer and Weiss-Fehler-Lehmann test and steps and Weiss-Fehler-Lehmann test and steps to atom typing that looks at most an atoms away. So let's test this experimentally. We just have a very toyish data set here that consists of a thousand molecules from zinc data set, split it into 80-10-10 training validation tests and we used enough to show graph neural net models and if you ignore the 1.5 hour rate here you can say that okay maybe graph net can do the task of fit atom types perfectly. Then the next question we want to ask is if you can do the atom types can you do the bond types and therefore bond parameters. So these are apparently two questions that consist of an and operation which takes one end of the bond and the other and use an operator due to join them together as far as a dictionary lookup. So the first half is a bond typing scheme so to speak and the second is the prompteration scheme because neural networks all neural networks this is not even a graph convolutional or anything even if the simplest feed forward neural networks can do logistic operations. Also any kind of neural networks can look up a dictionary. So we hypothesized that maybe graph nets can after do message passing on the on the nodes maybe can also come up with bond types and therefore parameters. Now there's a slight problem here so after you do the atom message passing you will come up with latent code for each node but for bond angle and torsion parameters you necessarily need to combine the latent code of individual atoms into an entity and meanwhile we need to respect the symmetry here that there is a so for example the other symmetry for bond is that if you name if you switched order you input so if you if you input the blue atom first and red and red atom next or if you switched order this should give you the this should give you the exact same answer because there's a topological symmetry here. One of the techniques that we can use is called Genesee pooling. This huge chunk of math here basically says that I don't care what kind of symmetry you want to to respect. One simplest way of pull things together that is expressive is that just enumerate all of these permutations and sum them up together so this is what we did. So we have a feed fewer a feed forward in your network that operates on the concatenation of all the enumeration of permutations and then add them up together to come up with a latent representation of bond angles and torsions and from there we determine the parameters and it's not doing too bad although there are still some although there is still some error the R square is generally over 0.9. There is another version of the scheme that we tested which is which involves hierarchical message passing that can do slightly better but this is 10 times more expensive so it did not proceed with this herosite. Okay again if you tolerate that 0.9 R square you can say that okay seems that fitting atom bond angle and torsion parameters could be achieved by graph Nets. Then next we move on to whether we can reproduce this set of parameters if you have sufficient enough md trajectories. So this is again a sanity check for before we go to fitting QM energies because we know the answer here we know that the model now is expressive enough it is just a matter of the optimization process. So again this is a toyish task we selected a hundred molecules from the ANI dataset for training and 24 tests. We parameterized the system by open force of tools using spurnoff and we draw samples from an NM simulation at 400k. We feed those energies and those snapshots into a graph net and try to see whether it can whether it can reproduce the energies first. So it seems that our messy of the predicted and the reference energy is around 0.4 0.5 k copper mole for training and slightly larger for test set. However when you look at the parameters it seems that the only parameter that the model can reproduce with some confidence is the equilibrium bond lens. So this has something to do with the very rocked landscape of the optimization process and we've shown that this is not the issue of the expressiveness because if you train it directly on parameters then you can get a perfect energy whereas it is the the rugged landscape endowed by the complex functional form that is increasing the difficulty of training here. So this is still a problem that we're trying to overcome but we're also trying to make do with this kind of an accuracy that we have and move on to QM. Now if you train on a QM energy dataset then there are more challenges that are added. So you really if you have something like a QM9 or a ANI dataset then the dynamic range for your model needs to be very very large because the energies you really sometimes span from negative few thousand to somewhere near zero. So you want a very high r-square as well as a very high RMSE because in order to make the model useful at a local region then even if you get a general trend it can still make some errors at the very local scale. So if you look at the scatterplot this is just some random experiment that I did on QM9 dataset. If you give this plot to any machine learning person they'll tell you that okay this looks perfect but still the RMSE of this specific experiment is greater than 10k Calprimole which makes the potential energy model basically unusable. So I guess what we need to do here is we need to do more experiments on molecular mechanics to make sure the optimization process works smoothly. This is just another example to show that the last landscape the training curve looks very smooth but in order to achieve a RMSE that is actually usable for a potential energy model there is still some work to be done. Another thing worth mentioning here is that when fitting a MM model or a MM-like model to a QM energy then we also apart from predicting the bound non-bonded terms we necessarily also need to predict a offset because even if you count all the atom offset there is still a part of QM energy that is not explained so you still have to predict per molecule offset for each molecule. It is kind of well known that you cannot perfectly reproduce MM energies using traditional class 1 force field. So what we are doing now is also to expanding the functional forms of the force field so that it will achieve a better finish with regard to QM energies and forces but again this is made easier by the neuralized parameterization scheme. So one of the things that we can do is to follow the guidelines for a class 2 force field. If you look at such a formulation you will realize that the first few terms is no more than just a higher order polynomial for other bounded terms and we can also add bond-bond coupling, angle-angle coupling, bond-angle coupling, torsion-bond coupling and torsion-angle coupling as well. Portion-angle-angle coupling. So we can easily put all these coupling terms. We can squeeze them into the hypernodes that we introduced before. So I have a TensorFlow implementation of this and now the ballpark of the RMSE is around 5k copper mole trend on QC archive data but this is still a working progress where we try to further lower that that error. So with that I'd like to thank everybody in Tulera Lab who helped me with this project and thank you for your time.