 Okay, yeah, it was good. Okay, thanks. And she will give us a presentation on interpretable embeddings from molecular simulations using caution make sure variation of all the encoders. The floor is yours. Thank you very much. Hello everybody, I'm going to ask you while Ganesh from maximum Institute for polymer research in mice and I'm a post up there. So as let me try again. Yes, it works now. As we have seen, molecular dynamic simulations is one of the areas that can most benefit from the developments in the unsupervised machine learning I will say, and I would like to motivate with the classical time scale problem here. We are okay. We are here ish nowadays and we always want to sample the longer time scales and there are a bunch of methods that allow something this region more effectively, and one of the methods is the collective variable biasing. So the idea is picking some direction and applying a potential along this direction and accelerate the sampling this way. So this requires good selection of CVs beforehand. And what's considered as a good CV is that that allows crossing over the free energy barriers and also separates metastable states and therefore characterizes the slow motions and hence have common models and the common choices are some internal angles, pairwise distances as coordination numbers are and also non generic highly system dependent and highly complex descriptions of the system. And a good CV is not only helpful in altering the dynamics in a control manner but also for interpretation purposes. For instance, like when building kinetic models, such as Marco models. So in this free energy plot that you're seeing this darker colors are low free energy regions and sense regions, and our aim is to properly locate this high energy barriers, also in the low dimensional space, so that we can preserve this metastability structure. And then we take one step further and build a hopefully build a kinetic model and define or describe the rate of the transition between these metastable states. But this, as I said this requires picking a good CV beforehand, and which is a difficult task and maybe data driven techniques can help us to choose one beforehand. As we have seen auto encoders are along with other dimensionality reduction techniques auto encoders, which are special type of neural networks offer an easier route for such tasks. And they're special both type shape, fewer notes in the bottleneck layer forces the network to learn the essence of the data and discard the irrelevant in more information basically and the loss function is the discrepancy between the input and reconstructed version or decoded version of the input basically. So, here, and there's another flavor of auto encoders, which are called the variation of the encoders which is just a probabilistic spin to standard auto encoders. And the idea is instead of just learning and I think from high dimensional space to low dimensional space. Now the idea is enforced and also inferred the parameters of distribution of the data in the latent space. So we learn to model the data fit structure, in a sense. So, now we're interested in finding the posterior distribution meaning the posterior distribution meaning the probability distribution of the latent variable that given the input x. But that this calculation is not easy because of the interactivity of the evidence and what is done is what's called variational inference hence the name variation will take orders. So instead of calculating the true posterior, approximating the posterior we can easy to sample or let's say tractable distributions. So this makes the problem. Actually an optimization problem where now we are trying to minimize the kale divergence between the true posterior and the approximate posteriors and with some manipulations this can be written as a difference of two terms. Where the first one is the one that pushes the latent space to the chosen prior and acts as a regularization and the other one is trying to maximize the reconstruction. So the likelihood of the decoder or minimize the reconstruction error in the sense, and this nicely follows the neural network view as well. So these components corresponds to correspond to each other and that's nice thing about the variation will take orders is that when the prior distribution is chosen as a Gaussian there is an analytical solution, which makes things a lot easier. That's what you say. But, as I showed there are two different terms in the last function now the first one the left one you see is when the this hand written digits example is strange with the reconstruction also only which is just the standard auto encoder and the plot plot in the middle shows that when we only take into consideration the regularization term, which does what supposed to do acts or pushes everything into each other and mixes everything. And when the both terms are considered in the last part. So we have some sort of separability and also some structure, but if our aim is to further cluster these data points in this latent space this landscape is still not very good for clustering. So the Gaussian prior actually has an anti clustering effect. So therefore we suggest replacing the mini model Gaussian prior with multimodal Gaussian or Gaussian mixture in the latent space so that we can give a space to expand the latent space a little bit more. And so for that we introduced the categorical variable why, which can be considered as the cluster ID. So it allows dimensional reduction and clustering together basically that does it simultaneously. I should say that the number of clusters here is a hyper parameter but as I will be showing it acts like an upper one so so this will come but let's start easy and this is an example of one day for well potential without any dimensional reduction, concentrating just on the clustering. So this is the latent space that we obtained from the Gaussian mixture model which follows closely to the probability distribution, coming from the potential. So this is the accuracy metrics that we obtained by taking the cluster IDs of from the GMBAE method which is pretty high. And as a comparison we're also looking at the variation low tank order case, and what it does what was expected from the variation low tank order pushes things a little bit close to each other and making the clustering difficult actually. So it seems like a good improvement I would say. So the next example is iron type of type which is a benchmark system for conformational dynamics and it's very well known that the metastable states are lie in this emotion run plot so they have these special names for the metastable states. So, and we use the, the pair wise distances augmented with the language as feature as input to the auto encoder and we got this to the landscape for interpretation purposes we always kept the landscape as to the, and, and these are the cluster IDs. So all at this point we noticed an interesting property of the method although we trained the method with 10 clusters we only got six from zero to five, and the other clusters were empty, so it can be said that the method can explore the inherent cluster clusters in the system I would say. So, next we looked at the cluster where these clusters are in the emotion run plot. So it matches pretty well to the metastable state definitions, and then we took next step and build the Markov state model using only these six cluster this course six cluster descriptions, and then the model satisfies the Markovianity and such so for this example at least this clusters were appropriate for kinetic purposes I would say. And other example is now we don't have for this system, nor reference kinetic model or we don't know the number of clusters actually that would be needed in the system but again we trained with 10 clusters. And this system is 15 residual long peptides, and this is also a representative system for helix coil transition and in our simulations coil status sample pretty well, and helical and herping structures were rare in the system. So what we obtained is again this is we kept the dimensional test to the and we got this landscape with this cluster IDs. And as a qualitative analysis we looked at these structures around the cluster centers, and this suggests that the structures become more and more extended as we go along this direction basically. And we further validated with more called quantitative analysis. This shows the average fraction of helical helicity I would say, and as the color goes from blue to red, then the structures become more and more extended, which is in agreement with the quantitative analysis before. And as a further step we also looked at the distance RMSTs from the reference structure which we chose as this helical structure, which is also in agreement with our previous conclusion. We made another test case on polystyrene which is this material that we are seeing every day from computer packaging to food packaging basically, and it has an interesting property. And polymorphs meaning that this molecules have that exist to at these two structures with different arrangements with hands different properties. Experimentally these are the known polymorphs and simulating this transition is extremely difficult. Actually, a PhD student in our group spent a substantial amount of time iterating over the off the shelf methods to simulate this transition and finally could obtain a good landscape that characterizes all the crystalline phases and we instead we decided to follow a purely data driven approach. And we use the slatum description as the input and try to get different crystalline phases with our method and it seems like it does a pretty good job. Differentiating different crystalline phases. And as the next step, although my examples were all from molecular dynamic simulations. I haven't really exploited the fact that our data is a time series data basically and connected in time. So a trick that can be done is just instead of trying to predict the original input. I the idea is predicting a time like version of the input which brings sort of smoothness to the latent space and some dynamical connectivity. So we tried that with this. We shaped potential system. And when we don't add any time like to the system, the division boundaries around this line. So it perfectly when we add the time lag. So it perfectly identifies the different clusters. So at least for the system time lag helps separate the clusters. So this, this is something that one can always incorporate when working with the time series data, I would say. So overall, I introduced a method. Gaussian mixture variational to encoder, which helps promoting the separation of metastable states for our applications at least and it does it simultaneously dimensional reduction and clustering. So it is a generic method, so feel free to adapt to your cases as well. Although I just focused on the molecular dynamics data. So, the paper is here and also the our implementation is there and I would like to thank my collaborators Tristan Barrow and Joseph Rutsunski and Adrian Banerjee from Max Planck Institute for Polymer Research. And thank you for your attention and I'm happy to take questions. Thank you very much. We have time for a very quick question over there. Thank you very nice work. And so from what I understand the to get the collective variables in this way, you need a very long and the trajectory to begin with. You can also kind of somehow iteratively explore, like use that collective variable to sample more, or do you always need to have exhaustively sampled this before you can get the variable. We haven't specifically tried that but that's a good idea one can, of course. Okay, thank you very much. Thank you very much.