 to the second day of the ICT Peace Hitchhiker's Guide to Condense Matter and Statistical Physics that is dedicated to machine learning in condensed matter. So before today's lecture, by Bing Ching Chang from University of Cambridge, I'd just like to remind you that we will have next two appointments on the next two Wednesdays. But in particular, our next Wednesday on 27th of January, we will start earlier. So 12.30 European time. So just check on the program to be sure that you're here. And then the final lecture on the first February will be by Juan Garasquilla that will start in the regular time at 2 PM. So that's it for me. And today, I leave the floor to Alex Rodriguez, who will introduce Bing Ching Chang and the discussion. So Alex, please. Good to everybody. First of all, thank you, Bing Ching, for being here. Well, let me introduce Bing Ching. She is now a, and please, Bing Ching, correct me if I do something wrong. She's now at Cambridge. She's now a fellow in Cambridge. I don't know which, an early career fellow. It's exactly your position, I think so. Yeah. And today, she's gone. She has done a lot of work on machine learning on atomic systems. And we are going to listen some of her work and some kind of before an introduction. And then the Q&A for this session will be, I think, Bing Ching will stop every quarter of hour or something like that. And she will replay your questions. I will remind you to write it in the Q&A box, please, because in this way, we can check it. That's all from my side, please. Thanks, Alex, for the introduction. So let me share my screen first. So as Alex mentioned, I'll talk about machine learning. But more importantly, the application of machine learning to materials modeling. And as Alex also mentioned, I'll stop every 15 minutes to answer questions. So the first hour of the lecture is about basic notions. So that's the fundamentals. And during the second hour, I'll talk a little bit about more applications, which are basically my own work. I think I actually estimate the first part is a little bit longer than the second part, but that's OK. We can move a little bit to the second part as well. I think the fundamentals are, in any case, more important than the recent work I have done. OK, so let's get started. So let's start from the very beginning, like what is a first principle calculations? What is app-initial methods? So the app-initial methods in this context means that we predict material properties that we predict the motion of electrons and nuclei starting from the Schrodinger equation. Now, the significance of quantum mechanics and the Schrodinger's equation can be reflected by this quote from Paul Dirac. The rest is chemistry. And what he means is not something dismissive towards chemistry, because what he really means is that the fundamental laws and equations that are necessary for material property prediction are completely known. But the difficulty is that these equations are just too complex to be solved. So what is the solution? According to Dirac, we can develop proximate practical methods so that we can make the salvation of quantum mechanics trackable. Now with that in mind, let's look at the methods that we have. So we have the Schrodinger's equation that cannot be solved exactly except for the simpler system, such as a hydrogen atom. And then we have reference methods, such as quantum Monte Carlo or coupled cluster methods. And then the workhorse of the field is the so-called density functional theory with different levels of approximations as well. So typically, we are able to model a system with hundreds of atoms on a timescale of 10 to the minus 12 seconds using density functional theory. And then we can model the system using empirical force fields, meaning that we just assume two atoms would interact to each other via sort of empirical or simple analytic functional form. But the empirical methods, they lack the quantitative accuracy. Now so with that in mind, with this preliminary slides in mind, now what we are going to talk about during the first hour of the lecture, we'll first talk a little bit about the fundamentals of statistical mechanics as well as its mystic modeling. And then we will change gears slightly to go to the machine learning part. So David and co-workers gave some talk about the basics of machine learning. But here we are going to focus on how do we translate our physical systems? How do we translate the system of materials and molecules into the mathematical language that machine learning model can use? How do we translate the physical problem into a design matrix? And then finally, I'll talk a little bit about machine learning potentials using machine learning methods to approximate the interactions between atoms. So let's start with a little bit of thermodynamics. So from a thermodynamic point of view, the Gibbs free energy of a system has two compositions. We have an enthalpy contribution, and then we have an entropic contribution. So as the simplest example, when we have a solid and liquid and with the free energy, it can both be expressed in these terms. When a temperature is low, the free energy of the solid is lower, so the system stays solid. But when the temperature increases, the entropic contribution becomes more and more important. So the solid will melt, and the stable phase will become the liquid phase. Now from a statistical point of view, the free energy is actually a measure of probability. So the free energy difference between the solid and liquid can be expressed in terms of the log of the probability of observing the liquid divided by the probability of observing the solid. This is a very simple expression, but however, bear in mind that the probability of observing the liquid or the solid means the probability of observing all the possible configurations that belong to the liquid state. And there are many, many of them. And they all look similar, but they all differ by where the atoms actually are. Now the founding father of statistical mechanics, the Boltzmann. So the Boltzmann has an expression of the entropy on his grave and reads, the entropy of the system is the Boltzmann's constant times the log of the number of microstates. A microstate is just a snapshot that I have been showing before. It's a specific realization of the coordinates and velocity of all the particles. Now this expression engraved on the tombstone is actually not correct because the correct expression, not only we have to consider, just consider the number of microstates, but we also have to weigh them properly using the Boltzmann's distribution. So the entropic term can be expressed in terms of the weighted sum. So by far, you might have noticed we already have a conflict. So in a sense, when we are computing the energy of the system, we have a whole spectrum of quantum mechanical methods with different level of accuracy as well as costs. So to compute energy accurately, we want to go to the more expensive and more accurate methods. But however, to sample the microstates in a satisfactory, in a comprehensive manner, then we want the methods to be cheaper. So there's a conflict there. And during this talk, I'll explain how can machine learning resolve this conflict. So just some very fundamental and very important information on how do we actually sample the microstates. Because in no circumstances, we want to enumerate the microstates because it's a very high dimensional object if we have n atoms in our system. So we will be sampling the six times n dimensional space. And that is not trackable. So there are two pathway forward. And the first one is the Monte Carlo sampling. So the end game is that we always want to sample from the Boltzmann distribution. So omega is the microstate. And the microstates are populated according to the Boltzmann distribution weighted by the enthalpy of each microstate. Now, so the fundamental principle of Monte Carlo sampling, the important sampling, is that instead of sample generate uncorrelated microstates, we have a sequence of correlated states. So I start from a certain point in my phase space. And I make a move from, let's say, omega to omega prime with a certain probability. Now, it can be shown that if the probability p of omega is invariant under this move, then we eventually will be end up sampling the correct Boltzmann distribution. However, this integral here is quite difficult to implement in practice. So in reality, when we do the sampling, we often impose a stronger condition, which is called detailed balance. So in the detailed balance, we are saying that we have to, let's say, if we have to microstate p of omega and p of omega prime, the move back, the move from omega to omega prime and the move backwards, the ratio of them should be equal to the ratio of the probability of omega and omega prime themselves. Now, the one possible option of sampling also according to this detailed balance is the famous metropolis sampling. So it is saying that if so typically, we have a diagonal matrix. And when we are saying that the probability of, if the probability of omega prime is higher than p of omega, we always accept this move. And otherwise, we accept the move with a certain probability that is equal to the ratio between the probability of these two microstates. To show this graphically, so this is a very simple scenario effect in 2D with limited number of particles. So each time we propose a move, if the energy of the system goes down, meaning that if the probability goes up, we always accept the move. But otherwise, we accept it conditionally, depending on the energy increase of this move. So this is the one way of sampling our microstates. But honestly, like Monte Carlo is not that common these days. But the underlying principle is very nice. Now, the other method to sample the system is to use molecular dynamics, which in fact is relying on Newtonian mechanics. So what we are doing is that it is a very simple picture. We compute the force on each particle. And then we propagate this trajectory just using the classical Newton's equation of motion. And we also have to do some additional tricks to control the system's temperature as well as pressure. And the Bible for this type of sampling is the Don Frankel and the Baron Smith's book on the sunny and molecular simulation. So there's a, however, that is not the whole picture. And the reason for that is typically when we do molecular dynamics simulations, the time scale is fairly short. And if the system has to minimum in its free energy profile, so back to this example of solid and liquid. And solid and liquid can be understood as two equilibrium states on a free energy profile. And what would happen if we just do a cultural molecular dynamics is that when we start from the liquid state and do the molecular dynamics, the system would remain in the liquid state. And if we start from the solid, the system will also trap in the solid state. This is because the thermal fluctuation is often not enough to overcome this very high activation barrier. So to overcome this, there's a method that is called a meta-dynamics that is developed by Exondro Lio and Michele Palinero. And I would like to use a high-kindness summary to explain how this works. So can you see which mountain is this? So this is the Matterhorn. And I think this can also be viewed from Italians, both the Italian side as well as the Swiss side. So the profile of the mountain can be likened to a free energy profile. And the valleys are the equilibrium states. And the peak is similar to the activation barrier. If we want to travel from A to B, we have to spend a lot of energy to climb up the peak and wait for a very long time. However, there's an alternative, which is to hiking. Now, what if I travel on this landscape and I deposit a heap of sand whenever I go? So imagine if you do this long enough, what will eventually happen is that we will even up this free energy landscape and make it flat so the system can go back and forth without much resistance. And another nice aspect of the method is at the end of the simulation, just by checking how much sand is deposited at each spot, and we take the inverse of that. And then we take the negative of that, then we recover the actual free energy profile. So there's an alternative method for performing free energy calculation. And that is called thermodynamic integration. So the idea there is that we have, so let's start from a slightly easier approximation. So we can use the minimum potential energy at 0 Kelvin. We can neglect the entropic contribution altogether at low temperature and just use the minimum potential energy as the proxy for our free energy. Or we can assume that our system behave like a harmonic oscillator and we take the harmonic approximation and add the harmonic contribution of the free energy. And then there's also the option of doing everything properly and taking into account of n-harmonicity as well. So in that picture, we perform a thermodynamic integration. The idea is that this is a very general concept. So the idea there is that if we have two systems and somehow I can connect them using a reversible thermodynamic path. Then the free energy, sorry, here I flipped the sign, should be from B to A. Now the difference in free energy between these two systems can be just expressed as the integral of the finite difference when we go along this path. So that's a very general idea. And this path in particular can be anything. It can be a path along thermodynamic variable like temperature, pressure, or other things. Or it can be a switching parameter between different Hamiltonians. So in practice, we usually do a switching between a harmonic system and our actual system because the harmonic system has analytic free energy that we can express very easily. So in practice, we typically follow a recipe. We start from a harmonic reference and then we integrate to a real crystal and then we can choose to go up in the temperature or go between different pressure. So this is the recipe that I would always use with some justifications because we have to do the switching between harmonic to a harmonic at the relatively low temperature because when temperature is high, this integral becomes divergent. This is because at high temperature, like diffusive behavior happens to make the system very non-harmonic. And another thing that we often do is that we typically do, if we want to compute the Gibbs free energy, we first start from an MVT isombo. We start from a constant pressure and temperature isombo. This is because the pressure is not well defined for the reference harmonic system. So we cannot place a harmonic system under the MPT, the constant pressure isombo. So there are typically also some number of tricks that we can play, but in the interest of time, I will not go through them. So I just want to point out that some time ago, we have returned a tutorial style paper to explain the tricks as well as the fundamentals of computing Gibbs free energy using thermodynamic integration. And this is accompanied by the Python notebooks for data analysis as well as sample input files. And I'll just quickly show a couple of examples to show why this accurate free energy estimation is important. Then I'll try to answer a couple of questions. Now the example here I'm showing the free energy of the vacancy formation in BCC iron. So we have the approximation that is just using a potential energy difference. And then that is the black line here. We have the ones that use a harmonic approximation. And then we have the accurate estimation taking into account of n-harmonicity using thermodynamic integration. And low temperature, they are very similar. But however, as the temperature goes high, even the harmonic from estimation is not sufficient enough to capture the accurate vacancy formation free energy. So here's a similar example that computes the stacking for free energy in FCC metals. And it's the same idea. You can see at high temperature, the harmonic or the potential energy approximation, they are not only quantitatively inaccurate, but even the overall trend, it can even get the sign wrong. OK, I think now is a good time to stop to answer some questions. So there's a question from a question rower. Christian, do you want to speak up and ask the question yourself? Or are the students allowed to speak with the Zoom setup? OK, so Christian asked, will we only be dealing with equilibrium systems? I think this is mostly the case. Although strictly speaking, when we are using mind-high dynamics, the system is seeing a quasi-equilibrium. But yes, this is mostly the case. And Raj, do you want to speak up? So Raj asks, yes? I can allow Raj to speak if you want. Yeah, yeah, I think that's better. Hello, ma'am. I would like to ask, is a detailed balance and a metropolis sampling a link to each other? Or this detailed balance is applicable in all condition as it has a certain condition that when we are going from one microstate to another microstate, so it has the same probability of when we move from the second one to the first one. But it seems that this condition is not favorable for each of the experimental conditions. So you mentioned this experimental condition. Can you elaborate this a little bit more? So what does this mean? Like the probability of going from one of the phase, like a material goes from one phase to another phase, and he's not capable to recreate the same thing, come back to the same like material has two phases. One is in the low temperature phase, and one is in the high temperature phase. And material is such that when it's moved from low temperature to high temperature, its phase transition takes place. And in that case, there is a certain probability that it moves from one state to another state. But if reverse condition is not possible, so this detailed balance that we have applied, that you have said in my Montevallo sampling is strong condition of detail balance where we equate these two product of two probabilities. So that I doubted that that can be happen in those situations. So I want to clarify two things. So first of all, here we are talking about microstates. So a microstate is basically a list that can be understood as a list of factors of where atoms are and what their velocities are. So let's say the liquid phase. So a phase is a different concept. A phase would contain many, many such microstates. So the concept of detailed balance applies when we are talking about microstates. And it's also worth pointing out, like detailed balance is more like an assumption rather than what happens in reality. It's sort of an assumption that enable us to do sampling, particularly Montevallo sampling in an easy way. And so another thing like metropolis sampling is a special implementation of detailed balance. So there are many ways of design the moves that satisfy detailed balance. And metropolis sampling is a particular simple form. OK, thank you. Thank you. OK, so back to where we were. So so far I have talked about the classical system, the classical free energy. And the classical term here applies, refers to the fact that we are assuming that our particles or our nuclei are classical particles. They are point particles. And the point particles can be characterized by their center of mass in space. And that is sufficient. However, in reality, many nuclei that we have, especially the light ones such as hydrogen, helium, and lithium, they are sufficient like the classical particle treatment breaks down. And this is called the nuclear quantum effects. And the nuclear quantum effects many aspects of our system. So it affects the particle momentum distribution and isotope fractionation. So isotope fractionation means that in different, for example, in different phases of water, the equilibrium concentration of deuterium or oxygen 18 are different in different phases. It's different in gas phase, in the liquid phase, and the ice phases. It affects the pH, heat capacity, diffusivity, and many things. Nice and intuitive example of the differences that light water is perfectly drinkable, but heavy water is poisonous at high doses. And how do we take that into account in our simulation? So we use the path integral of formalism. I left this slide for reference. Also upload the slide to the website later. But this is just some gloss over them. But the essential idea is that because the nuclei, they are not classical particles. So the momentum and kinetic energy operator, they do not commute. So what do we do in practice? In order to compute the density of states, we have to decompose the system into separate replicas. And each replica would live in a much higher temperature. Because at the high temperature limit, the two operators becomes more commute. So in practice, what we do is that we use the ring polymer molecular dynamics formalism. It's, as I mentioned before, instead of having a nuclei as a classical particle, we represent each nuclei as many, many particles. And they are connected by the ring polymer, by a harmonic spring. So the total Hamiltonian of the system is the Hamiltonian of the individual replicas and the harmonic springs that connects the different replicas. Now, just to give an example of, and just to give us a more intuitive example, under the classical picture, we have the equi-partition theorem. So as we are saying, each degrees of freedom, we have a kinetic energy of kBt divided by half. But because of the quantum mechanical nature of our nuclei, this is not always true. So for example, in this water molecule, we have three modes for the hydrogen. We have the oxygen and we have the oxygen hydrogen and the bond vibration. And we have this sort of the breathing mode. And we have the out of plane vibration. And because of the nuclear quantum effects, because of the zero point energy and so on, that each more actually carries a kinetic energy that massively exceeds kBt half. And they are also different amount of quantum mechanical kinetic energy in each mode. Now, when we try to characterize the free energy difference between the classical system to the quantum mechanical system, we can still use the thermodynamic integration to perform a reversible switching between the classical and the quantum mechanical system. And in practice, when we write the expression down, this is equal to that the integrand is a function of this quantum mechanical kinetic energy, which we can compute from the ring polymer molecular dynamics. OK, I saw no question in the Q&A, so I'll continue. So so far we have talked about atomistic modeling as well as a little bit of statistical mechanics. So the next part is so so far, I think I have a pretty grim picture about atomistic modeling. There are many parts that needs to be taken to a consideration. And moreover, each step means a lot of computation. So thermodynamic integration is not cheap. Metodynamics simulation is not cheap. And if we want to consider nuclear quantum effects on top, because we need to not just simulate one system, but we need to simulate many copies, many replicas of the same system. And in there, we also increase the computational cost by 20 times. Now, so here comes the machine learning part in how can machine learning help in this case? So first, let's talk about representations. How do we represent our molecules? And we have many types of systems, right? So we have, let's say, a peptide, like a protein, or we can have a crystalline system with different arrangement, different symmetry groups. Or we can have also a box system with certain defects. It's a little bit difficult, so it's probably a dislocation somewhere. So usually, the starting point is that instead of looking at all the atoms, instead of looking at the box system, we first divide the system into a set of atomic environments. And each atomic environment will be like if I sit on a central atom and I cut off a sphere with a predefined cut off radius. And that is my atomic environment. And the reason why we want to do this is because imagine if you want to compare a true system with different number of atoms. And that is very difficult, right? So by decomposing the system into a set of atomic environments, then we can focus on representing the atomic environments instead of a system with varying number of atoms. And there are many popular representations which I will talk about. Now the idea is that we first do this decomposition. And then any observable of the system, like let's say, can be represented in terms of the local contributions. So let's say the phi here is the descriptor. It's the representation of the local environment. And then we can have observable associated with this local environment. And so for example, we can have an atomic energy that is characterized by the local environment. And the total energy of our system can be expressed as the sum from local contributions. So let's dig further about this local, how to represent local environment. So this picture doesn't just apply for to representing materials and molecules, but it's a very general idea in machine learning. So in machine learning, in the end of the day, we want a representation that tells us how similar our samples are. We can characterize the similarity in terms of the kernel matrix, the k, or we can have a distance measurement. So the things that are similar have a kernel matrix, have a kernel that is high, that is close to 1, and a distance. And they are very close in our distance space. And vice versa. So the idea is that we want to have this kernel or distance metric for our atomic environments. Now, let's look at these two atomic environments. And how do we compare them? So first of all, we can represent the atomic environment. Remember that the center is an atom. We are sitting on an atom now. So we can characterize the local environment as a list of displacement vectors. So basically a list of neighbors and the displacement vectors characterizing these neighbors. Now, so here's a question. So we have a list of neighbors. In this case, they are all hydrogen atoms. Imagine if I swap two hydrogen atoms. So the physical system doesn't change because the hydrogen atoms are indistinguishable to each other. However, if we look at this displacement vector, that changes when you permute two atoms. And we don't want that. So what we do is that instead of having this list of displacement vectors, we put smearing, we put a three-dimensional Gaussian distribution on top of each neighboring atom. So now instead of having a list of displacement vectors, now we have a density field. And the advantages now, it doesn't matter when we swap two atoms as long as they're of the same species. So now we can overlap these two density field. And then we compute a degree of overlap in order to characterize how similar these two atomic environments are. However, there's another problem. So if I rotate one of the molecules is still the same molecule, the physics doesn't change. But the integral would change. So how do we overcome that? How do we incorporate rotational environments? So the trick here is that we not only compute a degree of overlap for one particular orientation, but we do a rotation. And we compute the average degree of overlap on all possible orientations. So in this way, we remove the rotational degree of freedom as well. So and you might have guessed, this integral is quite unpleasant to evaluate for each pair of atomic environments. Suppose you have n atomic environments. Then computing this guy between each pair would have a quadratic scaling, and that is not ideal. However, Albert and Gabo and Rishi, they have a very nice formalism to show that one can actually express this. We can compute very efficiently this kernel, this similarity by expanding the individual role, this density field, in terms of spherical harmonics. And then computing the k here would amount to some simple operation using the spherical harmonics coefficients. Now so far, we are talking about the atomic environment, but eventually we are interested in bulk materials. So what happens there is that we need to combine the atomic descriptors into a global descriptor. And there are many ways of doing this. And the easiest is that, OK, I can just take the global descriptor, the global feature, by taking the average of the individual contribution from each atomic environment. So that's the simplest thing we can do. Obviously, there are many other choices available. OK, so now I think it's a good time to stop and answer some questions. OK, so Oscar asked, so the atomic environment is somehow like an application of the renormalization group, isn't it? I don't think so. But then I'm not an expert in renormalization group. There could be a way of expressing atomic environment as such. I do not know. And then there's a question, does local environment means up to nearest neighbors? Yeah, that's exactly right. The local environment is taking the nearest, it's taking, no, sorry, sorry. So depending on how you define nearest neighbor, so typically we take a cutoff. And usually you can go to the first neighbor shell. You can go to the second neighbor shell. It's really up to you. It doesn't have to be the first neighbor shell, but it is the nearest atoms within the certain cutoff. And then there's the sludge. OK, so there's an anonymous question here. So how large should be the variance of the Gaussian kernel associated with each particle? And this is a fantastic question because it actually is actually very deep. So intuitively, the sharper the Gaussian that you use, let's say the more sensitive your kernel matrix, your kernel measurement is for atomic environments. Imagine if two atomic environments differ very slightly, but if you use a very sharp Gaussian, then you are still not getting close to one when you compute the similarity. So intuitively, that would be a good thing because you want the measurement that is as accurate, as sensitive as possible. However, in reality, this is not the case because eventually we need to use our representations to do some machine learning, to do some regression. So in that way, we want to incorporate a little bit more smoothness into our representation. So it's a balance between the two. So there's another question about what about interactions between atomic environments? Symbol summing means they do not interact. And that would be a correct statement. However, when we come to machine learning potential, the architecture is actually more complicated, and we do consider interaction. OK, I think I'll stop now, stop the questions now in the interest of time and move forward. If there are other questions, we can also answer them. Other questions that are not addressed at the end of the talk will answer them during the last part. OK, so we talked about representation. Now we can use this representation. And there are many ways how we can use them. We can build a low dimensional map using dimensionality reduction so we can visualize our system. And we can do some pre-processing and sparsifying the data set. We can do clustering or we can do regression. So I'll talk about this dimensionality reduction. So the dimensionality reduction at its core is basically I have a high dimensional data. And I want to find a low dimensional representation that best preserve the relationship between the high dimensional data. Notice that the terminology I'm using here is kind of vague. And this is intentionally so because depending on how you define this, what kind of relationship you want to preserve, you basically ended up with different dimensionality reduction algorithm. So the more popular ones are PCA. And if you do it in the kernel matrix, this is called kernel PCA, like TISNI and UMAP is getting very popular these days. So I'll just talk about the PCA principle component analysis because it's really like sort of the mother of all the other dimensionality algorithm. And the principle are usually pretty similar. So the question is like, what is preserved? What is this relationship that I have talked about that is preserved during the PCA analysis? So in a simple example, I have two dimensional data. I found a principle component L1 and L2. And I project the data set onto this first principle component. Now in PCA, we first, we have the data in the high dimension. I have the data in the low dimension. So this is all about the covariance of the data. So I'm trying to preserve this covariance, which can be expressed by T transpose, sorry, x transpose times x in the high dimension and the same at the low dimension. Now mathematically, one can show that this can simply be done by finding the first the small d eigenvectors of the big C covariance matrix coming from the high dimensional data. And these eigenvectors are sorted like with a descending value of eigenvalues. So we are looking for the eigenvectors like the eigenvectors that are associated with the largest eigenvalues. And so this is just a little bit of derivation. And of course, we can find the eigenvalues and eigenvectors just by using the symbol linear algebra tricks by solving the determinant becomes 0. Now so graphically, let's look at this graphics again. So we are looking at L1 and L2 that preserves the covariance of our data. And this can be done, of course, can be done in higher dimension and just a simple illustration on how this happens in 3D, we take the two principal components and project it down. Now we have returned this simple Python package that does these analysis automatically. So this can be just because using a simple bash command line. So you use one bash command, and then you are able to generate a low dimensional map, as well as do some other analysis automatically. So these are the selected bash command that one might want to use. So here just show some examples. So here's a learning dipeptile. And typically, people try to visualize the system using the Raman-Chengen plot. So using two dihedral angles, far and short. We can also do this type of analysis automatically using the ASAP package. And then we can see we came up with something that is quite similar to the Raman-Chengen plot. So we have the principal component 1 and principal component 2. And but however, remember that we didn't know a priori dihedral angles are important. You don't need a prior knowledge to come up with such automated map. Here's another example. Here I'm showing this kind of map can distinguish a classical water and water that is from with nuclear quantum effects. And then there's also a projection of the Q&9 data set. And the map is able to distinguish small molecules with different compositions or like branch molecules or long carbon chain. OK, so wandering should. OK, there's still this last part of machine learning potential. And as I expected a lot. So this part is a little bit longer than the second part. So maybe I'll stop now and talk about machine learning potential during the second lecture. And now I'll answer any of the remaining questions for the first two parts. So Ebi asks a question. Ebi, do you want to ask this live? Hi, so I asked you which representation of molecules we should choose? On what basis do we choose the representation of molecules? So the soap representation that I introduced earlier has this advantage that it's completely general. You can use this for any. So these are all generated with the soap representation. And they are very different class of materials. So it's sort of applicable to many, many things. And we don't have to think. And of course, there's the option of handcraft representations. This is often done in the chem informatics community. The handcraft representations has the advantage that you can incorporate your prior physical understanding of the system. Because when we think about it, we can also think the dihedral angles find shine to be the representations of our system. And by using li-fi and psi, we are incorporating our prior knowledge about the peptides. We know that the dihedral angles are often important to characterize these systems. And we know the positional side chains are probably not as important. Does that answer your question? Yes, thank you. And the larvae also ask a question. I'll read it out instead. So how do we go about long-range order interaction in atomic environment case? And how do we incorporate it? So the answer is typically people do not incorporate it. So there are some ongoing work from Michele Cioriotti from York Baylor that they have a certain scheme of incorporating the long-range interaction. But as of today, this is not the norm. So typically, people do not, which is a little bit of a shame. But surprisingly, without accounting for long-range interaction, the machine learning framework seems to be rather accurate for many things. Hello. Yes? Yeah. So actually, I had a question. So while you were explaining about the PCA, about the data there, so I was just wondering if the data there has something to do with the information which you explained in the last slide about the atomic environment case, where we have this information about this displacement vector. And yeah, so is it somewhere related to this PCA data set, so that now in this PCA analysis, what we do is now we have this information of this atomic environment? Is it something like this? I just wanted to relate it with the previous slide. So which do you have a slide number that we can refer to? OK, so you were explaining about the PCA. OK, I actually didn't look it. Yeah, maybe before this one. Over? Yeah, I hear you talk about this data set, where and I think this data set you explained in the previous slide to this one. Yeah, so here you were explaining these data points. So I was wondering if these data points were somewhere related to this in the previous case to the information of the atomic environment? Right, so this depends on so let me explain this example. Let's say in this example, each point represents a particular alanine dipeptide configuration. So what actually happens is that we take each configuration of the molecule and then we compute the descriptor for each of its atomic environment. And from there, we compute the global descriptor for this small molecule, in this case by taking the average of the atomic contribution. Now I have a vector for each molecule in my data set. Right, now let's say I have 10,000 molecules in my data set. So now I have a 10,000 by the dimension nullity of the descriptor matrix. And then I project that matrix down into dimension. And that's what we see here. So each point represents the low dimensional representation, the 2D coordinate of the design matrix of the vector for a small molecule. OK, OK, OK. OK, thanks. Thank you. OK, I think we are a little bit over time. I think we can take the break now. And when we come back, I'll go through the machine learning potential as well as some applications. Yes, let's take how much do you want to break? So on the schedule, it says a 15 minute break, right? Maybe we come back at 2.15. Perfect, let's back to 2.15. Thank you. Thank you.