 Thank you, Matteo. Thanks everybody for showing up on the last lecture. Let me start sharing my presentation and I will ask you to confirm if you if you're able to see it properly in full screen mode. Yes, it's very good. Okay. Okay. So welcome back everybody. So now after Alex, four lectures on methods that are related to data mining, particular clustering and dimensionality determination and dimension reduction. I will try to in this session, I will merge what we have seen in the first two lectures. So basics of formulating quantum many-body problems in terms of data structures and the data tools, the data mining tools that Alex explained. Okay. And so this is now more of a research line and here you find the name of all the people that have contributed to the specific things I will be presenting on today, which are these two papers here on the bottom left. Let me emphasize that of course we are not the only one contributed to the general framework of utilizing data mining or more generically unsupervised learning techniques to understand features of many-body problems in particular phase transitions. There have been a lot of works. Some of these, some of that I will actually mention along the top, but most of it unfortunately I have long time to cover. Let me however suggest two rather good reviews and I think they are very good. They touch different aspects of what I will be telling you today. They are maybe more on the quantum side, but there is of course also classical results included and they are by Juan Carrasquilla and this the second one by several authors published over the last two years. So in case you have interest beyond what I will be presenting today, I will strongly suggest that you start from reading these two reviews. The idea of today is the following. At the beginning I will briefly recap what we said at the end of my second lecture, which was about this strange three-cyte Ease model. However, in repeating these concepts I will do that with a slightly different twist and instead of utilizing the Ease model, I will utilize the Eisenberg model and it will be clear to you why I do that, why René discusses data structures. And then we will move on and discuss the main results that come out of this merging of data mining and classical stuff. And I will first present you a series of numerical experiments done with Monte Carlo methods on second order phase transitions, first order phase transition and phase transition which have topological origin in the sense that they are driven by topological defects. And in particular I will be discussing what is so-called Berizisky-Costen's Taoist tradition. And while doing that I will also take a very, very short detour if time allows and discuss our spontaneous symmetry breaking, which is a common phenomenon that happens in some of these transitions. It's actually related to clustering data structure. And I will do that utilizing a tool which is called principal component analysis that I'm sure Alex mentioned. And then after explaining to you this collection of numerical experiments we will try to build up a theory for that. And this theory will be analytical work under given number of approximation and the many results of this theory is that it will be able to explain the main feature that we observe namely that many body problems and their data structure actually simplify the critical points. So there is a kind of emergent simplicity and we will show how this actually stems from universal problem. And this we will understand analytically utilizing two methods. And then towards the very end, if time allows, I mean I have just only a small teaser, a short teaser on how to extend some of these concepts to the quantum antibody problem in particular path integrals. And there are serious issues that do not allow to just take the classical results and apply them to the quantum one. And I would like just to give you an idea of what these challenges are from the point of view really of data mining rather than physics. And then we show some other applications. Okay so let's go back to our example that we had last time. I mean instead of taking a three-side reason one, today I want to take a three-side Heisenberg one. So you mentioned that you have three sites and you can label the three states. So these three sites are here for instance. They put them on a line but you can put on a triangle who cares. The important thing is they are all talking to each other. And we have to define our problem, our simply problem. It's very simple. We can label these states by the, all the states in the configuration space of our partition function by just three variables, theta one, theta two, and theta three. Theta is just the angle of these spins we respect to the line or we respect to a plane. It doesn't really matter. It's just a one-dimensional representation of these things. So this implies that our configuration space as nothing but is defined by points in this cube. Sorry I'm only seeing the first slide I think. Okay very good. So you don't see a slide now where there are spins and where there are configurations. No the title that's reminding the man about the problem and yeah I can see only the first page I think. Yeah very good. So let me try again. Then maybe there was a problem in my slide sharing. So now can you see actually? Yes. Data am I a three-side Heisenberg model? Can you see that? Yes. Yes. The third page. Yes. Okay very good. So I think so and now what do you see? Now I can see the third page but the illumination is too low I think. Sorry. Okay no problem. Let me I mean if it's not a problem for you I will just keep moving in this format okay? Yeah okay thank you. Where you can see properly right? Yeah now I can see. Okay thank you. Very good. So thank you for pointing out because I was not able I was not reading the chat. Okay and sorry could we see the previous slide because we missed it. The previous slide is just the outline. Thank you. I will send you the PDF at the end of the lecture. There is no problem of course together with the updated notes and some exercises. So I want to come back with the with this minimal many body problem classical memory problem. So now we have this three-side Heisenberg model and we are enabling all the states with these values theta 1 theta 2 and theta 3 so that each state is represented by a single point inside a cube and now we want to sample our partition functions how we do that with respect to the statistical mechanics weight e to the minus energy of temperature and the Hamiltonian is nothing but is the Heisenberg Hamiltonian so it's just spin speed interaction. Now we have here two figures one on the left which represents the data structure of our partition function at low temperature. So what happens at low temperature at low temperature the states that are sampled the most at the state which at the lowest energy but the state that are lowest energy in a mountain in a ferromagnet like this are the states where all the spins point into the same direction. So when I fix theta 1 I will have theta 2 equal to theta 1 and theta 3 equal to theta 1. Okay so this defined on our three-dimensional space just align where all the titas are equal. Okay so you can see here even more clearly than in the easy model case that the intrinsic dimension of the low temperature representation of this simple three-sided I think my model is equal to 1 these are all the points on the line. Now on the right we can do the same thing but at very high temperature when when the temperature is now in this very high temperature case what happens all states are equally probable because the energy does not matter anymore so during our sampling I'm essentially filling up the full cube the points so I don't think I need to explain you that the intrinsic dimension of this data space is not one anymore but is really equal to three because we are moving in the full dimensional space. So this is kind of a recap of what we have done last time with easy model but now with the Isenberg. Note that there is a difference here which is very important in the Isenberg model I will very very rarely repeat the semi-configuration while in the three-sided model I will have this repetition problem and I think Alex discussed with you. Now okay this is very qualitative it's three spins we want to go beyond that of course but just to summarize the message of our first example very low intrinsic dimension in order phases and as high as you can get in principle at high temperature. But now okay remember what Alex told us is that ID is the is somehow the minimal number of variables that is required to describe a dataset. That's great our example fits into this more because we know in a ferromagnetic of a single spin then we know all the other spins automatically so then this makes one. While in an anti-ferromagnetic since all spins are going on their own we need really a number of dimensions which is equivalent to the number of spins. So all of this makes perfectly sense from a data science viewpoint what it will do. The question is what about transitions. While at transitions you have to put the emulation on my own imaging that now we want to draw kind of a phase diagram and we have a temperature here which is our x axis and we'll have on the left our ferromagnetic state in principle dimension one and on the right we have this essentially high temperature paramagnetic state. Now what happens at criticality here we have we can follow two intuitions. The intuition on the left is a real space intuition. So typically criticality what happens what is the fundamental feature of criticality is that the correlation length your system diverges. So this implies that spins which are very far from each other they are typically very correlated and don't trigger in a power low manner not like in a ferromagnetic. Okay so then one expected one sampling the partition function one could find states which are very bizarre the manifold can be very correlated and can be a lot of states become available. There is however a second intuition which is that more based on a concept which goes under the name of renormalization group if you have never heard about it forget about that it's just a definition okay you will have to trust me for 30 seconds. So this intuition about renormalization group tells us that at critical points physics becomes universal what does it mean? It means that in order to describe expectation values of observables or of lower correlation function one does not need to know them at any distance but does only require to know very few parameters to determine all the correlations of whatever. So in particular these parameters they are critical in the distance correlation function and some exponents okay and in particular the critical exponents that you have seen there are also other quantities which are called universal amplitudes but let me forget about them for a moment these are just critical exponents. What does it imply? If I can determine correlations with very few variables well it implies that I need few variables to describe my system but this is starting by the definition of interesting dimension okay so even if the manifold is in principle huge because there are many states it's actually very constrained because I can't specify a set of rules that describe how the correlation would be so it is in principle the interesting dimension is minimal okay and so there is kind of this conundrum I mean sorry is which one of this intuition is correct okay and so at now this exactly this is exactly the question how we posed a few years ago we didn't know okay it's really they are they are predicting the opposite one is predicting a maximum interesting dimension at the critical point and the other one is predicting the minimum okay so what's the truth? So first of all let me stop now is it clear what the challenge is now? What are these two tensions? Rg a real space or do you have questions? Okay there are no questions so you can trust me so let let's move on okay so in order to determine this we did first some simulations okay and now in explaining the simulation let me start from the simplest model is the easy model into the other square lattice okay so the the the configurations phase are defined by spin variable plus one minus one there are strings of ones and minus ones the Hamiltonian is very simple it's just the easy model without without fields okay so it's just the nearest neighbor interaction ferromagnetic who cares doesn't really matter a lot and we know already of course this model is exactly solvable blah blah we know what the value of the critical temperature is there is a it's this 2.26 blah blah and this phase transition is conformal it doesn't really matter for us but what is important is that its critical exponent nu is equal to one the one remember the critical exponent nu is the one related to the scaling of the correlation times now what do we do in practice well in practice what we do we run Monte Carlo simulations and in these Monte Carlo simulations we have this mark huge mark of chains that have to be uncorrelated yeah so that's why we need these cluster algorithms that I didn't have time fully explain to you at the end of the second lecture but anyway forget about how we get them we have these very uncorrelated data sets and then we analyze them maybe analyze them out well we extract the 3c dimension with the 2nn method that Alex mentioned in you in these lectures and for the 2nn methods we need a lot of configurations and typically we we deal with configurations over 10 to the 4 10 to the 5 and then how we extract in 3c dimension well we have this probably I recall is the ratio between the next nearest neighbor distance and the nearest neighbor distance the people's rg is the normalization group so this value of nu is the ratio between next nearest neighbor and nearest neighbor's distance and what one does to extract id just plots these two curves one is the this curve which is typically called s I don't know exactly for what for which reason and it plots that as a function of the logarithm of this ratio of distances versus the minus the logarithm of the distribution of these moves okay and this is a typical result that one gets for the easy model I don't remember what temperature that is and you see that this line as the this curve as the following feature very very close to the origin it's actually a perfect line straight line and then at some point for very large values of not new then it scatters and becomes different from a line so the regions where we are interested in is this one the region of very small log of mu because this that's the region which is physically important is describing our data set at short or intermediate distances okay instead these these points here are typically related to either outliers or very or points which are sampled randomly and not particularly important and how one does extract extract id is just this slope the slope of this curve okay and here in this sample you can see that the slope is obtained with a linear fit in this gray line and we do that for various points in our phase diagram so let me show you the results of this procedure okay so this in this slide here what you can see let me try to optimize a bit the dimension of the slide so that you can see it well okay in the plot here what you can see is the intrinsic dimension on the y-axis plotted as a function of temperature for the easy model for this different linear dimensions of the system so the gray the red triangles these are squares of 40 times 40 and then you have larger and larger now you can see that these curves so fish of the following I mean id decreases very very close to the critical point and actually has a minimum which is denoted by these orange dots and the minimum as we increase the system size it approaches the correct transition point which is marked by this this that should lie okay so this results clearly show us that the intrinsic dimension does not grow at the critical point but actually the opposite it becomes smaller and smaller as we proceed with system size in the following sense that the minimum of the intrinsic dimension moves shifts towards the transition point as system size is increased okay so this is a signal that there is really emergent simplicity so the and this it's an evidence that the explanation scenario two rg base understanding based on this time is correct okay so now you don't want that Marcello sorry it looks like the points increase is increased in id as you increase the system size yes the the value of the dimension increases okay but what is important is not immediately the value of the dimension because then you can wonder is it fair to compare the intrinsic dimension of a data space of a species that which is else 40 times 40 with a data space of a spin system which is under 20 times under 20 okay and then you can tell me look it is not fair to compare that actually the dimensions right the reason is that is because the embedding space it's very different okay the only thing that that is really important in our understanding is what are the features where are the features of this curves uh living okay and then this is where the important point is is that the features describe a minimum which shifts towards the interest dimension even if the value of this minimum depends on the system size actually it also depends on how many samples we take on this is another point that we're not touching these lectures there is an intrinsic dependence in this classical partition function on the number of samples we take this is non-trivial and we only understand it okay i will not i mean maybe i will comment on the very end because it's way too much now really here the point is that the the minimum shifts towards the transition pointer system size increases now the following question is well if this is related to rg this has to show some unit hello yes there is a question on what is rg ah renormalization group rg is renormalization group okay rg is renormalized so the question is is if this is just related to something that signalized where the transition is or if there is if the data structure really becomes universal okay and in order to understand that one has to do an analysis which is which is based on hypothesis which is called finance and scaling hypothesis and this finance and scaling hypothesis tells us that an observable displays universality if it is described by this equation here on the left so where l is the system size zeta is just a number it's actually a combination of critical exponents and the important point is this f function and this f function is a scaling function this implies that a function of only chi divided by xc divided by l where x is the correlation length okay so this implies that if we take our value of id for the different uh various different temperature of various different system sizes and we rescale the the system size and the temperature with a given scaling which is which is determined by the correlation length scale which it is known is t minus tc is one of the t t minus tc to the new where new is the critical exponent yeah so if we perform the scaling and we also rescale our id we have to obtain that all of the curves collapse into the same function very same function and this is shown here on the plot on the left where on the right axis we have the rescale temperature t tilde doesn't matter what it is and then we have the system size l scale with an exponent one of the new where new is one of our 15 parameters and on the left axis we have id times a rescaling function l to the minus zeta to be absorbed this first time and here you can see all the curves that we had before remember I mean there were many system sizes nine system sizes tens if not hundreds of temperatures that we take here in our field all of them fall into this single single curve actually it's not on it's low and there are only three parameters that we used to fit I mean the critical temperature value of new and values of zeta and one can see this is this is what we call evidence of universal behavior when this happens and this implies that in what what what is happening in data space it is really that there is a structural transition which has the same universal properties of the transition in real space okay so if you want it's like when you look at ice when melts we think about ice ice as a transition and the point is if you try to reformulate ice in a data space which is a very abstract object well ice also melts in data space with the same critical exponent can you briefly explain finance high scaling okay the finance high scaling hypothesis let me see if I can briefly explain to you well it's essentially telling you the following it's telling you that in the vicinity of a critical point when the correlation length of your system is of the same order of the size of the system itself that because you're starting that finance size in that specific regime all low order observables like two point correlations and so on and so forth everything which is governed by a g if you wish behaves in a in a manner your system size we will eat the correlation length okay only precisely the critical point now under this hypothesis there are correlation functions behave and this is one one sample of these equations and if you are interested in a you know that you look at this review by previously to me which is great you are breaking what time breaking is the connection fuzzy yes the connection is a little bit fuzzy at times okay i'm sorry for that um if it keeps being fuzzy let me let me do the following i close this thing i think should be closed okay so now it's probably is it better now i think so okay let me go ahead and but let me know if there are for the problems okay so that's what this stuff is telling us and there is an additional point which is a bit more quantitative because you see from the split parameters like the temperature is kind of okay the correlation on critical exponent is kind of okay we want to see if we can do better so for instance what we can try to do we can try to plot the position of this minimum that we were observing before versus and this is what we get if we try to plot that position of the minimum as a function of inverse system size against with a critical system sizes that predicts a critical temperature which is 2.278 blah blah which is a border of one percent less than one percent away from the expected critical temperature okay so this correlation function this understanding of of many body problem in data space is not only used to to characterize the many body problem on its own but is is at a quantitative level accurate enough that allows us to determine properties of transition and notice here i've not told you what the transition is about i didn't tell you what the other problem is about i just throw you the data okay you just add to analyze the data and without having to know what the transition is and so on and so forth you are able to get the correct critical point with a with a precision which is less than a percent which is kind of non trivial i would say because essentially a fully assumption free now i mean there is a very easy critique to everything i told you is that the easy model on the square is a very simple model okay it's exactly soluble and in order to to answer this question we did the following so we started a set of models that are known as statistical mechanics as q state pots models into dimension on the square like this and the idea of pots models for those of you that have not heard about them is just there are simple generalization of of using in the following sense that instead of having variables that can only take two values we have variables with sigma j's which can take q values so from zero to q minus one and this implies that our data space is defined by not binary numbers but q value numbers and the corresponding energy function is not only counting if two spins are both plus one or minus one but is actually just a delta function that close by spins have to point into the same direction there's a ferromagnetic formulation of the pots model now this pots model is richer is not fully soluble in general however the transition point can be determined analytically and follows this formula you see equal to one over the log of one plus square root of q what is interesting is that the nature of this transition changes as a function of the states that we include as a function of q and in particular it is second order for q equal to three and four and it's first order for q larger than four so what we can do now with these pots models is that we we can check two new scenarios first is a secondary transition which is not an easy model okay and also has a different new exponent and second we can check a first solid transition so what happens there and let me show you directly the results so these are the same results that I showed you before for this model now for the three state pots model okay so on the first plot here the id versus temperature again it displays a fissure which has a minimum which moves towards the transition so emergent simplicity is still the second plot is this collapse scaling from finite size scale theory okay and again the curves collapse very well there are a bit of outliers here for on the left side of the transition I think they are most to do to the fact that we don't have enough accurate estimates of id there is really not physics here it's mostly computational limitation and then again out of this we can try to attempt the finite size scaling estimate of the critical point position to see and also of the critical exponent move and the critical point position you can see here on the plot on the right we estimate with this linear field with the retro which is very very close to expected one again the error is over the order of one percent more or less likely limited to due to the fact that we really can't go much larger in the term in the definition of physical systems the Monte Carlo simulation is not particularly difficult but is the determination of id which becomes at the same time we can estimate nu and the value of nu that we get is 0.805 which tell us two things of course we don't get always one only that's a good sanity check that our easy model was not you're not just getting one because the easy model is particularly simple it's really we were getting the correct critical exponent and it's also rather close again at one percent even better than the expected critical exponent 4.5 so that's a confirmation of what we have learned from the two for the easy model but now we can move beyond second order critical points and we can understand what happens at first order so this is an example taken from the eight step model where the region is actually relatively strong first order okay so the collision is not it's not so small and on the left you see a plot of the intrinsic dimension versus temperature and now what do you see it's something which is not what we had before that we have our nice minimize one and so forth where all the curves look like they want to have a minimum look at this yellow curve who likes to have a minimum but then it jumps very very close to the critical point and this is happening the larger the system size the better you see this jump so the qualitative description here that emerges out of our simulation is that actually the intrinsic dimension experiences kind of a discontinuity in the form of a of a local maximum when you approach critical temperature and we try to estimate the position of this maximum with these yellow dots it's not so easy but that's what we're trying and why is this happening like this why is the intrinsic dimension of the data set exploding well the reason as I anticipated you last time is that at first order phase transition we have coexistence we have metastability so there are two orders juxtaposing what does it imply in data structure well in data structure implies the following suppose that you have paramagnetic orders here and you have ferromagnetic order low temperature okay they are typically described by very different states okay there can be of course an overlap but the states are very different so what happens at the critical temperature is that these two huge clusters actually merge okay so the intrinsic dimension locally of of the data set explodes okay obviously does not diverge because it's still limited by the largest of the two okay even though this is not just a direct sum of course but it is very very large okay and a signal of this metastability is exactly what is happening here and there's really kind of increased complexity of the data structure at first order phase transitions now is this also related to universality the answer is yes this is not the same universality that you have at critical points it's what is called dimensional scaling if you want or mean field scaling it's not exactly mean field scaling so what happens in practice is that the critical temperature scales to its correct value in the thermodynamic limit not with a critical exponent no but with an exponent which is just d you need the dimension of the system and that's exactly what we find in our numerics which are depicted here and so the blue line is the correct value in the thermodynamic limit and the red line is our field to the numerical data with the correct value of d which is 2 in our case so what else okay this is what I told you about the data structure so so far is there an example in data science linking the Griffith phase and the structure of criticism when I think about Griffith phases but maybe you have to correct me and because you might mean something different I mean phases where somehow which are characterized characterized by non-trivial transport problems but this is typically a disordered system it's not immediately related to criticality or do you mean something different excuse me for the systems that have a many critical point not just one critical point yes ah that's what you mean with I don't know we're not excited about them if the torque is really critical I don't see fundamental reason why conventional physics should not work but since I've not tried explicitly I cannot confirm this so Marcello the your video is a little bit broken now and also the audio I don't know if you can let me do like this let me okay I will share on the pdf because I think it's it's better than keynote so give me just a second I stop sharing because I think that was probably the main reason why I have just noted that also locally my slide that is low which is which is probably due to that and and this might be due to the fact that there are a lot of data points so so far in both of these models we have only dealt with with the phase transitions where there is an underlying symmetry that it's broken so we move from a paramagnet to a paramagnet and then the natural question to ask next is whether we can find interesting features in the data structure that have universal behavior also in the cases where this is no symmetry particularly if they are topological and the examples that we know is the x y model okay so and in two dimension Matteo can you yeah so it's still under this uh maybe if you switch off the camera maybe that would be better I don't know yeah okay let's fly yep let me try let me try so what is happening in this x y model is that there is a transition as a function of temperature but this transition is not signaled by any local order parameter okay even though we have a ferromagnetic phase at low temperatures the transition occurs through the condensation of what are so-called topological defects symmetry breaking and for the specific of the two the x y model the critical temperatures it's it's a called it lies in the so-called berthisky-coastered thawless universality class and it appears at the given value of the of the temperature which is 0.89 blah blah blah okay so what happens to the data structure in this case well these are the results again from numerical simulations and on the left you find the intrinsic dimension versus temperature and while at very very small system sizes you actually see no features so the intrinsic dimension for instance of the smallest system size 30 times 30 is this black line it's just a monotonous curve by increasing system size you start noticing that there is indeed a minimum development okay and the minimum is marked by this gray by these yellow dots yeah however the minimum is not very close to the transition point you see the transition point is marked by the dashed line this is strange at first but it's actually consistent with the finite size scaling theory of BKT which is described by this equation here that tells us that that the finite size scaling of the critical temperature is not approaching the correct BKT temperature with a power law like we have seen in the second order transition but it's actually approaching that with one of a log square scale so it's extremely slow approaching criticality is extremely slow and it's actually confirmed on this plot here on the right where we have on the left axis the critical temperature extracted as a minimum position of these yellow dots here and on the right axis we have one of a log square L and you see all these points are close to each other they are very different system size but one of a log square makes differences very small and you see that if I mean apart from the first two points which are a bit off probably because the sizes are too small one can see that these points line perfectly on a line which is consistent with the scaling and they get a critical point critical temperature which is of the order of 5% with the expected value however in my opinion an even better statement can be made even a stronger statement can be made by starting the collapse scaling okay collapse is again not against the power of the of the temperature and on the system size but as a different shape which I will not bother you with that is just very similar in speed to what we have done for the second order transition and here are the results plotted for very different system sizes from 50 to under 10 spins over a pretty broad range of temperatures and none of them collapse into the same universal curve that is the picture here okay so we can say out of these results that even for topological transitions data structures have universal behavior and the universal behavior reproduces the correct expected critical properties like the correct scaling and the correct temperatures so that's the summary part on the numerical side and the the question is can we get analytical insight on why this happens but first I would like maybe to pause and if some of you have questions on these simulations I would take them now if there are no questions I can I can take a bit about the theory I'm not sure I will be able to tell you everything but I want to give you some ideas at least so the first I mean first we can decouple the question the first part is why is the first idea please please send does this idea that you found as a critical exponent have any other you know in a scaling relation with any other critical exponent the answer is no in the following sense we only know we get the critical exponent nu and this is the same one of the critical theory so id as the creates as the same critical exponent of the correlation length however we were not able to determine whether id can also be connected to other critical exponents like z for instance this we do not know and from the analytic explanation I will also tell you why we can probably get on you thank you cheers cheers okay so let me tell you a bit about the theory first I want to tell you why id is and then I will try to argue why this shell indeed exhibits universal scaling behavior okay well the why is the id singular well the idea is is very simple and as Alex told you id this is actually sensitive to scale so it really depends the specific sample that we're using is sampling a given length scale so that id is sensitive to changes of the data structure at the given scale how does it do that well it is very hard to understand in general but let me take the the a drastic simplification and instead of looking at really the interesting dimension let us perform dimensional reduction with a PCA and analyze what the corresponding figures of merits are okay and I want to mention that there is this beautiful paper by Leuang on which is the first analysis of PCA based model we are not doing the same thing but this is really popular was very important for us and I think it's very beautiful so on the right here you can find the the analysis of the two principal components or the two most important components the principal components at low temperatures on the left and at high temperature on the right and you can clearly see from these pictures that at now so in addition to the principal components here we are encoding the average magnetization in our points with colors so if the average magnetization is plus this is blue if it's minus this is red i think that's all the points of color and you can see that at very low temperature our data structure features two clusters they are made out of very similar states and they have distinct magnetization instead at high temperature essentially all states are equally important our principal components is just a single indicates that there is just a single huge cluster made of very different states so there is kind of a very fundamental change in the data structure even by just looking at cluster and in order to understand this a bit in a more quantitative manner one can compute the following one can compute a global id which is the one that we compute with the principal component by essentially doing the following by taking all states okay and this global id approaches one in the symmetry broken phase essentially that all the magnetization in the symmetry broken phase because at that point all the magnetization is really required um to describe the data set we have two very large separate cluster we just need to label them with the plus and one plus and minus and we are done and one can see that in this plot here as a gray line okay so this is essentially flat and I can tell you the value is one and oppositely in the high temperature this would be very large of course then we can define something a bit different which is not the id of all the states but is the id of all the states projected in a fixed magnetization sector so we on purpose select only states which have other magnetization plus now in the high temperature phase this does not matter and indeed the red curve and the green curve at high temperature at the same the area but what happens in the low temperature phase is very different in the low temperature phase essentially id is not probing this dichotomic structure of the data structure so plus and minus but it's actually probing the internal structure of each cluster okay and this doesn't have to be small okay in actually in our case is this large because it's diagnosed in fluctuations and that's the reason why this local id as very is very large at high temperature but it's also very large in sensitivity broken temperature and it features the minimum at the transition point now it has to the feature that affected that this is a true minimum with minimal assumptions now one can draw out of this simple PC analysis the following universal picture that data structures across phase transitions they have three regimes they have a regime where the temperature is very small where we have a number of clusters that corresponds to different quantum numbers of our theory in the easy model there will be two clusters if there will be three state box model there will be three if we were dealing with the x y model we will have a number of clusters which are labeled by the winding and then the intrinsic dimension is large because it's able to see fluctuation of the local scale inside each cluster oppositely at t very much very large essentially we will have all quantum numbers together and the intrinsic dimension will also be large because it's seen just single cluster which can have which is globally distributed so we have seen in the first slide this is typically very large intrinsic dimension now what happens close to criticality is that a given number of these clusters actually all of them in a specific manner start merging because they start being composed of states which are very similar to each other okay and in doing so the important degree of freedom here is not anymore the local ones which are typically very large but is the global one defining the changes of quantum numbers okay so this implies that the intrinsic dimension close to criticality where these clusters start merging it's somehow minimal it's minimal in the sense that is now describing only how these quantum numbers changes right so this is the qualitative picture that we get and this is it's important to note it is irrespective of the nature of the transition okay it only relies on the fact that we need to have labeling of excitations at low energy this is applicable to second order this is applicable to topological it does not matter how these quantum numbers are defined now can we now go beyond this heuristic picture and do some analytics really right now how the intrinsic dimension depends on the correlation length and on the critical extra components the answer is yes we can do that we have to do but since analyzing the data structure is not an easy task we have to make a certain number of assumptions let me try to explain you this assumption and then I will just flesh this out I mean remember how we compute id for a many more is not that we are looking at all the states okay we are looking at the final sample and then we draw these pictures of the distribution as a function of the of this variable mu which is the ratio between next nearest neighbor and nearest neighbors now the assumption that I want to make is that those a since here we have a line we assume that the slope of this line which is id only depends on the first point so we know that this line has to touch the origin we say it's okay the first point determines the slope of this line brutal assumption by the way okay so this implies that id as described by the simple one and r11 are nothing but the nearest neighbor and the next nearest neighbor distances corresponding to the smallest values of mu that we can have point of of our line is not just a random state in it is not defined by a random state in the universe space no it is defined by a state which has the slowest lowest energy and this it's a kind of very reasonable assumption because first it is very likely that we have sampled this state or a state very close to that at whatever it is more than this it is extremely reasonable and also it is also very reasonable that this state has the smallest ratio of next nearest neighbor and the nearest neighbor because excitations on the top of the ground state if they are local they also tend to have very low energy and they are similar to the ground state in configurations so this second assumption is actually very reasonable then there is a third assumption that we have to make because still I mean the id does not depend on real space correlation function depends on correlation between the configurations that we have collected in our Markov chain and the third assumption is that configurations over the Markov chain between correlation between configuration in a Markov chain are equivalent to real space correlations okay now this assumption it is very hard to prove of course we have a argument of why should we work this is analytical I will not have time to present it to you here but I want to show you a numerical proof of that okay of this third assumption and these are these quantities are two computed for the easy model for the different system sizes and also for x-y model for two different system sizes let us look at x-y model now the analytical prediction based on this assumption that I told you predict this green lines this dashed lines that I can picture now you can see that these green lines are in very good agreement with the max which are these red points and red squares and blue triangles so this implies that this is a basically completely nuts assumption that correlation between configuration are equivalent to real space correlations so maybe correlation it is actually holding true for data structures and it's true also beyond its expected probability regime which is in principle very small temperature okay it's actually true how based on this elastics what we can do we have three steps okay first first assumption is only the first time then we can write the first point as a series of correlation functions that we call this f2 f4 f6 fp denotes p-point correlations and then we but these are correlation between configurations and then we make this third assumption that these are correlation functions at equilibrium and now you can see that most globally decaying correlation function at equilibrium is two points if we plot here if we write id is proportional to one divided by the logarithm of this correlation function where this correlation function decays an exponential of minus r divided by psi so this implies that id is proportional to the critical point to the inverse correlation length okay so this is proving two things first id as minimal as to be minimal at the critical point because it's the largest correlation it will not go to zero because this is only the first correlation of course there will be other stuff that make it final but that's to be minimal what is most important here is that this predicts universality okay because it implies that the data structure is governed close to criticality by the same critical exponent that govern a behavior of the correlation length and indeed nu okay so that's okay one can derive the very same equations or almost the same equation also utilizing maximum likelihood this will only use one of the assumptions that we have told you however the direction is a bit more messy i think we do probably presented you maximum likelihood you can do it on your own or it is done in our paper it's actually a few lines that it's a bit tricky and i think i will skip the quantum mechanical parts in principle one can utilize what i told you about classical partition function also two-part integrals but there are some unexpected challenges that's just a demonstration that everything at the end works but i will not go over it now so let me go to the conclusion so what i promise you at the beginning of the course is that there there will be fruitful interactions between data science on one side and statistical mechanics on the other and in particular leveraging on the concept of universality and i i hope that i convince you that data structure of many body problems are just not collection of random numbers not just random ensembles but they have unique characteristic properties okay they have features of emergent simplicity so and this suggests that maybe there are other ways of characterizing complexity of many body systems by leveraging on these simple data science tools that we know how to how to apply to objects such as a partition function but not only that i mean what i find particularly intriguing is that universe i mean structure data structure of many body systems are universal okay so they are not messy they are perfectly ordered in the same way we think of many body systems in terms of correlation function and spectrum data structures also at this universe and i would say even more so to some extent and this might be useful in the future if we want to understand many body systems for instance developing new algorithms that leverage on the fact that we don't have to search for our physical properties in a messy data space but we know how to constrain that to universal one and i think what is also interesting maybe at the more qualitative level and this is still need to be developed especially in the quantum mechanical case is that data structures have qualitative signatures of physical behavior we have seen one example is about metastability in first-order phase transition and there can be other who just do not know because we have not started them so far and i think it will be interesting in the future to understand whether data structures can enable new insight on many body problems that are very hard to capture by analyzing them with conventional tools okay i think i'm done so thank you all and i'm happy to get questions i have a question related to the exam yes why why is it on the 28th of april because it's the 28th of april i don't know why are you asking this i understand no i mean it's uh almost like a month and a half from the end of the lectures okay so that i think we have done it for several reasons and one of the reason is that you guys have time to observe things and you are not rushed through we know that you have other exams we didn't want to put a date which is too close because it would have not been good for some of you so we put a single date okay it was just my curiosity thank you so questions a lot of this is work in progress so don't be a question but i think it's a very stupid question i'll ask anyway i really didn't understand at the very start of the lecture what was the intuition that led us to think that if the correlation length diverges we should expect also the intrinsic dimension to become very large okay yeah i did not explain in detail so it's not that your question is stupid at all it's just my question was very short the idea was the following so if you have a very large correlation length you typically think that states which are very different from each other they are equally represented in the in the partition function naively and the reason why you say so is because the correlation length is large this implies that it's true that two spins at large distance are correlated but they are correlated in a polynomial manner so i can have clusters which are very different shapes and to some extent in the context of the easy model one can also see that at criticality because at the critical point one has states which appear in the partition function which are very different they are clusters of spins but all of them are arranged in a different manner okay so naively you will say okay these are points in data space which are all very far from each other because they're flipping all the spins so they're at maximum distance they will cover my full cube if you want naively but the reality is that it's true that they connect your full cube but they do it in a very precise manner so i can have points very spread in the dimensional space but only define a cylinder for instance and this is okay i see thank you thank you very much excuse me professor can you go back to the slides where you connected the twisted parts model and ising model i guess i mean the slide 12 and 7 maybe the plots actually yes so you you want this one uh i think this okay and what is your question the previous one i mean i want to make sure that i saw this correctly at the critical point id decreases as the system size grows right yeah well at the critical point id decreases a minimum and the minimum moves towards the transition as the system size grows okay okay so there's a question in the chat about the very good question actually yes it's an excellent question how generally is your third assumption in the frame of marco we are trying to understand this it is it is not clear we believe that it it is possible that our assumption actually it's true in general for a marco chain because marco chain are not just random data structure they are data structure which have specific ordering and so on and so forth so it could be that analytically one could prove that our third assumption it is it is not an assumption it is actually true for marco chains that we have we have tried we have failed so far but uh well if you are expert in marco chains i will give you the try i'm not an expert maybe that's why we have not managed to do that uh yeah it could be that this is not an assumption it could be it's a very interesting point by the way will you send reference to the matrix yeah i will send everything on the matrix but we send the slides additional references you will have everything okay so um there are no further questions then i think uh we can thank uh much hello again and uh take uh uh 10 minutes break before the next lecture by Edgar thank you very much much as well thank you all thank you thank you thank you all bye