 Yesterday was a very nice kind of for it was more like a teaser so today I think we are going to learn about the spin glass cornucopia so spin glasses were sort of motivated originally in some anomalies in magnetic materials and all eyes but now the kind of methods that were developed have found applications in all kinds of fields including in my field in swing theory but also in computer science information theory here so in fact today I will I will tell you about some history of the statistical physics of the system and how it helped to create a set of tools and concepts that have become very useful in applying them to various problems in computer science information theory and machine learning so I will summarize in one slide 40 years of research so you will be kind of concise I just wanted to stress so spin glasses it's the topic really started in the 70s by the experimental observation of some anomalies in the magnetic response of totally obscure and useless alloys I insist on that because it turned out that pretty soon there was a lot of interest about spin glasses and it has always in spite of 40 years of research these materials have always been useless and they have and I actually I don't know of anybody who got a grant on them but it triggered very rapidly a lot of intellectual curiosity and it turned out to open completely new fields that were unexpected so some benchmark in 75 Ed was another son were the first to define a nice model of a spin glass one on which one can work they also had the idea of the basic order parameter which is the overlap that I will describe and they also pointed that the existence of a new phase of matter which is a glassy phase of spins and starting in 78 the theory developed mainly along two lines and the mean field theory around the sharing to be a practical model that was sold by by georgio Parisi and then we elaborated on that on his work and finally understood the solution which was initially extremely mysterious and in parallel there was an alternative way of addressing the subject which was the study of real space droplets developed by Fisher and Hughes what where do we stand today real spin glasses still have some open questions I would not say that the subject is sold there are even some heated debates and controversies I will show you a few of the experimental results they are very interesting and but the thing that interests me most and that is the whole content of this talk is the fact that in trying to understand spin glasses we developed an incredibly sophisticated and very powerful corpus of conceptual and methodological approaches and they have been they have been successful in analyzing the mean field approach and this has been very recently in the quite recent years it has been fully confirmed by in mathematicians who have really demonstrated that all these constructions of physicists was correct but most importantly all this has been used in many different fields it started in probably 1984 where the first application to optimization problems to a completely different branch of science appeared with the works on random assignment and random traveling salesman problem and since then it has been expanding continuously and I will show some specific example so this was described by Phil Anderson in Phil Anderson wrote a series of papers of articles in the 80s in physics today and one of them had the title spin glass as cornucopia so at that time I didn't know what was a cornucopia and so I put a drawing so that everybody gets in phase with me about that so this is a spin glass actually a real spin glass is a piece of copper with some impurities of manganese it's something totally uninspiring it looks like a piece of copper so it is not spectacular and there is so it's hard to catch the attention of people with spin glasses but with cornucopia it's easier so why why this why is it so interesting why is spin glass theory so interesting it was let me go backwards a little bit in the in what is statistical physics statistical physics created more than 100 years ago about understanding understanding collective behaviors of system built with many molecules let's say and in some sense if you if you keep aside from the tiny point in the phase diagram which is the phase transition point let's say away from there you can get an idea of the collective behavior relatively easily by doing what is called mean field mean field is just you take a molecule it it has the environment of the other molecule but the other molecule sees the same environment at this molecule so you do a self-consistent description it's very easy and roughly speaking you get the phases you get the main phases where the system is and of course there is a complication of very large correlation length developing near to the phase transition point that's a difference sorry so this is easy if you want to go to dirt dirty system you add first defects dislocation spinning sites etc but the the story of spin glasses was going to the strongly disordered system a system in which really every spin in that case will see a very different environment and when you get there even the nature of the basic phase even the phase diagram is difficult to understand so the mean field returned out to be extremely complicated and sophisticated to develop and in some sense if you think of it in terms of an analogy with finance if you have a financial market and all the agents who are playing on this market they all have the same strategy then it is very it is relatively easy to understand what takes place you take the representative agent he does his strategy and in a self-consistent field created by the other agent who do the same thing as what he does if you have heterogeneous agent each one has his own strategy then you need to develop a new tool you cannot get just one effective agent you need to have something much more sophisticated which is a population of effective agent and that is more or less what is done in the in the cavity method so this is just to say that there has been actually there have been many developments of all these idea of spin glasses in that field of economy and financial market and even very successful companies have been built elaborating upon that but I will not they will not tell you too much about it so the the table of content for today's talk I will I will describe some basic ideas of all spin glasses about the dynamics and the meanfield description and then I want to have two chapters in different fields on different branches of science one is about information theory the phase transitions and glassy phase in some aspects of information theory which are error correcting codes and then some developments in computer science in constraint satisfaction program and in doing all this I shall I shall also develop the methodology in some sense which is tell you about replicas and tell you about cavity method these are the two main pillars of spin glass theory and they are the ones that have been so successful and in particular the cavity method which can be turned into very elaborate algorithm to handle important problems in information theory and computer science I start very easily just to just to recall that everybody and understands where we are if you take a simple magnet you have you can describe it to in some regime by easing spins just plus or minus one here I take classical spins I will not go in the quantum mechanical description you have exchange interaction neighboring spins they they pretty they have a lower energy where they point in the same direction and so this is just an easing model and you look at the equilibrium configuration and at low temperature the spins tend to align in and you have a global magnetization that sets up so you have two ordered states and if you want to represent it again with the representative agent that I was telling before you look at the spin it has a spontaneous you suppose it has a spontaneous magnetization M you look at how this magnetization is is is related to the value of the other spin it should be the expectation value of the touch of the local field and you replace it doing a mean field approximation replace it by the touch of the of the expectation value of the field and you get a self-consistent equation for the magnetization this was done by vice probably hundred years ago or something like that very successfully applied to phase diagram of ferromagnets and you get the two states and and the spontaneous breakdown of the symmetry that's easy that was just in one minute to to set the stage now I do the same thing in a spin glass so in a spin glass you take copper manganese you have one percent of manganese of impurities of manganese in copper it turns out that the manganese carrier manganese atoms carry a magnetic moment the copper does not so you look at the magnetism of the manganese and the magnetic the magnetic moment of the manganese they interact but the interaction is mediated by the by the conduction electron of the copper and so this interaction interaction between i and j it is an interaction energy minus j i j s i s j and these j i j it's sign depends on the distance these are due to the Ruderman Kittel oscillation of the due to the conduction electron I will not explain that but the thing that you have to remember is that depending on the distance between two spins they can either have a ferromagnetic interaction where they want to point in the same direction or an anti ferromagnetic interaction and they want to point in opposite direction at low temperature okay and again you are interested in the in these systems at thermal equilibrium at a certain temperature T so this is a spin glass the the coupling constant depends on the pair of spins and it immediately leads so there is there are two aspects of it there is a disorder it's called a quench disorder the the manganese they don't move you can basically have a static approximation which is a very good approximation so the j i j a sample of a spin glass corresponds to giving the value of all the coupling constants all these coupling constants of the interaction between all spins another sample with another realization of the position of the manganese will have another set of j i j s so this is a disordered system it also has frustration in the sense that for specific values of the distance you can find triplets i j k where all the three interactions are negative so if the three interactions are negative means that each pair of spins it is coupled anti ferromagnetically so they want to point in opposite direction so you cannot have a triplet of spins where all pairs point in opposite direction so this is called frustration so these are disorder and frustration they are the two main building blocks of spin glasses and and the main question is what happens at low temperature so at low temperature the first thing that you see is that you have non-equilibrium properties setting up this is a plot of the magnetization measured as a linear response to a small magnetic field this is the magnetization as function of temperature so at high temperature you see the magnetization which is increasing with temperature as it would be in a paramagnet but then you see two curves one of the curve is what you're what you observe if you have a sample that you cool in the small field so you have your sample here is you have put the speed the field you measure the magnetization here you have a cusp and this is a field cool magnetization and this other curve is what happens if you cool the sample at zero field turn on the field and you see then the magnetization which is a response to this field and it does not reach the other one so clearly these are the system which is here and the system which is here it has the same temperature and the same magnetic field it does not have the same magnetization magnetization depends on the history how you have prepared the system so it is out of a career there is a very rich phenomenology I will just want to show you one aspect of this phenomenology which I find a really nice this is another measure of the susceptibility so the linear response to a magnetic field it is in this case an out of phase susceptibility never mind you see you have a you have a you have a spring glass and you have cooled it below its glass temperature and you look at the susceptibility as function of time so you have put a very small magnetic field and you measure the magnetization you see how it relaxes it turns out that it relaxes on time scale which are very very long so you see here the susceptibility decreases you do nothing the system is at a given temperature it is quiet there still it looks like a piece of copper you don't see anything but if you measure carefully you see that after a few hundred minutes the susceptibility is not the same as before so something is taking place the system is actually aging and it ages and very very long time scale so here after four hundred minutes our friends experimentalist in in Saclay at that time they had decided to do a protocol in which they wait for four hundred minutes and then they start something they change a little bit the temperature and they cool the system so this one this was a system at 12 Kelvin and they just change the temperature they bring it at 10 Kelvin but 10 Kelvin what happens is that there is a kind of restart of the dynamics the system has a new dynamics that sets in with a larger susceptibility and it relaxes again so the system has completely understood in some sense that there was a change in the temperature it has its new dynamics for another 400 minutes after 800 minutes total the guys have turned on the temperature again and brought it back to 12 Kelvin again there is a restart of the dynamics and the system relaxes very slowly now the interesting thing is that if you take this curve and you cut the central piece you take your scissors here and there and you glue the two pieces this and that you get this curve which is a relaxation if you are staying at 12 Kelvin so what does it mean it means a few lessons from this a qualitative lessons from this this is a system which ages on extremely long timescale actually as long as you can measure it is also it and its age depends on the temperature and if you cool it it has a new dynamic and a new aging but it keeps the memory of where it was at higher temperature this has been there are various ways of analyzing that but one way of analyzing it was using a mean field theory and in mean field theory one understands that the system gets it has an energy landscape which is very complicated and basically if you are at a certain temperature you are in a certain basin and the system is exploring this basin it is not thermalized in this basin but it is exploring it then maybe you lower the temperature and you will reach a sub-basin here so the system will start exploring this sub-basin and when you raise the temperature again it means you raise the energy again it will restart the exploration where it was so in some sense you quench in a certain sub part of the of the phase diagram and then you raise again so this is related to what is called the ultrametric structure of metastable states and pure states in in spring glasses which I will not elaborate upon but it is what is obtained from from mean field theory and there is an alternative description of that using droplets growth of droplets so out of equilibrium out of equilibrium effects are crucial then you can either study of course the dynamics itself or you can study the equilibrium and infer from it the auto-requilibrium property as I was saying so let me describe just define the Sherrington Kirkpatrick model which is a mean field model so it is a it is a description of an ensemble of spring glass an ensemble means that I have I will describe the system by easing spins that interact by pair each pair is coupled by a coupling J ij and J ij are identically distributed Gaussian random variable so I draw all the J ij this gives me one sample and this sample I can study it and then I can take another sample and study it again and again and again and there are some prop most of the thermodynamic properties they become sample independent that is there in the large-size limit the free energy the magnetic susceptibility and so on they are independent of the sample but there are some subtle properties which fluctuate from sample to sample so I will summarize a long and complicated history of spin glasses by saying that basically it's a cartoon of the energy landscape let's say you have an energy landscape which in the easing model you just had two possible states either the spins point up or they point down in majority let's say in the spin glass you have a very complicated landscape with many valleys actually an exponential number of of valleys they are organized in an ultrametric structure which means they are a hierarchical organization and all this takes place in the spin glass phase which is below a certain critical temperature the true ground state is very hard to find and it is fragile to small perturbation you take a spin glass sample and you perturb it a little bit and the truth and the true ground state changes completely maybe I will skip this this was just to say that there is a direct application to finance but okay if some of you are interested in finance I will tell them later let us look at the at the order parameters so I get back to the ferromagnet in a ferromagnet basically if you want to define an order parameter you have to do something because there is a breakdown of the symmetry so you have a spontaneous symmetry breaking so to define another parameter the real thing that you should do is apply a very small magnetic field compute the magnetization and take the limit where the field goes to zero and so if the field is positive it goes to zero with positive value then the spins align positively and you get a magnetization M if the spin goes to zero by negative values the spins point down and you get the magnetization minus M that's the breaking of symmetry spin glass is a bit like that but you have many many states so basically what happens is that there are magnetic fields that so the spins point in random directions and you don't really know them if you would know them what you would do is take a very small magnetic field which points in the right direction corresponding to this minimum of the of the energy and take it to zero and then you will get the spontaneous magnetization for this state of the of the spin glass but there are many combinations there are many states so this is a complicated story because it's you know in order to define the local magnetization mi alpha here that should be index i this local magnetization depends on a field that you don't know and you should know the magnetization in order to compute the field so the basic the magic approach to spin glasses really was the idea to use replicas because I myself I don't know what is the correct value of the local magnetic field the one that will pick up one state I mean in this curve sorry in this curve here there is one state here it corresponds to certain pattern of magnetization so certain pattern of local field I don't know it so the system knows it and so the idea is to take two copies of the spins so you imagine that you have to see two spin systems the system of spins number one spin system of spins number two and you will you will look at the what we call the overlap between the system of spin number one and the system number two and this is called the overlap and you ask what is the probability of this overlap that is if I take one configuration as one at random with my Boltzmann probability and another configuration at random with my Boltzmann probability how do they compare are they close to each other are they very different from each other and so the overlap is this probability of the the order parameter is this probability of the overlap p of q and then it was found that there are really two families of spin glasses so the this is the overlap this is the first family it's called continuous continuous transition full replicasimetry breaking at high temperature the overlap is peaked on only one value p of q is just a single delta function that if you take two configurations at random they they have a certain angle generically in large dimension that what happens so for instance if there is no magnetic field they will be orthogonal actually q will be equal to zero that the easiest case now below a certain temperature the p of q develops a certain structure and it's a function that is a that has a continuous part and two peaks and this function develops continuously around the phase transitions here now there is another family of spin glasses which is called this continuous transition it's called so-called one-step replicasimetry breaking where when you are close to to to tc but below tc you develop two peaks only two peaks so the really the the spin configuration it means that they are clustered you have clustered of spin configuration and if you are in the same cluster you have one overlap if you have if you are in different clusters you have a different overlap and that's it very simple organization and it develops gradually at the transition but the two the two types of overlaps are discontinuous and this you can look at it if you have two replicas with a smaller traction they will and take this attraction towards zero the overlap will go to this value if you take two over two replicas with a small repulsion they will go exactly to this value so you have an order parameter that you can really measure and it tells you what is the nature of the transition let me you know I like yesterday I like to give to the people who are happy at programming let's say and programming fast I like to give a give a small toy system in which you can see the things in a program so I give you a very simple system which develops a spin glass phase and that you can simulate very easily this is a system of easing spins si is plus one or minus one and they interact by triplets instead of being pair interaction they interact by triplets si sj sk and I will assume that the this the triplets ijk which appear in the energy they are just randomly chosen triplets so you have some kind of spin plaquettes if you want and for each plaquette I have an interaction minus si sj sk that's my system very easily I just need to tell you how many plaquettes there are and you choose them randomly you generate a sample and you can simulate it so now you can think what is the low the ground state energy I look at a plaquette and the energy of a plaquette can be minimized either when all the spins point up si equals j equals k equal one is in energy minus one or one points up and two points down one time minus one time minus one also minimizes the energy and so when you are there you can start to understand the phenomenology that is there is what is the ground state of this system there is Jean knows the answer but you're cheating there is one solution which is obvious you take all the spins up it minimizes the energy of all the plaquettes okay so the ground state it's easy everybody points up it minimizes the energy but you can see immediately that if by accident because of temperature or whatever one of the spins here points in the wrong direction is down then immediately some of the neighbor in the plaquette will need to point in this other direction also and that is at the heart of the phenomenology of this you can simulate it you can do a Monte Carlo simulation and you look at it you look at the energy as function of temperature and the energy as function of temperature it starts to decrease you decrease the temperature and the energy will follow this line here and you find that the energy decrease decrease and gets to minus one point one six that's what you get by doing what is called a simulated anything decreasing the temperature slowly we stand to the four Monte Carlo steps at each time now you say okay I'm not sure if it was really well formalized I go from ten to the four to ten to the five Monte Carlo steps so I wait for a bit more my computer I get the second curve which is just there and they decrease a little bit the energy I go to ten to the six and I get to the seven so the here are four curves ten to the four ten to the five ten to the six ten to the seven Monte Carlo step per spin so you see and actually I can I can prove I could prove to you that if you take the limit of a very large number of Monte Carlo step per spin very large but not diverging with the size of the system then it will saturate here so that is what you get when you cool the system this is very far from the ground state because that was a system in which you have four triangle per spin and so the ground state energy should be minus four-third ground state energy per spin and in fact if you take the configuration in which all the spins are up this is the ground state you that's the correct configuration at zero temperature you heat it up and you follow this curve until you reach this temperature and it jumps back to the high temperature phase so this is a phenomenology exactly of that you have in in glasses that is you have a liquid phase you have the equivalent of a crystal phase that's where the spins point up with a small number of defect there is a first order phase transition between the two and here you have the super cool liquid and you have the glass phase so that is a very simple amiltonian that you can play with you know it's a very easy to program by the way in this case it was with hundred simulation with hundred thousand spins so it's a very size difference if you think of it in terms of optimization the thing that you get by simulated annealing is very far from the real ground state energy and this is an example of what is called a run of first order phase transition it's an example also of a discontinuous one-step replica symmetry breaking problem so there have been a enormous number of application of these kind of ideas are going beyond the spin glass itself here is a small list small list of all the various possible applications but I want to I want to take now the time to introduce you to two example one example is from information theory and one example is from computer science and so the first example is information theory so just a reminder first of all information theory is a beautiful branch of science it was developed mainly let's say one can start when if one has to decide what is the starting point it's probably the paper of Shannon in 1948 Shannon was at that time during the World War two Shannon were was working on communication and he was interested in in trying to define what is a content of information of a message how you can compress a message and how you can send it on a noisy channel and correct the errors and he wrote a beautiful paper after the war which basically established all the links between the domain framework for all these from the beginning it was clear that information theory was deeply linked to statistical physics actually Shannon himself decided to name the information call it an entropy because he decided to I mean he realized that the right measure of information for a probability law was minus P sum of PI log PI so it was just the entropy known to physics and of course this branch of science evolve in its own dynamics but it turned out that there was recent convergence with with physics I mean they diverge from physics at the beginning and then it came back in recent time on various on various topics that I will that I will describe so here I want to I want to tell you a little bit about information transmission and error correction so as soon as you transmit some information you will have the possibility of noise on the channel and you should you should think of having an error correcting code this is obvious if you want to have information coming from the from the space it's also taking place each time that you are using your cell phone there is an error correcting code it's also true that when you write on your hard disk on read from your heart this you also use error correcting codes so this would be a case in which you have a channel which has a lot of noise this would be a case in which you have very small amount of noise but still you don't want to make mistake so and there are a lot of other processes especially in living systems in which in which you have to correct information for instance when you do the metosis you get to the information in the DNA that is reproduced and you have error correcting codes in order to correct for errors of reproduction of the genetic code so the principle of error corrections is just this one the one that was proposed initially already by Shannon you have an original message it has L bits you encode it into an N bit message and N is larger than L and so this idea is the idea of redundancy actually the whole principle of error correction is that you you send more bits of information than than was really needed you are redundant that's what I do we do that all the day I mean language is redundant and if you think about it if language if language would not be redundant you would not understand me because I am mispronouncing all the time I say language instead of language and whatever so this mattered you would not I'm not sure you understand me but at least the language corrects a few of my mistakes let's say so and so the redundancy of the language something that we use all the time it corrects it corrects error it corrects transmission error of transmissions then there is a transmission channel you receive the message let's say that it has the same number of bits and bits and you want to decode okay so the simplest code is repetition code I want to send bit number zero I send it three times want to send bit number one I send it three times this is called rate one third it has a it has a degree of the number of information bit divided by a total number bit is one third so if you want to send this message you send each bit three times you go to the channel imagine that you have a channel that flips a bit with probability p and so maybe if p is small enough maybe p equal point one or something this bit is flipped but then you decode and the idea is that we agree at the beginning on what what will be the encoding at what would be the decoding so the guy who receives a message he knows that it was a message that was created by repeating its bit three times so he has a very simple decoding algorithm it groups the bits that you receive in packets of three and uses a majority rule he says one one one that was a one zero zero zero that was zero one one zero well it's likely that it was a one etc and so here he corrects the error and you can see that when you do that there is an error probability that when p the probability of flipping the noise level of the channel when p is small the error probability is 3p squared so that's good it means that you start it from an error probability that was p if you send a single bit you reduce it to 3p squared if p is small you gain it you can do that better you can do groups of five group of seven etc except that you are adding more and more redundancy so you pay a cost now the thing that was discovered by Shannon is that there exist codes which are more sophisticated than just repetition in which for a given noise level one can build a code or decoder which transmits with zero error and and that happens below if the if the redundancy is smaller than a certain threshold and in Shannon's approach there are two ingredients which are very important and they are extremely physical one ingredient is the fact that he takes a thermodynamic limit that is he looks at the limit in which you send very long messages and the ratio the the the famous rate L over n is kept finite and then instead of describing one code he describes you a family of code an ensemble of code he tells you how to generate a code and his generation of code is very easy it is just take the you have to agree the person who sends the message and the person who received the message they will agree on what they call a code book the code book is a series of code words and they agree that the guy who sends a message he can send the message only using one of the code words of the book and that's it and the code book in Shannon's is is is described by just random code words so you take the unit hypercube in dimension n and on this unit hypercube in dimension n you pick up a few vertices they are the code words and all the message that will be sent will be one of these species special code words now imagine that the sender has chosen to send this code word can't the receiver what happens to the reason the one who received the one who receives there is we know that there is a certain probability of flip of a bit which is p so he received a message which has on average p times n bits which are flipped so he received a message which is at the distance p times n of the initial code word and so what he does what he does is that he looks at the receive message and looks on the circle of radius p times n and tries to find if there is a code word there and if there is one he says that's likely that it was a message that was sent to me he has corrected errors thing which is amusing is that you should think maybe you should think well sitting in front of this problem I should have an organization of the code words on the hypercube that will be optimal that will have the best distance they should be far apart from each other and she would say let me think of it and I will invent something but the best code is the one of Shannon take them randomly and it works so Shannon code is provably optimal it is a bound and and the thing which is also amusing is that if you look at the probability of perfect decoding that is so you see it's it's purely a problem of geometry in large dimension you have these random points on the hypercube in large dimension and you pick up one of them you displace it by p times n and you see on the on the circle around what you have received what is happening is there a single code word or are there many of them and it turns out that if the level of noise is below a certain threshold there is a single code word and so your your process works perfectly and that is a situation which is here and then there is a sharp phase transition and be below me beyond this critical value of the noise you have many code words and so you cannot you do not know which one is the correct one so that is a Shannon phase transition that is extremely nice except that it doesn't work I mean it's a theorem it's the best it's optimal but the decoding takes an exponential time because there is no structure in the codes of Shannon so the only thing that you can do is look at all the code words and the number of code words is exponential so if you if n becomes large you have an exponential number of code words that you have to look at in order to see which one is the correct one at the right distance and and so it is useless in practice so it's nice to prove the theorem but it's useless in practice and so since 1948 a lot of people a lot of scientists engineers and so on have worked on developing more efficient codes not more efficient but practical codes ones in which you have a structure that you can use in order to decode and I will describe to you very briefly one family of codes it's not the only one but it has such a nice correspondence with spring glasses that they cannot refrain from presenting them they are called parity check codes and the idea is that instead of taking random code words you will decide that the sender and the receiver they agree that all the sender will use only strings of here in this case seven bits that satisfy a certain number of equation and these these are just parity check equation so he will say maybe this would be the first equation it tells you that x1 plus x4 plus x5 plus x7 is even and then a second equation is this one etc so you have an algebraic structure you are in the in in the field gf2 actually you have an algebraic structure you have seven variables three equations so it leaves you with a four dimensional space so you can have two to the four code words that's easy you have a code book among the two to the seven possible words which are the words so that you can build with seven bits two to the four of them are called words and of course then you have to go to something which is larger so you don't do it with seven you do it with a large number here I have drawn it with 20 bits but in practice you will use it maybe with 1000 and you will you will describe it now you you have all these bits that you send and they satisfy parity check equation it means that the sum if I look at the first equation the first the sum of this this bit and this bit and this bit and this bit and this bit and this bit is even and again you can take a random construction in which you draw these graph randomly with k variables per equation l equations per variable give you the connectivity of this random graph and here you are you have a code book and you can look at what happens what happens if you flip some bits if again you send a message for the channel and these two bits are flipped it means that certain parity check will no longer be satisfied this one it involved that in that bit I use bit or spin equivalently this check use the that bit which has been flipped so it is no longer satisfied so you can see what are the equations that are not satisfied and you want you want to infer what is the bits which are flipped this is an inference problem it's our first example of inference problem we'll talk much much more about it tomorrow so basically the reconstruction is the following you receive a certain number of bits y1 up to yn and you want to you want to find what is the message that was sent and the probability that the message that was sent was the message x1 xn it is composed of two factors first of all for each each time you receive for instance if I receive a bit and I find it that it is 0 I know because I know the channel I know that with probability 1 minus p 0 was sent and with probability p 1 was sent so this is my my evidence that I get from the observation of the of having received 0 here and then I have another constraint which are the parity check constraints they have all the equations must be satisfied and now I want to find the most probable message that that that with respect to this probability distribution looks a bit complicated but in practice it is it is familiar it is familiar because if you if you if you use the simple mapping you say that you go from x i equals 0 or 1 you go to s i equal 1 or s i equal minus 1 so you go to another representation in terms of spins then the fact that x1 plus x2 plus x3 is even it means that the product s1 s2 s3 is equal to 1 and so basically all the equations that you have they correspond for instance an equation with three variable it correspond to a placate of three spins and it tells you the product of the three spins on the placate is equal to 1 so we are back to some to some of these problems which I called before which I described to you as a very toy model for programming a spin glass that's a spin glass with one-step replica symmetry breaking and so on so forth so that was a major point of convergence of spin glass theory and information theory and actually the decoding algorithm correspond to a rating mean field equation developed in spin glasses they were not invented that way they were invented completely independently but that is what what happens and so in this case the phase diagram is slightly more complicated you receive you send this message you receive that one you have the sphere and you look at the code one the distance pn from the received world and it may happen that you have a more complicated situation because you have a glass phase so you might be in a phase in which you have metastable states so really in this case what you have is a phase diagram which is a bit more complicated you look at the probability of perfect decoding as function of noise so you have a code forget about the blue line the blue line would be what you would do with Shannon's code but Shannon code is not practical you cannot use it in finite time so you have these two other curves and this is the phase transition for your algorithm the iterative algorithm the cavity method that I will describe and this is the algorithm if you really you would be able to optimize this probability that I was describing and so the difference between the two in this phase here you are in a simple situation on the sphere around the received message there is a single code word and there are no metastable states so it's an easy landscape and you find the good you look around this fear that's that's a very large dimensional sphere but the landscape is easy there is one guy which has zero energy it is the one which is the descent code word all the constraints that are satisfied here now if you look in this other phase there is still a single code word but there are many metastable states so you have you have a landscape which is a bit more subtle in the sense that we'll have an energy landscape in which there is the code word it is here but it has but it has many other metastable states so this one was the same code word you have to find it if you had an infinite amount of time and you could look at all the sphere you would find it but if you have just mean field algorithm or you do a Monte Carlo or something like that you get trapped in the glass phase so in practice in this phase the code in principle is working but you don't have a practical algorithm and here in the and so this is very much reminiscent of what was happening here you know there is there is a ground state but the naive application of mean field equation or simulated annealing does not find the ground state and in the last phase you get a phase which is more more complicated with you which has a lot of ground states and you don't know which one is the correct one so you cannot answer whatever the computer time you have let me close from from the point of view of information theory I want to tell you a little bit about a problem in computer science one thing that they want to insist on is that all the problems that I'm mentioning they are core problems in their own field meaning information theory developing smarter or correcting code is one of them it's not the only topic but it's one of the very important topic which actually made progress in the in the recent years and this one is is about basically satisfiability which has been it is described here as a grandfather of NP complete problems so everybody has heard about the theory of NP completeness it is this idea that you have a hard computational problems in which if P is different from NP you cannot solve the problem in a time which is polynomial and so one of these problems is the satisfiability so satisfiability it's it uses binary variable Xi again which can be 0 or 1 that means false or true and the constraints are in the form of clauses a clause is x1 or not x2 or x3 okay so that is the or function in terms of truth and true and false and the question you have many constraints like that and the question is is there a configuration of the Xi which satisfies all the constraints that is really the first problem that was shown to be NP complete it is NP complete in the sense that if if you could solve this problem in polynomial time you could solve all the problems which are non-deterministic polynomial in polynomial time so it is a very important problem it is still NP complete if you restrict the length of the clause to be to be a certain number for instance close all of length 3 it is still NP complete and people in spite of the fact that it has been really the main tool for show for proving the existence of NP complete problems it was hard to find hard instances and people in the end found a generator of hard instances and that was random case satisfiability random case that is is a very easy problem you generate each clause like this one by picking up randomly three variables and negating some of them randomly which probability one half it's again you know very reminiscent of my placets and my triplets and so on but applies here to binary binary variables which are so that is again an ensemble of hard instances if you look at the probability that a randomly chosen case at problem is satisfiable as a function of alpha which is the ratio of the number of clauses to the number of variables you find a curve which is like that you can simulate it for n which is not too large n equals 50 you find this curve 100 200 and you find that actually there is a phase transition it was well evidenced numerically Kirkpatrick and Selman did that and you find a phase transition around this number somewhere around 4.2 the density of constraints turns out that it can be proven that there is a phase transition and that not only that but you can also find an algorithm that finds a solution in the whole SAT phase which is down below and this is again using the sprinkler theory and using the the algorithm that you derive from the cavity method so the statistical physics of that satisfiability it's very easy you have many binary variables you have a cost function which is a number of violated constraints so these are three body interactions and you look you want to find the configuration of lowest cost and if there are many of them you look at the uniform measure about the ground state how many ground state there are how many configuration there are of energy zero looking at this problem we found out that it the situation was much more complicated than just the SAT and SAT transition the SAT so this is just a cartoon again the axis here is the number of constraint per variable the density of constraint and here is your cartoon of what takes place here is the main transition if the number of constraint is bill beyond alpha s which is here there is no solution the cartoon is describing by green circle green blobs are the solution regions where there are solutions now below that there are solutions if alpha is small enough less than a certain dynamical transition point then the form the tr the solutions form a big connected cluster and then it's relatively easy to find a solution in this regime and then there starts to be what is called clustering that is bill beyond a certain critical value which is a dynamical transition point the solutions start to group into well separated clusters and at some point the measure condenses on the small number of clusters here there are exponentially many clusters that share the weight of all the solution space here there is a finite number of them and it's a condensation transition all these are very well known and very well defined transitions in spin glass theory this is a one step RSB spin glass transition well studied and so on and in some sense we could export all the concepts and all the phenomenology from one step RSB spin glasses to this case here and it allowed first of all to find this alpha s because if you don't have this idea of the genome geometrical structure of the space of solution you cannot get the correct value of the phase transition so that's how we derived it and then also to find an algorithm to solve the problem in these in these regimes here so this was just to give the general setup it will be I will just have a five more minutes and the basic development of the of the cavity method let's say that you have phase transition and glass phase in very different contexts I have given you one context one example in formation theory one example in optimization and computer science and the thing which is interesting is that you see the dynamical glass transition the one that slows down the relaxation in spin glasses that that gives rise to these very long relaxation time it is also slowing down all the algorithms so there is a link between the glass transition and the algorithm let me let me give a short hint on on message passing and the cavity method it is called in nowadays we call it the BP algorithm BP has two advantages it is the initial of better and piles and better and piles in when what year was it it was in 1935 better and piles devised a better approximate better mean field approximation and vice mean field approximation in which you take into account the reaction of a spin to the to its neighbor so you dig a cavity and so we like to prove to better and piles even if they had no idea of how to use it for this other system and even less as algorithms but in other context it was also called belief propagation so we I call it BP so imagine I have a number of variables that interact with some factors they could be triplet interaction to bed to body interaction three body for body interaction etc so here I have my variable and again the interactions are described by these by these squares and so here I have an interaction a which relates variable 1 2 and 4 and it is a certain Boltzmann weight that involves the energy of interaction between these three variable and so on and so forth these give me the structure of the interaction and BP equations is mean field sophisticated mean field equation they are of two types there is a first type of message which is what would be the probability of X1 in the absence of the constraint a so I take my I take my problem my statistical physics problem and I imagine that they take out one interaction and if this interaction is absence I decide to name in absence of a the magnetization of one would be M of M1 towards a of X1 second type of message is I take one variable I erase all the interactions around this variable except one I keep only the C what is the probability of one I call it in just a name I call it MC2 1 of X1 and now I write close equations between the two so imagine that I have a piece of my diagram which is done like that so the probability of X2 in the presence of C only I can write it if I know the probability of one in the absence of C the probability of three in the absence of C and the weight relating one two and three I just have to do the sum of X1 and X3 the weight the probability of X1 the probability of X3 elementary the the probability of X1 in the absence of C it is just the product of what it receives from D the weight that it receives from E the weight that is from F these closes the equations these are mean field equations and they can be understood as messages that are exchanged on the graph I have two N two if I have K edges I have 2k messages and 2k equations that's good when is it exact well first of all I cheated it was maybe not that clear that I cheated because I was doing it fast on purpose because I am late and because I wanted to cheat you but I cheated well for instance when I said I look at the probability of the variable 2 which is here due to C and they say well it's easy I have the weight due to see that's psi C of X1 X3 the probability of X1 and the probability of X3 but if it happens that X1 and X3 are correlated in the absence of C and they might be correlated if they interact if there is an interaction loop that connects them here then they are then they are correlated and then their joint probability does not factorize so here I made an hypothesis of factorization of the joint probability so immediately you see that you you get from this reasoning you understand when it is exact it is exact if the factor graph is a tree if the interaction graph is a tree you don't have these loops all the equations I have written are exact they are it's also exact in one dimension it's a special case of a tree that's called transform matrix it's a good old transform matrix that we learn at school or dynamic programming it is exact on local literary like graph and that's why we can solve case hat with this with this idea I will sweep I will skip the the special case of infinite range model and just say one word of what happens in a glass face when there are many pure states and therefore many solutions because this is my this is my field this is a complicated part of the of the of the problem imagine that you have a glass face these belief propagation equations they are correct if you have if you can neglect the correlation between variable 1 and 3 when you erase the variable the constraint C so they are correct if the correlation decay at large distance but now if you have a glass face the correlation they decay you know that they decay at long distance if you are within one pure state but if you are in a mixture of pure states the correlation do not decay at long distance so this immediately tells you that for each pure state here each minimum of the energy landscape there will be a solution of the BP equation so in a glass there will be many solutions of the BP equation but there is one for each ground state here and so the nevertheless it becomes very complicated to find it because if you iterate the message from random configuration there is no way that it will converge to the precise solution which corresponds to one of these wells it will actually it will not converge and so the idea that we had was to in this case was to do a statistics of the message over all the many states you have these many many exponentially many states and you do you say I look at one message the message sent from I to a certain constraint mu it will take many different values and I do I look at the statistics of this message over all the states and these statistics I can relate it to the statistics of the incoming messages so this amounts to doing a statistical physics in the space of messages that is they have messages I am sending messages along the edge I want that the message each the message satisfies the BP equation and I look at how many solutions are of messages that satisfy all the BP equation and that is called survey propagation so it's a kind of belief propagation in a meta space which is a space of the messages of BP and that is also called it is it is the way that one step replica symmetry breaking appears in this problem and these old all these BP and the SP are extremely smart solvers for many complicated problem for instance force for random constraint satisfaction problem random satisfiability SP is still the best solver we proposed it in 2002 I think or something like that and it's there is still no not a better a better solver than that and and it's a field in which are a lot of people who are working in finding good solvers let's say so so it means that there is clear really a new idea I mean this idea of there is a there is a region of the phase diagram in which a space of solution breaks into cluster for each of these cluster there is a solution of BP and you can do the statistic of that and and invent a completely new algorithm this works it treats as a concrete realization so that is just a bird's eye view of what is taking place or I'm running late I'm sorry I just want to tell you that this is some aspect of the cornucopia are written here I have touched only two of these blubs let's say but there are many more that could be that could that one could discuss and and I like this sentence on the bottom right and it's only the beginning because the cornucopia has been sending us so many fruits in the last few years that we could think well it's done but Jean-Philippe Bouchot is the author of this sentence and actually I agree with him I think it's still the beginning that many more is coming in particular in the field that he's interested in which is a economy and I just wanted to mention a few concepts that we have been seeing today in this lecture that are very important in this in this context and also methods and advertise that in the next talk we should focus on another small aspect of this of all these very big cornucopia which is inference problem and I will describe how to use all these ideas in some inference problem and get back also a little bit to machine learning which is machine learning is a big inference problem you have a lot of data you want to infer the parameter of the machine so here is sorry for sorry for being too long today thank you thank you mark are there any questions I'm sure there are many questions I want to ask that in the part where you discuss the satisfaction problem computer science problem you said that after a particular value of alpha the solution space starts to form clusters so what is the value of that alpha can I know that particular value of alpha oh it depends on I mean for each case that's problem depending on the value of k fk there is a there is a number that one can compute and one knows it it has been actually it was formulated from the point of view of computer scientists it was we from the formulation that we had they formulated as a conjecture but then it was confirmed in recent years that it's correct to correct value okay and when you were talking about the information theory you said that if you have a large large dimensional hyper cube and then there is a threshold p that that does that threshold p depends on the dimension of the cube or is it independent of the dimension of the threshold appears in the limit of large dimension I mean the the curve that I am sure the phase transition it's like in physics the phase transition appears only in the limit of infinite dimension and the dimension in the error correcting codes the dimension is the length of the message the length of the code words let's say so it's only in the in this limit that you have a sharp transition and there is a whole field which I have not touched upon which is what finite size effects I mean if you use a real code maybe you will hold a user code with 500 bits and not infinity and so the sharp transition that I was describing it will have finite size effect rounding can you describe it what is impact and so on and this is a very important field in if you want to engineer a code and there's one last thing I don't know a lot about it but I have read that when you want to compute the ground state probably of some some you know difficult you know difficult system to study there is a trick in replica systems replica trick that you want to take a limit when n goes to zero so that limit is hard to take and you might need to introduce new parameters so is the introduction of those in those new parameters linked to a replica symmetry breaking or is this is that yes I have put all these mathematics under the carpet I can describe it to you after the talk probably because it will take a bit of time but all what I was describing about the order parameter function and the overlap and so on it is a result it is actually the physical understanding of what is hidden below the replica method and this parameter and the number of replicas going to zero so I have just given you the physical interpretation not the dirty mathematics very nice it's not dirty at all actually it's very nice but it is not mathematics either because it has not been turned into survey into mathematics it's better than that in some sense so you mentioned that this issue of NP hardness right so one thing which well this NP hardness concept is typically developed for worst case analysis right so whereas what you have been discussing is that was the standard in computer science since 20 years or more and what you have been discussing is a typical case complexity right so so what is this status I mean is this worst case analysis completely abandoned in the field or are people or can spin glass theory shed light also worst case analysis in some cases you are right in pointing this I mean all the theory of NP completeness is based on worst case and it is still very active and very solid in this direction what we are bringing to this field is saying look it is also interesting not to focus necessarily on the worst case but to build an ensemble of problem in which you can actually compute what is happening it may be so for instance if you look at spin glasses the the spin glass problem is NP hard finding the ground state of the spin glass is NP hard so in the most general realization of the spin glass if p is different from NP it's likely that there is no good algorithm for doing it but it turns out that there are a lot of examples of spin glasses in which you can either find the exact ground state or at least go very close to the exact ground state and for instance Andrea Montanari has found some algorithm like that for the SK model recently so I think that what we have learned in recent years is that they are really very different approaches and one complements the other but I don't think so far I don't think at all that the the physics methods which are based on what happens typically in some ensemble can help understand the worst case analysis I think they are very separate ways of approaching the question here so very nice talk and this is very nice to see the confluence of two complicated areas spin glass physics and information theory satisfiability each area is difficult and the confluence is really good my question is that is it that the this field of I would call it field of confluence of two different difficult areas is it like physicists trying to find the applications in these areas or computer scientists looking into physics community seeing that there is a beautiful mathematics there which we can use oh I think it depends a lot on the field for instance in if I if I think of satisfiability for instance satisfiability there was a kind of joint effort in some sense and they were interesting circumstances because physicists were interested in satisfiability when they saw that there was a phase transition there is a critical value of the density of constraint where there is a phase transition so you take a statistical physicist randomly in the audience you tell him look there is a phase transition say oh yes there is a phase transition so that's interesting he wanted to know that but the computer scientists were interested exactly at the same value of the threshold the one where there is a phase transition because this is where you have the hardest instances that is where you have the the instances of KSAT which are experimentally they are harder to solve it's harder to decide are they sat or not unsat they are they are just at the limit between the two and so in some sense that was a lucky circumstance that the two communities were focusing on the same topic and had the same interest and so I think it is actually in my now a relatively long experience in working in interdisciplinary setting it is very important to identify questions that are of common interest let's say it often happens that a community looks at a problem asking this question saying this question is very important and the other community say oh that question is very important but if they don't coincide then it will be difficult to talk but in this case it was that one example of when it was very successful any other questions actually I have a question so in the spin systems there was some notion of criticality and universality do these concepts play are they useful in this context or not yes there is so I was focusing today on the main points of the phase diagram which is where is there a spin glass phase where etc if you zoom in the phase transition itself of course you will have some critical aspects to it they are for instance they are very well reflected in in the dynamics for instance if you look at systems which have this continuous glass transition you will see a dynamics of relaxation for instance you measure you measure the relaxation of of a correlation function as function of time and you will find that typically in system with discontinuous phase transition just above the phase transition you have a relaxation which is like that and then it goes like this and that's as a function of log time actually because relaxation time are so long that you want to plot it in in log and so you have this kind of plateau which is well documented and this plateau if you go be if basically what happens is that when you go below the phase transition this plateau will be infinitely long and it will correspond to the spontaneous overlap of the system the correlation stops at some point that's the overlap that's the order parameter but in the dynamics you have this thing here and the way you approach this plateau it has a certain times t to the minus alpha here t to the minus beta sorry actually it's actually beta and alpha all these things are exponents that you can measure and study so there is a whole phenomenology and a very interesting theory there too but I'm summarizing in one hour no but is it yeah is it interesting and computers I mean this other context do our computer scientists interested in this question well I'm not sure that computer scientists are particularly interesting in critical exponents but the mechanism is interesting because this mechanism it is related to the fact that what is taking place is that where you are here you are still in a in a in a phase which is relatively easy so you can you can thermalize and go very far away from where you started but when you go a little bit beyond that if you go to a lower temperature you will have something like that and that means that you will be actually your dynamics will stop it will stop in this blob and so the the correlation does not decay to zero so it reflects the geometrical structure and it is very nice that we had also this these models of computer science that actually you can visualize all these glass phase in terms of geometrical structure in large I mentioned last question okay so what you have been talking about computer science it was mostly classical computer science right and and the glass was was a classical icing glass now of course you know they are related it's very interesting but the question is if you come to to quantum computation so there is in principle the necessity of error correction but it should be different from from the classical one now if I think of you know a step further and ask not only a question you know how to make error correction in quantum computation but the question would it be also some mapping sort of of quantum computation to to what to not to classical spin glass but to to what do you do you anticipate some extension of classical spin glass in order to anticipate some section first of all it has it has there is a natural mapping to quantum spin glasses for instance and in and actually that is related to some aspect of it which is which has been advocated a lot which is imagine that you have a complicated I take a classical spin system si with an energy of si I would like to minimize this energy and it's imagine that it is one of these very complicated problems they cannot solve and one way of of turning it into a quantum problem that you can try to solve is is quantum quantum annealing and then you say okay I take easing quantum spins I take and this is the energy si z etc and then I add some I add something which is how is it I will have I will have I will have the si x and then and then I will say that if I turn this parameter adiabatically so let's say that it is I don't know okay when lambda is equal to 0 I have my classical spin system which is very hard to solve when lambda is equal to sorry when lambda is equal to 1 I have my classical spin system when lambda is equal to 0 I have a bunch of independent spin I know also the situation and then I can try to to tune lambda adiabatically from the lambda equal 0 case in which I know the solution towards lambda equal 1 that is a kind of that is a version of a quantum spin glass it's a it's a trivial extension but it is the it is one of the main object that is being studied for quantum computing for the use of quantum computing in optimization okay last question absolutely the last question I want to ask a subsidiary question to that one that if by considering this parallel in in information theory and spin glasses you end up finding this BP code that is efficient if now you switch to the quantum version what would be the efficient code that you get instead of the of the bv1 does does does people know or well it's not as simple as that I guess but I mean there have been quite some developments in quantum error correcting codes I worked on it a few years ago I'm not to I did not follow the latest development so I cannot really answer you but basically you need really need to to develop a new type of codes they are related to the code that we had actually we had proposed the point is that when you have errors the channels in the quantum computation you can have you can have errors which are in the in the x channel or in the z channel and you have to handle the two and the code has to be able to handle the two and so it creates a more complicated structure but the idea is basically the same and it's a very interesting field of research actually it's also it's a very important one because if we don't develop a very smart code it's likely that we will not see a quantum computer working because thank you okay thank you very much Mark for a very nice