 Okay, so let's move on to other things I wanted to tell you about this problem of clustering in the stochastic blood model. So one thing that I wanted to tell you about is the argument that is used to show that there exist cases where we are below the Kestelstigung threshold and yet we know that with a non-polynomial computational power we can cluster the nodes in a significant manner. So the best argument that I know of is in a paper by Bonk Zetal in 2016. So it applies to the Q-blocks case. So we have two parameters in that case. We have the symmetric Q-blocks case. We have in the mean progeny matrix we have the parameter on the diagonal that is one constant and we have a second parameter that is the parameter of the diagonal. So these are the two parameters we need to specify the model. So on this slide I call C in the probability of putting an edge. So if I have two nodes in the same block, so sigma i equals sigma j, then the probability that i is the neighbor of j, given the vector of spins, this is given by C in over n. And if sigma i is distinct from sigma j, then this is equal to C out over n. So these are my two parameters in that model plus the number of blocks Q. So in that case we can figure out the eigenvalues of the mean progeny matrix. These are alpha, the average degree that is given by C in plus Q minus 1 C out over Q. So that's an easy exercise. We have lambda 2 that is also a simple explicit function of the three parameters of the model. So how do we prove that there exist triplets of parameters Q, C and C out such that we are below the K-sten stigma threshold and yet we can cluster in a meaningful manner? Well we'll end up using what is known as a first moment method argument. We are going to define some notion of a partition we will call a good partition. And then the first moment method is an argument that will be used to prove that any partition that meets this definition of being a good partition is necessarily such that it achieves a positive overlap with a true partition. Okay, so let's pass the definition of a good partition. So it's a partition of the n vertices into equal sized blocks, so rounding off, I mean if n is not divisible by Q. Okay, so you split it into Q sets of size roughly n over Q. Okay, and you ask that the number of edges within each block is the one that you would expect if you had picked the right blocks. So we know that within a block, okay, within a block we have n over Q vertices and then for any two of them we have a probability c in over n of putting an edge. So if we want to figure out the expected number of edges within a block that is exactly given by, okay, we have the number of pairs within a block of size n over Q that is n over Q choose two and then we multiply this by the probability of seeing an edge between such a pair and so that's roughly n squared over Q squared times c in over n. Okay, so we can figure out how many edges we expect within a block. We can figure out how many edges we expect that are not confined to a set of our partition. And so we will say that we have a partition that is good if the number of edges it has that is that are within subsets in our partition is close to the expected number of internal edges in the true partition. And so close, we will define more precisely by saying this is distinct from the expected number by at most some so we'll see if we do this computation. So here you see a number of internal edges within a block is of order n. So it's n c in over two Q squared. So you have Q blocks to multiply that by Q to get the expected number in total over the Q blocks. So this is order n. And so this is a binomial random variable of mean order n. So there's concentration of fluctuations would be of order square root n. So let's call a partition good if we are off this average by at most something that is n to the two thirds. So something between n, the mean and the fluctuations root n. Likewise for the edges that are across. And so we know that if we had picked the correct partition, we would meet these criteria. And so there exists at least one good partition. And so how do we show that any good partition is necessarily such that it achieves a positive overlap. Well, that's where this first moment method kicks in. We prove so we want to prove that the probability that there exists a partition of n that is that is good. And that achieves overlap less than some epsilon. Then we'll upper bound this by the expectation of the number of partitions that meet these two conditions that are both good and have epsilon at most overlap. Okay. And so the beauty of the first moment method when it works is that you can compute those expectations. So you can get a formula you pick an arbitrary partition. And you can we can do some combinatorics. You can look at how many partitions there are that have another lap at most epsilon with a true partition and fixing one, then you throw in the edges at random and you compute the expectation. So this is something you can get a handle on. It's a bit of combinatorics. So it's like in their paper, they do it with skipping steps. It's one or two pages. So it's it's not straightforward, but at least the rationale behind this very is very clear. And so this works to show that for q larger than four, you do have parameters below cast and stick room for which this probability goes to zero. Hence there is no good partition that achieves an X below epsilon overlap. And so the brute force algorithm which scans all partitions so exponential time but stops whenever you meet the criteria for a good partition, then this this is successful. Okay. And so it will stop because we know there exists one good partition because the true partition is a good one. So I wanted to just highlight this use of the first moment method argument because that's the neatest way I know to show the existence of the heart phase in this in this case. So another thing I want to discuss as a follow up on that theorem on the structure of the non backtracking matrix of these random graphs we were looking at is a relation of the result with a notion that has been developed in graph theory that is the notion of a Ramanujan graph. And so I don't know if you have heard of a expander graphs at some point. So an expander graph is usually defined as a graph that has a common degree for all nodes. So it's a regular graph. So all nodes have the same degree could be D equals three, four, so same value. And it's an expander if the adjacency matrix has its largest eigenvalue is going to be the parent Frobenius eigenvalue is going to be D. It's a fact if you take the adjacency matrix of a regular graph, you plug in the all ones vector, then you get the degree times the all ones vector as the output. So D the degree is an eigenvalue of the adjacency matrix. And by the Peron Frobenius theorem, this is the largest modulus eigenvalue. Okay, and so expander's are graphs such that they are regular and there is a gap that does not vanish even if the number of nodes goes to infinity. So it's a property of families of graphs with boring number of nodes. So the gap between the second largest eigenvalue and D does not vanish. It stays separated. Okay, so that was a bit of context before defining what a Ramanian graph is. So it is a D irregular graph for some integer D such that its eigenvalues have either modulus D like the Peron Frobenius eigenvalue. They could also have an eigenvalue minus D. That would be the case if you had a bipartite graph. Okay, you would have minus D also in the spectrum. But all the eigenvalues that have a modulus that is strictly less than D have a modulus at most twice square root of D minus one. So that's the mysterious value that is used to define what Ramanian graph is. And so why this value? Well, this is because, okay, such Ramanian graphs, they have a spectral gap between D and the next eigenvalue that is D minus twice square root of D minus one. So for an arbitrary number of nodes. So if you have a family of Ramanian graphs, this is also an expounding family because you have this gap that does not vanish. But so the interesting property comes from this theorem due to Alon and Bopana in the 80s, which says that this is the best possible spectral gap that you can get for regular graphs. In fact, so more precisely, take a graph G that is irregular that has a diameter that is sufficiently large. So let's say the diameter, that's okay, you take the supremum of a pair of nodes of the graph distance between the two nodes. Okay, so that's diameter. Suppose it's larger than twice some parameter r plus one, then the second largest eigenvalue of the adjacency matrix of that graph must be at least something that is the value appearing in the statement of the definition of Ramanian graph. So twice square root of D minus one minus something that will vanish to zero if R is large. Okay, and so now think about this. If I fix D and I want to construct a family of graphs with degree D for all nodes and growing number of nodes, so growing to infinity, necessarily the diameter has to go to infinity as well. Okay, so that's that's by the following argument. So if I have a fixed degree D, I consider the number of nodes that are within distance r of some node i. So a distance one I'll get at most d neighbors and then each of these d neighbors will have at most d minus one neighbors that are two hops away from the first node. Okay, and so I can proceed like that. So eventually I can see that at a fixed distance r, there is a bounding number of nodes in a d regular graph. Okay, so if my number of nodes goes to infinity, then the diameter must go to infinity if I have fixed the common degree D. Okay, so in other words, if I have a parameter D, a family of graphs with a common degree D, but growing number of nodes going to infinity, then the diameter must diverge and hence the second eigenvalue of the adjacency matrix must be asymptotically above twice square root d minus one. So you cannot get better spectral separation than in a Ramanujan graph. Okay, so these are the best exponders that there are. All right, so let's let's try to okay, so I'll be able to answer one of these questions. You tend to call them spectral exponders. In fact, and the usual notion of expansion of a graph would be more something like that. You pick a set of vertices, set S of vertices of a graph. So the isoperimetric ratio for S, so phi of S is defined as the number of edges from S to its complement normalized by the size of S. So let me define everything here. So where E of S as bar is the set of edges from I in S to J in S bar and S bar equals the set of vertices minus the set S. Okay. And so that's the isoperimetric ratio of the set S and the isoperimetric constant of G phi is defined as the infimum over S a subset of the vertices that is of a size at most n over 2 of phi of S. Okay, so that's graph discrete construction analogous to the isoperimetric ratio of curves in the plane, if you like, because this would be like the size of the boundary of a set S. So that's the analog of the perimeter of an area in the plane. And this is the analog of the volume of the set. So the analog of the area enclosed by a curve. So, you know, the classical isoperimetric problem asked, if I have a string of length one, what is the largest area that I can enclose and this is solved by the cycle. So this is the analog here that would be the analog of the length of the string, and that would be the analog of the area that I enclose. Okay, so that's the isoperimetric constant of a graph. And so the classical definition is G is an expander, by the phi expander, if it's isoperimetric constant is larger than phi. Okay, so now you can talk about the expansion of a family of graphs and a family of graphs with a growing number of nodes is an expanding family. In this sense, if there is some non vanishing number of phi such that all graphs in the family are phi expander. Okay, so they have a isoperimetric ratio for the graphs that is uniformly bounded from below. And it turns out that having a isoperimetric ratio uniformly bounded from below is equivalent to having a spectral gap in the adjacency matrix uniformly bounded from below. And this is why the notion of being a spectral expanding family is equivalent to being an expanding family according to this definition. Okay, yes, they tend to expand. So you expect the geometry of neighborhoods of nodes to grow more in an exponential fashion as you increase the radius of the ball you consider. And yes, so the equivalence of the two notions of a spectral expander and of an expander is a consequence of an inequality that is known as trigger inequality that some of you may have heard about. This is an important inequality in the study of the mixing times of Markov chains. So I could elaborate on that, but I'm not sure. There are many reasons why expander are interesting. So for instance, if you consider a random walk on an expanding graph, it reaches equilibrium in a short amount of time, the expansion actually conditions the time to equilibrium for the random walk. So if you want to sample a random from a graph, it being an expander is a good thing for you. Okay, anyhow, so about the second question that was why is it called the Ramanujan graph. So it's because of a conjecture of Ramanujan in graph theory, I believe, that I did not fully understand. And so solving this conjecture by Ramanujan was instrumental in exhibiting Ramanujan graphs. So those guys Lubotsky, Phillips and Sarnak who produced this definition, they also worked on trying to construct explicit Ramanujan graphs. And this is an notoriously difficult task, but typically you can have infinite groups, you can have questions of those infinite groups that are finite. And so those finite questions, you can have a KLA graph of a group. And you can prove that the KLA graph of a group constructed according to their work and using the Ramanujan conjecture is a Ramanujan graph, something like that. Okay, so I wanted to, so okay, I'll probably not go into the proof of the Alan Bopana result unless you ask strongly for it. Okay, so let me not do that yet at least. So what's the relationship between what I was telling you about and these Ramanujan graphs? So there is a beautiful formula about non-backtracking matrices that is the Ihara-Bass formula. And that gives you a lower dimensional characterization, somehow the spectrum of the non-backtracking matrix of a graph. So assume you have a graph B, a graph G, you have its associated non-backtracking matrix B, and you're interested in its eigenvalues. So you're interested in the roots of the determinant of the identity minus U times B, so that are going to be the reciprocals of the eigenvalues. Okay, and so this quantity, if you scale it up by 1 minus U squared to the power of the number of nodes N minus the number of edges M in your graph, then this is the determinant of a N by N matrix. So you've gone from computing the determinant of a matrix of size twice the number of edges to the determinant of a matrix that is the number of nodes instead. And so the N dimensional matrix in the right hand side is the identity minus U times the adjacency matrix plus U squared times diagonal matrix Q whose entries correspond to the degrees of the nodes in the graph minus 1. So entry at the highest place on the diagonal is di minus 1 where di is the degree of node i in your graph. So you have this relationship. And so from this actually, you can specialize it if you have a regular graph. So let's see what this gives. So for a regular graph, if you have di which is equal to d for all i, all vertices, then you have that 1 minus U squared. So that's what N minus N, yes, that of identity minus QB, then this specializes to a depth of identity minus UA plus U squared d minus 1 times the identity. And so something that consoles this term. So if U is a zero of that i minus QB that is neither one nor minus one is going to show up here. So you'll have, it is going to be a zero of this. And so this means that 1 minus U lambda plus U squared d minus 1 equals zero for some lambda in the spectrum of A. And so this gives you a map from the spectrum of A to the spectrum of B. And so because of this map, you can now take the, for instance, the definition of a Ramanujan graph. You can say, okay, I have in my a deregular graph, okay, eigenvalues of modulus d and all other eigenvalues are of modulus at most twice square root d minus 1. So this translates into properties on the spectrum of B. Okay, so I leave it to you as an exercise to see what, to do this mapping, see how you can characterize the property of being Ramanujan on the spectrum of B instead of on the spectrum of A. And that gives you the corollary here. So you have a Ramanujan graph if and only the non-backtracking matrix B has its eigenvalues that are either equals, they all equal modulus to d minus 1, the degree minus 1, or at most the square root of d minus 1. Okay, so this has a subtle mapping here, but it exists. And so that has been used, this correspondence between the spectra of B and the adjacency matrix A to try and develop a theory of Ramanujan graphs for graphs that are not necessarily regular and that can have degrees that differ from one node to another. Okay, so, yes, that's right. Yes, yes, yes. Actually, it is proven by showing an identity of matrices underlying it. And so, anyway, yes, yes, that's true, exactly. Yeah, the proof of the Ihara-Bas formula is used in the construction of this better. Okay, so people working in graph theory, in particular, Audrey Theras, who's done some fundamental work in graph theory and also in group theory. So who's interested in Ramanujan graphs. So she proposed this extension of the notion of Ramanujan graph to non-regular graphs. And so basically, you just take this corollary, remove the condition that the graph is regular, and you get a definition that extends the previous definition of the Ramanujan graph. So that's what it is. We would say that the graph is Ramanujan if and only if the eigenvalues of the non-backtracking matrix are either, their modulus is either that of the Perron-Frobenius eigenvalue, so the maximum modulus, or less than the square root of this maximum modulus. Okay, so that's the extension. And so now we can look back at our result on the spectra of these non-backtracking matrices of random graphs. So in particular, take the simplest stochastic block model that there is, which is another Schrodinger graph. So only one block, right? So n vertices and probability of an edge between any two vertices is alpha over n. So the theorem specializes in that case to say the largest eigenvalue of b modulus is given by alpha up to a term that vanishes in probability as n becomes large. All the other eigenvalues are with high probability less than square root alpha plus a term that vanishes. So we can reinterpret this particular case in light of the notions of Ramanujan graphs and Ramanujan graphs in particular for non-regular graphs. In the following manner, we can say that up to a little of one error, the Andosthenic graph is a Ramanujan graph according to this extended definition. So it has somehow the equivalent of this maximum spectral separation that we have for the classical Ramanujan graphs. And so there's a similar result that had been established for random regular graphs in 2008 by Joel Friedman. Yes, sorry. A previous definition was a gap on the eigenvalues of the adjacent symmetries. So yes, yes, that's right. That's right. It's, yeah, it does not imply that. Indeed. Okay, I think being Ramanujan, okay. Well, you could be Ramanujan and yet disconnected. So you want to throw in the fact that it's connected, I guess. Okay, let's talk it over after that. I'm done here. Okay, so Friedman had shown that a random regular graph of common degree default vertices is going to meet the Ramanujan bound up to a vanishing correction. So we knew from his result that random regular graphs were near Ramanujan graphs. So now we have somehow an analog for non-regular graphs. The simpler model of an Andosthenic graph is not regular, but it does meet a similar property. Okay, so the last thing I wanted to say is how can you relate those results to results you have, I think, seen in other courses this week, in particular, I guess Mark Potters and Marc LeLage spoke of low of low-ranked deformations of random matrices. So you have heard about this bike, Ben-Arous-Pecher phase transition. And so typically you have a low-ranked matrix, you add a noise matrix, so it could be a Wigner matrix. And you have a threshold condition on the magnitude of the eigenvalues of the low-rank matrix that is used as a deformation of the noise matrix. So to be specific here, let's assume we have a Wigner matrix for the noise matrix with a variance of the entries of order sigma squared over n for order 1 sigma squared. And so we have the low-rank matrix Pn. So the threshold condition on the eigenvalues lambda i of the low-rank deformation are that their square is above sigma squared. And so if they meet this BBP condition, then they are reflected in the spectrum of the matrix Pn plus Wn. Okay, so in a sense what we have done on the non-backtracking matrix is a sparse version of this. And at first sight, it's not very clear why. Okay, so there's Kaston-Steagum and then there's BBP. And so what is the connection? And in fact, the Ihara-Bass formula gives us a connection. And so that's what I wanted to conclude on. So take again our stochastic block model. So what we observe is the block matrix that is the conditional expectations of the edge variables. Okay, that's low-rank. The rank is at most the number of blocks. And we add a noise matrix that is a bit nastier than the Wigner matrix, but it's zero mean entries with independent assumptions. So let's try now to match this to the deformation by a low-rank matrix of the Wigner matrix. So the variance parameter in BBP, sigma squared would be the analog of the sum of the variances of the edge variables conditioned on the spin variables. And so in our stochastic block model, this is the parameter alpha. Okay, so Pn is the analog of the conditional expectation of the adjacency given the spins. Okay, and the spectrum of this conditional expectation. This is exactly the spectrum of the mean progeny matrix we have introduced. So we have this analogy. And so now you can say that the Kasten-Steagum condition, which is when eigenvalue lambda i of the mean progeny matrix has its square strictly larger than alpha. This is exactly the BBP condition. Okay, so that gives you a formal correspondence. Okay, and actually we can push that a bit further using the Iharabas formula. And we can even, so I guess you have seen also how the eigenvalues of the low-rank matrix Pn gets transformed when you add the noise matrix, the Wigner matrix. So lambda gets transformed to lambda plus sigma squared over lambda. And so we get, we can retrieve that as well. So recall the Iharabas formula, which tells you how to go between the spectrum of B and the spectrum of some n dimensional matrices. So now if the degrees in your random graphs are close to their mean, you can really retrieve a correspondence between the eigenvalues of the adjacency matrix and the eigenvalues of the non-backtracking matrix. And so that's what this corollary says. It says that in the case where your degrees concentrate, then an eigenvalue in the spectrum of B, the non-backtracking matrix is reflected, so this lambda eigenvalue is reflected by lambda plus alpha over lambda in the spectrum of A. And so this is exactly the formula that you had in the BBP theory. So that's what I write here. This suggests that in the SBM when the average degree is large compared to one, then if it is such that there's concentration of degrees around their mean, we would have in the spectrum of the adjacency matrix, we would see eigenvalues lambda i of m plus alpha over lambda i of m. And actually we can prove that this is something that's been done essentially by my PhD student Ludovic Stefan in a paper of last year. So we could push the spectral analysis we have for the sparse SBM to a not so sparse SBM where the degrees grow sufficiently fast. And so we have a version of the BBP results with a prediction of the occurrence of eigenvalues in the spectrum of the adjacency matrix, as well as predictions on the correlations of the corresponding eigenvectors with the eigenvectors of the low-rank deformation. So all this goes through, in fact. And so the interesting thing is that this is formally gives you a confirmation that this is somehow the same phenomenon we have looked at as the BBP transition phenomenon. It also gives you another way of proving things. Of this nature, and it gives you results in under assumptions that are not the same. So we can cover noise models that are quite different, in fact, using different techniques. Okay. And so I think this is where I wanted to stop for today. This result that you're talking about at the end, do you need the connectivity to scale with the system size with the number of nodes or you can still take it finite and let it go to infinity? You mean the average degree? Yes. No, we wanted to be such that the node degree is concentrated. If we want to have conclusions on the spectrum of the adjacency matrix, we need this concentration of the degrees around their mean. So we need basically to have an average degree of their log n does work, for instance. Okay. But it needs to scale with m. And did you connect this result? Because there is another way to see the connections between BBP and DSBM through the mutual information. Okay. You can prove that the mutual information between the observed matrix, like the deformed matrix in the BBP problem and the hidden deformation, the rank one part is the same as the mutual information between the random graph DSBM model and the planted partition in the limit of large degree. You can show that the mutual information are the same. We're using a Lindeberg principle. Okay. And I wonder how it connects to what you did. Well, I think it's different. So we really push our estimates for the spectrum of B to be able to control all the error terms when alpha grows sufficiently fast, but not too fast. So there's work here in tracking the magnitude of the corrections we have. And then we can really leverage the IHARA base formula for the spectrum is not so hard, but we also have a correspondence between the eigenvectors. So we can construct the eigenvectors for B somehow. We control the error and distance fraction. And we have a there's an eigenvector version of the IHARA base formula. So you can take an eigenvector of B, you project it in dimension N, and you get an eigenvector for A in the regular case. So in the non regular case, but nearly regular case, this also works. I have a near eigenvector and you can control the errors there. So I guess it's a bit different. But then what you say about mutual information would apply even if alpha does not go to infinity. No, for finite alpha, but large enough. Now I think you still need a growing degree. Okay. Any speed but growing. Any question? All right. Thank you very much then. Let me ask if those online that want to show their face can do it so that we may take if there are enough people online showing their faces, we can take a group picture and we can try also the people here to see if we can catch as many as possible. Let's see. Let's see. Let's see. Okay. Come on. Come on. Show your faces, everyone. Don't be shy. More. More. We need more. There are too many. Yes, much better. I need at least a full screen filled with you. Come on, at least five more. You can do it. Try to come. Try to go there actually because there is a camera. Everyone that is here in the room can go there. Still need some faces. Okay. Devendra has a camera problem. Come on. Don't hesitate to connect. Don't worry. The picture will remain among us if you don't want to share it. Okay. All right. So the last one have a stiller chance to show their face. You have five seconds. Okay. How do I do that? Now I know. All right. Smile, everyone. Yeah, I succeeded. I'll take another one. It works. Thanks. All right. See you, everyone, tomorrow.