 All right, so today will be my last lecture, actually. And I will mainly talk about, I chose finally to talk about record statistics for random works. I will just leave these records for IID. But before I leave it completely, I just want to remind you the main results that we obtained yesterday. So we were looking at these records for IID sequences. So that means that you consider a set of random variables, x1, x2, xn. And I just want to see this index again, xi or xk, as a time index, having in mind that this could represent a sort of discrete time series. And the first object that we have computed, considering the case where these random variables are actually continuous and also symmetric. Yes, I need symmetric also. So first thing that we computed is the probability that the record is broken at step k. So probability that the record happened at step k. And this is what I would like to call, I mean, usually called the record rate. I'm not sure I really mentioned it yesterday, which I denoted rk, which was the absolute value of, sorry, the average value of this indicator variable sigma k. And we computed it and realized that this is just one over k. We did one concrete calculation with that to obtain it. And we also gave a very simple argument to obtain it directly. And from it, we could obtain the, then essentially the rest of the lecture was devoted to the statistics of the number of records, which I denoted by rn. We first obtained immediately the average number of these record numbers as being simply the sum of this record rate from k equal 1 to n. And this is just this harmonic number. And in the large n limit, the sum of one over k is well known to behave as log n. And then we did an additional computation to get the rn square. And there, this was actually more involved, because we had to show that essentially the record rates at different times are independent. That means that basically the two following events, which are having records at step k and having another record at step k prime, turned out to be actually linearly independent. This absolutely, yeah. Thank you. OK, so precisely because it's inside, we need to consider these correlations. Thank you. And we show indeed that they are decoupled, at least linearly independent. And I told you that this is actually a more general result for variables which are exchangeables, doesn't have to do with really the independence of it. And eventually, we find that this variance itself also behaves like log n. And finally, so typically, I mean, that means with these two values, typically, I mean, that means that if I look at the probability that rn is equal to m as a function of m here, so it will be centered around the average value, which is log n. And the width will be the square root of this quantity, which is namely square root of log n. So what you see here is that if you look at the relative fluctuations, you see that they tend to go to 0, right, in the sense that the width divided by the typical value goes to 0 as 1 over square root of log n. So that means that the number of records, I mean, tend to be extremely peaked, more and more peaked, if you want, around this value. Now, I also did the computation, which I have not reminded here, to make a link between the distribution of the number of records and the distribution of the number of cycles within random permutation. OK, I discussed quickly these sterling numbers that count these cycles within a given permutation. OK. And then eventually, I didn't show you how to do that. But eventually, one can get this Gaussian behavior. So eventually, the distribution of this rn equal to m, at least the typical values of it, will be a Gaussian centered around log n with a width proportional to square root of log n. OK. So now, today, what I want to show, OK, just one thing that was asked to keep in mind is that you see that in all these results, finally, the p of x, that means that the initial pdf of the x i's actually does not enter. It has actually completely disappeared. So in other words, this statistics of records for iid is completely universal. And in fact, one way to do it, I didn't present it like this because it's a bit technical. But one way to do that is to show directly that the records can be mapped directly to this problem of random permutation. And the problem of random permutation is just universal. I mean, it does not depend on any p of x. So that's the case for iid. It's perfectly well understood, although the mathematical structure is not that simple. But so now, today, I would like to show you how to look at the same problem. But in the case where the x i's here are the positions of a random worker after step i. And in fact, what I want to show you is that we already have all the tools to do that. Essentially, the survival probability that we have computed, and in particular this Parandasian theorem, will be extremely helpful to do that. The only thing that we need to do is to formulate properly the problem. And that's what we will do today. So let's start with this. So again, the idea is to look at some record statistics of strongly correlated set of random variables. Again, I didn't mention here what would be the record statistics of weekly correlated random variables, like, for instance, the Einstein-Mullenbeck process. But basically, at large time, the statistics of records for a weekly correlated sequence with, say, correlations that decay exponentially would eventually converge to these records for IID sequences. One can construct an argument similar to the one that I showed you last time for the extremes. And then, so that's why we need to consider something which are, I mean, some sets of variables which are strongly correlated. And the simplest and certainly the most relevant example is the case of random walks. So I will not, we have already studied these random walks quite a bit. And that means that I will, again, study this kind of Markov model. So I will start at 0. And I will evolve according to this Markov chain. And these eta n's are the jumps. And the p of eta, so the eta n's are IID random variables. And p of eta, as it is there, I will take it as continuous and symmetric. And probably I will also comment a little bit on what happens if you, for instance, have a linear drift. But what is important is that I will just consider jumps which might have a thin tail, like an exponential or gaussian tail. But I will also include in the study, I will actually also include the case of levy random walks, levy flights, say, which corresponds to the case where this p of eta have a power law distribution. And we have already seen that we know all this machinery of first-passage properties for all these families of jump distribution. So you will see that we will apply it, of course, here. So how does it look like? So typically, I mean, I will have this kind of sequence. So let's just enroll it. So I will start at 0. So let me start here at 0, as we did. And then I will just evolve with this kind of jumps. So I could have this something like this. Then I will do that. Then I will do this. For instance, I could do that. Do this, do that. OK, so that's typically the kind of chain that I would have. So I will consider the case where I have n steps. So if I have n steps, one should remember that since I start at x0, if I have n steps, I will have n plus 1 random variable, x0, x1, x2, xn. So n plus 1 random variable. And where are the records? So typically, it's a convention. I will assume that the first one is a record by convention. So this one is one record. Where are the other records? This is another one. And then this is obviously another record. And this one is another record. So here I would have, again, I will do the same notation. I will have n equal to 4 here. So again, I would like to say something about the statistics of the number of records. So let's try to focus on that. And if I have time, I will also try to say something about the ages of the record. You remember that that was another observable that we were interested in? But let's try to focus on the number of records for this. For this, yes? That's true. So indeed, you mean in here? OK, so here I really consider the fact, OK, so you have these quick steps. So this is just one step. Yeah, OK, so it was not very clear. Is that clear, then? Is that clear for everybody? So let's try to look at first the statistics of Rn. I will do exactly as I did before. So I will introduce this. So let's first look at the statistics of R. As we did before, I will introduce this variable sigma k, which is 1 if xk is a record and 0 otherwise. And again, I will just use the same trick. But just rewriting of Rn as the sum of this random variable. So now I will start at 0 because the convention is that we usually start at x0, or at least that's my convention. So by definition, Rn is that. And again, if I want to compute the average, so when I say statistics, let's start with the first moment. And let's try to compute Rn, average. So this will be, again, this quantity. Let me call it Rk. And Rk, OK, let's do it this way again, sigma k. And yesterday, I convinced you that this is just the probability to observe a record at step k. So this is just this guy, step k. So this is just Rk. And Rk is the probability, again, that xk is a record. So now let's try to compute this object, OK? So let's look at this picture here. And let's suppose that, OK, we would like to compute the probability to have, say, a record here at step k. First observation as we did yesterday, the probability to observe a record here, of course, is completely independent of what happens afterwards, OK? So let me just focus on this part here, OK? So I will just erase this, the rest. And I will just focus on this. So I'm looking at this sequence here. Again, I'd start, basically, I'd start at 0. And a little bit further. And for the sake of clarity of the explanation of the reasoning, let's suppose that this record is broken at a given value y. So I will first try to compute the probability that the record is broken at step k with some value y. And eventually, I will have to integrate over y from 0 to plus infinity, OK? Now, what do I do? So again, I want to compute this probability that I start at 0 and that I arrive at y here at step k for the first time, because it's a record, OK? So that means that all these values here, they need to stay below this value y, OK? So let's try, let me do a transformation that we have already done several times. I will first change the origin of space. So I will move, my origin is here. So I will just say, OK, let's try to have this origin there, OK? I just can move this. And I can do that because the random work is just invariant under translation. So I first do that. So essentially, what I'm doing is that I will just take this. This will be my new y origin. This will now be some dotted line, OK? So that means that now, so this will be 0 here. And this will just be minus y. Now here, I am actually doing something. I mean, our random work is symmetric. The jumps are symmetric, OK? So that means that I can just reverse, basically, here. I mean, I can easily reverse time. So instead of going in that direction, OK, I will go in the other direction, OK? So I will just choose my origin to be here, OK? So that will be my new origin. And I will say that here, now, so I erase this guy. So I change the origin, essentially, from this point to that point, OK? So that's my new origin. I didn't change this arrow here, but I just changed the arrow of time. Can do that because the random work is just. So now you see what you have to compute. You have to compute the probability that this is a record is basically the probability at a given width at a given level y. I reformulate it as the probability for random work to start at 0, 5 at minus y after k steps, and saying negative between step 0 and step k, OK? So now, eventually, I need to integrate over all the positive value of y. That means all the negative value of minus y. So what I am actually computing is just the survival probability, which is the probability that my random work starting at 0 stays negative up to step k. Well, I don't need here to reverse it, and I don't want to do that because I want to have a framework that also allows me to treat non-symmetric work, OK? So if you have, for instance, a drift, then if I start to reverse it, that will be the survival probability on the positive axis, but then I would need also to change the sign of the drift, and I don't want to do that. I just want to stick to this, OK? So that's the probability. So I hope this transformation here is just to convince you that, OK, this is something that I will call here q minus k, and this is just the survival probability on the negative axis. So that's what exactly. So that's the probability that x1 is negative, x2 is negative, up to step k, starting at x0 equal to 0. OK, so before we had computed the survival, so that's what I want to call the survival probability on the negative axis. Up to now, that's true that I mainly consider the case of the survival probability on the positive axis, but up to now, I've also always considered symmetric jumps. So that's the survival probability on the negative axis. Now, so that means that, OK, the standard picture that I had before was more something like that. So that would be, I start at 0, and you want to stay negative up to, say, step k. But alternatively, I mean for symmetric jumps, which I am considering here, that's essentially the same as the survival probability on the positive axis, up to step k. Yeah, well, OK, here I really consider the case where I am going up to step k, OK? So I want xk to be, I really want xk to be a record. This k is not less than 0. I don't, I mean, he has to be. Yes, yes, yes, yeah, yeah, they all need to be negative here, right? So the first one is 0, and all the other one has to be exactly smaller than 0. Is that fine? Yeah, yeah, exactly. I mean, at the beginning, yeah, OK, I mean, even you see immediately, OK, maybe I did, I mean, OK, I did this picture because I just wanted to illustrate that this is indeed the first passage. But that's what I meant by doing this picture, right? I mean, OK, maybe you saw it immediately from the beginning, and then it's fine. But otherwise, OK, I mean, it needs a little bit of construction to see precisely why this is not. And also to see clearly that you need to take care that this is the survival probability on the negative axis, OK? But I agree, I mean, this is relatively standard to see. So now, the thing is that we know what this q minus k here is, at least, for symmetric random walk. So for symmetric jumps, we have already encountered this quantity, right? For symmetric jumps, a q minus k is something that we know. q minus k is what I called up to, I mean, up to now was just q0k, right? The survival probability of the random walk starting at 0 up to state k. And we have seen that this observable here, this probability, we could actually compute it explicitly using the Sparonder-San theorem, OK? And what this Sparonder-San theorem tells you is that this is just this guy times this combinatorial factor. So this is, I will call you, the Sparonder-San. And what is quite remarkable in this theorem is that this result here is completely independent of p of eta. So this, of course, this picture here will depend on the jump distributions. I mean, if you look at how your random walk looks like, and in particular, if you have heavy tails, it will look quite different. But if you look at these properties of the records, that will be completely independent of the jump distribution. Now, another thing that we have seen is that for large k, this, so if you look at k larger than 1, this actually behaves like 1 over square root of pi k. So now, so this is rk, right? This is the rate, this is the record rate. So the first observation is that you see that you remember that for IID random variable, this record rate was actually behaving like 1 over k. So that means that for random walk, it's more likely to break new records, OK? So this should be compared to 1 over k for IID. So of course, one should expect that the typical or the mean number of records will be higher in this case. And this is something that we can easily evaluate, at least for large n. So now, I can evaluate this rn because this is just this guy, this I know now, so this I know. And you see that tells you that rk behaves like 1 over square root of pi k for large k. So that means that in the large n limit, of course, this sum will be diverging, because it decreases relatively too fast to converge when n goes to infinity. In other words, that means that for large n, the behavior of this sum is dominated by the large k. So that means that this is essentially, I can basically replace this by in the large k limit by this behavior, 1 over square root of pi k. And for large n, this will, when you informally resum this, so you will have a square root of n that will come out. And then this actually gives you a 2 over square root of pi. So if you look at the large n limit of this sum here, there is a 2 here. I mean, it's a bit like similar to, for large n, you can essentially replace this by an integral. And this is the origin of this factor 2 here. Of course, if we can do better, we can do better, because it's possible to do this sum explicitly. So for those of you who like binomial coefficients and gamma functions, maybe I can just give you an explicit result for it, if you prefer. If you feel more comfortable, there is actually an exact result. Let me first give it exactly. This is just 3 by 2 plus n divided by square root of pi factorial n. And OK, let me just mention, I like very much this derivation. Which is actually not due to myself. But let me just mention the name of this gentleman. So this method actually was, I mean, this way of computing n was divided by Satya Majuna and Bobzif. And this is an ispn from, I can give you this. This is quite nice paper, actually. This one? Yeah, this is for high values of k. Yes, OK. So what I said is that, so here, I was saying that we know that it decays like 1 over square root of k. So it's clear that actually the sum here will be dominated by the large values of k close to n. So that means that, at leading order, I have not said, I mean, what are the remaining? There will be some reminder here. I mean, there will be some corrections, which I have not written. But to leading order, to evaluate this sum, you can just look at what happens for large k. It's governed by this, because you see that this sum here, you cannot put, I mean, if you formally put it, and if you want to put n goes to infinity, this will not converge. So when you are facing such a sum, which is diverging when the upper bound is growing, the same for an integral, that means that you are dominated by what happens for the large values. So I can just discard. I could do that in a more precise way. At the end, if you really want to do nicely, that's why I gave you this formula. There is an explicit formula, and then the analysis is very straightforward. But it's also nice to see that you can just extract the leading term by looking at the behavior of rk for large k. Actually, so you remember that for the IID case, we had this 1 over k. And again, I mean, it's clear that in that case, the series is not converging. So for IID, you are also dominated by what happens for large values of k. You have this 1 over k, which gives this log n. Here you have this 1 over square root of pi k, which gives you this 2 over square root of pi times square root of n. Yes? Yes, so OK. You can have actually something that grows, yes, of course. For instance, if you put, I mean, one case which is relatively simple is you put a drift. So you look at this random work, and you put a negative drift, for instance. And what will happen in that case is that, typically, your random work will have a certain number of records at the origin. So if you put a negative drift, your random work will just do little steps here positively, and then it will go down forever. And so you will have, typically, a number of records, which is of other one. But you can actually have all the exponents that you want, in fact. Yeah, it's quite tunable, 1 over square root of k, yeah. That's true. But you see that you are summing these numbers, OK? You have a sum of, so these rk are smaller. Oh, yeah, well, I have excluded because it is diverging here, so I cannot have a. It's a number. It's about a 1. I'm looking at what happens for square root of n. I mean, I'm looking at the leading behavior, OK? There will be some corrections. It turns out that here there are some corrections, which are of order 1. So this number, for instance, the case k equal to 0, of course, it exists. It's 1. But it will contribute to the constant, right? And I don't want to get the constant here. I could do it because I have an explicit formula, and I could obtain it here easily, OK? But you have to convince yourself that when you want to do the large n, here you just need to consider what happens in the limit of large k. Is that clear? Yeah. OK, so you remember that this was the case for the Brownian motion, but not for the random walk, OK? Because there was this problem for the Brownian motion that essentially when you cross a 0, I mean, when Brownian motion, which is continuous, both in space and time, when it crosses 0, then it will recross it infinitely many times. So that means that in that case, the survival probability at x0 equal to 0 should be 0 indeed for Brownian motion, but not for the random walk. That's true of the continuum limit of this object, OK? Other questions? OK, so now we want to do something more elaborated. I mean, we want to know more. We want to compute, say, for instance, we would like to compute the higher moments, second moments, for instance, or even the second moments. If you think a little bit about the second moment, then you will realize that what we did before for the IID case is much more difficult to do here. So you remember that for the IID case, so if you want to compute r n square average, then you see that you will have this kind of correlation, sigma k sigma k prime to compute. And if you think a little bit about this quantity for the random walk, it will be much harder. I mean, you have already seen that to derive this power understand was already quite complicated. We were using this Pollack's Spitzer formula, and now you are asking for something even more complicated. You want to have the records rate at two different times. So this could be doable, but it would take quite some time. And instead, we will develop an approach which allows us to get immediately the full distribution of r n. So before doing that, I just want to give you, OK, here you see I use the fact that for symmetric jumps, you can use this Spar Anderson theorem that I told you. Now, it turns out that suppose that the jumps are not symmetric anymore, for various reasons if you have put a drift or if the distribution itself is different, well, it turns out that there is a generalized Spar Anderson formula, and I just want to give it to you because it's very nice. So basically, this q minus k, in fact, can be computed for any jumps, which are continuous, at least. So it's just a remark here, which is the generalized Spar Anderson. So generalized in which sense? In the sense that now you suppose that p of eta is continuous, again, yes, it's continuous, but not necessarily symmetric. You relax this assumption, OK? So then in that case, you have actually a nice, again, a very nice formula, in fact. So you remember that the Spar Anderson formula came from this generating function. Actually, what you get usually is a result for this guy, OK? So q minus k is quite complicated. I mean, it's an object that really involves the full story if you want of the full history of the random walk. And in fact, you can write it in this way. So let me just write it, and then I will comment. You can write it as a sum from m equal 1 to infinity of 1 over m z to the m, now times the probability that your random walk here, xk, is negative at step k. So that looks a little bit like a strange formula when you see it for the first time. But what is quite nice in this formula is that you are relating here on the left-hand side a probability that, again, concerns the whole history of your random walk from step 0 to step k. I mean, to know whether your particles have survived or not, you really need to know all the positions from step 0 to step k. Now, on the other side here, the only input that you have to give is the probability that your random walk is negative at step k only. So that's purely local in time, right here. I mean, you just need to know whether your random walk is positive or negative and with which probability. So now, in principle, this can be anything if you have jump distributions which are not symmetric. Now, let's see that we can recover this. How do you recover this formula when you have continuous jump, symmetric jump? Now, for symmetric jumps, obviously, the probability to be positive or negative at step k is just half. Yes. The sum on the left side is from k equals 0 to infinity, absolutely. Sorry, it's M. Yes, thank you. Thank you. Sorry. So if you have symmetric jumps, the probability to be positive or be negative is just the same. It's just half. So I can replace in this sum here half. And then the series are recognized that this is just the Taylor series of log. So in that case, we get an identity, which is that the k from 0 to infinity q minus k z to the k is what is this exponential of. So here, I just replace it by half. And now this sum here, the sum from m equal to 1 or z to dm divided by m, is just minus log of 1 minus z. So that's just minus half of log of 1 minus z, just a sum. Now, this exponential minus half of log is just 1 over square root of 1 minus z. And this is what we have. So this is just 1 divided by square root of 1 minus z. You remember it? So that's how I gave you the Sparranderson formula. That's how we derived it from this Polak's x-pizza. OK? Now, once you have that, you have immediately this. But otherwise, if it's not half, then it's something else. So that means that with this formalism here, you can, for instance, look at the case of the, suppose that you have a linear drift, for instance. So you put a force on your random work. And then you can study what will be the statistics of the number of records. So this is something we have done a couple of years ago. And it's quite, no, no, it's really xm. But you see that if you take eta m's, which are symmetric, also take a collection of random variables, eta i's, which are symmetric. That means that the probability that they are positive is exactly the probability that they are negative. If you take a sum of these random variables, it will also be symmetric. Take Gaussian variables, which are centered at 0. If you take a sum of them, we know it will be a Gaussian. And it will also be clearly centered around 0. Is that clear? It's because they are IDS. But in fact, if they are, well, in fact, if they are, because it's because, yeah, at least in this case, it's true. Well, in any case, their value, the mean value is 0, right? I mean, whatever this p of x i's are. They can be correlated, super correlated, whatever. Is that clear? When you sum a collection of random variables, you know that typically the typical sum will be 0, right? Take, I don't know, take a spin system, ising model in a high Tc, in a high temperature phase. I mean, the spins can be plus or minus with equal probability, essentially, at high temperature. And if you sum these spins to get the magnetization, obviously, on average, it will be 0. And the probability that it is positive is exactly the same that it is negative. And your system is just invariant by a global inversion. That's the same here. So you can change eta i in minus eta i, and the random works will be exactly the same. That's another way of saying it. OK, so now let's go to something more, I mean. Yeah, we just look at this first moment here. So now let's look at something, I mean, more complete. We want now to have the full statistics of this Rn. And I will show you an approach which I have not really touched, which I have not touched upon too much up to now, which is called the renewal structure of the random work. And I will use this renewal approach to compute this. And instead of computing only, and that's something quite general, actually, in many cases. So you want to extract the number of records, so it looks like a bit complicated objects. So instead of that, you will consider probability of a much higher object. That means here, essentially, I will compute the probability of the number of records and all the sizes of the ages. So I will consider very big object. But you will see that this very big object actually has a very nice expression. So I want to have the statistics of Rn, the full statistics, I should say, full distribution. I was just mentioning statistics. Let's call it renewal approach. So it starts like that. It starts with a picture. Let's start with a picture which is similar to this one. And let me define the objects that I want to consider now. So I start at 0. And OK, I will just have this kind of final box that I did. No, it's not an approximation. Yes? Well, if you look at, OK, there are two things. I think you are confusing two things. The first thing is that if you really look at the average of this value, it will not be 0 exactly. But the probability that is positive is exactly the same as the probability that it is negative. You don't like it. I can see that they are not convinced by that. No, it's not an approximation, actually. It's really exact. So take two, I don't know, how can you convince yourself of that in a simpler way? You can take two random variables. I don't know. Plus, minus, plus, one, minus, one. Just take two of them, OK, and you make the sum, OK? So there will be some probability that it is 1. It can be 1, or it can be OK. If you look at the different probabilities, construct your different probabilities that you might have, and obviously, it will be symmetric. Again, here, that's the question that you have to realize something. If I take symmetric jumps, and if I look at the random walk, which is xi equal to xi minus 1 plus eta i, so this is 1 plus random walk. Now, with the same eta i's, I look at a different random walk, which is the xi is equal to xi minus 1 minus eta i. Then these are actually the same random walks. Of course, they will not look the same. I mean, if you really run them, they will be different. One will be minus the other one. But statistically, they are just the same, because the distribution of eta i is just symmetric. So I have the same probability to observe plus eta i and minus eta i, exactly. So this is really not an approximation here. So if the jumps are symmetric, this probability will be exactly half. I look at this sequence, and I want to now define properly values quantities. So there are these guys that we know well now. These are our records. I have some records here. But now I want to look also at the edges of these records. So the edges will be basically these guys. So I will define the edge of these first records here as the number of steps that I needed to break it. So here, typically, I will call it tau 1. So tau 1 here, I have 1, 2, 3, 4 steps. So tau counts the number of steps. This is how I define the edges. You see there are two, in principle, two different ways of defining the edges. Either I count the steps, or either I count the number of points here. OK, there are some subtleties with that. So I will just count the number of steps here for all these ones. So here I have a first one. Here I have tau 2, which is 1, 2, 3 steps. Now I have another one. tau 3, sorry, tau 4. 1, 2, 3, 4, 5, 6, 6 steps. 1, 2, 5, 6. And here I have another random work, 1, 2, 3, 4. That will be tau 5, which is 4 steps. So you see that, and here I have my final, my final, final. So I have n steps here. I start at x naught here. And n counts the number of steps. We found out two counting steps. So you see that, so I have some, I have these objects here. Now you see that I have finally another guy here. So I look at what happens on a finite time interval, capital n. So there is a last interval here, which is a bit singular. Singular in the sense that it's not exactly the same one as the tau. And I will comment on that. And I will call it an. And this is just here 1, 2, 3, 4, 5. I just also count the number of steps. But I want to draw your attention on the following thing, the following properties. So basically you see, I mean, all these quantities, these tau, these tau's, here, these tau i's, they are essentially all the same type of variables. Because how are they defined? You see that I start it at 0. And this tau 1 is the first time that I passed or crossed this initial value. Now the same here. What is this tau 2? It's basically I have a random walk. And after time tau 2, I just pass through this or cross this value again for the first time and again. So you have this kind of excursions. This one, each of these portions here, they are called excursions. This tau's, sorry? OK, tau 3 is here. So I have some correction of excursions here. So again, you need to see that. It's like I have just glued these different excursions of Brownian motion where I started, say, at 0. And I looked at when I crossed the origin for the first time from the negative side. So you have always this picture here. Now, of course, here, this one is different simply because the last record plays a singular role with respect to the others simply because this one has never been broken, my definition. It will be broken at some point here later with probability 1, we know, for the random walk. But this one is an unfinished excursion. So these are excursions. This one is an unfinished excursion. So what I want to compute, really, is the joint distribution of all these guys. So I want to compute, so that's the question, the goal. I want to compute the joint probability that, basically, tau 1 is equal to L1, tau 2 these are all discrete numbers, is equal to L2. Now, the last guy here, the last guy here, generically, the index of this guy will be tau of index Rn minus 1. So let's check it. So we have 1, 2, 3, 4, 5 records, Rn is 5. And the index of this last guy will always be tau Rn minus 1. You can just figure this out. Finally, that this An is equal to some value, say A. See whether I really use this notation. Yes, fantastic. And I want also, jointly, that the number of records is equal to L. So that's a very big object. And now, in principle, when I have constructed this, basically, I can compute the distribution of Rn, Rn, solely by summing over these random variables. OK? So maybe I have to make some comments on the values of these numbers here. Well, you see that these Li's here, they need to be larger than 1. I mean, if they are not larger than 1, they don't mean anything. That means that there was no excursion. So Li has to be larger than 1. But if you think a little bit about it, the value of A here can be 0. OK, so imagine that you break the records at the last step, then An will just be 0. OK, so that means that A can actually be 0. OK, and that corresponds to a situation where you break the record at the last step. OK? So A itself can be positive. They are all integers, of course, therefore, discrete random walk. These are all integers. And then I need to compute an M. Obviously, M also has to be greater or equal than 1, sorry. Because this one, by definition, is always a record. So question is, how do I compute that? Well, the computation of these big objects essentially relies on two properties and two tools. OK, so the two tools, the tools that we have them already, I will recall them in a minute. And now what are these two properties? Well, the first property is that the random walk is a Markov process. And as a consequence, all these variables tau i's and i n's are just independent. That should be quite clear from the Markov property of the random walk. So the Markov Rw is the random walk, is Rw for random walk, is Markov, Markovian. And as a consequence, you see, I mean, when you are here, the length of this excursion here obviously will be completely independent of what happened before, just because this guy here doesn't know anything about what happened before when he jumps. So that means that the tau i's and a n are actually independent. So this is almost true. And in fact, they are almost independent. So why is it so? Well, there is a little constraints on these guys is that their sum actually has to be exactly equal to n, because I fixed n. And so obviously tau 1 plus tau 2 plus tau 3 plus tau 4 plus a n must be equal to n, exactly. And that, of course, obviously induces some constraints on these random variables. But that's the only constraint that they have. So you need to have almost these because of that. Why do you say so? You can see it as a function. OK, you can see, yeah, it's a n. But then, OK, if you want, but what I wrote here can also be viewed as a condition on a n. You would like to write. You ask the question. Oh, sorry. This a n here is just n minus the tau i's. And that actually creates, obviously, some correlations between the a n's and the other tau i's. Yeah, I mean, think about it as you wish. But it's important, at least, to see that as a whole that there is some correlations between them. And the correlation indeed is that a n must fill the gap up to n. That's, I think, equivalent for at least for this purpose. Yeah, I said almost again because of this global constraint. Because there is a global constraint, and we will see that it plays a role. Of course, when n is very large, this constraint is not very strong, but still it's there. So that's the first thing. Now, the other property is that we know that the random walk is invariant under translation. So that means that typically, I mean, if you look at a random walk that starts at x0 equal to 10, or a random walk that starts at x0 equal to 100, if you just renormalize them, translate them to start at 0, then they will be exactly the same, statistically. So in other words, this random walk is invariant under translation. And that means that if you look at the probability distribution of tau 1, for instance. So tau 1, again, you start at 0 here. And you look at the first passage through x0, through the initial point here, which is 0 in that case. So this is one guy. But now here you are sitting. I can just view this part here, this excursion here, as a translation of this one. But I could imagine that instead of starting at this point here, I could have a random walk that starts exactly at 0. And again, I would like what's the probability that I cross 0 for the first time. So what I want to say is that because this random walk is invariant under translation, basically these tau i's are just identical. They are just the same random variable. Different realization of the same random variable. So these tau i's are identically distributed. Say it like this. So now I need two tools, basically. And I should have divided this blackboard here. My picture is a bit big, but it's nice to have it because otherwise it's very hard to figure out what we are doing. So OK, let me just erase this. I just want to have it. I just want to, I will just have it that tau i's and an are almost independent. Now I want to have two tools to that. So they are independent. And we should remind us that we have this constraint. So this almost refers to that. Now what are the two tools that we have? Is n, yes, sorry. Thank you. So what are the two tools that we need? Well, these are the tools to compute these values. So what we want to compute is really the probability distribution of these tau i's and an's. So we need to think a bit of what tau i is and what an is. So in other words, what we know now from this, we know the following. We know that this joint distribution here, which I don't want to write. But we have already a partial answer to it, which is simply that this will be, let me call it f of l1, f of l2, minus 1. OK, so that's roughly speaking what I want to have. So let me just define the two tools that we will need. So the first one is this guy. So I want to compute basically the probability that tau i is equal to li, and that's f of li. So this is the first quantity that I want to compute. And on top of that, I will have, in addition, the probability that an is equal to a. So that will be something, the second quantity that I want to get, the probability that an is equal to a. So we will see how we can compute that. And so this is what I say by saying they are almost independent. But now I should have a delta function here, that imposes that tau 1 plus tau rn minus 1 plus a. Sorry, this is a condition on the allies. So that would be delta of l1 plus lrn minus 1 plus a. And this has to be equal to n. OK, can you read that? So this is just a delta chronicle function. So it's 0 if the two arguments are different. And this is 1 if they are the same. So I want, you know this notation? Delta ij, this is 1 if i is equal to j. And this is 0 if i is different from j. So here I need to have the global sum here, which is exactly equal to n. And that's what it comes. Yeah, sure. Again, so they are independent because this is a renewal approach. I mean, this is a renewal theorem. The random walk satisfies this. This is called a renewal theorem, basically. The fact that at each time you have cost, the 0 here, essentially you can forget about what happened before and you restart. But nevertheless, since I'm working here, so you could, I mean, that means that OK. You could say, I never stop. I mean, you could look at the random walk that are really on the infinite horizon, as a mathematician would say. You set n goes to infinity, and that will eventually converge to something which might have a good limit. Now here I'm doing a bit something else. I want to look at what happens after n steps. This one, this last delta here? Yeah, it's not very well. Yeah, it's on. So I will actually, then it's better for me to use a slightly different notation because this index. So I will use this notation. Delta of i comma j. This you cannot read. Yeah, it's because it's a subscript, so that's not very nice. So I will just, I forget about, I will use this, instead this notation. So this will be delta of l1 plus l2 ta ta ta, lrn minus 1 plus an. That's important. Is that fine? Well, I mean that they are independent except for the constraint. You have IID random variables, but you fix the constraint. That's just, I mean it's almost a definition, I would say. What is, I mean once you have that, so once you know that these tau i's and these a n's are independent, which is granted by the fact that you have a Markov chain. Nevertheless, you have this. This is the only thing that correlates this guy. The other way that you can think about it, I mean, as you were mentioning before, you can remove this guy, if you want, this delta and simply say that an has to be n minus the rest. This is another way of saying it. And again, that's just the same. But you will see that technically it's much nicer to write the thing with a delta function. You will see this in a moment. So now what I want to compute is this f. So maybe I should have actually f minus. That's better. I will use the notation f minus because it's related to q minus. I suppose you have noticed it. So f minus li, we have already seen it, right? Because this is just this probability, right? Let me just write this small cartoon here. You start at 0, and you do this random walk. And exactly at step between li minus 1 and li, you cross 0 for the first time. So this is li minus 1. You start at 0. And this is what we have already computed, which is the first passage probability from below. So this is the first passage probability. You remember that? We have seen that. Now this, you remember probably also that this first passage probability, I can actually relate it to the survival probability. Vertical axis is xi. It's x, basically. It's x. So it's time. So it's space here. Let's do it like this. It's space and time. So I've just picked one of these segments here. I'm looking just at this segment here. What I'm saying is that the length, so again, what is tau i? Tau i is defined as the first passage time through your initial point. And you want to know the distribution of these guys. So I just isolated one of these pieces here, one of these excursions. So that's this one. I started at 0, but you can start anywhere else. I said that because it's translation invariant. So you start at 0, and you want to compute the probability that you recourse 0 from below for the first time between step li minus 1 and step li. That's the definition. So what I claim is that this probability is exactly this first passage probability. Now, this first passage probability, we have seen it. You can view it as minus the discrete derivative of the survival probability. So in other words, this f minus of li, you can write it as q minus li minus 1. OK, let's do it better because you won't see anything here. So we have seen q minus. So it's q minus li minus 1 minus q minus li. Would you buy that? OK, so we have seen that already several times. But OK, not several times, probably one time. But if you don't like it, I mean you have the right. And I can re-explain it, but tell me. OK, so again, one way maybe to see that is that the q minus is the survival probability. So if you look at q minus k, well, a way to see that you have survived up to step k, well, another way to say it is that you have recrossed the origin for the first time after step k plus 1. So q minus k is just the probability from l equal k plus 1 to infinity of f minus l. That's another way of saying it. So that means that if you have not crossed the origin up to step k, well, that means that you have recrossed it after. That's exactly what I'm saying. And we know that this probability is finite, so I can go up to infinity. And this probability is basically the sum of the probabilities, or basically the probability that the first passage time is strictly larger than k. So you have this identity. So once you have this, if you buy that, then you should be able to derive this. Is that fine for this object? That's the first building block, because we have many of them here. And now what I need is the, no, I need to treat separately the case of a n. So a n, what is a n? Well, a n, you see, it's quite simple, in fact. Let's just isolate what a n is on a small graph here. So I have this q minus k here. So let's see this is the first tool. This is the second tool. I want to have a similar graph. Now you see, I mean, for this last part, this one is a record. My definition is the last record. That means that after him, after it, all the values needs to be lower than this guy. So in other words, on this last segment of step a n, I have a random walk that starts, let's say, at zero and stay negative up to step a n. That's, again, a survival probability. So the low, I mean, the probability that a n is exactly equal to a, but this is just q minus of a, again, because I'm just looking at these kind of things. I have a random walk that starts. It starts here, I don't know where, but I don't care because I'm saying that this segment here is an independent, I mean, is a walk which is essentially independent from the rest. And it starts only at, the only thing is that it starts at this point here, but since the random walk is translationally invariant, I can just start at zero. And what it does here, the probability that the a n is equal to a, is just that I have this kind of thing here, of step a. So I need to stay negative up to step a. So it's very nice because essentially I know what it is. Yeah, that's a good question. So where is it? Yeah, it's in the sum, and it's on this index here. Index means the number of these guys that I have. Yeah, I can replace it by m, actually, if you prefer. Yeah, maybe it's better. OK, it's the same here, but yes, I can replace it by m if you want. Yeah, maybe it's better to do that. OK, yeah, you're right. And here, I mean, in this case, I'm mixing a bit the random variable and its value, which physicists usually don't mind to do, but the mathematicians don't like. So the dependence on m, so m is the number of records, indeed, is basically in this number of factors here, and also in the delta function here. That's a good question. But that's the only way it enters, OK? So now we want to extract. Now, in principle, all the information is in this guy here. Of course, it's a little bit complicated, but we will see that it actually has quite nice structure. So let's see how it works. So now I can erase this. I think that I hope that I could convince you that this joint distribution is given by that, OK? So because of this delta function, you see, if I think a little bit at this model in the context of statistical mechanics, it's a bit like if you think about this segment as the number of particles, you see that this delta function actually imposes me to work in a kind of canonical ensemble. And usually, I mean, we don't like too much to work in a canonical ensemble. In that case, it's much easier. Because n, you see, I mean, here somehow can vary here. It's much better to let n vary precisely and to work in the grand canonical ensemble. So the grand canonical ensemble is just generating function of the partition function of the canonical ensemble. So I will do that. I will take a generating function of this guy with respect to n. And you will recover something which is similar to the grand canonical ensemble, OK? So that's remarked. That's one way to do it. So I have now a joint law of these LIs, if you want. I will just notice like this. I have A and I have M. It's a big object. And I don't like it too much because this delta function is a bit annoying. So I have this F minus of L1, F minus of M minus 1, Q minus A. And I have this delta. So another way of looking at it, if you want, is to say that I have a certain convolution, in fact, of this law. I can view that as a kind of convolution, if you want. And what turns out to be much nicer to work with is actually generating function. So why do I do that? Because eventually, so let's stick for the moment with that. What I want to compute is I would like to know the distribution of M, OK? So my goal is to obtain this guy, OK? This is this probability. I could ask other questions from all these. I could look at the ages of the records. Of course, I will not do that today. I just want to look at the number of records, starting from this renewal. Maybe this product here, the fact that I have a product here, this is called a renewal structure. It's extremely powerful. It can be generalized to other stochastic processors, some others. So what is that? I mean, to get it, you would agree, I guess, that I need to sum over all the possible values of a, l, and a, OK? So that's what I need to do in principle. I want to sum from 1 to infinity. I need to sum to infinity of this f minus l1. Sorry, I have an additional sum to perform, excuse me. And this additional sum, I have to sum from a equal to 0. You remember that this li start at 1, but a can be 0 also of this quantity. f of f minus of l1, f minus of lm minus 1, q minus of a, and I have this delta function. So again, it's really now a convolution of these different laws. Now, as I said, I don't like too much this delta function, which enforces me to work somehow in the canonical ensemble, which is usually quite hard. So what I will do is I will take some generating function with respect to n. So I will start at 0, and I will compute this guy instead. Yeah, sorry? Yeah, OK. Yeah, OK, maybe I should have not said that. So physically, you can think of this delta function think a bit about the blocks that I have in my system. I want to think about them as some particles. Each block is a particle. Now, for some reason, I mean, you see that this constrains me to have, or say differently. In each block, I have a certain number of particles, which is li's and a. And I want that the total number of particles is exactly equal to n. So it's a little bit constraining, because if you think, I mean, if I ask you, I mean, please evaluate the sum here. They are quite intricate. I don't like too much this constraint. So I will say, OK, let's forget about it. I will just work in the grand canonical ensemble, and I will introduce some chemical potential. My chemical potential here is, so you might think as z as a fugacity. So z, I could think about it as exponential beta mu. This is an analogy. And then I will have exponential of n beta mu. And if you sum n over n, then you switch from the partition function in the canonical to the partition function in the grand canonical. Again, I think it's nice to think that the grand canonical ensemble is basically the generating function of the canonical ensemble in some sense, because it's very useful to think about it in this term. And this gives you also a nice tool here. Why do you do that? Obviously, because when you now sum, so this is just a remark. I mean, if you don't like this analogy with the grand canonical ensemble, forget about it. I like it, but I think it's useful. But if you don't like it, you have the right. It's not really useful. Not necessary, I would say. It is useful, but not. Yes, OK. So now you see that I have to do this sum from n equal to 0 to z to the n. So that's what I need to sum now. So it looks complicated, but in fact, it's quite simple. It is quite simple because of this delta function here. Because you see that the sum over n here, it's quite simple. The sum over n will just give me that I have z to the power l1 plus l2 plus l3 plus a. So what I claim is that I can do the sum over z to the n. And I will write it. It will come like that. So I have a sum over l1, and I have a sum over ln minus 1, which starts from 1 to infinity. And then I have a sum over, OK, let's do it. Still a sum over a from 0 to infinity. And then I have this f minus l1, f minus lm minus 1, q minus a. But then you see I mean z to the n times the delta function and sum over n. Then the only possible values that it selects are precisely the n which are equal to l1 plus l2 plus lm minus 1 plus a. And now life becomes much, much nicer because I have, OK, I have a huge number of sums here. But I have the sum of products which can be rewritten as the product of the sum, right? Because they are completely decoupled. So what I'm saying is that the sum over l1, the sum over lm, lm minus 1, the sum over a equals 0 are just completely decoupled. So in other words, this I can again rewrite it like that. Let me collect what depends on l1. What depends on l1 are just this f1, f minus l1 times z to the power l1. That's the first block. Now let's look at l2. Well, l2 is basically the same, right? To infinity, f minus l2, z to l2. You agree? And the same for this lm minus 1, which goes from 1 to infinity, of m minus 1, zl index minus 1. I'm almost done. I still have one guy, which is the sum over a, OK? Now the sum over a, I can. So you see that they do factorize. I mean that's the key point of using this one canonical or this generating function is that here I had a very intricate constraint. But in the Laplace, how I say in the generating function space, it becomes extremely simple. Is that fine? But now in fact, all these objects are just the same because l1 or l2 are just dummy variables here. So at the end of the day, I obtained something very nice. So I will introduce this. I just need to keep that in the corner because I just want to tell you something at some point. So what I'm saying is that these are all the same values, the same functions here. And these functions, on top of that, they have a nice interpretation. They are just the generating function of these objects. So this guy is the generating function of f minus l1, or f minus l, and this one is the generating function of q minus, which we know actually. Paranderson tells us exactly what this guy is. So we have a nice expression. Let me write it like this. So we know that the generating function of this probability, z to the n, is what? OK, I'm writing it and I will comment. I will write it this guy, q minus z to the power m minus 1, f tilde minus, and here I will have a q minus to the power z. Now, what are these guys? So these guys are just the generating function. It's just the sum from l equal to 1 to infinity of f minus l z to the l, and q tilde minus z is just the sum from a equal to 0 to infinity of q minus a z to the n. So at the end, it's very simple. I mean, simple. You see, I mean, we had to work a little bit, but eventually. So that was a good idea to consider a much bigger object. And that's something that is very frequent, actually, when you want to compute some marginals. So you isolate some objects. You want to compute the distribution of this object. You easily realize, or quickly realize that it will be quite strong, because it looks like a quite implicated object. Now, you need to see precisely on what it depends. And then you will compute the full distribution of all the variables on which the number of records depend, and from it, by successive integration or summation here, you obtain what you want. And at the end of the day, with the use of these generating function techniques, you have something very simple. So now, what I want to say is that it's even simpler here, because we are considering symmetric jumps. And this is very simple. This is given by Sparanderson. And Sparanderson just tells you that this is just this guy. Now, what about this one? Well, this one, you see, is also very simple. It's very simple, because S minus z is just related to that. So it's very simple, because it turns out that this is just that. So this is Sparanderson. I hope you recognize it. Now, this is a consequence of Sparanderson. How do I see that? I just need to use this identity here. Let's do it just some exercise. Then we will have a beautiful formula, very simple. And again, you see that the only thing that we really need is this Q minus. So I told you up to now, the only thing that I needed to compute everything that I had was this Q minus. So what I say is that here, this is also basically the same. Because if I want to look at f tilde minus of z, then this is just this guy. This is the sum f minus L z to the L. Now, I just replace these are standard identities when you manipulate generating functions. So this f minus L, you see I can still write it as Q minus L minus 1 minus Q minus of L z to the L. So let's see what it gives. So you see here, so I just separate the two sums. They are converging because of this z, z is smaller than 1. So the radius of convergence is fine. And then, OK, I want to write this first term as I can write it this way. Let me take a z outside and let's write it as z to the power L minus 1, Q minus L minus 1. This is the first term here. And then I have a second term. The second term is this sum here. It's the sum from L equal 1 to infinity of Q minus L z to the L. OK, didn't do so much. Now, I will do two things. Here, I will just change the index and set L prime is equal to L minus 1. OK, so this is just the sum from L prime equal 0 to infinity of z to the power L prime Q minus L prime. I just made a shift. Translation of the index. OK, so this I know. This is just parandasen. Now, what about this guy? Well, this guy you see, it's almost, if I look at that, it's almost even by parandasen. But there is just one term which is missing, which is the terms which corresponds to L equal to 0. So we just have this term for 0. So that would be Q minus of 0 minus the sum z to the 0 is 1. And now I can use parandasen. So that is z times that. Now, Q minus 0 is the probability that starting at 0 that I survive. So that's 1. It's just 1. And then this is simply if I isolate the 1 and if I just sum these two guys, you obtain what I wrote before. So that's essentially this guy's parandasen also. So this guy has computed everything for us. And now, basically, we are ready to, well, then we can compute anything we want, because I mean, it's not yet depends a bit on what you really want. But OK, I am almost done now. I mean, I arrived at least at this generating function. So let's write it explicitly. So that's the final result of this bit long calculation. But at the end, relatively simple. I mean, of course, when you see it for the first time, it's a bit might appear a bit cumbersome. I did it quickly because I've done that 100 times. So I mean, OK, I see how it works. But that's the formula that we have. And now, if you work a little bit more, you can extract this distribution. So you need to work out what is p of n equal to m. So here, it's a little bit more complicated. So if you want to, so in principle, how you do it? I mean, the answer is hidden here. What you need to do, if you really want to compute this guy, if you want to compute the probability that rn is equal to m, then you need to expand this series in powers of z. And you look at the coefficient of the series, the coefficient of the term of the z to the m. So in principle, with Mathematica, for instance, you could do this very simply. And if you are a bit more clever, you can actually guess analytically what this guy will be, for instance, using the Cauchy's formula that I gave last time. So maybe I can just show you what the final formula is. And this was, again, I think the first time probably, at least for the records obtained by Majunda and Ziff. And this is this nice, this very nice component. I mean, it's simple, two to the power. So it turns out that in the probability literature, this formula has also various interpretations. If you like the book by Feller, for instance, you will see that in Feller's book there, this kind of formula actually appears. If you don't like Feller's book, you have the right, then you can work it out by yourself or see other papers. Pittman, for instance, has many derivations of these results in a different context. Anyway, the way to do that is basically, I mean, a simpler way. You just expand this using the binomial coefficient. And you can just use Cauchy's formula for all the terms. And then you re-sum all the things. And using some nice properties of binomial coefficients, you arrive at this formula. So the question that, once you have that, you can compute everything. A nice question that you may ask is what's the typical distribution in the large n limit. Now, it turns out that in the large n limit, one can extract the behavior of this formula simply by using, for instance, a sterling formula, if you want. I mean, that's probably the easiest way to get it once you have this expression. Of course, from it, you can recover the first moment that we got. Maybe I can just mention that quite quickly. Otherwise, it will be a bit. So you have an explicit formula. And again, once you have it, I want to understand the large n behavior of it. So what happens for large n? So as usual, I mean, a good way of, to understand the large n behavior, a good way to start, I mean, of course, if you do it blindly, I mean, this will lead nowhere. The first, I mean, a good way to understand how this distribution looks like when n is large is to look at the first moment. Now, we have seen already that our n is actually proportional to the square root of l. And we have this coefficient here, this 2 over square root of pi. Now, you can check this formula, of course, starting from that. So that means that you can sum over m times this probability. And then you can also get, once we have the full distribution, we can actually get our n square, for instance, or we could get this guy, the variance. Now, quite interestingly, what you would find is that this is actually proportional to n, and not square root of n. So that means that, OK, I can give you the coefficient, it's very simple, is 2n. But so what you see is that, in this case, and that's really at variance with what we observed for iid random variables, you see that the typical fluctuations is of order square root of n. And this is actually of the same order as the fluctuations. So if you compute that, then this will be of that form. And that means that this actually is of the same order as square root of n. And that's quite different from what we observed in the iid case, because you remember that, let me just recall it here. If I do the same for iid, we had rn, which was log n, fine. But we have rn square minus rn squared, which was also log n. So in other words, if you do that, then you would get square root of log n. And that's what I mentioned by saying that this is, instead, much smaller than rn itself. It's not very nice, but don't write it, because you don't do. So what I wanted to point here is that the relative fluctuations are square root of log n divided by log n. That means 1 over square root of log n. Here you see that the fluctuations are of the same order. So that means that the fluctuations in the random more case are much stronger. You break more records, but the fluctuations of this number of records is also much, much, much stronger. So this, actually, when you have something like that, when you have such a case where essentially the width is of the same order as the mean value, and in fact, what you can show is that if you go to the higher order cumulants, they will, of course, be all comparable to the initial value. So that suggests that you should look at, or you would expect that this distribution takes a scaling form for large n, which is of that form, which is typically of 1 over square root of n times some functions at g0, j. Here we'll not do the other one. m divided by square root of n. So that means that the typical value of n should scale at square root of n. And there is a 1 over square root of n simply for normalization. So j is a density. And I would expect that, so m is positive, so I would expect that this j of x is normalized. The question is, what is this j? Well, this j is actually a simple Gaussian here. It's a bit deceptive. But that's the result of this computation. That means if you start from this formula r, and if you like to use scaling, so here I mean how to do the asymptotics, you would just use this here. I mean, you would just use that, which is square root of 2 by n exponential of n log n minus n. If you just replace this factorial here by this scaling formula, then what you will find is that g of x is a simple Gaussian. So first, it's defined on the positive axis. And it's x square divided by 4 divided by, I can never remember. And it's normalized, so it is 1 over square root of 5. So in this case, what you see is that, so j of x is what? It's a half Gaussian. It's not really a Gaussian, it's a half Gaussian. So that means that the situation is quite different. So this is the half Gaussian. This is j of x. So now, if I really look at just to finish, just to compare what we had for the IID case and for the random walk, I can erase this. Just want to make a comparison to end up. So what are the differences? How do they look like? So if I look at, say, the random walk, and here I can look at the IID case. So random walk, essentially, if I look at the distribution of, I mean, this probability as a function of m, what I will have is that this is centered, I mean, this is close to 0, actually, and it's basically like that. And the width here will be quite huge. The width here will be over the square root of m. That's what it says. That's what we have shown. And this function here is just this half Gaussian that I've computed here. I mean, computed. This is this one. What about the IID case? Well, in the IID case, things are quite different because that's how they look. So if I look at the probability that m is equal to m, of course, this is a picture that you should have in the large n limit. I mean, I'm talking here about only large n. We have exact formulas for finite n, if you want. But I'm talking here about large n. So here what happens is that we have a Gaussian, which is basically centered. So we have something like that. OK, I'm cheating a bit because, of course, the width will grow, in fact, but it's the relative width that is. Now, we'll have something like that. We'll have a nice Gaussian, which will be centered around log n, and with a width which actually is very small. And it's over the square width of log n. So you see, I mean, this one is a Gaussian, the full Gaussian. And it's completely symmetric around log n. It's a full Gaussian, I mean, compared to the half Gaussian. OK, now I let you this as an exercise, if you want. Now, it turns out that the number of records here is exactly as the number. That's exactly the distribution of the maximum of the random walk, if you remember. And that's actually not a coincidence. If you think a little bit about it, there is a direct, and for some random walk model, there is a direct mapping between the number of records and the maximum of the random walk. And this is actually true for the plus minus random walk. So if you look at the plus minus random walk, you just do right or left steps with probability half. If you think a little bit about it, you can actually directly relate the number of records with the maximum of the random walk. And that's where this analogy comes from. OK, so this, I mean, OK, of course, I treated a very specific case, the case of discrete time step random walks, which were symmetric. And I could show you how we can use these ideas, essentially, from first-pass search properties to build this full-join probability and ask nice questions for this problem. There are many extensions around it. While most of the, I mean, the latest extensions, I mean, you will find a nice review in this review that I wrote with my colleagues, with Claude Vaudrache and Satya Majumdar. This has been extended to various type of random walks, like continuous time random walks, which are quite popular right now. So there are many, many things that can be done with this material. And I guess that, yeah, I'm already late, I'm sorry. So OK, with this, I will let you. So good luck for the rest of the school.