 We'll give you the second lecture. Third, sorry, sorry, third lecture. I can do the second lecture again if you want. Back by popular demand, my second lecture. OK. All right. So at the very end of the second lecture, we were talking about recurrences for polynomial powers. We're going to dive into that a bit more. So let me remind you briefly what we were doing. So I had my polynomial f. OK. d is 2g plus 2. And we were looking at f to the m. We called that h. And here m is p minus 1 over 2. And we wrote down some differential equation for h, which was f times del h. Del is the degree-preserving differential operator. This is m times del f times h. So this is the differential equation. And we wrote down a recurrence for the coefficients of h. So we had hk equals 1 over k times f0 times some stuff. I'll write it down. I think I should probably just write it down. OK. We had some. What was it? Half j minus k fj hk minus j. OK. And this expresses hk as linear combination of the previous d values, provided that our k is not 0 modulo p. So I want to take a quick look at what happens when we try to use this recurrence to compute the matrix a sub f. So remember, we have our matrix a sub f. And its entries are given by hvp minus u. This is a g by g matrix that we're trying to compute. It's the Hassavit matrix. So what do we do? Well, we start with h0. Well, h0 is very easy to compute, right? Because it's just the constant term. So I just compute the mth power of the constant term of f0. And I can do that very quickly using repeated squaring again. It's also sometimes called binary powering. OK, so I compute h0. That's great. And now I compute h1 using the recurrence. And there was actually a question in the problem session yesterday. What about the h with negative indices? What's h of minus 1? h of minus 1 is 0. All the negative indices is 0. I guess that's just by convention. But you can also check when you prove this that's what it has to be. They all have to be 0. OK, so h1 is a linear combination of h0, h minus 1, h minus 2, the previous d values. So you get h1. That's great. And then you get h2. And you keep going all the way up to hp minus 1. And at this point, we have the first row of a sub f. The first row is going to be these last g coefficients that we computed. This is the first row of af. That's v equals 1. But now we have a problem. So the problem is that to get to hp, we need to divide by p, because k equals p. And we can't do that. Cannot get hp, need to divide by p. Oh, I shouldn't write that. That's confusing. Actually, it's kind of true. We have to divide by 1 and by 2 and by 3 and by 4 I'll put it back. OK, there we go. So this is a bit of a problem. And we actually need to do hp because we need to get to the second row. The second row, we need to get all the way, we need to keep going. We want to get all the way to h2p minus 1. That's for the second row. Now you might say, OK, well, I can't get hp, but I could try k is p plus 1, for example. That's going to give me some relation involving hp. Does that tell me what I need? Well, you can play around with it, and it doesn't. The problem is there's always going to be one degree of freedom that you haven't dealt with. So it doesn't work. OK, so that doesn't work. So we're going to have to try something completely different, which is that we are going to start again from the beginning, but we are going to lift the whole problem to working modulo power of p instead of modulo p. So we are going to lift the whole computation. So the first place I learned this idea from was this paper of Boston, Gaudrey, and Schost, which I mentioned many times in these notes, very important paper in this area. But I'm sure it's a much older idea than that. I don't know where it goes to, but obviously, a much older thing. So we are going to lift to z mod p to the mu, z for some mu, let's say at least 2. And one of our jobs is going to be to figure out what mu we need, what is the smallest mu we can get away with, because larger mu is more expensive computation, for several reasons. OK, so here's what I'm going to do. I'm going to choose a lift, capital F, which is F0 plus F1x, Fd, x to the d. This is in z mod p to the mu, z, x. So polynomial over these integers mod p to the mu. And by lift, I mean that each Fj is congruent to our original Fj, modulo p. OK, so any lift is fine. You just choose it once and for all. The most obvious thing to do is, if your little f's are in the range, if the little f's are in the range 0 to p, it's a pretty reasonable way to represent them, then you just take the capital F to be literally the same integers, but now thinking of the modulo high power of p. So we're going to lift like that, and then we are going to put capital H to be capital F to the m. And this is also polynomial modulo p to the mu. Now, of course, you actually have to take it to be the mth power of this choice of lift. You can't just pick another lift of h, a little h. That doesn't make any, I mean you can do that, but nothing will work. It has to be consistent with this choice of f. And now, what we want to do is it's enough to compute coefficients of h modulo p, enough to compute the target coefficients, I'll say, Hj modulo p. And by the target coefficients, I mean the ones coming up in this matrix. And we only need the modulo p. So we might compute them incorrectly modulo p squared or p cubed or whatever other powers. As long as we get them correct, modulo p, we're happy. OK, so since we have h is f to the m, we can actually go through some of the same setup as before. There's a differential equation and a recurrence satisfied by coefficients of capital H. So let's write that down. So it satisfies the same differential equation, which is capital F times del h is m del f times h. So remember, this is now a congruence modulo p to the mu. Everything here is happening modulo p to the mu. And we can write down a recurrence using exactly the same method. So here's what you get. I'm going to be slightly careful. I'm not going to put the k in the denominator. So I'm going to write it like this. k hk is equal to 1 over f0. Now, I should just say about this capital F0. I think I forgot to mention it at the beginning. I think I mentioned this last time. Remember, we are assuming that little f0 is not 0. And that tells us that f0 is invertible. Capital F0 is invertible, so we can divide by f0, no problem. OK, so here is the recurrence. It looks slightly different than the other one, because I need to remember that p is no longer 0. Remember over here, that half is actually p plus 1 over 2. But now I'm working modulo high power of p. So I have to keep track of the p, and it's hidden in here. This is really, so I've left it as m, but m plus 1 is p plus 1 over 2. That's p plus 1 over 2. This is going to cause much pain when we get to section 7. But let's just write it like that for the moment. And then we have capital F, j, and h, k minus j. OK, so again, remember, this is all modulo p to the mu. Now, what we're going to do is we're going to try to do the same things before. We're going to start with h0, and we're going to try and compute h1, h2, and so on using the recurrence. And occasionally, we're going to have to divide by something nasty, something when k is divisible by p. We have to divide by p. And that is going to introduce an error into our computed values of h. So here's how I'm going to do it. So we've got these coefficients h0, h1, h2, and so on. But I'm going to define a sequence of approximations, which I'll call h0, twiddle, h1, twiddle, h2, twiddle. These are going to be the values that we actually compute. And they may be different to the h's. As long as we hit our target values modulo p, I'm happy. So let me show you how this works. So we'll start with h0, twiddle. Well, that's just going to be h0. And we can calculate that because that's just f0 to the m. So that's actually equal to h0. There's no error there. And we can do that very quickly. That's just repeated squaring. But now we're working modulo p to the mu. And then we compute, inductively we find, so we compute h1, twiddle, h2, twiddle, et cetera, by solving that recurrence over there but with the h's replaced by h, twiddle. So by solving k, h, twiddle, k, it's 1 over f0, m plus 1, j, yes, j minus k, fj. And now this is the only difference, really, is this one is hk minus j, twiddle. And this is all, of course, modulo p to the mu. So we use the previously computed approximations in this recurrence to compute subsequent values. I hope that's clear. I want to show an example of what happens when we do this because there's some pretty funky stuff happens. Let's try an example. I'm not actually going to compute any numbers, of course. That would be too hard. You can use your computer for that. Let's take a genus 3 example. So my degree is going to be, I'll use 2g plus 2, which is 8. And I'm going to take mu equal to 3. So I'm working mod p cubed. And you'll see why in a moment. So let's see what happens. So we start with h0, twiddle. That's great. So that's correct modulo p cubed. And then we apply the recurrence. And you can see that I'm not dividing by p anywhere. So h1 twiddle is still going to be correct modulo p cubed. What I mean is it's congruent to h1 modulo p cubed. I mean, you just plug it in. You divide by f0. There's no division by p. You have to divide by k, but k is just 1. So there's no problem. And you keep going h2 twiddle all the way up to h p minus 1 twiddle. And all these guys are correct modulo p cubed. Correct modulo p cubed. And at this point, again, we have the first row of a sub f. So these last few coefficients, this gives me the first row. Now I get to h twiddle p. This is the interesting point. So we get to h twiddle p. Now, when we solve for h twiddle p, h p twiddle, we have to divide by p. So what's going to happen is our value is only going to be correct modulo p squared. And there's no hope of computing it modulo p cubed. There is not enough information there. We've really lost that information. It was never there in the first place. So this is only correct modulo p squared. And we keep going. Now we look at h p plus 1. Now for h p plus 1, we don't have to divide by p again. We're dividing by p plus 1. But the problem is that the h p is one of the values in our linear combination. And so it's going to muck up our value of h p plus 1. So it's also only going to be correct modulo p squared. And we keep going, and the same thing happens here. So all these guys are correct. I mean, that you might get lucky. They might be correct modulo p cubed, but probably not. They're probably correct modulo p squared. And now we have the second row. Because these guys are correct modulo p squared. So they're also correct modulo p, which means we have our second row. And I'll quickly write out the third row. And the reason I'm doing this in such detail is it's something weird actually happens. So third row is we get to h 3p. So we have to, sorry, 2p, 2p. So we had to divide by p again. So now we're only correct modulo p, and we keep going. And we get to h 3p minus 1. So these are correct modulo p. And here we get the last row, the third row. And my genus was 3, so that's why I chose p cubed, so that we would have all three rows correct modulo p. And then, of course, we can stop. We don't need to go any further. We don't need any more coefficients. If we went up to h 2 or 3p, then it wouldn't be correct at all. It would look like random garbage. So if you generalize this argument, in general, it looks like we need to take mu. Well, what do we need? We basically divided by all the integers from 1 up to 3p minus 1. So in general, what it should be is the valuation. So vp is the number of powers of p of gp minus 1 factorial. These are all the denominators that show up. And then you need to add 1, because we need the result to be correct modulo p. OK? Now, the incredible thing, which, of course, those of you who've read ahead in the notes, will have spoiled the surprise. But I'm sure no one read ahead, right? You're not that kind of student. It really warmed my heart the other day when I said, does anyone want me to keep going for five minutes? Everyone's like, whew. Normally, my students, they wouldn't be answering, because they would already be asleep. So here's the amazing thing, is you can actually do better than this. And it's very, very surprising, because this argument looks tight, right? There was really no room. There was no room to do anything. But you can actually do better than this. So corollary 5.3.1 says that actually it can take mu to be smaller than this. So this is a logarithm in base p, gp minus 1 floor plus 1. So this mu, in our example, is 3. But you can check that in this formula, it's 2. This is in our example. In the example. So it's completely crazy. So look at this. Here's what happens. I'm saying you can start mod p squared. So you start mod p squared. OK. Here, these are only going to be correct modulo p. And then you keep going. And then you divide by p again. I'm saying it doesn't matter. These will still be correct modulo p. Like, how can this happen? Impossible. Well, there's a proof in the notes. I'm not going to go through it in the lectures now. It has to do with the fact that at every point, these coefficients satisfy this recurrence by design, which means they satisfy the differential equation, which I seem to have erased. There's a corresponding. Basically, these h twiddles satisfy the same differential equation as the h's. And magically, that is enough to sort of resurrect this mod p information. I wish I had more time. I could go into that. But this is actually one of the most annoying things in all these sorts of algorithms, because you actually have to work a bit to prove these things. And in one variable, it's OK. In two variables, it starts getting very difficult. And I don't completely understand what's going on in two variables. But it's a very interesting phenomenon. So it helps us in the sense that it means we need less precision than we expect in these sorts of algorithms. But it also hurts us because it hurts our brains, basically. And I think there's a lot that's not understood about this sort of phenomenon yet, especially in a larger number of variables. And I really encourage you to try this on the machine. Just try it. It's kind of incredible. It's like, where does it come from? OK. So let me just quickly say now what complexity bound we get. So the complexity of, I'll call it the recurrence strategy, of the recurrence strategy for computing a sub f. And the recurrence strategy, I mean, do what I've just done here. So take your f, lift it to this capital F, start computing the sequence of h twiddles until you have all the rows of a sub f. So it's not too hard to work out because you just need to count the number of operations you needed to do to compute all these coefficients. And it turns out to be o of g squared times p operations in, oh, hang on, have I got this right? Yeah, OK. So here's what's happening. There's, yeah. So the degree of the, the number of terms you're trying to compute is g times p. The extra factor of g is coming from evaluating all the terms of the sum is o of g terms. And I'm using the fact that we're actually working modulo p squared, not p to the g, but p squared everywhere. So I think I need to assume something like p is at least g. So this, this is always just 2, OK? So we're just working modulo p squared everywhere, OK? And then the cost of an operation modulo p squared, as before, is log p to the 1 plus epsilon, OK? Yeah, so I was confused briefly about why it was g squared. I think that's OK. So if we compare this to the expansion method, where you just compute the whole polynomial, then I believe what happens. I think we had, previously we had something like o of g times p, and there might have been a log p squared, I think. I think it was a little more, I think it simplifies to this if you assume p is greater than or equal to g. So if you compare these two, you see that our dependence on p is still pretty much linear, which is not surprising. You have to compute. You still compute all the coefficients, all right? The dependence on g has gone up. You can actually get rid of that if you work a little bit harder. I'll leave that. I think I had that as a problem at some point, and I got rid of it. But you can think about that if you like, how to improve that power of g. And the dependence on p has actually gotten slightly better. It's actually a bit misleading. This algorithm in practice is much, much faster than the expansion polynomial one. And the main reason is because it uses almost no memory. With this algorithm, you only have to keep track of the previous decoefficients. So you just have decoefficients. You compute the new one. You throw the other one away. So you're operating on very little memory, whereas taking a large power of a polynomial uses a huge amount of memory. And that makes an enormous difference. Memory is very slow. So Drew Sutherland has some lots of data on this. I'm not sure about for this algorithm. It's true here? Maybe not. But we have a paper on a similar algorithm, which is not for hyperliptic curves, but for plane quartics, which is going to be at ants next week. And he's got some data showing that this kind of algorithm is much, much faster than. Actually, no, he doesn't even compute the powers, because it's so hopelessly slow. OK. Yeah. Yeah. Sorry, I'm not sure I understand the question. Well, you have to work modulo p squared everywhere. So you can't represent elements. You can't represent things as elements of fp anymore. You have to represent them as integers modulo p squared. So they take twice as much space. Is that what you're asking? Yeah, you do. OK, I should have said more about this. You're right. So the question was, how do you do the division by p? So the answer is that whenever you need to solve this congruence, you just solve it. You write down any solution. So suppose k is p, for example. What you'll find is you'll find this right-hand side is divisible by p. So you divide the right-hand side by p, and then you're basically done. So whenever there's some indeterminacy, whenever k is divisible by p, there is not going to be a unique solution. And it doesn't matter. You just choose whatever solution you like, which you can do. And then you continue. And there'll be some error because you didn't know which was the right solution. And what's proved in the notes is it doesn't matter which solution you pick. Everything will work out. Thanks, good question. Any other questions before I go into section 6? Yes. And in fact, what will happen is, yeah. So what will happen is that this first line here will be correct modulo p squared. And after that, everything will be correct modulo p, even though you've divided by p lots of times. It's very, very strange. Yeah, as I said, it's one of these things that's hard to believe. And you don't really believe it until you try it on the computer, and you're like, OK. That's what he said. I mean, I had to try it on the computer before I came here. But I wanted to make sure. OK. All right, section 6. Oh, there's a question up there. You mean these powers here, the big O constants? Yeah, so I have no idea. I suspect that the big O constants here are smaller for this algorithm. See, the problem is there's this tension between what is the theoretical answer and what is the answer in practice. I just know from a lot of experience that in practice, this is much, much faster. And the other problem is that the way I've written it, this log p to the 1 plus epsilon is not very realistic. I mean, what will typically happen is that log p, you're dealing with operations modulo p squared. And typically, that will fit into one machine word, or maybe two. So the operations mod p or mod p squared are all being done in hardware. So it'll just be some constant. It doesn't really depend on how big p is. So this is a sort of theoretical bound. It's not very meaningful. The p part is very meaningful. It really will scale linearly like p, because that's the number of operations you need to do. Now here, this is much more complicated because here you're doing polynomial arithmetic on enormous polynomials. And roughly what's happening is I think one factor of log p is coming from the size of the coefficients. And the other log p is coming from the fast Fourier transforms. And I think you really will see both of those factors of log p. I bet that you could probably measure it and roughly see them happening in the data for large enough p, maybe. I'm not sure if I've really answered your question, but yeah. OK. Let's move on to section 6. And this is where we get to some really fun stuff. This is the square root algorithm, or square root time algorithm. So everything we've seen so far is linear in p. And now we're going to get this down to square root of p. And I'm going to first of all do a bit of a warm up just to get the ideas. And that's going to be Wilson primes. So this has nothing to do with point counting. But this is another fun problem that I've worked on before and still think about occasionally. So what is a Wilson prime? So we all know Wilson's theorem. It says p minus 1 factorial is minus 1 modulo p. So Wilson prime is when this congruence holds modulo p squared. So here are the Wilson primes we know about. 5, because 4 factorial is 24, which is 1 less than 25. 13. And this was discovered when people were still doing these things by hand. Takes a little while to do it by hand. Good exercise. 563. I believe this one was discovered the first time anyone threw a computer at this problem. OK. And there are no others less than 2 times 10 to the 13. And this was a computation that I did with Edgar, who might be here, and Robert Gerber. It's about 10 years ago now. Oh, where's my eraser? Here it is. So our goal is given a prime p is to compute p minus 1 factorial modulo p squared. Given any prime p, we want to compute p minus 1 factorial modulo p squared. And if it's minus 1, we have a Wilson prime. And otherwise, we don't. I don't know of any way of finding Wilson primes, apart from just testing every prime in some range. And you can sort of work out the heuristics, what you expect to happen. And everyone expects that there are infinitely many and that they're extremely sparsely distributed. And now we've gotten out so far, it's getting pretty hopeless to think that we'll find any more. But you never know. So my theory is if I look for Wilson primes, there's a very small chance of finding one. But if I also look for some other kind of prime, well, there's also a small chance of that. And if I do enough different types of primes, then maybe I'll find one interesting one. So if I just come up with enough stupid properties, then of course, I'll find a large prime having one of these properties. So the naive algorithm for computing this p minus 1 factorial is you just start with 1, and then you multiply by 2, and then you reduce modulo p squared, and then you multiply by 3, and then you reduce modulo p squared. And you keep doing this. And it's a hopeless algorithm. But it's pretty good when you've got GPUs doing lots and lots of primes all at the same time. And some people have tried doing it that way. So that's a linear time algorithm. You've got to do roughly p multiplications. So it still takes a while when p is large. So I want to show you Strassen's idea for getting this down to square root of p. So here's Strassen's idea. So let me write s to be p minus 1. This is the number of terms we need to multiply together. So p minus 1 factorial is now 1 times 2 times 3 all the way up to s. And what we're going to do is we're going to split this product into about square root of s blocks of size square root of s. So let me write t to be the floor square root of s. So that's about square root of s, which means I can write s is t squared plus some leftover thing, t prime. And you can check that t prime is not too big. It's actually o of square root of s. I think I worked out some precise bound in the notes. OK, and then I'm going to split up this product into t blocks of t plus t prime leftover terms. So I get p minus 1 factorial. It's going to be 1 times 2 all the way up to t. That's my first block. And then I'm going to have t plus 1 up to 2t. That's my second block. And I keep going. And then my last block is going to be, check I get this right, t minus 1 times t plus 1. Sorry about all the brackets. All the way up to t squared, right. So that's block number 0, block number 1, block number t minus 1. So up to t squared. And then I have the leftover terms, which is t squared plus 1 up to t squared plus t prime. So this is leftover terms. And this stuff is all t blocks of t. Now, Strasson's idea is to introduce a polynomial, which I'm going to write as q of k. So q of k is going to be k plus 1 times k plus 2, all the way up to k plus t. And Strasson knew a thing or two about polynomials. And he realized there's a way of rewriting this sum, sorry, this product, so that you could compute it much more quickly using some techniques of fast polynomial arithmetic. OK. So what happens is that p minus 1 factorial, you see the first block is actually q of 0. And the next block is q of t. And so on up to the last block, which is q of t minus 1 times t. So it's all the multiples of t. There are t multiples of t here. And then there's the other stuff, the leftovers. OK, so Strasson's idea is we're actually going to compute this polynomial, q of k. And by compute it, I mean work out its coefficients, right, working modulo p squared. Because everything here is going to be p squared. And then we're going to evaluate that polynomial at these t points. And there are fast algorithms for these things. And I'm going to explain briefly how these algorithms work. So here's the algorithm. So the first step is we're going to compute q k. Now, as a polynomial with coefficients modulo p squared, OK, now those of you who are not computing people will say, but I already have the polynomial q of k, right? Here it is. I need to actually expand it out, right? Because I need the actual coefficients as the input to the next part of the algorithm. So let's compute this q of k. And I'm going to use something called a product tree. And product trees are discussed in general in section 2 of the notes. But I'm just going to show you what it looks like here. And just to keep my life simple, I'm going to assume t is a power of 2. Let's assume t is a power of 2, just for this picture, OK? You don't really need to assume t is a power of 2. It works perfectly well. This is actually the difference between Drew Sutherland and me, is in my code, t doesn't have to be a power of 2. And in his code, t is always a power of 2. So my code is more painful, but more general. And we sometimes have fights about this, but anyway. So OK, so how does this work? What I'm going to write down is what I'm going to do is I'm going to write down all these linear terms at the top, k plus 1, k plus 2, k plus 3, k plus 4, all the way up to k plus t. And then I'm just going to multiply them together in pairs. So I multiply together k plus 1, k plus 2. So I get, and I actually have to do the multiplication, right? k squared plus 3k plus 2. I'll do one more. k squared plus, oh dear, 7k plus 12, et cetera. OK, there's something here, some quadratic. And I keep going, I multiply these two together. This is k to the fourth plus something, plus 24, and so on. And all these things keep going down to the bottom of the tree. And at the bottom of the tree, I have, well, what do I have? I have this product, right? Which I can't really write it out, but I'll just write it out like the way I wrote before. But you know what I mean. It's the actual expanded product. So this is a very simple example of a product tree, right? These guys up here are called the leaves. This is called the root, the root node. I think in my notes, they're upside down, right? I just realized this. I have the, I mean, the root should be at the bottom, right? I guess this is an Australian tree, right? It's the other way around. Yeah, I think in my notes, the root is always at the top. I mean, I just used some package, right? And it puts the root at the top. So all these American trees are upside down. It's very strange. OK. OK. Now, the idea is that, of course, we're going to use fast polynomial multiplication to compute all these products. Yeah, question. No, it doesn't. You could put the linear factors in any order. That doesn't matter in this particular case, because in fact, the only thing I really care about here is this polynomial at the bottom. But in section 7, we are going to have some product trees coming back with a vengeance. And there, it very much matters which order you put the nodes in. So you'll see, hopefully, we'll get to that in section 7 on Friday. OK, no more questions. OK. Now, I'm just going to briefly go over the complexity analysis. You see, the idea is that, let's say at this top level, we had two big polynomials of degree roughly t over 2 that we had to multiply together. So the top level, we just do multiplying two ginormous polynomials. The next layer up, we multiplied two pairs of polynomials that were half the size. But at each level of the tree, the total amount of polynomial is the same. The total degree, the sum of all the degrees at each level is the same. We started off with t here. Here, there's half as many polynomials of twice the degree and so on. So it turns out that when you estimate the cost at each level, the cost is actually the same at each level, up to a constant maybe. In practice, what you find is it's this bottom level that is the most expensive. And it gets cheaper as you go down to the smaller polynomials. So what happens is the cost at each layer is going to be o of, so I can work it out. It's going to be t times the coefficient size, which is going to be log p. And then we need log of all this stuff, oops, log p. And I think that's it. I think that's going to be OK, which turns out to be o of p to the half log squared p. And the reason is that our t, remember, is squared of p. And then the number of levels of the tree is o of log p. It's log t, which is log p. So the total cost is p to the half log cubed p. And the key point here is that it's basically squared of p. And this is discussed. This analysis is done a little bit more detail in the notes. And since I've only got 10 minutes, I don't want to go into that in any more detail. OK, so now I have to explain how do we evaluate q at these t points. And the idea is you use this technique called multipoint evaluation. So now, so step two is evaluate qk at k equals 0t2t up to t minus 1 times t. So we have a polynomial of degree roughly t, in fact, exactly t, which we're evaluating at exactly t points. And this is exactly what multipoint evaluation does. You see, if you evaluated separately at each point, you would get a quadratic time bound, quadratic in t. Because first, you'd have to evaluate a t that takes t operations. Then you evaluate it 2t that takes another t operations. The multipoint evaluation lets you do them all at the same time much faster. So use multipoint evaluation. And let me explain briefly how that works. So the first thing you do is you compute another product tree. So let's say, suppose we want to evaluate at, let's say, points alpha 1 up to alpha t. These are in whatever ring we're working in. Here we're working in z mod p squared. So if we want to evaluate at t points, in this example would be these t points, what we do is we write down, first of all, this product tree. We take all these linear polynomials which have the alpha i's as the roots. And we keep going down this tree until at the bottom we have the product of all of these terms. So this is our second product tree. And the cost of this is going to be exactly the same as the analysis over there. So the cost is p to the 1 half log cubed p. So first, we compute that tree. And then, this is really the key part, is we're going to use this tree to figure out q of k modulo all those polynomials. So now we do what's called a remainder tree. So I'm going to start at the top. Now my tree will be upside down, because I always want to start at the top. So this tree is like hanging from the ceiling. So I, first of all, have q of k modulo k minus alpha 1 up to k minus alpha t. And I don't really have to do anything here, because this guy is degree t. And this guy is degree t. I suppose I could do one little reduction to get rid of the degree t term, but it's not a big deal. And then what I do is I split this like this. I figure out q of k modulo, just the first half of the terms, k minus alpha 1 up to k minus alpha t over 2. And then over this branch, I have q of k modulo the rest of the terms. So this is k minus alpha t over 2 plus 1 dot dot dot k minus alpha t. And the way I compute these is I just divide by these polynomials and take the remainders. And I know those polynomials, because they are the values near the bottom of this tree here. They're the two polynomials here and here that went into this root node. And then I just keep splitting like this. These are both half the degree, right? So I keep splitting. Now I have four polynomials that are a quarter of the degree. And I go all the way to the bottom. So at the bottom of the tree, I have q of k modulo k minus alpha 1. Well, if I modulo a linear polynomial, that's just the value of that polynomial at that point. So this is equal to q of alpha 1. And similarly with all the others, I get q of alpha 2 and so on all the way up to q of alpha t. So I know if you haven't seen this before, I realize it's a lot to absorb. And I really think the best way to understand this algorithm is go and implement it. It's not hard to do as long as you have a computer algebra system which can do all this polynomial arithmetic for you. As long as you can multiply and divide polynomials, this is pretty straightforward to implement. Now this step is also of p to the half log qp. And the reason is that at each stage, we're doing a polynomial division. And you can prove that, well, I think I mentioned this before, division is big O of multiplication. OK. So this is how you compute these main parts of that product. Then you have to multiply all these results together. So I guess step three is multiply together all of these q multiples of t, qiit, and leftover terms. And remember there are only something like square root of p leftover terms, so that's also square root of p. So altogether you get square root of p. So conclusion, you can compute p minus 1 factorial modulo p squared in time p to the half log cube p. Now I don't really have time to get back to the point counting, which is really what we want to do with this. So I'll do that on Friday. But I might just make a few more comments about this. And so I've got a minute or two left. So firstly, the most important improvement on this algorithm, if you're interested, is you can get this 3 down to 2. There's an algorithm, I mentioned this before, Boston, Gaudry, and Schost. Very nice algorithm, which is a similar idea but you never actually compute this polynomial q. You don't actually expand this out. What you do instead is you sort of compute values of some of these sub-products at certain points. And it's quite clever and saves a factor of log p. And that is a big deal, right? A factor of log p. I mean, your p might be in the billions or trillions even, so that can be a factor of 30 or 40 or something like that. Huge, huge difference. So we actually used that algorithm that I just described. We did this big computation of Wilson primes up to 2 times 10 to the 13. We didn't use this for the main part of the computation, but we used it as a check, right? So after we've computed all these remainders, we pick some little sample of them and we verify them individually using this sort of algorithm. This is pretty much the fastest way I know how to compute these Wilson remainders. Anything else I want to say? This algorithm uses a lot of memory, which is a problem. You really have to keep track of all these polynomials. So that hurts. Yeah, I think that's all I'm going to say today. So thank you very much. Any questions?