 Okay, there are lots of different ways to think about this and different ways to present it I think this is like the third iteration of Presenting this material and each time I've taken a different approach I think this one is the easiest to get your head around but other people may think differently. So if you want to find a different Presentation you could just go look at any of my talks or any of David Harvey's talks You can probably find two or three different ways of expressing the same idea But the key point is the algorithm is very it's very simple You're building two trees and then you're taking a vector and you're reducing it down You think of it as you're reducing it down the tree, but you're taking slightly different actions at the left and the right child That's what makes it different from the multi-point evaluation algorithm we talked about at the beginning where it was completely uniform. It did the same thing everywhere All right So how long does this take? That's that's that's 64 thousand dollar question Well our initial matrices at the leaves the m sub k's No other three by three matrix matrices with integer entries We can ignore the coefficients of f I mean maybe the coefficients of f are huge But we're going to express our complexity bounds in terms of n and as n is marching off to infinity f is staying right where it is So in terms as a as a in terms of a complexity expression in terms of n all the coefficients of f are o of one What is increasing in these matrices is k k is running up to n so k is getting bigger So each of the matrix entries have have o of log n bits and So do the moduli because we're going to be reducing modulo n or modulo k's And so the total size of all the matrices we have at the bottom of our initial tree is n log n bits We have n matrices we have log n size entries and the matrices are fixed size They're three by three. So there's no there. That's just a constant factor. So it's all then log n bits To multiply to multiply all the pair of those things up multiplication has complexity m of n log n So we're doomed to gain another log factor So we're plugging in an n log n into an n log n and so we end up with an n log squared n And that's just to get to the next one layer up in the tree to do it all the way to the top of the tree There's log n layers in the tree We're going to end up spending o of n log cubed n time to compute just to compute the product trees But the good news is what by the time we've done that we're halfway done because the time spent reducing down the trees is no worse Then the time it's spent computing the product trees So this gives a complexity bound of n log cubed n using n log squared n space Which is quasi linear and n Which means that if we think about how much time are we spending if we now think of our word think about the primes p up To n how many are there there's n over log n of them So if we average out the total time across all the primes p up to n We're only spending something like log p to the fourth time on each prime Okay, which is dramatically faster than any of the algorithms we've seen so far right At least asymptotically But of course you can't just pick your favorite prime you got to get it You know, it's it's it's a bulk purchase You're buying the Haas invariance for all the primes up to n you can't just pick your favorite one Okay, but when you do that you get a very efficient algorithm And this is not so different from the factor is a the factorization said that you implemented on the first day Give you an average polynomial time algorithm much an average polynomial time an average quasi linear time algorithm for factoring integers And we don't even know a randomized probabilistic algorithm that can factor integers even in quasi polynomial time All of the best known algorithms are sub exponential and you on your first day here You probably know more than 10 lines of code wrote an average quasi linear time algorithm for factoring integers Of course, you have to factor all the integers up to a given bound You can't pick your favorite integer and factor it so it has limited use And this is somewhat true here with this average polynomial time algorithm for elliptic curves in practice We don't use this algorithm because the range the practical range of n the biggest We could sort of feasibly make n was maybe like I don't know 2 to the 50 to the 60 at some point We're just gonna run out of memory or run out of computers on the earth to run this algorithm on and in that range Running mestres algorithm on each individual Elliptic curve would still be faster And you might say how could that possibly be true? Or just to do a rough back of the envelope sketch Let's suppose we have a prime p that's on the order of 2 to the 64th The complexity of Mestrix algorithm ignoring the log factors is something like p to the 1 fourth 2 to the 64 to the 1 4th power is 2 to the 16 And even if we throw in the log factors, I guess there was a log squared in there That would multiply it by 36. So it'd be something like 36 times 2 to the 16 This algorithm would be log p to the fourth So if it's p is 2 to the 64 log p is 64, which is 2 to the 6 and I raise that to the fourth power I get 2 to the 24th and 2 to the 24th is bigger than 36 times 2 to the 16 And that's assuming the constant factors are equal, but in fact the constant factors favor Mestrix algorithm So in practice, we don't use this algorithm for elliptic curves even for computing L functions of elliptic curves with very large conductor But as soon as you go up one genus at genus 2 already This is this algorithm wins when p is around 2 to the 16 and when gene and in genus 3 it sort of wins from the get-go Okay You have to do very many point counts before you'd be you'd prefer to do this algorithm in a situation where you want to compute Count points module all the primes up to a given bound N And I'll just close by mentioning that You can improve the space you can gain a log squared N factor in the space complexity And that's quite important in practice because what limits this algorithm is you run out of memory not time And you do that by implementing what's known as a remainder forest And so this this matrix V that I mentioned that was coming into our product tree I said we could optionally multiply that matrix V times all the matrices in the product tree And the reason you might want to do that is in a remainder force rather than building one big tree You actually build something like log N squared of them and at each one you need to pass information that encodes You know V times the product of all the matrices to the left from the preceding tree And you do that by taking your matrix your vector V and multiplying it by the matrix at the top And that's your input V to the next tree in the chain And so you sort of have this V jumping along the tree tops you have log P log N squared trees and This doesn't change the time complexity it improves the constant factors a little But it improves the space complexity by log N squared And so I'm realizing I'm I'm close to out of time. So let me just Wrap up with just presenting the algorithm and maybe what I'll do is at is invite you on your own to go take a look at the implementation But it's not complicated the implementation of this average polynomial time algorithm is really about the same length as the O of P To the one-half time algorithm using polynomial evaluation interpolation And I've implanted I've given you a full implementation of not just a single remainder tree But it actually does implement the remainder forests. So and you'll see that the Practical performance is actually quite good. So maybe what I'll do is just scroll down to the end here where you can see I took some timings So using the the remainder forest algorithm it took about 80 seconds to Count points on the elliptic curve 11a1 mod P for every prime except 11 and up to 2 to the 20th I did the same thing using mistress algorithm and It took about 34 seconds. So it was still ahead And if I use the built-in Implementation of mistress algorithm that's available in magma and also in sage it would be much smaller It would be smaller than 34 seconds. It would probably be something more like 10 seconds so still not winning but I was actually impressed that This implementation came that close. It's not inconceivable to me that if you push it a bit farther it would get even closer All right, I better stop there You