 very much. Thanks very much. Thanks very much to the organizers for inviting me. It's a great pleasure and honor to be here. So everything I talk about today is joint work with the Kashi Taniguchi. The sort of main object of study for today's talk is the family of elliptic curves. So let me tell you what I mean by that. So every elliptic curve over q can be written uniquely in the form y squared equals x cubed plus ax plus b where a and b are integers. And to make sure that this way of writing elliptic curves is unique, you also have to assume that p to the 4 dividing a implies p to the 6 does not divide b for all primes p. And then you can construct your family of all elliptic curves to be all eab. So we'll call this elliptic curve eab. And here of course, like a and b need to satisfy the same conditions. And additionally, this cubic polynomial needs to have this distinct root. So we also need 4 a cubed plus 27 b squared not equal to zero and above conditions. Now, we know that given an elliptic curve, you have a rank that's associated to it. And of course, the study of Frank's elliptic curves is a very deep and important topic. But today we're going to be studying what happens to the rank and to other arithmetic invariants of elliptic curves as we vary over elliptic curves in this family. And the main conjecture, which motivates this this topic in arithmetic statistics is a conjecture, which is due to the works of Goldfeld and Kat Zanak. And the conjecture says that 50% of elliptic curves in our family have ranked zero. 50% have ranked one. And the average rank of elliptic curves in our family is a half, which is to say that the zero percent of elliptic curves with rank which are not zero or one don't actually affect the average very much. Now, when I make any kind of statement like this, any kind of statistical statement like 50% have ranked zero or average rank as a half, I need to order my elliptic curves in a certain way because it's an infinite set of elliptic curves. And what like what we would like to do of course is order elliptic curves by conductor or order elliptic curves by discriminant, but that turns out to be a little bit too difficult to do. So instead we will order elliptic curves by height. So when these elliptic curves are ordered by height, where the height of EAB is the maximum of 4A cubed and 27B squared. The 4 and the 27 here are not important. It's here so that it kind of mimics the discriminant in certain ways. And we do expect this conjecture to be true under almost any natural ordering. We expect it to be true when you order by conductor, when you order by discriminant, when you order by height. Today we're going to be ordering elliptic curves by height. Okay. So the way we study ranks of elliptic curves, algebraic ranks of elliptic curves is frequently via their two-cell mer groups. So let me tell you what this two-cell mer groups of elliptic curves are. So instead of giving you a sort of a definition of it, I'll just tell you the exact sequence they fit into because that'll explain why they tell us something about the height. So if E is an elliptic curve of a Q, then there's an exact sequence. Zero goes to EQ mod 2EQ, which injects into the two-cell mer group of E, which surjects onto the two torsion in the state chauffeur, which group of E. So this is a nice exact sequence. It's particularly nice because the two-cell mer group here is finite and it's a very important quantity. So for example, models prove that the ranks of elliptic curves are bounded. Basically, in secret is a proof that the two-cell mer group of an elliptic curve is finite. Now, there's a Cohen-Lenstra style heuristic given by Poonen and Reins, which predicts the distribution of the two-cell mer groups of elliptic curves. And in fact, it predicts the distribution of the P-cell mer group of elliptic curves for all primes P. And in fact, there's a heuristic of Del or Ne, which predicts the distribution of the two torsion of the state chauffeur, which group and it all sort of matches together. So you've got your CAT-SANAC, your Goldfeld CAT-SANAC heuristic, which will tell you how E of Q mod 2, E of Q is distributed, namely 50% of them are trivial, 50% of them is Z mod 2 Z. You've got the Poonen-Reins heuristics that tells you how the two-cell mer group is distributed and you've also got Del or Ne's conjectures telling you how the two torsion in the state chauffeur, which group is distributed and everything sort of fits together nicely. So we have a very good picture of how this behaves. And in fact, there's this sort of very beautiful work of of of Bhargava Poonen-Reins chain in and Lensstrah in which they predict how this this entire exact sequence is distributed as the curves vary over over a family F. And so given that we have all these different conjectures that are made in very different ways. So for example, the CAT-SANAC conjectures are made by studying the family of L functions associated with elliptic curves in conjunction with the BSD conjecture, where the Poonen-Reins heuristics are very much sort of co-in Lensstrah style heuristic that sort of model the two-cell mer group as a random group. And the fact that they all fit together is very strong evidence to believe that these conjectures are true. But what does the data tell us? So the data I'm afraid is not particularly satisfying. It had been noticed when people looked at the Kremona tables, the ranks of elliptic curves were significantly higher than the goldfell CAT-SANAC conjecture would would predict for them. There seem to be a particularly high percentage of ranked two curves. They seem to be the average rank and seem to be bigger than one in the data and so on and forth. But the Kremona tables didn't have all that much data because Kremona ordered elliptic curves by conductor and it's actually very difficult to even know that you've listed out every single elliptic curve of a given conductor. It's computationally quite expensive. However, recently this work of Balakrishna, Ho, Kaplan, Stein, Spicer and Weigand who collect a very large amount of data on elliptic curves when they're ordered by height. So in particular they can go up to height as large as 10 to the 10.5 and they still see a much larger average rank than predicted. So even in this very large height range, the average rank looks more like 0.97 or something like that than the 1 by 2 that it's supposed to be. But very curiously, they also see the average size of the two-selma group. This looks smaller than predicted. So it looks like 2.7 in the top height range, which is smaller than the 3 that it's supposed to be. So you have your exact sequence. This average looks much bigger than predicted and this average looks much smaller. And this is actually, I mean this is obviously just a feature of the data because the average size of the two-selma group has in fact been computed and it's a result of Manjul, Bhargava and myself that the average size of the two-selma group of elliptic curves when they're ordered by height is 3. So it is what it's supposed to be and that this 2.7, the data is, I mean it's smaller for some reason that needs some explanation. And the purpose of this, the main result for this talk is a possible explanation for why the average size of the two-selma group looks much smaller. It's because of the existence of a secondary term. So this is the main result I'll be talking about and it's due to Taniguchi and myself. And what we prove is that if you sum over elliptic curves in our family of all elliptic curves and we sum over those with height less than x and we sum the size of the two-selma group, then this grows like three times the number of elliptic curves with height less than x. So that's in accordance of course with the result of Manjul and mine, but there is in fact a secondary term. There's some constant times x to the 3 over 4 along with an error term, along with the power saving error term. So some delta greater than 0. Okay, so I mean just to orient ourselves this term, the main term, the number of elliptic curves with height less than x, you're basically counting a cube less than x and b squared less than x. So they should be about an x to the 1 by 3 plus 1 by 2 of them. So this grows like an x to the 5 by 6 and this grows like an x to the 3 by 4. So if you sum up two-selma groups of elliptic curves, the first term is 3 times some constant times x to the 5 by 6, well the second term is some constant times x to the 3 by 4. And I see this is a possible explanation of why elliptic curves have, there's a possible explanation of what's going on with the data because for the moment we are not actually able to compute what this constant c is. We can only prove that it exists. We would expect it to be negative. We can't even prove that it's negative but we have some theoretical evidence that it is negative and so far that's the best that we can. Okay, sorry, I should I guess monitor chat to see if there are any questions but not yet. Okay, so before I dive into how this result is proved, I want to talk about why we care about secondary terms in arithmetic statistics. So one reason to care about secondary terms is that they are necessary for good error terms. So for example, the result of Bargava and myself in which we compute the first term, the error term there was just a little low of x to the 5 by 6 and Takashi and my result, it's the first power saving that we have for the count of two cinema groups of elliptic curves and okay, so the main term grows like an extra 5 by 6 and note that if you want a power saving error term that's better than an x to the 3 by 4 then you automatically need to get a secondary term. You're not actually going to prove a power saving term of better than x to the 3 by 4 unless you also recover the second audit. So if you want really good error terms then you need to isolate the secondary term and good error terms are in fact necessary to for example study families of L functions that you might associate with this. So for example, the cat Sanak heuristic is obtained by studying families of L functions associated to elliptic curves and what they predict is that the low lying zeros of this family has an orthogonal distribution and that has been proven for test functions of certain support and those results in fact give you bounds on the average analytic rank of elliptic curves so they're very useful results. But suppose you want to for example study families of elliptic curves which are weighted by their two Selma groups and you want to understand how you want to try and prove something about how the two Selma group is distributed along with how the analytic rank is distributed. So like one of the consequences of Poon and Reims for example is that if you take the average size if you take the distribution of two Selma groups and you restrict to elliptic curves having rank zero or you restrict to elliptic curves having rank one and there should be about 50 percent of each the distribution doesn't really change and if you want to prove things of that sort then what you have to do is you have to study families of L functions of elliptic curves but you have to weight each elliptic curve by the size of the two Selma group and you have to prove some result about that family of L functions and you can't do that without power saving error and so our result for example will imply that that family also satisfies satiate equidistribution of the family of L functions will satisfy satiate equidistribution. So that's one reason for why you might care about secondary terms namely good power saving error terms which are also important for other things. So another reason why you care about secondary terms is secondary terms are really needed to align the theory with the data so it's the sort of nature of arithmetic statistics that we have a very wide ranging set of conjectures and we haven't proven very many of them so we know how class groups of number fields are expected to be distributed we know how Selma groups of elliptic curves are expected to be distributed. We have a whole bunch of things that we very strongly believe are true uh proving them is a much slower endeavor and if you want to be sure that our conjectures are actually correct then we and we want to get evidence from data then we're going to need to understand how secondary terms look like because otherwise you're just not going to be able to match uh to match the data with the prediction. So for example even even in in our case even in the high-range age around 10 to the 12 the first order term looks like a 10 to the 10 and the second order term looks like a 10 to the 9 and there's a factor of 10 between them. So if you really want to see good convergence to the predicted answers you need to know what the second order term is. A very nice example of this showed up also in the family of cubic fields so the classical Davenport and Heilbrom theorem says that the number of cubic fields with discriminant less than x grows like an x divided by 3 zeta 3 but the fit with the data was really bad they just seem to be far fewer cubic fields than this theorem would predict and Roberts conjectured that in fact this count is is an x divided by 3 zeta 3 plus some constant which he's which he actually wrote down explicitly some constant times x to the power of 5 by 6 along with the narrator and and this conjecture and this conjecture has been proven jointly uh by by sort of independent work of of uh Bhargava myself and Sima man and sort of independently by also by work of Tani Gucci and Thon once you plot the data against not just the main term but the sum of the main and the second main terms then you get a perfect fit this is absolutely no doubt when you look at the radar that it's that it's correct sort of fits quite wonderful a third a third reason you might care about secondary terms is that they often have theoretical significance I put a question mark because actually secondary terms are no way near as well understood in arithmetic statistics as primary terms for primary terms we have conjectures for a whole bunch of families of arithmetic objects like we we have the coincidental heuristics that tell us what the primary terms of summing up any torsion subgroup of a class group of some fixed degree number field should be the uh should grow like we have primary terms for the number of number fields of fixed Galois group with discriminant up to x we have primary terms if you want to sum up Selma groups of elliptic curves and we don't really have an understanding of what secondary terms should be like but but we do expect them to have theoretical meaning a sum of this theoretical meaning can be seen in the function field setting where there's some speculation that second order terms could correspond to secondary homological stability in the sense of Galatius Cooper's and Randall Williams but I'm going to talk about a very nice source of secondary terms in arithmetic statistics that come from pre-homogeneous vector spaces so let me give you an example of of what happened in the pre-homogeneous vector space case so if you look at if you look at the representation of gl2 on the set of integral binary cubic forms which I'll call v3 of z so this representation is pre-homogeneous which means that there is a unique that the ring of polynomial invariance is generated by a single element which I'll call delta generated by the discriminant so there's a result of Davenport which counts the number of gl2z orbits on integral binary cubic forms with discriminant less than x and this result of Davenport says that this grows like three zeta two times x and this result was generalized by Shantan so to tell you what Shantan is generalization is let me define for you what the Shantan is zeta function is in this case so given any n we let hn be the number of gl2z orbits on integral binary cubic forms having discriminant equals n and then you can put all these numbers hn together into a zeta function so you define the Shantan is zeta function to just be summation hn divided by n to the s and what Shantan you prove is that this zeta function has an analytic continuation to the whole complex plane along with the functional equation and it has poles at s equals one and five by six and as a consequence he was able to prove that when you count the number of gl2z orbits on binary cubic forms which says zero less than the discriminant less than x then this grows like the same constant the Davenport got obviously times x along with a certain constant times x to the five by six and I was saving error term and so this secondary order term corresponds of course to the pole of the of this Shantani zeta function zeta function that's nationally associated to any pre-homogeneous vector space and in fact there is a general theory of Sato and Shantani which says that if you take any pre-homogeneous vector space so so what do I mean by that I mean you take a reductive group G and you take a representation an irreducible representation W of G which is pre-homogeneous and what it means to be pre-homogeneous is that the ring of polynomial and variance is generated by a single element then exactly as Shantani did in the case of binary cubic forms you can make a Shantani zeta function associated to this representation and this general theory of Sato and Shantani proves that that this that this zeta function has a functional equation and analytic continuation and they also prove that that there are just finitely many possible poles for this zeta function and they show that the poles are contained inside the set of zeros of the Bernstein Sato polynomial associated to the discriminant function and so that's a very satisfying theory you get given any given these given any pre-homogeneous representation you get the zeta function and then you get a whole bunch of possible poles of the Bernstein zeta polynomial this again doesn't mean that you can you can say what the poles are for example so there's still there's still many open questions around here so for example I'll just give you a pre-homogeneous vector space that corresponds to Quintic fields so if you take pairs of four alternating five by five matrices and this is a representation of the group gl4z cross sl5z then it is then again this is a pre-homogeneous vector space so the Shantani zeta function has an analytic continuation of functional equation but the possible poles there are a whole bunch of possible poles I mean they correspond to these zeros of the Bernstein Sato polynomial this Bernstein Sato polynomial was computed by work of Kashiwara, Kimura, Kawaii and Sato and the possible poles were there's a whole so I'm going to miss some of them but four by three there was a ten by nine one is a possible pole nine by ten five by six I think I missed a seven by six whole bunch of possible poles and then there was work of Cable and Yuki which said that these are not actually poles that these three don't actually occur as poles and then there's work of Bhargava which says that one is a pole he proves that one is a pole and computes a residue at one but there's still a question mark on the remaining ones and answering which remaining possibilities are poles will give you higher will give you lower order per terms in the counting function in this vector space so like if you can compute the secondary term and counting this stuff you'll know whether or not nine by ten is a pole of this pre-homogeneous of the Zeta function associated with the stream. Okay so now let me talk about our situation so remember we're trying to we're looking at two Selma groups we're looking at elements in the two Selma groups of elliptic curves I told you previously that secondary terms are known for the family of cubic fields and the way the secondary term was computed for the family of cubic fields is the set of cubic fields injects into the set of gl2z orbits on integral binary cubic forms and secondary terms for counting these orbits were determined by Shantani and so that should correspond to secondary terms in counting cubic fields and similarly if you take the set of all two Selma elements in elliptic curves this doesn't quite inject but it is related to the representation of pgl2z on integral binary quartic forms now this representation is not pre-homogeneous but it's the next best thing it's a co-regular representation so if I look at the ring of invariance for the action of pgl2z on the space of integral binary quartic forms then it's generated by two elements which I'll call a and b where a is a degree two polynomial and b is a degree three polynomial on the coefficients of binary quartic forms and the way there's a relation between the two Selma group of an elliptic curve e and orbits of this representation is that if I take the two Selma group of the elliptic curve eab then this is going to be you can write this as a sum over pgl2z orbits on integral binary quartic forms with invariance equal to a and b but you have to weigh this f by a certain weight function okay so just as yeah sorry for interrupting you please and there is a question the chat um up here on would you like to unmute and ask away yes yes yes yes absolutely it's a great question thank you um yes indeed so so the way so so so the way this works is that there's a sort of by the I think on the name but it's sort of very well known and analytic number theory that when you have a zeta function so so so how so so the so the subtle Shintani zeta function associated to a pre homogenous vector space is taking the number of orbits with a given discriminant and then putting all of that together into a zeta function now you have on the one hand the question of can I count the number of orbits having bounded discriminant and on the other hand you have the question of what are the analytic properties of the zeta function and those two questions are very closely related so for example the fact that these three are not poles and that one is a pole says that we should count the number of these orbits with discriminant less than x then the leading term is an x which is in fact what are approved and if for example four by three was a pole then the leading term would grow like an x to the four by three not like an x does that does that answer the question yes yes thank you yeah great thank you table and yuki gave an upper bound of a smaller than x to the fourth 10 ninth yes that's what they did okay absolutely absolutely absolutely so I should say also that that bhargava's result uh which which which implies the which implies that one is a pole and which computes a residue a bhargava used that to give asymptotics of course for the number of uh gz orbits on on wz with bounded discriminant he also and and then he he also did a sieve like together with his worth parameterizing quintic rings he also he managed to actually prove that that the number of quintic fields with discriminant less than x grows like x which is sort of huge result in in arithmetic statistics so so I mean this this result which which he proved in conjunction with his higher composition laws result parameterizing quintic rings was what was with the two ingredients that went into counting quintic fields okay so so we've got so just like just like the family of cubic fields was associated to orbits on binary cubic forms the family of two cell elements of elliptic curves associated to pgl2z orbits on binary quartic forms now there's an issue with trying to evaluate so so so obviously so I mean like just as the first step for getting a secondary term for cubic fields was shuntani's result getting a secondary term for gl2z orbits on binary cubic forms I mean the sort of natural first thing to do when counting when you're trying to count if you're trying to get a secondary term for counting two-cell elements of elliptic curves is to get a secondary term and counting pgl2z orbits on binary quartic forms and and this this we do so this is a result of again mine with the kashi in which we prove that if you count pgl2z orbits on binary quartic forms with height less than x and what's the height on binary quartic forms well I mean it's the same height that you put on elliptic curves max of 4a cubed and 27b squared right given a binary quartic form you have two invariants a and b associated to it and and so you make the natural height uh and and and so what we prove is that if you count this you get a certain constant times x to the 5 by 6 and the primary term of this was proven by uh myself and manjul bhargava but we prove that there is in fact a secondary constant associated to it sorry I'm running out of I think I've you see for every single indeterminate constant so far but um but let me let me continue doing that and we can prove that this constant c2 prime is less than zero we can give some sort of explicit description of c2 prime and it is negative it's a negative secondary term okay so we can count binary quartic forms with two main terms and we want to go from there to counting two cinema elements in elliptic curves with two with two main terms but here's where things get very different from the case of of of cubic fields so so to prove our main result we need to prove an analog of this result except instead of just counting binary quartic forms you have to weigh them by this function w of f things aren't that bad so this also happens in the cubic case but w of f is a local is a local function so it's a product over all p w p of f where w p is just defined on v z p however and this is very different cubic case also you had some cubic case the corresponding local thing was defined mod p squared so here w p is not defined mod p to the k for any k it's genuinely just some function on v z p so we need a bunch of ingredients so first we want to write w p as a sum so we write w p as a sum overall little k w p of a where w p k is defined mod p to the 2k secondly w p k is sparse which is to say w p k is supported on forms whose discriminants are highly divisible by p so the first thing that we do is we write our weight function as a sum of functions each of which are periodic and which are also increasingly sparse we do this by we do this essentially by interpreting w p as sort of counting nodes in the brouhatt tree of pgl2 so once we have this then what we want to do is sum over pgl2z orbits on integral binary aquatic forms we have to sum what we want to do is sum w of f but we've shown that w of f is just a summation over n uh w n of f where w n of f is simply product over p to the k dividing n w p k of f so we have to sum we have to sum a weight function we've broken up the weight function in this way and each w n is supported on forms where n squared divides the discriminant of f so it's an increasingly sparse set and of course to count this we just use some we just use like standard like this is sort of very familiar from sieve theory we break this up as n small and we count this using equidistribution techniques and then for n large we have to prove a uniformity estimate so for n small we count using equidistribution techniques and for n large we just bound it with some sort of tail estimate and that's how the result is proven so I think I'm out of time 50 minutes being passed so I'll stop now for questions