 I'll start by just recalling one or two points from last time. So last time I introduced the notions of Gower's norms. So these are norms on functions. I'm going to go over exactly the definition of those again later and I also talked about nil sequences which are the main topic of these lectures in fact. So let me just remind you what I said about what a nil sequence is. So a nil sequence is phi of p of n where p from z to g is a polynomial sequence. g is some s step or class s. What is Immanuel? What's the French term for class? I'll use class, class s, nilpotent group. The nilpotent Lie group, which I'm going to assume is simply connected. And phi is automorphic, a smooth automorphic function, gamma automorphic function for some lattice gamma in G. Now I spent a lot of time in the first lecture talking about arithmetic progressions. And I mentioned, I stated something called a generalized von Neumann theorem which provides a link between counting arithmetic progressions and the Gower's norms. And I used that to motivate the following question. So when is the Gower's norm, when is the Gower's UK norm small? Because if you can show that for a particular function that Gower's norm is small, then that function doesn't contribute to counting arithmetic progressions. So what I want to explain in detail today is at least the statement of the answer to that question. But let me state a rough form first. So theorem, this is the inverse theorem for the Gower's norms. So let, so suppose that, suppose delta is greater than zero and suppose f is a bounded function from one up to n to the complexes and suppose that its Gower's norm is, is at least delta. So suppose f in UK of n is at least delta. Then the conclusion is that f correlates with a nil sequence and then there is a nil sequence chi of n equals phi of p of n such that the average value of f of n times chi of n is at least delta primed, where here delta primed is bounded away from zero in a way that depends on delta and k. So if a Gower's norm of a function is large, then it correlates with a nil sequence. Now as I've stated it here, this result has precisely no content at all, it turns out. And the reason for that is that just every function from one up to n can be written in this form, chi of n is phi of p of n. In fact with just taking G equals r, so I'll leave that as an exercise. So to add content you need to say some additional things about chi. So here chi is, it's in fact a k minus one, a class k minus one, nil sequence and its complexity with complexity bounded in terms of just delta and k only. Well perhaps an even more trivial reason why the first statement had no content is that actually the way I stated it to begin with, I wasn't asserting any bound on chi. So I could just have multiplied k by some huge number and this would be true. So among other things complexity controls the size of k. And there's a converse to this, conversely, well if f has a large inner product with a nil sequence, then its gara's norm is large converse. If chi is a class k minus one nil sequence of complexity, well let me not say it like that. So just if chi is a k minus one, a class k minus one nil sequence that correlates with f, then f has a large gara's norm. So f in the gara's u k norm is at least delta primed, where now delta prime depends on delta k and on the complexity of chi, depends on delta k and the complexity. So one of my main aims in this lecture is going to just be to at least tell you what complexity is. And as I said, I don't think this has ever been done in a lecture by anybody. So at least we will see a rigorous statement of what the inverse theorem for the gara's norms is. But the point of this is that somehow these nil sequences are a complete list of characters, if you like, or the nil sequences of bounded complexity are a complete list of characters for this higher order Fourier analysis, which is supposed to be to do with these gara's norms, which themselves control arithmetic progressions and other related questions. Let me say a few words, though, about the proof of this theorem. So this is actually very difficult and long. It was finally finished off in a 140 page paper by Tau and Ziegler and myself, and certain special cases were done earlier. And I think the consensus is that we don't really understand why this is true yet. So we don't really understand why nil sequences are the complete list of obstructions to this gara's norm being small. The converse is much easier and I'm going to tell you, probably even sketch a proof of this. So it's relatively easy to show that if f correlates with a nil sequence, then the gara's norm is large. So any questions before I set off on trying to make all of this more precise? Good. Yeah, so the plan is I'm going to just define all of these terms properly. So let's define the gara's norms, first of all. Now, actually, it's best to define the gara's norms on a billion group, first of all, rather than on the interval 1 up to n. So let z be an abelian group, a finite abelian group. And let k greater than or equal to 2 be an integer. Then we define the gara's norm. Well, first of all, define it's 2 to the kth power, f in u to the k of z to the 2 to the k. It is defined to be this average, so the average over x and then over h1 up to hk of the product over omega. Omega ranges over the n cubed, 0, 1 to the n of, and I'll explain this in a moment, curly c to the omega f of x plus omega dot h. So here, curly c is just complex conjugation, and mod omega is just the sum omega 1 plus up to omega k. So let me remark as I did last time, it's not obvious that the quantity on the right is real and non-negative, but that actually turns out to be the case. So it turns out that the right-hand side, right-hand side, by the way, you should, I feel very embarrassed that my French is so bad that I couldn't even dream of lecturing in French, but if I say something in English that makes no sense, like yesterday I used the phrase red herring, which apparently makes no sense whatsoever in French. So if I do that again, please just tell me, okay? So the right-hand side is real and non-negative, but that's not quite obvious. It's not particularly difficult either, but it's not obvious, and so there's a unique choice of 2 to the kth root, which makes that norm real and non-negative. So define f in uk of z to be the unique real and non-negative 2 to the kth root. So depending on how much time I have in these lectures, I mean I do want to get to more substantial things, I may give some indications of the proofs of these facts, which rely on something called the Gower's Cauchy-Schwarz inequality. The main tool in all of this part of the theory is just the Cauchy-Schwarz inequality applied many times. So it also turns out that we have the triangular inequality, f plus g in uk of z is bounded by f in uk of z plus g in uk of z. And finally let me observe the, there's a nesting property of these norms. So also the u2 is less than or equal to the u3, etc. So all of these facts follow from something called the Gower's Cauchy-Schwarz inequality from Gower's Cauchy-Schwarz, which as I said is just a sort of exotic version of the Cauchy-Schwarz inequality and proven using just the Cauchy-Schwarz inequality. Is there some, sorry is there something? You didn't tell us what is f in this formula. So it's defined on z to the power k or just on z? No, that's a good question. So these, sorry, these are ranging over z. No, f is a function, you're right, f is a function on z. So let f from z to the complex is be a function, yes. So let me explain actually now why I've introduced this in z. But there are many advantages of working on an abelian group. This makes many averaging arguments a lot easier and you don't ever have to worry about the range that things are averaged over. So now if, now suppose f is a function on 1 up to n, then what we do is we embed 1 up to n inside a cyclic group. So embed n inside z modulo n prime z. And it doesn't actually matter which n prime we choose, so just take it to be quite big. So with n primed bigger than 10k times n, say. And it's also convenient for technical reasons to take it to be a prime. And now we define the Gauss-Uk norm of f on n to be the norm of the same function regarded as a function on the group. This is a slight abuse of notation. f is a function on 1 up to n, but it can also be thought of as a function on this group just by setting it to be 0 outside of the image of that embedding. Normalized by the norm of the interval on that same group. So that looks very technical. Let me explain two things about it. First of all, it doesn't depend on the choice of n primed. And actually the reason for that is it's just the same thing as the average. You can also express it like this, but where the average is only over 2 to the k tuples for which all of these elements lie inside the interval 1 up to n. So same as, let's call this star, in which the average is only over x plus omega dot h lying inside 1 up to n. But that's a nasty set of x's and h's to average over. It's kind of a convex subset of z to the k plus 1. I'm not going to dwell too much on the precise distinction between the norm of functions up to n and the norm on this group. They're really the same up to a constant factor. You just slightly lose some properties. Like this monotonicity is not quite true. You have to include some extra factors of 2 and things. But more or less, they're the same. So that's some, they're of at least defined the Gauss norms. So now let me define properly what these nil sequences are. Because last time I didn't even, I mean, I think many people know, but I didn't even remind you what a nilpotent league group is in general. And actually it's good to define these in greater generality than that. So let me do that. That will come back in just a second. Can everyone see what they need to on that? Yeah. Okay, so let me talk about filtrations. And I'm trying to be consistent in my use of terminology, but I can't guarantee that this is all standard terminology. So let G be a group, a pre-filtration on G, which I will call G dot, and it's going to be a sequence of subgroups G i, i equals naught to infinity, is a nested sequence G is equal to G naught, contains G1, contains G2. This containment symbol includes the equality case, so maybe I should just make that completely clear with. So it's a nested sequence of subgroups. And we say that it has class S, I guess class at most. Class S, if all of the terms from S plus one onwards are trivial. So here E is the identity element on G. And I'll say that it's a filtration just if the first two terms are equal. So if G naught equals G1, I actually forgot the most important axiom of a pre-filtration. This is at the moment just a nested sequence of subgroups. It's supposed to satisfy, satisfying that the commutator of G i with G j is contained in G i plus j for all i and j. All i and j bigger than one, bigger than zero. So let me just remind you what the commutator of two groups is. H commutator k is the group generated by the pairwise commutators H and k, H and k, H and h, and k and k. Not the same thing as the set of commutators. So that's what a filtration is. Now there's a particular filtration, a particularly well-known filtration, is the lower central series filtration. So the lower central series is defined by G naught just to distinguish it, I won't use the brackets. So G naught equals G1 equals G and G i plus 1 is the commutator of G with G i. So it's actually the maximal filtration because any filtration G commutator G i has to be contained in G i plus 1. So it's actually the maximal. I suppose maximal, minimal, it's the, let's go with maximal. Maximal filtration on G. Any other filtration will have G i contained in G i without the brackets just by induction. So maybe I'll call it the minimal filtration. It's not absolutely immediate that the lower central series is a filtration, but that is true and that's easy to prove by induction. So this is a filtration by induction. So a group is, as is well known, a group is nilpotent if the lower central series filtration has finite class. So a group is nilpotent of class S if G sub S plus 1 is equal to the identity, i.e. if the lower central series filtration, filtration has class at most S. And because the lower central series filtration is the minimal filtration, a group will only admit a filtration of finite class if it's nilpotent. So that's what nilpotent is. So let me just give a quick example, the Heisenberg that we saw last time. So G is 1, 1, 1, R, R. So G naught equals G1 equals G. G2 is the commutator subgroup, GG, which is 1, 1, 1, 0, 0, R. And then G3 is trivial, so it's nilpotent of class. So why have I mentioned all of this? Well, it turns out that for the purposes of the theory I'm interested in, there's nothing really special about the lower central series. Very occasionally the lower central series needs to be discussed. But usually everything works just for an arbitrary filtration. And moreover, it's good to have the flexibility of arbitrary filtrations. So for example, they're closed under intersections with subgroups, whereas the lower central series need not be. So it's much nicer category, well, nice place to work. So a polynomial sequence is something that is defined relative to a filtration. Polinomial sequence, so the group definition. Let's let G bullet equals GI, I equals 0 to infinity be a filtration. The group of polynomial sequences, poly ZG dot, of polynomial sequences P from Z to G. So this definition depends on more than just G. It depends on the filtration is generated by, is the group generated by sequences of the form little GI to the PI of N. So those are sort of basic polynomial sequences where GI must always lie in a suitably high element of this filtration. So we saw this, I gave some examples in the Heisenberg group before. I'm only allowed, in the Heisenberg group, I'm only allowed quadratic terms, which sit inside the commutative subgroup G2. So EG example, 111 alpha N, beta N, gamma N squared is a polynomial is in poly ZG bullet where G bullets the lower central series filtration on the Heisenberg. That's actually not absolutely immediate, but it's an easy exercise to write this as a product of more basic sequences like this. Now again, it depends how much time I have, but there's actually a very remarkable theory of these polynomial sequences in the case where the group is nilpotent. I should say that, sorry, I only want to make this definition when G is a finite class. So G is nilpotent and the filtrations of finite class. There's just a quite remarkable theory that I want to show you some bits of, at least, due mainly to Alexander Leibman, but also to Horst and Krah. And this is the idea that there are some different characterizations of what it means to be a polynomial map in this sense. So alternative characterizations. When you say the group generated, you mean by point one product? Yes, yeah. So there are some alternative characterizations and the most remarkable, so EG, P lies in poly of ZG bullet if and only if you have this derivative property, delta H1 delta H2 to delta Hj P of n takes values in G sub J for every choice of H1 up to Hj. So for all H1 up to Hj in the integers and for all J greater than one, we're here delta sub H, well it's kind of a multiplicative notion of discrete derivative on this group. So delta sub Hf of n is f of n times f of n plus h inverse. So this is somehow, this is a very natural definition of polynomial. These are derivatives and you're saying that somehow certain derivatives lie in certain subgroups. So this is definitely a non-trivial theorem and one fact that you may immediately note about this is that with this definition here, it's completely not obvious that these P form a group. That's just not obvious at all. It's very non-obvious that it's a group with this definition. So in many ways this is the most, really this is somehow the natural definition and then it's a theorem that all polynomial sequences are of that form and that everything of that form is a polynomial sequence. G is nilpotent here. Yes. Again, this is important that G is always nilpotent. Does it follow from the user? Yes, if there's a filtration of finite class that implies that it's nilpotent because the lower central series sits beneath, sits above the filtration. Sorry, beneath the filtration and therefore if the filtration eventually terminates, so does the lower central series. So you can only have a filtration of finite class in the nilpotent case. And actually there's an example which I don't remember off the top of my head. But if G is not nilpotent, these two notions are different. So there are polynomial sequences satisfying one but not the other. So are there any questions on this before I move on? Yeah. Do you hope that the contrary enclosure, it just said you are inside? I think I meant it the other way around. You can have, I mean you can have really quite big, quite flabby filtrations that are just constant for a long time and then start decaying down. In fact, there's a very trivial filtration in which everything in it is just the group G. It's not a finite class. Why do you want to take G0 and C1 to be G? Why not just start at C1 equals the commutator and C and G? Well, if I label things like that then the filtration property is going to look slightly weird. It'll be something like ij and ij, i plus j minus one or some such thing. This turns out to be the neatest way to deal with things. I've done that thing with the board where I'm going to struggle to get this one down again. I think, what is it? Is this possible? So this is a very algebraic theory which I may give some hints of if I have time. It's not that easy to prove these facts and I only have six lectures. So this is valid just in the setting of nilpotent groups. But I'm exclusively going to be interested in a much more restricted setting than that. So we'll only be interested in. And I think this is, I'm not very happy with this piece of nomenclature. But let me say a proper filtration. Maybe I should call it a Lee filtration. I think that's a better term, but let's stick with proper filtration. So these are ones in which everything involved is a closed and simply connected Lee group. So simply connected Lee group in which G is a simply connected Lee group. Each GI is a closed connected subgroup. So the Heisenberg example that I showed you before is definitely of that type. So think Heisenberg example. Actually, let me give you another example. This is a pretty trivial example, but I'll mention it again later. Another example would be to take G to be R and G0 equals G1 equals Gs equals R and then Gs plus 1 and all higher terms to just be trivial. So that's what I'd call a quite a flabby sort of filtration. It's definitely not a minimal one. And an exercise, well, in fact, just follows from the definition. So with here, traditional polynomials P from Z to R of degree at most S. So EG and P of n is alpha n to the S. Those will all be polynomial sequences in the sense of my definition there, in the more general sense. So it really is a generalization of a familiar notion. So now I need to introduce, if I want to talk about nil sequences, I want to introduce a lattice. While I'm rubbing this out, maybe I should say, I mean, I think it's instructive to see the general definition of these things. But personally, I usually just think about examples on the Heisenberg group, which have a lot of the general structure already, although not all of it. So lattices and automorphic functions. So a lattice is just a discreet and co-compact subgroup. So gamma less than or equal to g is a lattice if it is discreet and co-compact. So we've seen already an example in the Heisenberg EG 111 Z Z Z in the Heisenberg. So there may not be a lattice in G. There are examples in dimension seven, I think, of class two nil-potent groups admitting no lattice. But that's totally irrelevant for me. I want to give you very much the sense that this is, even though I've used words like simply connected leagroup, I'm not doing any kind of topology here. This is really algebra. Once you think of these things as like vector spaces, they have no interesting topological content, really. I'll make that clearer because I'm going to mention the Lie algebra. So being a simply connected nil-potent leagroup means that you are, the exponential map from the Lie algebra is a homeomorphism. So they really are algebraic objects, I would say. So it's possible that G admits no lattice, but I simply won't care about such Gs in this theory. They just never come up. I'm always going to assume we say that the filtration G-bullet is rational. Well, if gamma basically forms a lattice in each element of that sequence. And we'll only ever be interested in rational filtrations. Again, there are sort of what I'd call soft results, and by soft I don't mean easy. There are statements due to Malchev along the lines of that if that any lattice will be rational with respect to the lower central series. I believe that's a theorem. But again, it's irrelevant to us. Every situation that we're given, it will somehow be clear that these things hold. So I'm not interested in pathologies. Any questions about that? So that's lattice, and now I can define automorphic function. For me, an automorphic function, a smooth automorphic function, is some phi in C infinity of G. C infinity of G is a meaningful notion. G is supposed to be a Lie group satisfying that phi of gamma G equals phi of G for all gamma in gamma. So now I can define precisely what a null sequence is. So definition, a null sequence of class S is a function chi of n is phi of p of n, where all of the objects written there are as above. So p is a polynomial with respect to some filtration for some proper filtration G bar, and which is rational, rational with respect to some lattice gamma, and phi is some smooth gamma automorphic function. So there we are. That's a formal definition of what a null sequence is. It's a bit more complicated involving this additional structure of a filtration, which must be rational with respect to a lattice. So maybe now is a good time to make some remarks about this. But I don't necessarily think that this particular definition will last forever more. And here's why I think this. I talked about null sequences being the generalizations of additive characters, which are e to the 2 pi i theta n. So those additive characters are a very special type of null sequence. And here I'm allowing any smooth function phi. And the reason for that is that we don't know which is somehow the natural subclass of smooth functions phi that we should be considering. So maybe one should take eigenfunctions of some operator on G mod gamma. I don't know. I don't know. There seems to be no persuasive reason to select one class at the moment. So this definition is potentially subject to revision in the future, but for now it's fairly workable. Now the statement of the inverse theorem for the Gauss-Norms involved a notion of complexity. And without that, it has no content because these null sequences can be just any these can take essentially arbitrary values on arbitrarily long intervals. So to make the theory useful, we have to have a notion of complexity. So I want to at least tell you the definition because I want to give a complete statement of the inverse theorem for the Gauss-Norms. But these complexity issues in this subject, they're always annoying and they're never conceptually difficult. So there's something that I always just put in to what I tell and myself we always would just put in an appendix to our papers. They kind of always work out in the end. A bit like I don't know if anybody's done linear algebra over Q where you have to keep track of the heights of the rationals that you're using and so on. These things are always tedious, but never really difficult. Well, sometimes really difficult. I guess Immanuel has done some situations where it has actually been very important to keep track of those things. But for me, they'll never be conceptually difficult. So complexity. To make the theory quantitatively useful, maybe before saying this, I should also say the notion of a null sequence first arose in work of Bergelsen-Hosten-Kra in a Gothic theory. And their notion was a little bit different to this in that they, in a Gothic theory, it's somehow natural to just use a continuous function phi. And they didn't use polynomial sequences. They used a special type of polynomial sequence, just a linear sequence. The need to use polynomial sequences arises only in this quantitative theory for reasons that I'll mention. So to make the theory quantitatively useful, we need to know, we need measures of two things really. The complexity of the filtration with respect to gamma and also how smooth the function fires. So the smoothness. So to quantify those notions, well, you have to choose a basis with respect to which you're measuring these things. And the way to do that is to pass to the Lie algebra. So we work in the Lie algebra, curly G. I'm going to run into trouble here. I need to make sure that does not look like a phi. Yeah. Well, there's a Lie algebra, G. Well, I myself am not super familiar with Lie theory. I don't want to go over huge amounts of the theory, but what I recommend, if you're not familiar with it, just in the Heisenberg case, figure out what the Lie algebra is. It's a three-dimensional vector space with a bracket operation on it. So the thing is, for these simply connected, since G is nilpotent and simply connected, we have homomorphisms, homeomorphisms, not homomorphisms, homeomorphisms between the Lie algebra called the exponential map and the logarithmic map. And as I say, this really means we're doing algebra and definitely not any kind of topology. So we're doing algebra. Now, the Lie algebra is a vector space, so I can just pick a basis for it. So complexity. Complexity can only be measured with respect to a particular choice of basis. Choose a basis, curly B, for the Lie algebra, G. Now, it's convenient not to choose an arbitrary basis, but to choose one that is adapted. So we say that curly B is adapted to the filtration. So if x1 up to x sub dim, I guess I want to actually start from the other end, x sub dim g minus dim g i plus 1 up to x sub dim g is a basis for the Lie algebra of g i, which is defined to be the log of g i in the filtration. So an adapted basis is one that's good for computing with respect to filtrations. And once we've made such a choice, we can quantify those two notions above. So I can now give you the definition of both of those things. Are there any questions on that? So I should have time to just finish this before the end of the lecture. Okay, so definition. We say that the complexity, which I would write hash sub b of the filtration and gamma, because that's the things it depends on, is at most m if there is an integer. So m is supposed to be an integer if the following are true. So basically, the idea is that I want all of the interesting things about how the filtration and the lattice interact to be bounded in terms of m. And it turns out that these are the interesting things. So first of all 1, so the structure constants describe the Lie bracket in terms of this basis. So those are the aijks. They satisfy they're all at most m and they're also m rational. So mijk lies inside the integers. So somehow there is this result of Malchev that says that a Lie group only admits a lattice if all of the if and only if the structure constants are rational. So this is a quantitative version of g admitting a lattice, gamma. And I found it convenient when writing notes on this to introduce just one slightly technical extra condition. And I'll tell you why in a moment. And so this is just purely technical, but I want to give things completely. Let me just remind myself exactly what it was. Yeah, for each value of i, there are at most m values of j such that that's not zero. Not trivially zero. And I included that just because it makes these notions behave better with respect to certain functorial properties of groups like taking direct products. I found that's quite convenient. So just ignore that. And then secondly is the assertion that m times the integer lattice in the basis b is contained within log of the lattice gamma, which is contained within 1 over m times the integer lattice on the basis b. So this is another statement about how the lattice sits inside. It behaves with respect to this basis. So log gamma is just the image under the logarithm map of all the elements of the discrete group gamma. It's actually not necessarily a lattice. It's not necessarily a subgroup, actually. So you have got to be careful with these li operations. They're definitely not homomorphisms. So it's a lattice up to finite index, but it's not closed under addition necessarily. In fact, for the Heisenberg, it's not. So that's a definition of complexity. And then I have to define how smooth phi is. Well, that's fairly routine, it turns out. Let me get filtrations down again. Don't need that anymore. 10 minutes, is that OK? I think we'll make it. We'll spend the first few minutes just recalling trivial definitions anyway. Actually, maybe not. This is France, right? I think I can manage it in six minutes. So we're going to quantify the smoothness using some basically, well, essentially, sub-left derivative norms. We define the smoothness norms. Smoothness, so I write Wm curly b. This will be the supremum of basically all derivatives with respect to the basis elements. So that's an m prime of phi in L infinity. So where the supremum is overall choices of I1 up to Im prime, and overall derivatives of order at most m. So it's just the biggest any derivative can be, and derivative here is in the usual layout of a sense. So where dx phi is the derivative at 0, d by dt at t equals 0, of phi times x of xp tx. So this is a standard definition. So I don't know if there are any genuine experts in the audience, or I know this is being videoed. I don't know if people ever watch these things. But if so, I should like to remark that these notions of complexity are rather different to the ones that are used in papers by Taron myself, where we talk about Lipschitz norms on phi, which require one to put a metric on G modulo gamma. And I've come to the conclusion, and actually I think Terry came to this conclusion years ago really, that smoothness norms are the right way to proceed. And in fact, other people who work with quantitative equidistribution results, such as the papers of Einseidler, Margulis and Venkatesh, they use these smoothness norms as well. One should not be working in the Lipschitz category. Smoothness is correct. So let me finish then by just stating precisely the inverse theorem for the Gauss norms. So I basically stated it before, but so suppose that f in the Gauss norm is at least delta, then there is a nil sequence, chi of n, which is phi of g of n, where the filtration has class at most k minus 1, so where g tilde has class at most k minus 1. And with respect to a suitable basis, for a suitable basis, curly b adapted to g bullet, we have that the complexity with respect to b of the filtration is bounded only in terms of delta and k. So I'll write this as big o delta k1. And the smoothness of phi is also so bounded. So the smoothness is bounded for any m. And finally, and of course the important bit, and f correlates with chi for some delta primed greater than zero in a way that depends on delta and k only. So you can see somehow why this is not normally stated formally. It's a bit of an effort, but the gist of it is what I explained at the beginning. So you correlate with a nil sequence of finite complexity. So next time I will look at the converse of the inverse theorem for the Gauss norms, and you'll be pleased to hear that although these notions of complexity are in the background and would be important if you wanted to do everything rigorously, I shall not be mentioning them explicitly. So I'll just say that they can be bounded. So that's it for today. I guess we should go make haste.