 Okay, yes, well thanks for the invitation. First thing, is there any chalk? I can't see any whatsoever. I always see, yeah. So I'm going to give six lectures and this first one will be quite introductory. I hope nobody's offended by me going over some fairly basic facts at times, but let me start by explaining what the lecture series is about. So the title is NIL Sequences, but I think a better title might have been Higher Order Fourier Analysis. So I'm going to be looking at various questions, often usually in additive and combinatorial number theory, where the traditional techniques are involved Fourier analysis, and I'll explain a little bit about that. But those techniques only go up to a certain point, and then they don't work. And much more recently, this idea of higher order Fourier analysis has been introduced. NIL sequences are the characters of higher order Fourier analysis. So I quite often give maybe one or two hour lectures on this topic, and it's nice to have the opportunity to give a longer series of lectures, because I can go into some more details about these objects in their own right. And there's some interesting sort of algebraic aspects to the theory that I want to dwell on in a bit more detail than I normally would. So let me begin by telling you what some of the questions that we might be interested in answering are. So goals. So here are three theorems, or really two theorems that this theory can be used to prove. I won't be giving all of the details of either of them, but hopefully they provide some motivation for wanting to learn about these objects. So the first goal is to understand linear patterns in the prime numbers. So one example would be, for example, this is a theorem. This is a theorem of Terence Tarran myself and Tamar Ziegler from 2011. So I'll just state one particular case. The theorem is that the number of k-term arithmetic progressions of primes, so distinct primes less than x, which are equally spaced, is asymptotically a constant times x squared over log to the kx. Maybe I'll just pause to explain why that's a natural answer. The number of arithmetic progressions of integers less than x is essentially x squared. You can choose the first term in x ways and then the common difference in something like x ways. And then the probability that a given number is prime is 1 over log x by the prime number theorem. So somehow this term here is not terribly surprising. And then there's an arithmetic factor here, which deals with the fact that prime numbers are usually odd and don't tend to be divisible by 3. So it's asymptotically this. For some explicit, constant ck greater than or equal to 0. And actually, I'll state later on a more general theorem. This is not just a theorem about arithmetic progressions. So there are much more general patterns primes can be considered. In fact, you can, apart from in some degenerate cases, apart from some degenerate cases, you can essentially count how often an arbitrary set of linear forms over the integers takes prime values. So psi 1 of n to psi t of n are all prime. So those psi 1 up to psi t are some linear forms in d variables. And the vector n ranges over some sort of open set or box or something like that. So n ranges over a nice set. Unfortunately, the degenerate cases include most of the really famous open problems like the twin prime conjecture and the Goldbach conjecture. But somehow almost any other system is covered by this theorem. So that's one main motivating goal. And the other one is questions related to what's called Szemeredi's theorem. So Szemeredi's theorem is a very famous theorem about not about a specific set like the primes, but about arbitrary set. So suppose that a is a subset of the first n integers. And suppose it has a certain size, alpha. Then Szemeredi's theorem states that if n is big enough in terms of alpha and k, then a will contain k distinct elements in arithmetic progression. So this is a theorem that's been proven many times. Szemeredi's original proof was combinatorial. There's a very well-known, very beautiful proof using a Godic theory. But it was Timothy Gowers, who I understand is in Paris today and speaking at 5 p.m., if anyone's interested. Timothy Gowers' proof of this, which somehow kicked off the whole subject of higher order Fourier analysis. So I'll be saying a few things about that. So we'll talk about Gowers' proof. Gowers' main motivation in coming up with this proof, I think, was to get sensible bounds for this quantity. Previously, there'd been no respectable bounds for how big this quantity had to be, and Gowers found some. But I'm not going to focus so much on that, more on the underlying structures. And I may talk about a sort of curious refinement of this that really has... Can I ever get that back? Which tool is recommended for that? Okay. Well, I don't need it back imminently, I suppose, as long as people don't mind looking up there. So a curious sort of addendum, if you like. And this needs much more of the theory that I'm going to talk about. Actually, there is some common difference. Some common difference, D, such that A contains at least actually quite a lot of arithmetic progressions of this particular common difference. So at least alpha to the k minus little o one times n progressions of common difference D, where that little o one means a quantity that tends to zero as n tends to infinity. But only for k equals three and four. And actually, it turns out that that's false for k equals five and higher. So this is best possible if you just take a random set and just pick elements from one up to n, just randomly with probability alpha, then you'd expect the number of progressions with common difference D to be exactly this for many values of D. So this is a curious and much deeper fact that has to do with some subtle properties of nilpotent groups of class two, which I'm going to be discussing in these lectures. So somehow those two theorems will be a motivation. But really I want to talk just more generally about this thing called higher order Fourier analysis. Okay, so let's actually begin properly. So why? Well, why nil sequences or higher order Fourier analysis? So the answer, as I said before, is that there are some problems that have been that can be handled by traditional Fourier analysis. So some questions, well, is there an eraser? Yes. Let me ask two questions. So here are two questions. Two additive number theory questions that one can ask about just an arbitrary set A contained in the first n integers. So question one, are there solutions to the equation x plus y equals z in there? So that's a very basic type of combinatorial additive number theory question that people would ask. And question two, much more related to the two questions I've put on the board, are there are four term progressions inside A? So are there four term arithmetic progressions x, x plus D, x plus 2D, and x plus 3D? In A. And what I want to convince you of is that tools of very classical tools of Fourier analysis can help you solve question one, but that they are, they're not sufficient for studying question two. And so this is, has traditionally been a motivation for introducing a higher order theory. So no. So to study question one and other related questions, we use classical Fourier analysis. Well, I'll call it exponential sums. It has many different names. So it's also known as Fourier analysis. And depending on exactly how you apply it, some people would call it the Hardy-Littlewood method. These are all different facets of a classical Fourier analysis. So the exponential sum attached to A is defined as follows. So define S sub A of theta to be the average over N less than or equal to N of the characteristic function of A times E to the 2 pi i theta N minus that. So there are two pieces of notation here I'm going to use quite a lot. So let me explain them properly. Here one sub A of N is 1 if N is in A and 0 if N is not in A. And a slightly curious I suppose but very useful tradition of recent work in additive number theory is that we use the expectation symbol to mean average. So expected value over N less than or equal to N is the same thing as 1 over N times the sum over N less than or equal to N. Is it okay if I use these side boards? Be all right. I may want to start over here. So if you want to compare how many solutions if you've got two different sets A and B, let's compare how many solutions to X plus Y equals Z they have. It turns out that all you need to do is compare their exponential sums. Now in a sense that's a trivial statement because the exponential sum determines the set but I mean it in a slightly more, I mean it slightly differently. So let me state it a proposition. So suppose I have two sets inside 1 up to N. Then let's count the number of solutions to X plus Y equals Z inside A. So the number of X, Y for which X, Y and X plus Y lies in A. And let's compare it to the number of pairs for which X, Y and X plus Y lie in B. Let's look at the difference between them. So the conclusion is if the exponential sums over A and B are close then these two quantities are close. So this is bounded by 3 delta N squared where delta is the supremum over theta of mod S sub A theta minus S sub B theta. So the way I'd like to think about this is exponential sums are a good enough tool for counting solutions to these things. So exponential sums suffice for studying X, Y and X plus Y. So I'm going to prove that proposition just to show you what I mean by Fourier analysis. So the key to it is that there's a formula for the number of summing triples X, Y, X plus Y in terms of the exponential sum. So proof, we have the formula that the number of X, Y for which X, Y and X plus Y lies in A is actually equal to I think it's N squared times N cubed times the integral of essentially the cube of the exponential sum. Now where did I pull this formula from? Well, let me first remark it's not hard to prove. This follows from the orthogonality relations. So if you substitute in the definition of the exponential sum S A theta, so it just is a consequence of the orthogonality relations that e to the 2 pi i theta m d theta is basically 1 if m is 0 and 0 if m is not 0. So that's a fun exercise to expand that out and check that. There are more conceptual ways to arrive at the formula. Really what's going on is that the number of solutions X, Y, X plus Y and A is pretty much the triple convolution of one A with itself. And as probably most people know, convolution becomes multiplication when you pass to the Fourier transform and that's what's happened here. So that's how you would arrive at the formula perhaps. So similarly for B, and then we also note that the integral, the altumene of this exponential sum, again by the orthogonality relations is just N times the size of A which is less than or equal to N squared and again similarly for B. So this is just the pass of our identity and this is the factor, cubed. Maybe I'll start robbing some things out. So how do I get that top board back with this? Whilst I'm doing this, are there any questions so far? Everyone is very quiet. I'm the one who just came back from Los Angeles and is feeling sleepy. You should all try and keep me awake. Yeah, I think so. And that's why I had to refer to my notes. It's to do with the way I've normalized the exponential sum. Yeah. Yes, exactly. So the normalizations are supposed to, they're chosen so that that exponential sum is always between zero and one. Now you can just note just a simple inequality for complex numbers. So if mod z and mod w are less than or equal to one, then the size of z squared z minus w squared w can be expanded as z squared z minus z squared z bar minus w bar plus another term. z minus w z plus w w. I'm sure it's not stupid. That is not a stupid question. Yeah, that's a good, I must have got the normalization there wrong. So if you, I think it should just be the size of A actually. Yeah. Yeah. Divided by n squared. Exactly. Yeah. Because I normalized it exactly. So this is at most one over n. Yeah. Yeah, so this is, I mean, this is just an algebraic calculation. And this is bounded by modulus. So the modulus of this is bounded by three times mod z minus w. So now we're going to apply this with this formula here. So the difference between these two sides is n cubed times the integral from one to zero of s a theta squared s a theta minus the same with b. So by the triangle inequality, it's bounded by this. Something's weird about this. So I want to, yeah, I, okay. So I, no, it's not going to work. And it shouldn't, no, it will work and I'll lose an extra factor of n. So I forgot to, yeah, I need a statement here with a mod z squared plus a w squared here or something like that, which would certainly be true. So this is bounded by n cubed times the supremum. So this is over theta of mod s a theta minus s b theta times the integral from one to zero of mod s a theta squared plus mod s b theta squared d theta, which hopefully, so I've got a one over n. Yeah, I think I've got the right. So this gives me a one over n, essentially. And that's an n squared and there should be a delta. So okay, I'll go and check what, so I've lost a factor of two. So it's actually six delta n squared. Strange that that's wrong. I don't normally use notes and I don't think I'm going to, I'm not sure I'm going to start now. Okay, so basically there's a routine argument for deducing this statement just using the Fourier transform and all you need is the orthogonality relation, which gives you a formula for the expression we want together with Palsenville's identity. I mean, you can please suggest that you need more clarity than you can quality there. You don't really, I think instead of the three mod z squared plus mod w squared. Yeah, I mean, okay, so if I take this to be a mod z squared, I can add in a mod w squared and then the cross term is certainly at most, so I could certainly replace this by two. Three over two, in fact three over two works, yes, you're right, by Cauchy Schratz, so I can rescue the exact statement I had exactly. Okay, but it's not quite as simple as I made out. And the fact that mod z and w are less than one is actually totally irrelevant to the, to it. So this is true for any. It might, what, my w, that one, yeah. Okay, so I now think, thank you Sophie, so everything is now, I believe, correct on that board. And fortuitously, the statement I claimed is also correct. So, right, now the main point, so that was supposed to be just a, that was supposed to be a classical exercise. And the main point is that this fails for more exotic configurations. So here's a proposition in a negative proposition, about four to ten progressions. There exist sets a and b inside one up to n, both of some size delta n, with the property that their exponential sums are close, so sup over theta of mod s theta as s b theta is little o one, so really close. But they have quite different counts of four ten progressions. The number of four ten progressions, the number of x and d, x, x plus d, x plus 2d and x plus 3d in a minus the corresponding count for b is at least some positive constant delta primed times n squared. So this is saying in a strong sense that exponential sums are not a good enough tool to study four ten progressions. We can have two sets whose exponential sums behave essentially the same, but when you try and count four ten progressions they behave differently. So exponential sums cannot handle four ten progressions. So in the sense that an infinitesimal, a tiny perturbation to the exponential sum can have a huge perturbation to the number of four ten progressions. And the example here is kind of, it motivates the whole theory in a way. So this example is I guess, well it's due to Gower's but it was in very related context of ergodic theory, it's due to Firstenberg and Weiss. So I mean I'm going to describe a slight variant on their example. So I'll just take a to be a random set of density delta. I should say this is going to be a sketch proof rather than an example. I mean I've described the example. So sketch proof. Yeah, let's fix delta. The reason it's only a sketch is that somehow it's quite easy I think to convince you that something does go wrong, that exponential sums are not as efficient tool to handle four term progressions. It's actually kind of hard to rigorously prove that they're not but somehow that's not an interesting thing to do. Like we don't need to rigorously, once we're convinced that they're not going to work, we don't need to rigorously prove they don't work somehow. Well I'm not going to. So take a to be a random set of density delta and I guess I mean by that just pick elements to lie in a independently with probability delta. Then so I'm going to make some claims that I won't prove and I'll say something about them later but let me number them. So what's the exponential sum over a? Well it's basically the exponential sum over the whole interval one up to n but I've selected each point n with probability delta so it's delta times that and with high probability this will be true uniformly in theta so with high probability. So a rigorous justification of that's not totally trivial. You could use something like the second moment method so exercise and using second moment method for example or large or a large deviation bound it's not totally trivial because you've got to there's a whole continuous range of thetas that you need to worry about infinitely many events but if theta and theta primed are close they're essentially very similar events so you can just use a union bound over those finitely many events. Anyway that's a very believable fact and now what's b going to be? So b is b to be the set of all n less than or equal to n for which the fractional part of n squared root 2 lies between 0 and delta. Now here so here that's the fractional part of t and there's nothing really special about square root of 2 it's just you want to choose something that's not close to a rational so all we use about root 2 is poor approximation by rationals. So I claim that the the exponential sum of b is the same more or less as for a random set so 2 is the same claim as I made for a random set. SB theta is also delta times the exponential sum over the whole interval. By the way I've just realized that I've lapsed into this this is a standard analytic number theorist notation that may not be familiar to everybody e of t is e to the 2 pi i times t always plus little o1. Now I suppose that's natural I mean this set b I mean if you draw it it looks a bit like a random set I defy you to distinguish if I just wrote down at the board a random 100 elements and this set you probably wouldn't see the difference between them. But the proof of this statement is a bit delicate so what you would need to use is vials inequality for quadratic exponential sums so this is an inequality for sums averages of the form sort of e of alpha n squared plus beta n. It's a standard sort of thing in analytic number theory and the basic fact is that if alpha is highly irrational then this is bounded quite non-trivially and the trivial bound is 1. But you need a bit more than that as well you'd need to somehow smooth out the you'd need to smooth out this interval 0 up to delta and expand it as a Fourier transform. In short you'd need a bit of technique to rigorously prove this so plus smooth approximation to the characteristic function of 0 delta. So this is a tricky exercise in pretty standard analytic number theory techniques. Okay so the exponential sums because they're both they're essentially the same so hence 1 plus 2 implies that the sup over theta of s theta minus so the exponential sums are close. Now what about the four term arithmetic progressions? So how many four term progressions does a have? Well this is a relatively easy calculation so the number of four term progressions between 1 and n so the number of x and d for which x, x plus d, x plus 2d and x plus 3d are all less than or equal to n. It's roughly 1 sixth times n squared so that's an easy exercise. It's essentially just integrating over a two dimensional domain and then because a is random once you fix one of those progressions provided the elements are distinct, let's assume they are, it will lie inside a with probability alpha to the four, sorry delta to the four. So the probability it lies in a is roughly delta to the four over six n squared. So let me formally state that as 0.3 the number of four term progressions in a is delta to the four over six times n squared asymptotically with high probability. So again that's something I haven't actually proved. If you think about the argument I just said was computing the expected number of four term progressions in a that's not the same as showing that you need also to know that it concentrates around the mean. But again some second moment method would give this. So you can see why I'm sketching this. It's a collection of facts that are all very believable but to prove them is hard work and it would be somehow pointless because it's all negative. It's all just a counter example. Let me write up here. Now we come to the interesting thing which is the number of four term progressions in B and the reason this is interesting is that there's a certain identity. So observe the identity that x squared minus three times x plus d squared plus three times x plus 2d squared is equal to x plus 3d squared. So it's a fairly natural identity. And what this tells you let me define s to be the following set. S is the set of all n less than or equal to n for which three sevenths delta is less than or equal to the fractional part of n squared root two is less than or equal to four sevenths delta. That relation tells you that if x plus d and x plus 2d lie in s then automatically x plus 3d lies in a. So x plus 3d lies inside a. So you can it follows easily from this identity and the triangle inequality. The seven is one plus three plus three. S and a have nothing to do with each other except delta. S is contained inside a. Yeah. So s is a smaller set than a. It will typically have about one seventh the size of a in fact. So the number of four term progressions in a is at least the number of three term progressions in s. Now how do we evaluate the number of three term progressions in s? This is why this is a lengthy exercise. Well again, I'm going to tell you how one would do that. It is x plus 3d in s. No, I think your a is a d. A should be b. Exactly. Sorry. Yeah. Yeah. No, x plus three does not lie in s. It lies in b. So yeah. Well it turns out that s has the same exponential sums as a random set. So the exponential sum over s is close to the exponential sum of a random set of the same size. So a random set of size delta n over seven in fact. And to prove that you need to do the same argument again. You need to use vials inequality, smooth approximation and so on. So as in part two. But then by a variant of the proposition I proved over here. So this wasn't about three term progressions but it was about a closely related thing. X, y and x plus y. That will tell you that the number of three term progressions in s is actually pretty much the same as the number of three terms progressions in a random set of this size. So by an argument similar to the previous proposition. When I was thinking about these it did occur to me that perhaps it would have been natural to deal with three term progressions in the previous proposition. The reason I didn't do that is I don't want to give the impression that I'm obsessed with only arithmetic progressions because these methods are more general than that. But anyway a trivial variant of that proposition. It follows that the number of three term progressions in s is roughly the number of three term progressions in a random set of size delta n over seven. And then by a similar argument to this one I can say exactly what that is. So that's roughly delta over seven cubed times one quarter n squared. And the point now is that this behaves like delta cubed whereas this behaves like delta to the fourth. So at least if delta is small enough these will be quite different quantities. I should have said let delta be sufficiently small. So yes the number of four term progressions in B is basically delta cubed times n squared whereas in A it's essentially delta to the four times n squared. So the result follows. As delta cubed over whatever this is four times seven cubed will be much bigger than delta to the four over six for small delta. And if you wanted to inflict more punishment on yourself you could even count precisely how many four term progressions there are in B with some more smooth cutoffs and applications of vials inequality. Anyway the main points I want to get from this, well there are two main points. There's the fact that exponential sums are good for some problems but not for others. And then there's just this little hint that something else is going on for these more complicated problems that's not a complete mess. This is very far from an arbitrary set. So there's some kind of quadratic behavior going on rather than linear behavior. So let me just write that, well that was quite a shock. So there's a whole other audience in there. So key point, so key points, well first of all so additive characters e to the two pi i theta n are good for some questions but not all. And then the other point is that we saw a hint of some higher order behavior and in this case quadratic. Any questions on anything so far? So now I'm going to tell you what a nil sequence is. So what is a nil sequence? So this is going to be a little bit of a leap from what I've just said and we'll only really see the connection after another lecture or so. So I'm first going to tell you what an additive character is again. So an additive character e to the two pi i theta n can be written as, I'm going to write it in a strange way. So I can write it as phi of p of n where p from z to g is, it's the linear map is given by p of n is theta n and g here is the real numbers r and phi is a gamma periodic function on g where gamma is z. So what that means is that gamma of phi of x plus gamma is equal to phi of x for x in g and gamma in gamma. So that's an additive character, we know those things quite well. And a nil sequence is just, it's a generalization of this in which I replace each of these objects by somewhat more general things. So a nil sequence of class s is a generalization of this in which, well p from z to g is an arbitrary polynomial map, a definition to be given later. G is no longer an abelian group such as r but it's just an arbitrary nil-potent. So g is an s step nil-potent e group. Again I'll be recalling the definitions and we'll also be assuming that it's simply connected. I'll give an example in just a moment. And then phi is, well, perhaps slightly controversially I like to use the word automorphic instead of periodic once it's non-abelian. So phi is gamma automorphic where gamma contained in g is a lattice. Phi is complex value. Yes, I should say exactly. So here phi from g to c is a complex valued automorphic function where gamma is a lattice. And automorphic means that phi of gamma x is equal to phi of x for gamma in big gamma. So when the group is not necessarily abelian we write the group operation multiplicatively as usual. So you can see that this is a, just a direct generalization. A special case of this is the additive characters. But my job is to convince you that, at least as far as we understand this phenomenon, that these are the natural generalizations of the additive characters. So these are going to be the basic objective study. I will recall this definition a few times over the lectures and also I'll be giving some examples. But it is of course, it's a pretty crucial definition. I haven't said what a polynomial map is and I haven't said, although probably people know, what a nilpotent league group is. So I want to give now an example. And this is the example that I personally invariably end up playing with when I'm thinking about these things because somehow the abelian case is misleading and then large nilpotent groups are unpleasant in general. You get in a real mess with commutators. Sometimes it's going to be quite distressing. So the Heisenberg example is what I'm going to talk about. So I'll just fill in the blanks there. So here we'll take g to be this group of three by three matrices with the usual matrix multiplication. So this is a two-step nilpotent simply connected league group. So two-step nilpotent means that commutators of order three vanish. So x brackets y brackets z is equal to the identity for all x, y, z and g where to some extent it's arbitrary which way you define your commutators, I'll be taking them like this. So let me take gamma to be the standard lattice. And then all I need to define a nil sequence coming from this, I need to give an example of a polynomial map. Well, I can do that without going into the general definition. So p of n equals 1, 1, 1, alpha n, beta n, 0 and p of n equals 1, 1, 1, alpha n, beta n, gamma n squared are both polynomial maps. Again I'll go into the general definition later but the key point is that all of the terms of degree d are in the d-th element of the lower central series or lower. So key point is that terms of degree d in the d-th lower central series subgroup. So you couldn't have a quadratic term just above the diagonal here. So how do I define an automorphic function on g? So there are several ways of constructing automorphic functions, phi from g to c. So I suppose I'll tell you, I mean there's a very classical kind of way. So one example, so phi of g is just the sum over gamma in gamma of psi of gamma g. So you just take a fixed function psi and translate it around by gamma, provided that psi is some compactly supported let's say function on g. I should say I want my automorphic functions to be smooth, that's important. So c infinity. So yes, you could choose your favorite compactly supported function on g and translate it like this to get an automorphic function. In the notes I have a calculation where I show that by choosing psi judiciously you actually get the classical theta, Jacobi theta function this way. I'm not going to do that calculation here because it's actually not only irrelevant to the theory, but I think a red herring, so there's, I don't know what the French word for that is. It's a connection to a different part of mathematics that seems to be totally irrelevant and quite specific to this case g. So theta functions, for example the Jacobi theta function arises in this way. I mean the theta function itself is not an automorphic function, but get it this way. And then you can also just proceed totally by hand, which somehow I feel gives a bit more intuition into certain aspects of this. So second method, sort of a brute force method, you just define any function you like on a fundamental domain for g under the action of gamma. So let f be any smooth function on a fundamental domain for gamma mod g and then define phi of g to be f of pi of g, where pi is the map into the fundamental domain. I'm going to ignore what happens at the edge of the fundamental domain just for this discussion. So let's pretend that this is only, makes sense sort of almost everywhere, is the natural map. So you can just do a computation here to get some sense of the flavor of these automorphic functions and I want to do that computation briefly. Any questions on that? So let's compute, let's pick a natural fundamental domain and figure out what that map pi is. So by choosing, so let's let g be this matrix, 1, 1, 1, x, y, z and we can choose gamma equals 1, 1, 1, k, l, m in the lattice, big gamma, such that gamma times g has all its entries between 0 and 1. So the set of all matrices with their entries between 0 and 1 is a fundamental domain for gamma mod g and let's see exactly how we can do that. Indeed, gamma times g is 1, 1, 1, k plus x, l plus y, m, oh, struggle to multiply matrices in my head once the top right corner, m plus y, k plus z. So what you can do is choose k to be minus the integer part of x, l to be minus the integer part of y and then m to be minus the integer part of z plus k y, which is z minus y x. So that tells me what the map, that tells me what the map pi is. So the map pi from g to this fundamental domain is given by pi of 1, 1, 1, x, y, z is equal to 1, 1, 1, x, y, z is equal to 1, 1, 1, brackets x, brackets y, fractional part, and then brackets z minus y, integer part x, I believe. So that's reduction to the fundamental domain. Now let's take any smooth function we like on the fundamental domain and again I'm going to ignore edge effects. So take f of 1, 1, 1, x, y, z to be e to the 2 pi i, z. So it's smooth on the interior of the fundamental domain but there's some issues about what happens at the edge. And then we get an automorphic function, a gamma automorphic function phi from g to c given by phi of this matrix, 1, 1, 1, x, y, z is equal to e to the 2 pi i, z minus y brackets x. It's quite a fun exercise to verify gamma invariance of this just by hand. The problem is it's not quite a smooth function but let me just ignore that. So it's not quite legit, legitimate as phi is in places discontinuous. So it's mildly, they're not bad discontinuities but they are discontinuities. So that's what an automorphic function kind of looks like. And then to get a nil sequence, well I could just apply that automorphic function to a polynomial. So what would be some examples? Nil sequence, so e of minus beta n brackets alpha n and e to the gamma n squared minus beta n alpha n are examples of nil sequences. So that's very instructive because you can see that we actually are seeing some higher order behavior here. So there's a quadratic phase behavior here. But the key point is that there's also sort of more general behavior, just slightly more general behavior involving what we call bracket quadratics. And that's important in the theory. So any questions on that computation apart from how am I going to continue to write after all this, okay I'm going to stop quite soon I think just on physical grounds. So maybe I'll just write this on this board. Here's a key point. We get quadratic phases and slightly more general bracket quadratic phases. So you might think, I used to think that why would one think in this sort of slightly abstract way about nil-potent groups and lattices and automorphic functions when you could just write this down. This is a perfectly concrete thing. But I've now come to totally the opposite point of view, which is that if you ever see a thing like this you should interpret it as coming from a nil-potent league group. Otherwise you get into all sorts of trouble. So I might show you some examples of the kind of trouble that can arise a little bit later on. Okay so to finish I want to just quickly introduce something called the inverse conjecture for the Gower's norms. And the inverse conjecture for the Gower's norms is a statement of the form. It's the statement that says these are the higher order characters. So in other words, given any kind of linear problem like finding 4-10 progressions, it's enough to work with these generalizations of the additive characters. So it's a sort of sufficiency theorem. And one of my aims in this lecture is to just state the thing properly, or not this lecture. It's actually a surprisingly difficult thing to really state properly. And I don't think I've ever done so in a lecture. So anyone who hangs around tomorrow, I don't think anybody has ever stated it properly in a lecture. Be a world first. So there's these things called the Gower's norms. I'm certainly not going to be able to prove it in this course of lectures. Nightmarishly difficult proof. So the Gower's norms are a family of norms on functions which control essentially all linear expressions. So there is a family of norms which we call the Gower's u2 norm, the Gower's u3 norm, and then there's a u4 norm and higher on functions f from the interval 1 up to n to the complexes. This is another risk-saving notation. I don't know if I've mentioned this yet. This is the same thing as interval 1 up to n. So there's a family of norms. You can measure the size of a function with respect to them. And they control quite general linear expressions. And the way in which they do that is via things, statements known as generalized von Neumann theorems. So I'll tell you two generalized von Neumann theorems, such as his one. So T sub sum of f1, f2, f3 is bounded by a constant times the Gower's u2 norm of any one of the fi's, where T sub sum is counting solutions to x plus y equals z in those functions. So it's something like 1 over n squared times the sum over x and y of f1, x, f2, y, f3, x plus y. Provided to x plus y is where it should be, not outside. Oh, yeah. Let's just restrict it to n. It could, but let's just say that f3 is only defined for up to n. And then another such statement would be T sub 4 to 4 AP of f1, f2, f3, f4 is bounded by the Gower's u3 norm. So you'll notice there's a non-homogeneity here. So actually I missed an assumption. These fi's are supposed to be bounded. So if fi of x is bounded by 1, always. So here's a different such statement where here naturally T sub 4 AP of f1, f2, f3, and f4 is equal to the average over x and d of f1 of x times up to f4 of x plus 3d. Again, if the fi's are bounded. Well, I can't prove those statements for the simple reasons that I haven't told you what the Gower's norms are. So that would have to be a prerequisite. So here, well, I'll tell you what the first two Gower's norms are and you can probably extrapolate the definition. I'm going to be going over this again later anyway. So this is the average, overall x h1 and h2 of f of x, f of x plus h1, f of x plus h2, f of x plus h1 plus h2. All of that to the power one quarter. The average is over all choices of x h1 and h2 for which all four of these points lie between 1 and n. As I said, I'm going to come back to this later, so I won't write that for now. And then the u3 norm is the same thing, but sort of generalized in an obvious way to three dimensions, h1, h2, h3, f of x, three dimensional, all to the power one over eight. And then the sequence continues in a pretty natural way. But it's not obvious that they're norms. Again, I may sketch the proof of that later. These generalized von Neumann theorems are actually not very deep. So they're actually just applications of the Cauchy-Schwarz inequality, although slightly painful ones. So these follow from the Cauchy-Schwarz inequality. I probably will show you one example of that later on just so that you believe me. I wouldn't want to do it in general, but only because it's notationally nasty and not because it's conceptually difficult in any way. So once you have this definition of a Gower's norm and this notion of generalized von Neumann theorem, here's a strategy for counting arithmetic progressions in a set A. So a strategy for counting, let's say, four-term progressions in A. So A here is just some set that's given to you. It might be the set of primes. It might be some other set. How are we going to count four-term progressions in A? Well, what we want is T4AP of 1A, 1A, 1A, 1A, or it's equivalently that. But it would be silly to just try and, we can't just try and apply this theorem straight away because we might not be trying to show that this quantity is small. So what you do is you guess a main term in some way. So split 1A as a structured part, F structured, plus what we call a pseudo-random part, F random, in a suitable way. So there's a way of doing that, for example, that's specific to the prime numbers, which I may show you later on. And then T4AP of 1A, 1A, 1A, 1A splits as a sum of a structured part, struct four times, plus 15 other terms, each of which is a random part, plus 15 random terms. Nothing, I mean, it's a quartilinear form. So you end up with 16 terms when you expand, such as T4AP of F randomed up to F randomed. So with luck, you could just evaluate this term. This is supposed to be a main term. And then you could try and show that the other 15 terms are error terms, and they're small. And then it's sensible to try and use this theorem here. So all you need to do is show that the Gauss U3 norm of F rand is small. Desirable to show that the Gauss rand, Gauss U3 norm, is small. So that motivates the question of how you show that. So how would you show that the Gauss norm of a certain function is small? And that's the question I'm going to talk about mainly next time. So it turns out that that is intimately linked with the notion of nil sequences. And in fact, the inverse conjecture for the Gauss norm states that a function has small Gauss norm, Gauss U3 norm, say, if and only if it's orthogonal to two step class two nil sequences. So next time I will, I won't be talking about progressions much, but I will introduce Gauss norms again and remind you what nil sequences are and then talk at much greater length about the relation between the two. But I'm totally exhausted for now and I'm sure you are too, so I suggest that we adjourn and go and watch Tim Gauss talk about logic if you want.