 The talk is about probabilistic degrees of symmetric Boolean functions and this is joint work with Shrikanth and Uthakrish. So, well, the class of Boolean functions that we are interested in are symmetric Boolean functions. So, just to start with the basics, the domain of the functions is the Boolean hypercube and using the hamming weight function. So, the hamming weight of any vector is the number of ones in the vector. We can partition the hypercube into layers where we have one layer per weight and symmetric functions are those which take exactly one value per layer. And if you consider polynomial representations for symmetric Boolean functions, then we get symmetric polynomials which are also multilinear. Multilinear in the sense all the monomials in the polynomial representation are square free in some sense. So, let us start with some examples. The majority function is sort of takes the value 1 when the input weight is at least 10 by 2 and 0 otherwise this is sort of the graph. So, black indicates 0 for us and white indicates 1 that is how it is. So, the degree of majority as a polynomial is going to be omega n. The r function is slightly different takes the value 0 at the all 0s input and 1 everywhere else and again as we know the degree is n. The mod function is sort of defined depending on the characteristic. So, we will choose a prime that is not equal to the characteristic of the field. So, if the characteristic is 0, we pick some prime and if it is a finite characteristic, positive characteristic then we choose a prime that is not equal to it. In which case the mod function gives us a graph like this and the degree is going to be again omega n. So, it takes value 1 when the input weight is 0 mod q and 0 otherwise. The threshold function is a more generalized definition which captures majority as well as r. So, take a threshold parameter t and takes the value 1 whenever the input weight is at least t and 0 otherwise and again the degree of threshold function is omega n. So, these are some examples of symmetric Boolean functions where the degree is omega n relatively high and what we would like to do is approximate them with low degree polynomials. So, that is where our notion of probabilistic polynomial or probabilistic degree comes into picture. So, start with a Boolean function and this is more or less how it looks. So, if you think of polynomials as the model for computing the functions then a natural complexity measure is a degree and if you think of the randomized version then what we get is the probabilistic polynomial which is the distribution of polynomials used to approximate the function up to some error and the corresponding complexity measure is what we call the probabilistic degree which will be the max degree of the polynomials in the support. So, more formally start with a Boolean function f and take an error parameter epsilon and epsilon error probabilistic polynomial for f is defined to be a random polynomial. So, it need not follow the uniform distribution it is some distribution of polynomials with finite support such that at any given input if you randomly pick a polynomial from the distribution the agreement with the function is with high probability. So, probability 1 minus epsilon in this case and the degree of the random polynomial, the probabilistic polynomial is defined to be the max degree of the polynomials in the support and the epsilon error probabilistic degree is defined to be quite naturally the minimum among the degrees of all the probabilistic polynomials that approximate this function. Let us go on to some history of yeah no no no we fix a field. So, since these are Boolean functions we are only concerned with the base field and we assumed that the characteristic is fixed throughout could be 0 or a positive characteristic, but we fix it correct yeah yeah right. So, this is different from sort of approximate agreement it is not 1 minus epsilon agreement on fraction of points it is at every point you agree with probability 1 minus epsilon. So, definition was given by Raspbore and the initial works of Raspbore and Smolensky led to lower bounds for Boolean circuits for the AC0 mod p class. Then the works of Tyrui and Begel, Reingold and Spiegelman led to probabilistic degree upper bounds for AC0. This in turn was used by Breverman to show that poly logarithmic independence holds AC0 with set as linear Nissan conjecture. And more recently the probabilistic degree of any symmetric Boolean function was shown to be Big O square root n by Allman-Williams when you fix the error for constant error. And this in turn was used to get a sub quadratic time algorithm for a version of the nearest neighbor problem. So, symmetric Boolean functions are also interesting in the context of other complexity models for example, AC0 mod p circuits of quasi polynomial size. The polynomial itself as the computational model, but the approximate degree over real is as a complexity measure and also constant depth per se prance. So, a fairly interesting class of Boolean functions and an interesting definition for the complexity measure. So, what do we do here? So, let us start by looking at how do we approximate majority. So, Raspbore and Smolensky initially gave us a lower bound of square root n when the error is constant. So, we will throughout the stock fix the error to be 0.01, 1 by 100. And very recently the Allman-Williams proof gave us an upper bound of square root n. So, this was a curious case of the upper bound following the lower bound. And Allman-Williams did more. So, they showed that for any symmetric Boolean function the probabilistic degree is Big O square root n. Some other examples are as follows. Raspbore also gave us matching upper and lower bounds of constant for the OR function over finite fields. Over the reals we do have a gap. We know that the lower bound is something like square root log n and the upper bound is log n, but there is a gap. There is a gap which we do not know how to bridge yet. And for the mod q function if we fix distinct primes p and q, then again we have matching upper and lower bounds. This is again a case of the upper bound following the lower bound. So, what are we interested in in this talk? We would like to get such bounds for the more general class of symmetric Boolean functions. So, the question is as follows, given any symmetric Boolean function can you find a suitably interesting upper bound and lower bound on the probabilistic degree? Let us fix constant error historically for mod q, for majority and for mod q. So, what are we interested here? So, this is the basic question for this talk. Given any symmetric Boolean function, can we find upper and lower bounds on the probabilistic degree for constant error? More importantly, do they match? And our answer is yes up to poly log factors in n. So, in order to get to the statement of our theorem, we will consider a decomposition of Boolean functions. So, for any Boolean function g, let us first define the period in a very natural way. So, these are symmetric Boolean functions and therefore, they can be thought of as functions of the hamming weight. So, more or less univariate function. So, the smallest positive integer such that g of i plus k gives g of i for all valid values of i that will be defined to be the period of the function g. So, there is a slight abuse of notation here. This is the Boolean function g as a function of the weights that is all. And standard decomposition of f is defined as follows. Look at the interval of weights n by 3 to 2 n by 3 and choose the function g which has the least period which agrees with f on the interval n by 3 to 2 n by 3 and take h to be the sort of the mod 2 sum of f and g. So, the pair g comma h will be called the standard decomposition of f. One example is as follows. So, if you take majority then we restrict our attention to the interval n by 3 to 2 n by 3 and in this region well the period will be n by 3 and when you extend it naturally to both sides g becomes this function and the mod 2 sum is going to be h. So, the point to note here is that h is going to be constant on the large middle region of the hyper cube and probably arbitrary on the periphery. So, these kind of functions are going to be interesting to us which are essentially k bounded functions. We call them k bounded functions which are functions which are constant on the set of weights k to n minus k and take arbitrary values at the periphery. So, it is quite possible that given a k there is also a k prime which is less than k such that the function is also k prime bounded. So, we want the best value of k and which is what we denote by b of f. So, given a Boolean function f, b of f is the smallest k such that f is k bounded function and why are we interested in this quantity? It is because we will be interested in this quantity b of h where h comes from the decomposition of f. So, start with a symmetric Boolean function over a field f with characteristic p, p could be 0 or a fixed prime. Take the standard decomposition h is going to be b h bounded and it is not going to be k bounded for any k that is less than b h that is sort of the observation here. The upper bounds that we get are as follows. So, if the period of g is positive and not a power of p. So, not being a power of p also includes a case where the characteristic is 0. So, if the period is not a power of p then we get an upper bound of square root n and if it is a power of p then it could possibly be lower than square root n. So, it is the minimum of square root n and this quantity period of g plus square root of b of h. So, one thing to note here is that we are only concerned with symmetric Boolean functions and b go square root n is already an upper bound given by Allman-Williams. So, essentially the theorem says if the period is not a power of p then we cannot do any better and if it is a power of p then probably we can use the finite characteristic property of the field and do slightly better than that. And the second part of the theorem says that the lower bounds match up to poly log factors. So, again if the period is not a power of p it is omega tilde square root n and otherwise the other expression we are hiding the poly log factors here and which is why we have the tilde. So, let us start with lower bounds. So, here we will consider a small concept called the restrictions more or less clear what it means, but let me just define it for this talk. So, start with the Boolean function should be f and take a partition of the set of variables as called it n, z and u and a restriction of the function f corresponding to this partition is essentially defined this way. So, n is the set of variables which are left unassigned. So, they stay variables z is the set of variables which are assigned 0 and u is the set of variables which are assigned 1. So, for any such partition the corresponding function on the smaller hyper cube the smaller sub cube will be called a restriction of the function. So, now let us start with the lower bound for thresholds and slowly we will build up to getting the final theorem. So, this is the statement for the threshold probability degrees omega square root t and how do we go about it? Well, more or less straightforward look at the graph of the function and we see that no there is a sub cube over which this threshold function is basically majority. So, we just restrict to that sub cube and get a majority function on 2 t many variables and therefore, the probability degree is at least omega square root t. So, that is how we get it. How do we go ahead from here? Let us say lower bound for h and here is where we will use restrictions. So, let us start with the standard decomposition and B of h is sort of denoted as k. So, this is a slight technical definition n prime is taken to be something smaller than n which is n by 6 plus k by 3, but these are the restrictions that we are concerned with. So, h i of x and certain variables are assigned 1, certain variables assigned 0, some of them are left free. The number of variables left free is k by 3 where k is B of h and now we can observe that the h i is when you think of them as a function on the weights as a function on the weights and written down as a row vector, they will give you an upper triangular matrix and then you can show that the threshold over n prime many bits with threshold parameter k by 3 is going to be a linear combination of these h i's and as a result, we can bound the probability degree of h on the lower side with the probability degree of the threshold because essentially what we have done here is we have reduced threshold to h. So, now the threshold parameter here is k by 3 which is why we get square root k and we are done with this case. Yeah, in this case as it is, for the threshold, no, no, no, nothing, nothing, yeah. So, more generally how will it go? So, let us start with an arbitrary symmetric Boolean function and B is denoted as the period of g, k is the B of h and so we look at cases if B is not a power of p, the first case that we consider more or less the same set of tricks work. So, we can show that the mod q function where q is chosen appropriately not equal to p essentially, the mod q function reduces to g as a low degree polynomial in the restrictions of g. So, it may not be a linear combination, but it will be a low degree polynomial. In this case, the degree will be log B essentially. So, mod q will reduce as a low degree polynomial in the restrictions of g and therefore, we get the probability degree of q as omega square root n and if B is a power of p, then majority will reduce to g again as a low degree polynomial in the restrictions of g and then again we know that the probability degree of majority is omega square root n. So, we are through. In the third case, the threshold will reduce to h as a linear combination as we saw just now and we have the probability degree for thresholds and therefore, we have covered all the cases for g and h need not be x or depends on the field. It is some linear polynomial in the h s. Well, the polynomial for f in terms of g and h will be I guess g plus h minus 2 g h is something like that. So, it is fine does not depend on the field it is g plus h minus 2 g h. So, then we combine these bounds and get the correct bound for f. So, that covers the lower bound side. So, what do we do with upper bounds? We essentially start with the Allman-Williams bound and the more general version of the Allman-Williams bound is this. Given a Boolean function f symmetric and error epsilon, the probability degree of f is big over square root n log 1 over epsilon as a function of epsilon. So, let us start with the standard decomposition and in this case if b is a power of p then we essentially write g as a polynomial in the elementary symmetric polynomials and then we can conclude that the probability degree is at most big O b. And if it is not the power of p then we just use the Allman-Williams bound do nothing. On the other hand h will reduce to thresholds as a sum of thresholds again and how do we do this we will see in a moment. So, we know the probability degree for thresholds as big O square root t and then we will get the probability degree for h as big O square root k and then combining both the bounds we get the construction for probability polynomial for f that will finish the theorem. So, just before wrapping up a few details more. So, for thresholds how do the upper bounds work? So, this is the more general expression for the upper bound for thresholds as a function of the error epsilon and the proof is more or less following Allman-Williams it is an inductive construction. So, assuming that we have probability polynomials for all n prime less than n we just start with an input vector x and sample n by 10 many coordinates from there and now we readily have a probabilistic polynomial for any error epsilon prime on n by 10 many variables and any threshold value. So, if the hamming rate of x is much larger than t then even after sampling we can get with high probability that the hamming rate of x hat is also going to be on the correct side of t by 10. So, the lower order probabilistic polynomials will output the correct value. So, and therefore, we just invoke the epsilon by 4 error probabilistic polynomials by induction hypothesis and output the values. In the other case when the hamming rate of x is certainly greater than t, but not too far away then we have this additional case where probably x hat jumps sides and therefore, the probabilistic polynomial applied to x hat below order 1 might give us the value 0 instead of 1. So, what do we do in that case? Well the hamming rate is still in the interval t to t plus square root t log 1 over epsilon and this is the bound we are looking for. So, we just ignore x hat and use an exact polynomial interpolating that interval and output the value. The only change from the Allman-Williams proof is essentially the base case where well Allman-Williams only had to consider epsilon less than 2 to the minus n in which case you just use the exact polynomial and you will get degree n, but in this case we have an additional case of epsilon less than 2 to the minus t, but we essentially reduce the problem to r and use its probabilistic polynomial. So, in summary what we do is, so we want to obtain low degree approximation for symmetric Boolean functions in a probabilistic sense via probabilistic polynomials. We make the use of standard decomposition and obtain bounds on the components of the decomposition and combine them that is how we go with the proof. For upper bounds the construction is inspired by Allman and Williams, inspired in the base case and in the inductive case it is the same and for lower bounds we use restrictions and reductions to functions whose probabilistic degree lower bounds are already known. So, that is our strategy and yeah that is our way. Suppose you add a finite quantization for the probabilistic degree for r. Okay. I mean is that the only place where the log factor comes in or are there other places? At two places. So, the log factor is coming in because of the non-tightness of r and also in the construction. So, we are writing the function as a polynomial in the restrictions. So, that is where an additional log factor is coming in because the polynomial is low degree. Although it is low degree it is still log b. So, that is where an additional log factor comes in. So, that is from the construction. So, I cannot say it should be there, but it is there by the proof. And do you also have to I mean the finance that you had to work with a different epsilon even though throughout this you are looking at this epsilon being a constant. Correct. So, we scale down the error by a factor of 1 by 4. So, all of it works if we do not scale down too much and that is essentially the idea behind the Orman-Williams thing.