 I'm extremely happy to finally have Professor Tony Pitassie among us. So Tony did her PhD at the University of Toronto. Then she was a faculty member at the University of Arizona and at the University of Pittsburgh before she moved back to the University of Toronto again in 2001. Tony has made deep contributions to computational complexity theory in general. In particular, to the theory of concrete lower bounds in many areas including proof complexity, Boolean circuit complexity, communication complexity, differential privacy and so on. And she has been very widely recognized as an expert in these areas. She has been elected as an ACM fellow in 2018 and she was also a speaker at the International Congress of Mathematicians in 1998. So with not much further ado, I'll invite Tony to talk. Thank you, Arkadev. It's really great to be here. I've been wanting to come for a long time so it's great to be here. Okay. So I'm going to give mostly a survey talk about lower bounds and communication complexity using a technique that's called lifting. I'm happy to go slowly if you have any questions. I'm going to start by giving some applications of the lower, explaining the technique and then giving some applications of the lower bound and then I'll finish with the remaining time to give some ideas of how to prove a lifting theorem. Most of the stuff that I'll be talking about, at least for the first, I don't know, 40 minutes, will be mostly, it's with a bunch of people but mostly it's joint with Mika Goose who was a PhD student of mine, graduated a few years ago. Tom Watson who was a postdoc and Robert O'Bear was also a student of mine who graduated two years ago. And more or less this is sort of a program to prove lower bounds for stronger and stronger computational models and I'll tell you where we get stuck. And then I'm also going to talk about a new result that was obtained at the Simons Institute in lower bounds, that's where the picture is up there. And it was actually a lot of fun. I got to work with Arkadiev who I hadn't had a chance to work with for a while. He was a postdoc in Toronto and we spent a lot of time together working on lots of fun things so it was great to have a chance to work with him again. And Yuval Filmes who was a previous student of mine and Sejin and Korath and Ormer so it was a lot of fun working on this result so I'll tell you about this as well. Okay so what I want to tell you about is basically this lower bound program where on the left you have the concrete model of computation that you're trying to prove the lower bound for. I'm calling that algorithms. In the middle is communication so we're going to convert the lower bound question in the concrete model to a question about in communication complexity so we want to convert it to a question about an information bottleneck and then to solve that problem I want to basically use what I'm calling a lifting theorem and it's just a way of converting a simple decision tree or query type of a lower bound so a lower bound and a simple computational model will show how to convert that sort of automatically via a lifting theorem into a lower bound for the communication problem, an associated communication problem. So all the work is like between the pink and the green, the general lifting theorem and then going from algorithms to communication that I'll mostly use other people's results that I'll tell you about. Okay so so far we've been able to re-prove all sorts of results using this technique and I'll tell you about how to re-prove monotone formula and circuit lower bounds using this method and also explain how to use it to get lower bounds for monotone span programs and then if there's time maybe I'll tell you a little bit about how to use it to get lower bounds for extended formulations of linear programs and pseudo-deterministic models of computation. There's other applications of lifting but I won't talk about those today. I started off getting interested in it actually when Arcadev was in Toronto motivated by an application's improved complexity and yeah so I didn't realize back then how it would be a general tool that could be used to solve lots of problems. Okay so I'm sure most people are familiar with communication complexity. If not I'll tell you what it is. So it's a wonderful model of computation that was defined by Yao in 79 and there's lots of variants of it but the vanilla version is you have two players Alice and Bob and they each hold say an n-bit Boolean string and they're trying to compute some joint function f, little f of their input and typically in previous applications until the last 10 years usually the function little f was a Boolean function. I'm going to be much more interested in functions that are not Boolean and also search problems and also partial functions. But anyways they want to compute this joint function f of their input and so they have a protocol that they agree upon where they send bits back and forth and at the end of the protocol they should both know the value of the function and the communication complexity of the function on inputs of length n each is the, you look at the best protocol, so the best protocol pi and the minimum number of bits that they have to send back and forth over all inputs of length 2n, n and n. So the trivially they can solve any problem by just one of the two players giving up and sending the other player their entire n-bit string so the trivial protocol is I would just send, you know, if I'm Alice I would just send Bob my whole string and make him do all the work and so that's an n-bit protocol. So what we're interested in is problems where there's a protocol of length more like constant or logarithmic. So everything scaled down, efficient would be polylog in n instead of polynomial in n. And there's, you can define many flavors just like in for complexity classes you can define randomized different types of randomized deterministic nondeterministic blah blah blah. You can do the same thing, there's all the same classes live in the communication world and the two I'll mostly focus on today is the deterministic one where the algorithms deterministic and they have to give you the right answer all the time with probability 1 and then the randomized setting where they're allowed to flip coins and then they have to output the right answer with, say, probability at least two thirds over all their coin flips. And here's an example, the equality function where if they want to know if their strings are equal and this can be solved quite easily using a randomized protocol with a constant number of bits if it's a public coin protocol but the determinist, the complexity is maximal, it's linear so one player basically has to send the entire string to the other player in order for them to determine deterministically what's the value. So again, I'll be interested mostly in search problems, total search problems. So for example, we'll see two examples that I'll come back to repeatedly. This is the Karstner-Vigderson search problem associated with a Boolean function. So say you have a Boolean function little f. Then the two players each gets, Alice gets a 1 input of the function and Bob gets a 0 input of the function and they want to find a coordinate i where they differ. And since it's a well-defined function, it's a total search problem. There's always at least one coordinate where they differ since Alice has a 1 input and Bob has a 0, so their job is to find such a bit. And another problem that I'll come back to all the time and it's going to turn out that these two problems are actually the same problem is the total search problem associated with an unsatisfiable CNF formula. So there's a fixed unsatisfiable conjunctive normal form formula such as this one down here. And the only thing you have to know about it is it's unsatisfiable and you somehow partition the variables into two halves and you give say the x part to Alice and the y part to Bob and they get an assignment of all the variables and they want to find a violated clause, find a clause that's falsified by their joint assignment. Again, it's total because the formula is unsatisfiable. So there's always at least one clause that's false. And so just to sort of give you a flavor of the eventually I'm going to be more or less trying to communicate to you that communication complexity in its generalizations are really about sort of norms, matrix norm operations on matrices so it's sort of like it's really like the study of linear algebra to a large extent. So I'm going to tell you about the communication matrix associated with a search problem. So I didn't label it but the rows are labeled with all of Alice's possible inputs. So she has a string of length n. So it's two to the n rows corresponding to all her inputs and the columns are labeled with Bob's two to the n possible inputs and an entry is all the possible answers that they since it's a total search problem that entry could be labeled with all possible legal answers. In this picture there's five legal answers, one, two, three, four, five with different colors and you can see that so this looks pretty it's not always this pretty but you can see that the matrix gets covered by these colors and each entry has at least one color associated with it because it's a total search problem but some entries can have more than one color associated with it. And in a protocol if Alice speaks first in a deterministic protocol it partitions the rows into two halves not necessarily so nice but it's hard to draw it otherwise so into two halves and then when Bob communicates again that partitions each of those two halves now it partitions column wise into two pieces and so on and when they're finished with the protocol which should happen is each the whole matrix has been partitioned into sub rectangles and each sub rectangle has to be covered all the entries have to be covered by a single color because at that point they have to know an answer so all the entries have to have you know there has to be a color one color at least one color where all those entries have that value so this is a picture of an example. Does that make sense? So showing you it's already sort of some kind of a rank like measure because in some sense you're partitioning the matrix into sort of rank one sub matrices. Okay so this technique of lifting that I referred to before is what I'm going to be spending my time on and the idea is to start with a function little f and I wrote it as a Boolean function but again it's not going to be a Boolean function typically for the applications it's going to be a total search problem and it's on n bits okay and then what I want to do is understand little f in terms of a simple measure some query measure which in the deterministic case it would be deterministic decision tree complexity so or how many bits you have to know in order to know the value of the function does everybody know what decision tree complexity is? Nobody's saying. Should I tell you? Okay so for example a decision tree over the variables x1 through xn it's a tree and the vertices of the tree are labeled with variables and there's two outgoing edges labeled zero or one maybe this is labeled with x2 maybe this is labeled with x4 maybe this is labeled with x6 and it leaves you put the answer the value of the value so in the previous slide the answers could be one through five if it's Boolean the answers are zero and one so you label these with the answer whatever it is and this computes a function in sort of the obvious way given an input there's a unique path that's consistent with the input so if the input is like say zero if it's x1, x2, x3, x4 up to x6 and if the input this is the zero this is 1, 0, 1, 0, 1 so for example if the input is all zeros then this would be the path that would be followed and that would be the value output and the complexity measure for a decision tree is the height of so again for a given function f you're looking at the best decision tree and the complexity of a particular decision tree is the height of it and the height of the decision tree is sort of telling you in the worst case over all the inputs how many bits you have to know the value of in order to know the value of the function so it's a pretty simple probably the most simple model of computation that we know of and again just like there's variants of communication complexity and circuit complexity and uniform complexity classes you can go from deterministic to randomized so we have the same thing for the query classes you can have deterministic, there's randomized and so on and these are very well studied classes and it's quite easy usually to prove lower bounds for decision tree complexity and the other variants of it so the idea here is to take a function little f where we completely understand its decision tree complexity and then what we want to do is compose little f with a gadget g so g is another function think of it as a very small function hopefully on a constant number of bits or a polylogarithmic number of bits and the composed function the gadget g it takes as input typically two inputs an x and a y where the x part will be given to Alice and the y part will be given to Bob so the composed function has as inputs x1 through xn and those are going to be given to Alice and y1 through yn and those are going to be given to Bob and so we're going to go from trying to understand the query complexity little f to the communication complexity of the composed function and what we would like is that we would like to show that there is some relatively simple gadget g with a property that no matter what little f is when you do the composition the communication complexity of the composed function is pretty much exactly the same as the query complexity of the function little f and if we can prove that, that's called a lifting theorem if we can prove that then we're in great shape to prove, to use we can just use everything we know about the decision tree complexity it translates sort of automatically to the communication setting and then we're good to go so that's what this says here we want to translate the query complexity of f to the communication complexity of the composed function so just to give you some motivation almost all the lower bounds that we know and love in communication complexity are already lower bounds for composed functions so it's already something that people have been doing it's just maybe it wasn't so explicit early on so for example, set disjointness is like the most popular the most, I don't know the Archive and I wrote a survey all about set disjointness because it's the analog of satisfiability in the communication world it's sort of the NP complete problem in the communication setting so set disjointness is just you know Alice is giving it n-bit string, Bob's giving it n-bit string you view them as subsets of 1 through n and you want to know if those sets are disjoint it's actually reversed so if they have a bit if there's a coordinate where they're both 1 in other words the sets are not disjoint then the value is 1 otherwise if they're disjoint the value is 0 and set disjointness if you view it as a composed function the outer function little f is the or function and the gadget is just the and of two bits and sure enough it turns out that the decision tree complexity of the or function by the way I have my zeros and ones mixed up here but the decision tree complexity of the or function is linear and it's easy to see you just whatever variable is cleared first you just always take the path that says 0 and you have to know that all the bits are 0 in order to know that the or function is 0 so any decision tree for the or function is linear and so we also the very famous results from I don't know how many years ago 20 years ago shows that likewise the deterministic and the randomized complexity of disjointness is also it's also maximal it's linear okay so once we have a lifting theorem then we can use it and one of the things that's nice about it is we so first of all we can get the intuition from the query world and that intuition in that simple situation translates easily to the communication setting we have both an upper and a lower bound on the communication complexity of the lifted problem it's quite nice to work with we can use it to get separations for lots of different things since we have the upper and the lower bound so there's been a whole bunch of lifting theorems in the last 10 years roughly and I'll just tell you about a few of them some of the basic ones so this was the first there are some other theorems that were proven prior to this that are also lifting theorems or in the style of the lifting theorems most notably Sherstov's theorems on the pattern matrix method but I'm going to consider this to be the first lifting theorem because it's the first one it actually goes from deterministic decision trees to deterministic communication complexity and this was proven originally by Ross and Mackenzie although they didn't state it as such it was sort of embedded in a really beautiful proof that they had separating levels of monotone circuit classes but the theorem that they proved was f little f is an arbitrary n-bit Boolean function or search problem and the gadget G is not so small it's actually kind of a universal gadget called the index gadget and it's lopsided so one player Alice has a string of length like say 20 log n or order log n and Alice isn't put as much bigger it's exponential in that so it's n to the 20 long so it's quite big and the value of G on a pair x y is basically you think of x is pointing to a location and y and it's that bit so that's why it's like a universal function or universal gadget and it might look bad to you that it's so big but the fact is the upper and the lower bound are usually dictated by the size of the shorter string so because remember one player can always send their entire string to the other player so in this case there's always an upper bound for the gadget that's the size of the smaller input which is order log n so the theorem that they proved is that the decision tree no matter what little f is the decision tree complexity of little f is that times order log n is exactly equal to well there's you know there's constants in there but it's equal to the communication complexity of the lifted problem so the easy direction is if you have a protocol if you have a decision tree for little f you can automatically use it to get a protocol for the communication problem how do you do that well I should have called the z1 through zn so typically I'm going to be calling the outer function I'm going to use the variable z to refer to the outer function and then when this gets converted by the gadget g this will become g of x1 y1 where in this case x1 is a 20 log n bit string and y1 is a into the 20 bit string so if we had this decision tree for little f the way we converted to a communication protocol is just to basically follow this tree so if the first variable query to z1 then the players compute the gadget g on x1 and y1 and they do that in a trivial way where Alice sends her her whole string to Bob so that takes order log n to simulate that that's why the communication complexity is order log n times the decision tree height the other direction is the hard direction to show that given an arbitrary communication protocol for the lifted function it could be completely ugly might have nothing to do with the decision tree you can somehow extract from it a decision tree for the outer function so then we proved a randomized lifting theorem so it's a very similar theorem but in the randomized setting so if you start with a randomized decision tree which is just a distribution over deterministic decision trees and with the appropriate choice of gadget we used in the original paper we used the same gadget that you can show a lifting theorem the decision tree complexity of any function little f times theta log n is exactly equal to the randomized communication complexity of the composed function and this actually was a lot of work it took a long time for us to get this and then the new paper that I referred to with Arcadev and Yuval and Sajin and Orre I think we got a much nicer proof of this and the proof unifies the previous two theorems using the same technique and it also is a little more versatile so we show that more or less any gadget that has low discrepancy can work, not just the special index one and there was other papers Arcadev has another nice, there's many papers between these that I'm not going to tell you about yep what is the smallest gadget you're asking what is the smallest gadget it's not any better so it's still an outstanding open problem to do better than order log n okay and then the last theorem that I may or may not get to is to show you that there's another lifting theorem that this one is for a constant size gadget and this is sort of like, I'll explain more later but it's sort of if you convert decision trees in the natural way into a linear, linear algebraic version of query complexity you end up with something called Mel Stolzner's degree and it's like polynomial degree and that is going to correspond to a communication measure that I'll tell you about later so we have a lifting theorem in that setting too okay so I'll start by telling you how to reprove the lower bounds for monotone formulas and then we'll see how it goes after that okay so back to this picture on the left is the model of monotone formulas so this is formulas over or and and binary fan in, no negations and we'll also talk about lower bounds for monotone circuits which again instead of it being a formula or a tree, it's a dag so the computation can come together again still the fan out is too and it's still monotone, so it's still over or and and but it's a circuit instead of a formula so these were like classic results from quite a while ago with sort of complicated proofs and they're all different so we'll see how to prove both monotone formula bounds and circuit bounds the best that we know of using this technique okay so first I want to explain you how to convert this is old, beautiful equivalence between monotone formula size or circuit depth and in the Karsmer-Riedersen communication problem I'll review that and then I'll explain how to just use lifting to automatically get this monotone formula bound okay so the Karsmer-Riedersen game I mentioned it on the first slide but the search problem is you start with a function f that's Boolean and Alice gets the one input of the function Bob gets a zero input and they're trying to find a coordinate where they differ okay and there's a monotone version of this game as well there's a monotone version of this game as well where the function f is monotone and now they want to find not just a coordinate where they differ but a coordinate where Alice's Xi is one and Bob's is zero okay and the theorem that they proved and the proof is really quite easy but it's a really great way to think about formula size they show that the formula size for any function f the formula size of f is equal to, this should say the log of the communication complexity of the Karsmer-Riedersen game associated with f okay and equivalently the circuit depth of any monotone of any formula f is equal to the communication complexity of the Karsmer-Riedersen game associated with f f is monotone then you get a monotone version of it so f is monotone then the monotone formula size is equal to the log of the communication complexity or in other words the log of the circuit depth okay so let me just tell you how the direction there's an equivalence but the direction that we mostly want is the direction that says if you have a monotone formula or a non-monotone formula of a of a certain depth we want to show how you can use that to solve the Karsmer-Riedersen game associated with it and just look at this picture let's say this is our formula okay so remember Alice has input where the output gate evaluates to 1 and Bob has an input where the output gate evaluates to 0 so all they do is they trace a path from the root to a leaf they know at the root each of them when they evaluate the root on their input Alice is 1 and Bob is a 0 and if it's an org if it's an AND gate like it is at the top then we know that on Bob's input one of the two sub-trees has to evaluate to 0 and on Alice's input both sub-trees have to evaluate to 1 so Bob would speak and tell Alice which way to go which way to go so that his input is still 0 and her input is still 1 so he sends one bit telling her left or right when it's an AND gate and when it's an OR gate she sends one bit and says whether to go left or right so the total number of bits is just the depth of the tree and if it's a formula you can also be a little trickier and you can get that it's the log of the formula size so this is like another just an equivalent way to view proving formula lower bounds solving this and notice there's not a Boolean communication problem it's this total search problem okay so we show that proving lower bounds for monotone formulas is equivalent to trying to prove lower bounds on the deterministic communication complexity of this monotone and now we want to use lifting to get lower bounds on that using some kind of decision tree bound okay and this is the second trick and this trick is kind of in almost all of these works but we're going to view the Karshma-Rigderson game in a slightly different way that's going to be really productive, really useful so let me try to explain that so you have, remember before in the beginning I told you about another type of total search problem that came from an unsatisfiable formula so you have an unsatisfiable formula C and the search problem associated with C the ordinary search problem is given an input find a falsified clause and we can turn this into a communication search problem by just you know splitting up the inputs into C in half giving Alice half and Bob half and it's the same thing they want to find a clause that's violated so one nice way, we don't have to do this but one nice way to partition the inputs to make our job easier is to compose the original unsatisfiable formula F with the gadget G so replace all the variables of the unsatisfiable formula C with the gadget so Z1 would be replaced by G of X1, Y1 and now they want to solve this search problem associated with the lifted C and F formula so Alice gets the X inputs Bob gets the Ys and they want to find a clause that's violated and this is not very hard to prove that more or less this this search problem this communication search problem is equivalent to the monotone Karstner-Vigderson search problem so what that theorem stated was that if you have an unsatisfiable C and F and you partition the variables in any way into two pieces that gives you a communication total search problem and what the theorem shows is that you can view the inputs you can view Alice's inputs the assignments of the X ones you can view those as one inputs to a monotone function associated with this unsatisfiable C and F and you can view Bob's input as a zero input to a monotone function associated with C and basically it's the same the Karstner-Vigderson search problem is the same as this one does that make sense so I'm just going to try to do the proof of this but I'm out of time but anyways this is just saying that we can always transform a lifted search problem like this into an equivalent Karstner-Vigderson game for an associated monotone function so this means putting this all together I'm going to skip the proof that all we have to do is start with some unsatisfiable unlifted C and F where we know that because in the decision tree model is linear and that's really easy to do there's all sorts of unsatisfiable forms the pigeonhole principle random C and Fs there's all sorts of unsatisfiable forms where any decision tree for finding a violation requires linear height and that's by the way that's equivalent to talking about the DPLL resolution height so decision tree complexity of finding the clause height DPLL is Davis Putnam Lovelin-Logan it's a tree like resolution method for proving that a form is unsatisfiable so you start with an unsatisfiable C and F where you know its decision tree height is linear like I said that's easy and then using the equivalence between then you lift that unsatisfiable C and F in the way we described and that gives you the lifted search problem and that's equivalent to the Karstmer-Rigderson game for an associated monotone function so by the lifting theorem that automatically gives us linear lower bounds on the communication complexity of the monotone Karstmer-Rigderson and then by the equivalence of monotone formulas that finally gives us the lower bound so at the end of the day putting it all together all you have to do is start with an unsatisfiable formula where you have a very simple type of a lower bound for it and then with that together with the lifting theorem automatically gives you monotone formula lower bounds lots of I'm trouble I don't know what to say I'm just going to keep going and this result was so an open problem was to show that this works not just for monotone formulas but for circuits and until a couple of years ago we didn't know how to use the lifting technology for monotone circuit bounds but in a really lovely paper from a couple of years ago they used the randomized lifting theorem that I mentioned to prove lifting for what's called instead of communication trees for communication dag and that in turn corresponds to monotone circuits so all the same stuff with a new lifting theorem that builds on the randomized lifting theorem enables you to re-prove the famous monotone circuit bounds that were proven by Rasbrough and I don't know the 80s I think and one thing that's really nice about this proof is it's actually quite different than the old proofs so the old proofs were like bottom up use sunflowers or use method of approximations and this is really a much different proof that's like a top down proof it gets pretty similar lower bounds though so everybody's looking to see okay so I'll tell you a little bit about the linear algebra lifting theorem that I mentioned that we can use to get lower bounds for linear secret sharing schemes which are sort of equivalent to monotone span programs so here is the same picture but on the left the structured model of computation I'll tell you what this means in a minute is monotone span programs monotone formulas and they actually end up being equivalent strangely enough to linear secret sharing scheme complexity so that's the model of computation and we'll show sort of the analog of Karstmer-Vigderson which was proven by Anna Gall quite a while ago a beautiful result that shows that the complexity of these can be characterized by a generalization of the Karstmer-Vigderson by a generalization of communication complexity that I'll tell you about and then we can use this new lifting on the right there instead of decision tree height it's sort of like polynomial degree okay so I'll start with people know what secret sharing schemes are I didn't know much about them until a couple of years ago so secret sharing is very cool if you don't know about it I think it's really cool and there's lots of open problems this is a wide open area if you're looking for something to work on some of the top sets of a population okay so maybe she wants to share her secret with the green and the pink and the purple but she doesn't want to share her secret with any group of people that doesn't include one of those special groups okay so maybe she wants like a parent in the room or something or a giraffe in the room so she wants special groups and what she does is she sends a message to each of the players which for example the pink group should be able to get together and figure out what the secret was using their shared messages and likewise for the green and the purple but any other subset like this group over here that's not one of the special ones should not be able to reconstruct S should not be able to learn anything about the secret S and the question is how big do these shares, these messages how big do they have to be in order to compute, in order to do this okay and this is an old problem from 79 and so you have, think of it as N party so the population is N people and you have subsets P1 through PM okay and the question is how long can these, how long do these messages have to be and there is an upper bound of 2 to the N which is terrible and the question and it's still wide open is whether you can do better than this in general and a lot of focus was on linear secret sharing schemes where the messages are actually linear functions and even there it was sort of this, this, this was studied in the 80s and early 90s a lot to try to even prove lower bounds in that case and there's a lot of synthesis of papers that led to some beautiful results that led to a quasi polynomial lower bound that was the best that was known and using this technique we were able to get like exponential bounds for this problem it's still totally wide open in the non-linear regime what's happening yeah so it's information theoretic so it means that any subset that doesn't include one of the special subsets should not, information theoretically should not be able to learn anything about the secret does that answer your question yeah that's right that's right that's right and one thing that people have looked a lot at is so you can look at the complexity of the subsets and try to understand how big the shares have to be as a function of that complexity and if the complexity is low then the shares are equally low but if you don't bound the complexity it's not known if you can do better than two to the end there was a breakthrough result that did slightly better than two to the end it's still exponential but there's a constant that's I forget exactly what it is but it's slightly better than the trivial two to the end bound but it's still open like completely wide open so it's a great problem so I'll show you how once you have this lifting theorem you get this like really easily I mean it also sort of because this model of monotone span programs includes not just monotone span programs but monotone branching programs, monotone formulas secret chainer schemes you get sort of like the same proof that monotone span programs are like I said it turns out that you can relate the complexity of a monotone function in this model turns out to be equivalent to the linear secret sharing scheme size of the corresponding groups so a monotone span where I really like this model of computation it was defined by Karsper and Wigerson quite a while ago and I think of it as like the linear algebraic natural extension of monotone formulas but it's actually in some sense stronger than a monotone computation it can even do things that polynomial size monotone circuits can't do so in some sense it's incomparable to these monotone models you describe a span program by a matrix and the inputs are you label the rows with the input bits it's important that you can label you can use the same input bit can label many different rows so in this case x2 occurs twice but typically it might occur many many times and so and then you label each row with some vector over whatever field you're in and then you say that an input alpha is accepted by the span program alpha is an assignment to the Boolean variables x1 through xn in this case x1 through x5 you say that an input is accepted if you look at the rows that are labeled with variables that are consistent with the assignment so in this case x1, x2 and x5 are one so the rows labeled x1, x2 and x5 are consistent so you're allowed to use those rows and then you want to know if the all one vector is spanned by these rows and if it is then you accept this input and if it's not you don't and this is the monotone version of it but there's a non-monotone version of it as well where you can label the rows by literals either variables or the negation same idea you light up the rows that are consistent with the input and you want to know if one is in the span okay so like I said these are sort of equivalent models even though they look quite different and that's the that's the model of computation we're interested in and it turns out that the complexity is equivalent to a particular communication type of model that I'll tell you what it is now called algebraic tiling and it's sort of the natural in my mind the natural algebraic generalization of ordinary communication complexity so I'm going to define this now it's called an algebraic tiling this is the same matrix you saw in the beginning the communication matrix and again it's a total search problem so it's the same matrix you saw before rows are labeled with all the one inputs columns are labeled with all the zero inputs and here I just drew five separate matrices one for each of the different possible answers where I just drew all the stuff that was labeled one in its own matrix so nothing's happening here I just drew them all separately so you can see them a little better and a tiling is we can get to construct five matrices one for each possible answer to the search problem and matrix A1 over some field matrix A1 has to be zero outside of the yellow area so inside the yellow area it can be anything you want any values on the field that you want likewise A2 has to be completely zero outside of the blue area and so on okay and then you have these matrices and what you require is that they entry-wise sum to the all one all one matrix and the complexity of the tiling is the sum of the ranks of these constructed matrices A1 through A5 so it might look weird but just to point out that if you have a deterministic protocol it gives you a tiling it's a special case of a tiling let's see if I drew a picture I didn't draw a picture well I have it earlier so remember what a regular protocol was it partitioned it was a partition of the original matrix into monochromatic sub rectangles so it had the property that all zero entries so for A1 all of the sub rectangles that give the answer one they would all I would put ones everywhere in A1 corresponding to all those sub rectangles and that has to all be contained in the yellow part because the answer is one and so I have ones and zeros in the yellow part and I have zeros everywhere else likewise for all the other ones so it actually is a zero one matrix and further like the fact that it sums to one it's sort of trivial because each entry is one in exactly one place and zero everywhere else and the sum of the ranks is just the you know it's just the total number of sub rectangles in the you know in the partition of the rectangle in the monochromatic sub rectangles so this is that's why I think of this as the linear algebraic extension of that because now you don't just have to have a one or a zero in every entry where every place has exactly one one but you can have various values you know if you have a place in the matrix that could be yellow or it could be blue then you can put a number into the yellow part in a number in the blue part as long as it sums to one over the field okay so it's a more powerful it's it enables you to get a smaller smaller complexity possibly so that's why it's a more powerful model and what Anna Gall proved this beautiful result that the log of this complexity is exactly the same as the monotone span program size so the log of the complexity of the tiling is exactly the same as the size of the monotone span program which is also exactly the same as the the size of the messages in the linear secret sharrings you know the optimal size of the messages in the linear in any linear secret sharrings being corresponding that those subgroups that correspond to the function that you're trying to get the lower bound for. Does that make sense? And then, so then if you want to prove lower bounds for monitor and span programs, we convert it to this generalization of communication complexity. Yes. Can you say that one more time. So you're wondering, so there's a non-monitune version of it also. The difference is these rectent, it's a little hard to describe because I did it right away for the search problem. So you could think of these colors, these A1, A2, A3, if it's the monotone, if it's a monotone function, then there'll be n of those in the monotone case where, you know, X1 would be if, if, if, so A1 would be like if, if Alice's X1 is 1 and Bob's Y1 is 0. And then non-monitune version of the thing there, it would be, you know, it could go either way. Yeah, good question. Okay. So what I want to do is just explain to you this lifting theorem that we get between Tiling and Nelstone sets proofs. And I hope it's somewhat convincing that this really is just the analog, the linear algebraic analog of the deterministic one that we did. So I'm going to replace decision tree height with polynomial degree. Okay. So you don't really have to, the same way that decision tree height. So again, we're going to do the same trick we did before where we have the unsatisfiable formula. Lift that, that's going to correspond to, you know, the Carson-Bvergerson situation. So I'm going to do that same trick, the, in the same way that once you have an unsatisfiable formula and you're trying to understand the decision tree height, that corresponded to proving a lower bound for a very weak proof system, namely tree-like resolution proofs. In the same way, the polynomial degree required to find a violation for an unsatisfiable scene enough is exactly what we call Nelstone sets refutations. You don't really need to know that. Maybe this is a better definition for now. So you have an unsatisfiable CNF, okay, and we're going to convert it to, you know, low degree polynomials. And then the Nelstone sets degree of this unsatisfiable collection of low degree polynomials is just we want to have, you know, a small degree polynomial that solves a search problem, okay? So that's, that's all Nelstone sets degree is. So you give it as input and assignment to the variables and it should output a violated clause, okay? And you're looking at a polynomial over the reels or over some field and you want to know what's the, what's the minimal degree of any polynomial that does that, okay? And if it bothers you that the polynomials outputting a value instead of zero one, you can just convert it to zero one by, you know, for each, each clause you could have a separate polynomial where it's, it should say one, if that's the clause, it's violated in zero otherwise. So again, it's like, it gives you a partition of all the assignments where for each assignment there should be exactly one polynomial that says one. Does that make sense? So the lifting theorem that we prove, that I prove with Robert Raber is mostly Robert's work is going from Nelstone sets degree or polynomial degree of solving the search problem to, you know, through algebraic tiling to monotone span program size. And so put all this together, you end up with, you know, all you have to do is prove lower bounds on polynomial degree for an unsatisfiable formula such as random CNFs and then you automatically get these, these lower bounds on linear secret sharing schemes. And unlike other lifting theorems, one thing that's really cool about this is that the gadget, the inner gadget here is constant size and actually where's Mark? He's here somewhere. He was one of the people on the later paper, Mark Vignel. So this is actually the gadget G is a constant size gadget, which means that there's almost no loss. So you really do end up with like truly, truly exponential lower bounds over here. So in particular, it gives you lower, truly exponential lower bounds on monotone formula size. Okay, so I probably went on too long. I think, okay, you have a choice. I have maybe eight minutes left. I can either tell you a little bit about the proof or I can tell you about a third application. I'm a little bit more inclined to tell you about the proof. But once you raise your hand, if you would rather hear about another, this last application. Okay, and raise your hand if you'd rather hear a little bit about the proof. Oh boy, why did I do that? Okay, we can split into two. Okay, maybe I'll, I'll split the difference. I'll do four and four. Four minutes into four minutes. Okay, so pseudo determinism is a very cool notion that I learned about from Shafi Goldwasser at Simon's lower bound meeting that I guess it's been around for a while. And it's, it's a notion of randomization for search problems or for functions, but where you are allowed to use random coins, but you have to always give the same answer. So for a Boolean function, it's not interesting, because there's only one answer. Okay, but for, for a search problem where there can be more than one answer, you want to, you want an algorithm you're allowing randomization. But what you want is this very high probability over the random coins, you always get the same answer. Okay, and they asked this question, you know, that they define this, this model called pseudo deterministic algorithms, you can define it for, you know, decision trees, communication complexity, whatever. And they asked, what is the power of it relative to ordinary randomized computation? And one motivation that they gave for it is sort of related to like reproducibility in science, machine learning algorithms, even as processes where there's a lot of randomness, and often it's contentious whether things can be reproduced. So if you had a pseudo deterministic algorithm, then, you know, somebody should be able to run it with their own random coins and should be able to get the same answer. So I'm not going to really go through this whole thing. But I'm going to just tell you, tell you a little bit about the, the decision tree lower bound. Okay, and then you can use, use modifications of what I've been talking about to get, to get a similar separation in the communication world. So here's like the classic problem that you would think should be easy randomized and should be hard. Well, it is hard deterministic, and you also think it should be hard pseudo deterministically. So call it, it's a promise problem called find one. So you have a vector of n bits, and you promise that half of them are one, at least half of them are one. Okay, and you want to find a one. Okay, and we want to study the decision tree complexity of this. So if it was a ordinary deterministic decision tree, the height would still be linear because I would just, you know, whatever somebody queries, I would just follow the zero side. And I would be able to go n over two steps, seeing all zeros before I was forced to give a one. So the height would be n over two. Okay, lower bound. And the question is, what about in the randomized case? So in the randomized case, it's easy, because all you have to do is like guess a coordinate, and you have a 50-50 chance of getting a good coordinate. So if you guess a few times, log times, all of a sudden your probability of being right is really high. Okay, so randomly it's easy, but notice that you always get a different answer. And in some sense it's not really a guarantee of much. Whereas a pseudo deterministic algorithm is allowed to flip coins, but it has to almost always give you the same answer. So the question is, can you prove a lower bound for find one? So it's not that hard to see that you can get some lower bound into the epsilon. What we prove here is a square root of n lower bound. I think the answer is linear. I was going to announce linear, but I didn't check everything yet. So the answer, we might have a linear bound, but for sure we have a square root of n bound. And the way to, a nice way to get this improved bound is again to go through the CNF trick. So the proof idea is that you start with an unsatisfiable random CNF. Okay, so think of a random CNF, three CNF, with maybe like, I don't know, 100 n clauses. So fix one that's unsatisfiable and with a property that like a constant, on every assignment a constant fraction of the clauses are always false. And for almost any random CNF without many clauses, this will be true. So for every assignment a constant fraction of the clauses will be false. So then we can ask our same question of find a violation. And that is sort of like embedded in find one, because you're always going to have on every input vector, so you can think of on an input, on an assignment to the variables of the CNF and variables, you can associate with it a vector of length 100 n, however many clauses there are, and you put a one in a position c i, you put a one in a position i, if clause i is false. And since the unsatisfiable formula had a constant number of violations for any assignment, h assignment is going to convert to a vector of still linear length, it's going to have a constant fraction of ones. Okay, so that means that, again, the randomized complexity of finding a one is easy, and we want to prove a pseudo deterministic lower bound, but a randomized algorithm that always gives you the same answer with high probability is going to require like a square root of n. And the nice thing about converting to the search problem is that, again, we can just now use lower bounds that we already have on Nolston's degree that will automatically give us lower bounds with a little bit of more work for this question. And this problem, like I said, it embeds in find one, it's a particular sub-collection of all the vectors that have a constant number of ones, but it's a nice collection, it's easier to work with because it has good expanding properties. So the same way that before we were able to use Ansata's Bible CNFs and nice properties of them, just use those properties together with general theorems to get lower bounds, we kind of can do the same thing here. So that's all I'm going to say about this, unless you have any questions. Okay, so just a few words about this, the lifting theorem, and this I'm going to talk about the newer one. Well, I'm not going to get into many details, but I'm just trying to give you a flavor for it. So this is the statement of the theorem that the randomized decision tree complexity of little f, regardless of what little f is, is more or less equal to the randomized communication complexity of the composed function. And here g, the gadget, is going to be b bits, Alice gets b bits and Bob gets b bits, and b has to be something like, it's still omega log n, so think of b as order log n, and we get the same kind, we get the same kind of statement as before. This works for any gadget g that has sufficiently low discrepancy, but you can just think of a fixed g if you want. Okay, so how are you going to prove this? I just want to give you some flavor for how do you prove this. So what you, the way we proved it is a constructive argument, so somebody gives you a protocol of randomized, let's just do the deterministic case. So somebody gives you a deterministic protocol for the composed problem, and you have to somehow extract from it a decision tree for the outer function. Okay, we want to come up with a constructive algorithm, I hand you a protocol, you run the protocol somehow and construct a decision tree out of it. Okay, so how are you going to do this? So what we want to do on any input z, z is an input to the function little f, we want to somehow simulate the protocol on all of the x, y inputs that are consistent with this z. Okay, we want to simulate the protocol on those, and instead of querying, you know, x, x, the x is we want to query the z's. So the weird thing is, is how can you do this when you don't know the z's? You don't know the z's until you query them. Okay, so the only thing really that you can do is when Alice sends, let's say Alice speaks first, so she's going to send some message, so she's going to partition her all of her possible x's, you know, values for x1 to extend into two halves, and all you can really do is sort of you know, simulate this with a randomized, just you know, think of it as just going left or right with probability proportional to how big her two pieces are. So she partitions all of her inputs into two pieces of size like a quarter fraction and three quarter fraction, then intuitively would want to go left with probability quarter and right with probability three quarters. We haven't queried anything yet because it's allowed to be a randomized decision tree. Okay, so that's what we want to do, and we want to hope for the best. But at some point, so we start by stimulating the protocol on this uniform distribution, but eventually the protocol pi is going to sort of transmit too much information about some particular coordinates, so coordinate of z, so some zi. Okay, so at that point we want it, we want to actually make our decision tree query this variable zi, so going left or right, but after we clear, and then we want to continue the simulation. So we want to have some measure of information. When too much information has been learned about a coordinate, when too much information has been learned about a coordinate, then we want to query that bit, and then we want to carry on. Okay, of course there's some issues here. One issue is when you query that bit, that's going to partition, you know, all the x y's into two pieces, and now you might lose the property that those two pieces, you don't know anything about the remaining coordinates. So you have like a lot of cleanup to do in a sense. So the invariant that we maintain is this is sort of a measure of information about subsets of coordinates. Let me just go back to the very high level, so we're going to simulate the protocol pi, like I said, going left or right with probability proportional to the size of the current sub rectangle, as long as there's no coordinate or subset of coordinates where we've learned too much information, okay, and we'll call that dense. As soon as some subset of coordinates, say one coordinate, too much information has been learned about it, maybe Alice sent a message and all of a sudden too much is learned about, say, the third coordinate, the x part. So at that point, we're going to query that coordinate, so query that coordinate, and go left or right, and that partitions all the x and y strings into two parts, the left part being the part where that coordinate has the value zero, and the right part where the coordinate has value one, but then the problem is we could have lost our invariant on the remaining coordinate, so then you have to do a cleanup to restore the invariant in a way that doesn't remove too much stuff. So the new proof, what it does is it sort of looks at what happens when you do query a coordinate, and there are some situations, trouble some situations that you can get into. So we define a notion of dangerous, which is like particular values that leak too much information, okay, so there are some bad strings that leak too much information, and what we basically have to do is argue that there's not too many of these dangerous values, so they can be discarded, and once they're discarded, you're back to this nice uniform situation where you can carry on the protocol. This is super high level, and you might not have gotten anything out of this, so we can talk about it sometime if you want to know where to ask the architect, but that's sort of at a very high level how the argument goes. Yes, so just to wrap up, so one question that I'm interested in is whether we can prove similar lifting theorems instead of for communication complexity, for information complexity, and this would be great for a lot of reasons because there's a lot of applications, well first it would just be interesting because for information complexity we know that communication and information aren't always the same, so not all protocols can be compressed, so it would be interesting if in the lifted situation that didn't happen, so if we could prove a lifting theorem in the information complexity setting that would take us from decision tree complexity to information complexity, then that would sort of tell us for composed functions, you know, there's no difference between the two. And the other reason it's interesting is there's lots of nice applications where you really do need the, seems you seem to need the stronger information complexity bound, for example in differential privacy, which is an application of one area that where a lot of lower bounds are obtained by information complexity bound. So you asked the question before, so we still don't have lifting theorems for constant size gadgets for the simple deterministic setting or the randomized setting. And of course the big open problem is to try to move our way away from non, away from monotone computation and try to figure out how to solve the real problem, which is to prove the lifting theorem for non-monotone computation and therefore maybe to get non-monotone formula lower bounds, even if we could break that into the third barrier in this way that would be great. So that's all.