 Speaker is Pravesh Kothari from Carnegie Mellon. And he will give a talk about strongly refuting all semi-random Boolean CSPs. Yeah, go ahead Pravesh. Okay. Let me know if you have like trouble hearing me or something. So thanks, Andrei and Yakub for inviting me to give this talk. I guess I'm super excited to tell you about this very recent work. Then jointly with my student, Jackson Abascal and Venkat Guruswamy, I think he is on the call. Before I begin, I want to apologize. I tried making actual slides on actual PowerPoint, but I was very far from finishing and I thought, well, you're all mathematicians, we can take the math. And so I thought, well, I'm just going to do it in the style that I'm teaching these days. So hopefully it will be fine. Hopefully my handwriting will be readable. But of course I'll be slow. And so feel free to stop me. Good. So I don't expect you to understand any of the words in the title right now, except maybe CSPs. And so I will explain everything about what the result means. And I actually probably won't get to the details of how our result works because I think my main focus is to impart some of the understanding of the tools that we have in this area. And so if I only get through that, I would consider it a successful talk. And towards the end, if I have time, I'll tell you a little bit about some of the new ideas that we have in this work. Good. So let's begin. So let me start straight away by defining what the central object of our goal itself is in the stock is to design certain kinds of algorithms for certain problems related to CSPs called as refutation algorithms. And there is a gory detail description. I'll go through it very slowly right now. But at a high level, you should think of it as some natural average case problem related to max CSPs that we are all familiar with. So more specifically, here is what's happening. So it's an algorithmic problem. The input to the algorithm is going to be a CSP instance. Now I'm going to always assume that the number of variables in the instance is n. And the number of constraints is m. And think of m as some function growing with n, m would actually always be at least n, as I'll soon tell you. The constraints are generated by applying a certain predicate. I'm going to think of the predicate as some arbitrary k-bit Boolean function. So it's a arity k predicate. And it takes only Boolean assignments. For this talk, my Boolean assignments would be values in plus or minus 1. So those are my bits for today. And I'm just going to label satisfied as 1. I'll say the constraint is satisfied. That means that the predicate takes the value 1. The constraint is unsatisfied. That means that the predicate takes the value 0 at an assignment. So hopefully, the description of the instance at a high level is clear. It has n variables, m constraints, and it is described by predicate p. Now more specifically, when I want to choose an instance, I have to give you two sets of things. What do I have to give you? Well, I have to give you the k-tuples on which this predicate is applied. So I will have m different k-tuples. I will think of this as simply a hypergraph, a k-uniform hypergraph on n vertices. So this hypergraph h describes the k-tuples, the ordered k-tuples that appear as constraints in your CSP instance. Good. But that's not enough because on every k-tuple, I will have a certain literal pattern applied before applying the predicate p. The literal pattern or the negation pattern would be described by another k-bit string. This time, I have a k-bit string for every edge, every hyperredge of the graph h. And because I'm working on this Boolean setting like where bits are plus or minus 1, I can actually think of bi1, bi2, bik, each of these as simply plus minus 1 bits themselves. And here's how the constraint works in this setting. If you have a k-tuple, let's say the i of k-tuple is xi1, xi2, xik, then to generate the constraint, I first shift the bits by multiplying them by the corresponding bi value. So xi1 gets multiplied by bi1, xik multiplied by bik and so on. Then I apply the predicate and my constraint says that I should find an x because this i-th constraint is satisfied, meaning p when evaluated at the shift of x by b should take the value 1. Now, if you're more familiar with the 0, 1 version of the things, this is simply like multiplying bi1 simply corresponds to negating xi1 when bi1 is like minus 1, because when bi1 is minus 1, it basically flips xi1 from plus 1 to minus 1 or minus 1 to plus 1. So this is simply the plus minus way to describe plus minus 1 word description of the usual negation, et cetera, that you might be very familiar with. Good. So good. So now we have a description of the instance in some gory detail. For such an instance, I would describe val i. This will be a notation for value of the instance i. It just means the maximum fraction of constraints that any assignment can satisfy in this instance. So in particular, I'll think of i of x as the function that calculates the fraction of constraints satisfied by a given assignment x. So I treat i, the script i as a function on plus 1, minus 1 to the n to the interval 0, 1. And it calculates what fraction of constraints are satisfied by x. And so val i is the maximum possible fraction of constraints satisfied by any assignment x. Again, feel free to stop me. This is I'm going to just be super slow and try to be super clear. Good. So given such an instance with all this description to play with, what is our goal? So our input, remember, was this input CSP instance. Our goal is to output some value v that is in the interval 0, 1. And I would say the algorithm is correct. If for every possible input instance i to the algorithm, the output value v always satisfies v bigger than equal to val i, meaning I always return a value which is larger than or equal to the true value of the instance. I never return a value which is smaller than val i. Notice that so far, if this is all my goal, then you can trivially solve this problem by simply not even looking at the instance and outputting 1 all the time. So of course, the problem isn't interesting so far. In fact, I talked about average case and all that. There is nothing average case about the problem so far. Like this guarantee and this goal has to hold for every instance i. The average case component comes in in the usefulness part of the guarantee. So in addition to these correctness guarantee, I also want a usefulness guarantee. And that usefulness guarantee is going to be an average case guarantee. In particular, this guarantee says that when the instance is chosen according to some distribution d, and I'm going to describe some natural distributions on CSP instances very soon to you, if I pick an instance at random, then you should output a value v which is strictly less than 1, 99% of the times. In other words, for 99% of the instances chosen according to the distribution d, the algorithm should output a value which is strictly less than 1. What does strictly less than 1 mean? So remember, v always has to be an upper bound on val i. If v is strictly less than 1, it means in particular that the algorithm is claiming that the instance is unsatisfiable. And because of the correctness guarantee, because the algorithm never underestimates the value, the transcript of the algorithm gives me a certificate that the input instance was unsatisfiable. So this is basically a formalization of the problem of generating a certificate of unsatisfiability for a random instance drawn according to the distribution d. Any questions so far in the setup? Good. Again, feel free to stop me any time. So let's clarify. What distributions are going to be of interest to me? Well, I'm actually going to talk about two different kinds of distributions on CSP instances. The main result is really about this semi-random CSP instances. But in order to generate intuition and tell you about what problems it relates to, I also want to talk about a different random model of just the usual plain random CSP. Now, I feel like if you've ever seen the word random CSP at all, this is the model you've seen. So let me describe what that means. Remember our instance? To choose our instance, we had to choose the set of k tuples that appear in it. And we described that by using a hypergraph h. And we had to describe the negation patterns that appear on each possible edge. In the random CSP model, both these choices are made uniformly at random. So choose the hypergraph to be a uniformly random hyper, k uniform hypergraph with m hyper edges. And choose every negation pattern to be a uniform k-bit string independently for every edge. That's the random CSP model. So that gives me one instance of the refutation problem that I just defined for you, where I plug in this particular distribution for the useful less guarantee. The second kind of and slightly more general distribution that I will care about today is when the graph h is completely arbitrary. So there is no distribution on h that I want to choose. I want to make this algorithm work for every possible choice of the hypergraph h. But the negation patterns continue to be uniformly at random and independently chosen for every hyper edge. Those are the two models I will care about. And so I'm going to call the second model semi-random model. The first model, the fully random model. And I guess the point is that the semi-random model has much less randomness in it in its description than the random CSP model. In particular, because the k-tuples are completely fixed in the semi-random model. Good. So when I describe the problem to you, some of you might have felt some awkwardness. Why should I be able to satisfy the correctness guarantee and this guarantee at the same time? Like in particular, if v is less than 1 and v always is lower bounded by val i, such a goal is feasible only if the true value, the value of the instance i, when i is chosen according to distribution d, is strictly smaller than 1. Because if val i is 1, in other words, if the instances are satisfiable, then this is never going to be possible. There is no star algorithm that can ever meet my requirements here. So let's make sure that this goal is actually feasible for both the models that we talked about. Again, I'm not going to give you the details of this very, very simple argument. But just by some churn-off-bound and union-bound argument, you can prove that as long as m is bigger than n over epsilon square up to some constant that depend on the rarity of the CSP, actually you can take it to be 2 to the o of k. So think of k as a constant, then this is some n over epsilon square constraints. As long as you have at least that many constraints, regardless of whether the graph was chosen, the hypergraph was chosen uniformly at random or a fixed, completely arbitrary graph, as long as the negation patterns are chosen uniformly at random and independently for every hyper-edge, you can prove that no assignment satisfies more than this fraction of constraints in the instance, which is like 99% probability. Now let's think a little bit more about what this expression is. P was the predicate. P inverse 1 is the set of all assignments that satisfy the predicate. So notice that it's a k-bit predicate, so there are 2 to the k possible assignments. So P inverse 1 is a subset of all possible k-bit strings. This is a subset of size at most 2 to the k. This ratio in particular is the fraction of assignments out of all possible 2 to the k assignments that satisfy the predicate. And so this is saying, this val i upper bound is saying that you're basically very, very close to the fraction of assignment that satisfy the predicate P. In particular, if the predicate P is non-trivial, as in like it's not satisfied by every assignment in the world, then this number is strictly less than 1 when you make epsilon tine enough, tine enough for fixed constant. Good. Now, some intuition for this value, again, because it's very, very simple to prove. This is simply the expected fraction of constraints that are satisfied in the CSP instance when you choose a perfectly random assignment. You just choose x uniformly at random, then this is the expected number of constraints to satisfy without the epsilon, of course. So what this bound is saying is that if you have more than n over epsilon square constraints, then basically there is no assignment that beats the guarantees of the random assignment significantly. So random assignment is basically virtually the best possible assignment for this instance. In particular, as I said, if epsilon is tine enough and P is non-trivial, then this is a value strictly less than 1, which means that our refutation goal is reasonable. Like it's not a bogus goal to ask of an algorithm as long as I have at least these many constraints. Good. Very good. So before going further, one last bit of information that I want to import, I actually focused on the goal here where I just wanted to prove that the value is at most 1. But now that we've discussed that the true value is around this P inverse 1 over 2 to the k number, I can actually make my goal a little bit stronger. I say, hey, you know what, prove to me, not just that the value is less than 1, but in fact prove to me 99% of the times that the value is at most P inverse 2 to the k plus epsilon. It's a reasonable goal because it's a truth. In particular, a brute force algorithm could solve this, could actually meet this goal. So it's a legitimate thing to ask of our algorithm. Such a goal has a name. It's called strong refutation. And actually the goal I told you earlier is referred to as weak refutation when you want to distinguish it from the strong refutation. Good. And so the key algorithmic question of concern to us after we have all this discussion is done is that what's the smallest number of constraints required for an efficient refutation algorithm to exist? Notice that as m increases, in a certain sense, the contradictions in the instance increase. So it should become easier and easier to prove that it is unsatisfiable. And in particular, a brute force algorithm succeeds as long as m is at least n over epsilon square. And you can ask, what's the smallest m at which I can efficiently produce a certificate of unsatisfidelity in formally speaking, what's the smallest m for which a polynomial time refutation algorithm exists? That's what our main concern is today. Good. And actually we'll focus mostly on strong refutation because that's what our algorithms will accomplish. Good. So now that I have told you the problem, I want to tell you a little bit about the very long history of this problem. So it's going to be extremely averaged. And I would be extremely sparse in my references. Pardon me for that. But I'll at least try to convince you of the different fields of the same problem occurs in and why people care about it. So most of the work actually in this direction has gone into studying the random model, the first model that I told you earlier. And one reason it was studied early on, especially in the early 90s, is in the context of proof complexity. So as you saw, this problem is actually at the heart of it is a problem about generating certificates or proofs of unsatisfiability. So it's no surprise that proof complexity theorists would be very interested in it. And that in fact is the reason they cared about it. And one of the messages of this line of work is that if you look at various restricted proof systems that proof complexity theorists study, then there are no short refutations, no efficient in various notions of efficiency. There are no efficient refutations for let's say random SAT and other random CSPs when m is about order n. So when m is like n or epsilon square where a brute force algorithm exists, where there is a long certificate, you can ask is there also a short and simple certificate in various senses of the word and proof complexity theorists, one message of this line of work is to show that well, they don't exist. Like in other words, this average case version of the problem is likely hard at least in this restricted proof systems. Okay, that's basically a long line of work. I've only have two early papers here, but I think there are many, many, many papers that follow up on these papers. Good. So at the same time, there were studies on algorithmic progress for this problem here. There were papers that were trying to design efficient algorithms for this problem. And one class of algorithms that are now known to do very well at least they give us the best possible guarantees are certain algorithms called as spectral algorithms. And I'll explain to you what this means in just a little bit. But the main message, which I want you to take this number away with you right now is that if we let's say focus on random three-sat where the predicate P is the three-sat predicate, then you do know of an efficient refutation algorithm when m exceeds n to the 1.5. Okay. Now one note or one point of caution for the whole talk, when I write this squiggly inequality, I'm going to ignore polylogon factors. And you'll see it's reasonable to do because we actually are root n about a square root n larger than the information theoretic minimum limit that we saw using this Chernobyl's union bound argument earlier. And so because we are root n larger, it's not a big deal to ignore polylogon factors. We are really interested in polynomial asymptotics because of this kind of bounce here. So the message is that you only know of an efficient algorithm for the refutation problem when your density exceeds the minimum possible density by a square root n additional factor up to some polylogon terms. And there is no known algorithm that works below this, below n to the three-hubs. In other words, if you have n to the 49 constraints, there is no known efficient polynomial term algorithm that can generate refutations for random three-sat. Good. So next piece in history. And actually, you know, so this algorithm, unfortunately, I don't have the reference, but I think it appears in a paper of Kocha, Oglon, Gert and Lanka, if I remember correctly. But after that, and there was a lot of work in the late 90s and early 2000s about finding better algorithms, and people realized that they can't quite come up with it. And a big step was taken in this election by Feige in 2001 in this really nice, really cool, really famous paper, where he formulated a conjecture that actually this is not possible. And that conjecture is called the random three-sat hypothesis. Actually, it's a, you know, cool sociological study on how people name things, you know, like I always thought of hypothesis as weaker than conjectures. I wondered if hypothesis rise up to the stature of conjectures. But anyway, Feige called, you know, this particular assumption, the random three-sat hypothesis, and basically just means that, you know, there is no polynomial time refutation algorithm in the sense that we just discussed, then M is, you know, some theta n. So, you know, in other words, this conjecture is saying that, you know, if you want a polynomial time refutation algorithm, you'll need a density which is, you know, asymptotically faster growing than n. Okay, that's it. Of course, this is a, you know, pretty conservative conjecture given our status of the algorithms. And, you know, over time, the stronger variants of these conjectures have been proposed. Variants for other CSPs have been proposed, you know, other predicates, et cetera. And these have served as starting points for like a whole host of, you know, hardness results that I will not mention. But, you know, these applications arise in learning theory, statistics, game theory, and many, many other fields. Okay, so you can basically use this as a starting point for average case hardness results. Okay, good. And actually, you know, not to stress too much on these applications, but, you know, one strong application arises in cryptography where, you know, you can use this average case assumption to build what are called as local pseudo random generators. Okay, good. I don't want to go too much into detail, let's just, you know, have the right buzzwords. Good. So, you know, in terms of algorithmic progress, which is directly relevant for us, there is like a string of recent words in the last five years that tried to address this problem at the complete generality for random CSPs for the first random model. So, there were these two algorithmic works that basically, you know, gave new spectral algorithms that, you know, gave very good guarantees, generalized this end-to-the-three-halves guarantee for other CSPs, you know, for every predicated in a certain sense. And it turns out that in amongst a large class of algorithms which are captured by this so-called sum-of-squares proof system, you know, these works showed that these algorithms are, in fact, optimal. So, at least if you restrict to a certain broad family of algorithms called as sum-of-squares algorithms, then it turns out that we basically have a complete understanding of what happens to the refutation, what happens in the refutation problem for random CSP model for every predicate, okay? So, again, I don't have that much time to discuss the specifics of this little, but this characterization is in terms of some simple property of the predicate. So, you look at the predicate and you read off some property called as how uniform it is and it completely decides, you know, what is the threshold for efficient refutation, et cetera. If at least you care about only the sum-of-squares algorithms, okay? But the message of these works which I want to tell you is that if you believe in, you know, special algorithms and if you think that, well, you know, at least we don't know of any other technique that can do better for these problems, then we kind of completely understand the status for random CSPs. And these results kind of serve as evidence for the starting point for reduction that I told you earlier. Good. That's the story of random CSPs. The story of semi-random CSPs is less well-developed, okay? So this model, the model that I defined earlier was proposed by Feige, you know, about 15 years ago. And actually, you know, in some sense, Feige was motivated by several works that study semi-random versions of the coloring CSP, the coloring problem and said, okay, you know, we should consider this problem for 3SAT, et cetera. And even in the semi-random setting, so here the random CSPs were already well studied, but he generalized the setting to semi-random ones. And the main motivations, you know, as you might imagine is when you're designing algorithms, maybe you want to care about what properties of the random instance are really relevant to your algorithm to succeed. In other words, you would want your algorithms to be somewhat robust and, you know, to use as few properties of a random instance as possible, right? In a certain sense, if you want to understand the complexity of the problem, that's what you should do. And, you know, maybe you're concerned about how brittle your algorithm is, you know, you, how reliant on the assumption about the structure of the random model is it. So that was, like, the intrinsic motivation that PyGay, I guess, had when he formalized this model. But since then, you know, the question about extending algorithms for random CSPs to semi-random settings has been asked by a number of different works. And it actually occurs specifically in the context of certain pseudo-random generators that cryptographers have proposed in order to build this cool gadget called as indistinguishability obfuscation. Again, not our interest, but basically there are actual motivations for studying the semi-random model that arise in cryptography. Good. That's all I want to, you know, tell you about this abridged history of CSP refutation of both random and the semi-random models. So now, you know, I can tell you the main result of this work, which is, you know, this informal statement that basically you can match the guarantees for every CSP that you obtain for the random case. You can match them also in the semi-random case. So in other words, the algorithms that we're using, if the algorithm that we're so far using, you know, the randomness in the hypergraph H, well, now you can actually design new algorithms that match the exact same guarantees, you know, even when the graph H is completely arbitrary. Okay. And again, when I say same, I think we lose some extra polylog factors which we don't know how to cure. So, you know, it's not same if you care about polylog and factors, but, you know, if you care only about asymptotic growth in M, then, you know, we actually match the random case. And, you know, in other words, you know, if you think the random, the case, the algorithms for the random case are optimal, then this is saying that, well, you know, we've gotten the right algorithms for the semi-random case too. If not, you know, you better improve the at least the random case before trying to improve the semi-random case. So, you know, the other way to interpret this is saying that, you know, the semi-random case somewhat surprisingly is no harder than the random case. Okay. So let me pause and see if there are any questions so far because I have dumped a lot of information on you. Okay. Good. So let me continue. So I now want to tell you, you know, a little bit about what techniques go into solving this refutation problem. And hopefully I'll get to tell you at least at a high level, you know, what are the new ideas, at least some of the new ideas in this work. Okay. And the first idea that I want to leave you with, it, if you haven't seen this before. I'm sorry, Pravesh. Can I interrupt just a question from the beginning? So, your predicates, are you assuming in this work that they are symmetric? Somehow it was suggested by this hypergraph viewpoint. So do they need to be symmetric? Good. No. So the hypergraph language was loose. You are absolutely right. Okay. I do want to think of them as ordered catapals. So, you know, I really have a collection of ordered catapals. So my predicates are not necessarily symmetric. Okay. Thanks. So it's more general than I thought. Yes. Yes. Thanks. Perfect. Yes. Good. Okay. So let's continue. So, you know, as I told you, like my goal is to really, so most of the talk is really about ideas that will be considered old, but I actually don't know how well-known these ideas are and they're really cool. So I want to, you know, leave you with some of these things which you might enjoy even more than our present work. So the first trick I want to tell you about is that, it turns out that if you don't care about constants, if you don't care about, you know, losing constants in the threshold for the number of constraints n, then you can essentially focus attention on solving the problem for kxr. Okay. So kxr, if you don't know, I'll soon define it, but it's a very specific predicate which takes the value 1 only when the parity of the bits is, you know, even and, you know, the parity of the bits is odd, it takes the value minus 1 or 0. Okay. So, and this is like some classical trick which Feige in his 2001 work already discovered. Actually, it might have been even before that, to be honest. And, you know, for applying it to other CSPs, this trick was generalized by this work of Alan O'Donnell-Wittmer, but the message is that somehow it turns out that there is a completely black box reduction that works both in the random case and the semi-random case that tells you that if you can solve the refutation problem for kxr at the right density, at the right m, then you can solve it for all CSPs. At the right m for those CSPs. Okay. So, again, I will not tell you what this trick is because I want to tell you more interesting ideas, but this is good idea to, you know, this is good to remember that somehow, you know, this refutation problem that can look unsurmountable at first, right? Because it seems like, you know, how do I handle every predicate in the world? But it turns out that you can completely reduce it to just handling xr. And so, you know, suddenly it seems way more tractable because there are only a few predicates to deal with. Good. So, given that, I can tell you the main quantitative result. The new thing that we really are doing is really for kxr and then we're just plugging in this five-year xr trick to get the result for all CSPs. Okay. So, in particular, we are proving that there is an algorithm that succeeds in solving the semi-random refutation problem for kxr. Okay. Whenever m is at least n to the k over 2. Okay. So, if you remember 3SAT that I gave you, this example 3SAT earlier, turns out that you can reduce 3SAT essentially to 3xr. And so, remember this n to the 3 half bounder I told you, well, that's the instance of this n to the k over 2 bound. Now, that result, when I was telling you about, it was about the random case. And this one is about the semi-random case, but the bound is exactly the same, up to polygon factors for the fully random case. Okay. So, the main result is really proving this result, proving this bound for xr in the semi-random setting for every k. Okay. So, I'll refine this observation a little bit more by pointing out this weird distinction that a priori might appear some insane technicality, but it turns out that it's a real issue. And I'll explain this is a real issue that we're going to tackle now, is the difference between even at xr and odd at xr. So, like 3xr and 4xr. Okay. So, it turns out that, and I'll tell you exactly why, but the case of even kxr is easy. Turns out that there is some simple tricks you can do and handle even k, but those simple tricks do not work for odd k. And so, it turns out that our main result is really... I can refine it even further. The real, true, new result is really for the semi-random refutation for odd kxr, because the even one is really simple and already known. Okay. That's all they are doing in this work. They are only solving the case of odd at at kxr in the case of semi-random refutation. Okay. Good. So, again, I'm going to come to the details in a second, but just to justify why this is going on. So, even in the case of fully random CSP refutation, the even at at kxr case was done way earlier. The case of odd at at kxr, in particular, kxr took a long time. And in fact, it was finished only... At the right level of generality, it was finished only about five years ago in this very nice work of Barak and Moitra, who used some heavy random matrix theory tools called as a trace moment method to actually establish this end to the three-halves threshold for kxr and end to the k-halves threshold for odd kxr. Okay. And now, I already told you this end to the three-halves bound was kind of known for random CSPs earlier. It turns out that those bounds for really establishing weak refutation, and even that require a lot of work, even weak refutation require a lot of work, even in the fully random setting. It turns out that you need some extremal combatoric results about the existence of certain combatorial objects called as even covers to make that happen. Okay. The only point of these two claims is that, it was kind of non-trivial, even in the fully random case, to somehow understand this odd additive refutation. And so, as I'll tell you, our main idea is to use a combination of three different techniques. And I'll explain these techniques to you in a second. These are going to be spectral methods. I'll tell you what those are. I'm going to tell you about a semi-definite programming-based refutation algorithm. I'll tell you what that is. And I'm going to combine these two methods with a combinatorial decomposition theorem, which also I will try to tell you. And so, one very nice takeaway, which if I don't get to it, then I really recommend just checking out the overview section of our paper, which is like page and a half long, is extremely simple proof for the fully random case that we can now give for this end to the three halves bound. So, as I told you, this case of odd additive XOR refutation, even in the fully random case was kind of non-trivial and required a really technical proof, but one of the key nice points that comes out in this work of ours is that the new technique gives a really simple proof for the fully random case. In fact, that's really important because we kind of use that as a starting point. If we didn't have a really simple proof, then we won't have been able to extend it to the semi-random case. And so, the key idea is somehow to find a really simple refutation algorithm for the fully random odd additive XOR refutation. And so, my ambitious plan was to explain this proof completely, but given the way time is progressing, I feel like I'll not get to it. Therefore, I strongly recommend just checking out the page and a half overview section in the paper. It should be really completely readable without any other context, especially with the notation I've set right now. It should be really quick. Good. So, let me now explain these three pieces that I told you about. The spectral refutation, semi-nefric programming based refutation, and this combinatorial decomposition. So, I need some external rotation, unfortunately, but it's going to be simple. So, let's say that I'm going to represent the k-tuples that appear in my CSP instance as C1 up to Cm. Then, just to clarify this odd, the XOR predicate that we've been talking about, it basically corresponds to the function px, which computes just the product of the bits in the k-tuple that defines the constraint. So, just take the product of the bits in the constraint. And the point is that in the plus minus one word, the product computes the parity of the bits because the product basically computes how many times minus one appears, which is equal to the parity of the bits. And so, the XOR has a really simple representation. It's simply a monomial in this plus minus one word. Good. So, now what about a negation pattern? So, now remember we had this negation pattern where we were applying negations to every possible bit. So, because my predicate is simply a monomial, I get to multiply all these negation patterns together and just extract a single bit out because that's all the negation pattern is doing now. And so, which means that I can represent negation patterns by a single bit instead of like this k bits that I had to do for general CSPs, general predicates. So, now my negation pattern is going to be a just a single bit bi, one for each constraint. And remember, it's going to be chosen uniformly at random and independently for every constraint in both the models I'm caring about today. Given this, I can now describe this function i of x in the context specialized to xor, which is simply the sum of bi multiplied by xci. Xci notice is just a monomial on the k-tuple of variable that appears in C sub i. So, I'm basically taking the parity of the bits, the plus minus one parity of the bits in C sub i. I'm multiplying it by the negation pattern that applies to it. Okay. And I'm averaging over all possible constraints. Okay. Now, why is this i of x relevant? Well, notice that if x, a certain assignment x satisfies all the constraints, then bi.xci would be one. In other words, xci, the product xci would take the value b sub i. That's what it means for the constraint to be satisfied. Okay. And so, if x satisfies all the constraints, each of these bi.xci's would be one, which means that this average would be one. Okay. Stop me if this is unclear. Okay. Stop me if anything is unclear. In particular, this is unclear. Okay. Very good. And so, basically, this particular polynomial expression, which characterizes the instance, is the only thing now I have to deal with. In particular, as I observed, if the instance is satisfiable, then the value of i of x is one. And in general, if every assignment gives i of x a value of at most some number epsilon, then you can prove that val i is exactly half plus epsilon by two. So, this is just some translation. This is just some shifting by half and translation of the value that we saw earlier. Okay. And in particular, if you look at the random assignment, then the expected value of i is zero, and the val had expected value half. So, this shift by half occurs. Okay. The thing to observe though, and the thing to take away is that if I want to refute xor, if I want to certify an upper bound of the value of a kxor CSP, all I have to prove is that the max of i of x is strictly less than one. And actually, for strong refutation, that it's less than some tiny enough epsilon. Okay. So, let's maybe record that as a goal here. My goal is to take some instance of xor now. And my algorithm should output a value v. v should be an upper bound on max of i of x over all assignments, which I'm calling bias of i. And v should be at most epsilon, 99% of the times when the instance is random. Okay. So, now let's recall what the semi-random, like what is the randomness in the semi-random case. C sub i's are fixed, right? Because my hypergraph is completely arbitrary. I have no control over it. It's not chosen at random. So, the only probability in the semi-random case comes from choosing the negation patterns at random, which in the case of xor is simply a single bit for every constraint. So, it's like, you know, v is chosen uniformly at random from plus minus one to dm. And, you know, that describes the instance on the hypergraph h. And I want my algorithm to output something at most epsilon, 99% of the times. Okay. Again, I know a lot of notation. So, let me pause for 10 seconds or 30 seconds and see if there are any questions if I can clarify this thing for you. Okay. Good. So, now I want to tell you this two tools that I was talking about. Okay. Let me tell you how, you know, how spectral refutation even plays a role here. How can I use eigenvalues of certain matrices to come up with upper bounds on, or come up with like, you know, refutation algorithms like this. Okay. So, let's try to understand this trick when k is two, two xor. Okay. Really, really simple. So, we're looking at two xor. Okay. Now, in the case of two xor, the idea is the following. Let's look at i of x. How does it look? Well, it looks at, you know, I'm just like looking at this expression we defined in the previous slide. Right. Now, somehow things have gotten erased. So, now, you know, the idea is that this is like some, you know, xci is a monomial of degree two. Right. It's a product of two bits, because I have a two xor instance. Right. So, I'm going to somehow try to write this expression as a quadratic form of a certain matrix. Okay. What matrix is it? Well, define the matrix A, which at r comma s. So, it's, you know, think of the matrix A as indexed by the variables of the CSP instance. Okay. Now, a entry of the matrix identifies a pair of variables. Okay. ARS at a pair of variables would be equal to some scaling of, you know, the negation pattern at constraint r comma s. If it does appear in your instance, you know, like look at the pair RS, either it appears in the constraint or it doesn't appear in the constraint. If it doesn't appear in the constraint, just set ARS to be zero. Okay. If it does appear in the constraint, then, you know, the monomial xr xs is going to be multiplied by b sub i in this expression. So, just add b sub i over 2, you know, for that pair. And this 2 is just to adjust for the fact that, you know, r comma s appears twice. It's like as RS and as s comma r. Okay. So, now the observation is that I can take this expression, which is a degree 2 polynomial, you know, when I view it as a function of x, and I can write it as a quadratic form of this matrix A. I can literally wrote down, you know, this b sub i is just ARS, you know, ARS plus ASR gives me b sub i, and both of them multiply x sub r, xr, xs. So, I literally get the same expression here. Okay. Good. And the point is that this, this expression that you're looking at here is simply the quadratic form, a scaled quadratic form of the matrix A on the assignment x. Okay. Good. So, now what? Well, remember bias i was simply the max of i of x over all possible x's, right? Which means that I can upper bound i of x in the following way. I can observe that for every x, not just assignments plus minus 1, but literally every vector x in the world, bias i is at most 1 over m. Now I'm going to upper bound x transpose A x by the norm, l2 squared norm of x times the largest eigenvalue or largest singular value of the matrix A. Okay. This is the bound. Notice that, you know, I have, this is a lossy inequality because this maximum is true not just for plus minus 1 valued vectors x, it's true for all possible vectors x. Okay. But nevertheless, it's a valid upper bound. It may not be a great bound in general, but it's a valid upper bound. Okay. Which means that I can use the following simple refutation algorithm. Given an instance of 2xr, compute this matrix A. Okay. And just compute its largest eigenvalue. Notice that l2 square norm of x is n, because, you know, each entry is plus minus 1. So when you square an ad, you get n. So I literally get the value n over m, the largest eigenvalue of A. Okay. Notice that without doing anything, I have a correct algorithm. Right. Because the value is always, the value v that I'm putting is always an upper bound on val i. Okay. So the only question is how good is it? We'll come to it in a second. Okay. But this is algorithm number one. This is what I'm going to call spectral refutation. Okay. Before going further, I want to explain to you a second kind of refutation. I'm going to call it SDP refutation. Okay. Here's what it means. Remember our instance, the same quadratic form. Let's just focus on 2xr still. Okay. Then instead of upper bounding i of x in terms of the eigenvalues of A, this time I'm going to be slightly clever and try to use the fact that my vectors x have to have only plus or minus 1 coordinates. So I'm going to take a simple SDP relaxation of max 2xr. If you've not seen this before, it doesn't matter. Just trust me that there is a natural SDP relaxation. And this is how it looks like. It just maximizes the trace of A dot x. The x is the SDP variable, the solution matrix. And I have the constraint that x is positive semi-definite and that the diagonals of x are 1. In particular, the diagonals of x being 1 are the constraints that use the fact that the coordinates of x were plus or minus 1. Okay. Like the one way to think about the SDP is, this is equal to 1 over m trace of A dot xx transpose where x is an assignment. And so what we are doing here is just relaxing the rank 1 matrix xx transpose, which has diagonals 1 by the way, to any arbitrary PSD matrix x with diagonals 1. Okay. Anyway, so if you've not seen this before, it doesn't matter. But this is a very simple SDP relaxation and because it's a relaxation, it's again always an upper bound, no matter what the instance. Okay. And it turns out that you can analyze this SDP quite nicely using this cool tool called as the growth and deakin equality. Okay. Again, you don't need to know what it is. Just the fact that growth and deakin equality in our case tells you, so, you know, we just proved that SDP i is a relaxation, which means that it's an upper bound on the true value, the true bias i, right? But it turns out that it's actually not a bad upper bound. It turns out that SDP i is at most some k, where k goes as log n times the maximum value of ix. So in other words, this bound is tight. Okay. Within some log n factors to the truth. Okay. So now how do I use this for refutation? Well, you know, I just used the churn off plus union bound argument for the random 2xr or semi random 2xr to bound the true value. And then I know that the SDP is going to give me a value which is at most log n words. And I can use, you know, either of these tools to actually get refutation that works, you know, whenever I have m bigger than equal to n log n for 2xr. Both these tools are enough to do random or semi random 2xr. Both these tools are enough to do that. Okay. Again, I'm going to pause for like 30 seconds to make sure that I didn't lose any of you here. So I told you how to do 2xr completely now and I told you two different tools for it. Okay. Good. So what about larger k? Okay. Let's look at why the even k case is easy. Right. I told you earlier that even k case requires no work. Let's see why. Okay. So here is the one line idea. The idea is that I have, you know, k variables. If k is even, I'm simply going to think of the kxr, the monomial of k bits as actually a monomial of, you know, degree 2 in a higher, like, you know, enlarged space. So I'm just going to like create n to the k over 2 variables for all possible k over 2 tuples of variables and think of kxr as 2xr in this enlarged space. Okay. And if it, if it makes sense to you, here is a notation. Here isn't, here's what's happening in notation. I've created variables y for all possible k tuples, k over 2 tuples. Okay. And I write an arbitrary even itxr constraint as just a 2xr constraint in this enlarged variable space. And notice that what I told you earlier immediately gives us n to the k over 2. Right. Because our algorithm previously for 2xr work when m was bigger than n log n, the n has changed to n to the k over 2 here. Okay. And I told you that the 2xr thing works for both random and semi-random case. So I have completely told you how to do even arity xr refutation for both cases. That's it. Done. Okay. All we have to do now is the odd arity case, which is of course the main result of the paper. Okay. So let me explain to you what the idea for odd arity refutation is. Okay. And I'm actually maybe at most 10 minutes away from the end of my talk. So let's try and see. And then I'll be back for some time from Andrei and Yakub when I get to the end. So let me explain to you how would you want to do this odd arity xr. And you see there's a very basic problem here. It's like if you try to do this enlarged space business that we did for even arity k, you would have to use n square variables. Because one natural way to do it is just make variables for every pair and every single ton. Now in this enlarged space you have n square variables, which means that you will succeed in doing odd arity. Let's say 3xr for n which is other than n square. But the right bound you want is n to the 3 halves. So you are off by root n. This is all that we are. This pesky root n is all that we are removing in this work. That's all we are trying to do. We are trying to be cleverer than this simple enlarging trick. So let me tell you how people did it. Barak and Moisra in particular did it for fully random case. Because it's a pretty instructive idea too. So here's the idea. They look at ix as, so here is a degree 3 polynomial that corresponds to the ix for 3xr. Now what they do is they look at all constraints that contain let's say some fixed variable i. So they look, they sum over all i and look at all constraints that touch variable i. They pull out xi from this monomial and write the remaining thing, which is a degree 2 polynomial, as a quadratic form of some matrix. Let's say b sub i, just like we did earlier. Whenever you have a degree 2 polynomial, it's a 2xr instance and I can write it as a quadratic form just like we did earlier. Good. So now I can write ix as some xi times x transpose bix where bi is this 2xr instance of all the constraints that touch i. So far so good. Very good. So now here is the trick. Here is the main trick. The main trick is to look at this expression and apply the Cauchy-Schwarz inequality. So I have a sum over product of two terms. So I'm going to de-couple them using Cauchy-Schwarz inequality. Here is what happens. Instead of writing square roots, I squared both sides so that I don't have to take square roots. And now i of x is at most, here is the first term. And here is the second term. Oh, no. There I am. Now, what's the greatness about this? Well, look at the first thing. I have xi square is 1 for every i, so this becomes a constant. And the second thing, I get to square x transpose bix. Now it becomes a degree 4 polynomial, which I can treat as a 4xr instance. So now in some sense, I have reduced my task, perhaps in a lossy way, but I have nevertheless reduced it to bounding or computing an upper bound for this 4xr instance. Still good? Still following me? OK, good. So ignore all this computation. I was a bit too ambitious there. The idea is that you can take this x transpose bix square and write it as a quadratic form in the enlarged space of n square variables as a 2xr, just like the trick I told you earlier for even i at xr. And it turns out that if you take the matrix, the quadratic form that defines this 2xr instance, you can write it nicely as a tensor product of this matrix bi with itself. Trust me on this, because we are running out of time, so I have to ask you to trust me on this. This is some simple manipulation. This is simply writing the 4xr instance as a 2xr instance in enlarged space. There y is a variable in this enlarged n square variable space and writing it as a quadratic form. And now what? So I guess I didn't do this whole step here, but basically now I can write it as 1 over i bi tensor bi y transpose y. So I have written down an upper bound of the value of this instance in terms of this matrix here. So just like we did earlier in the spectral refutation case, if I come up with a good upper bound and the largest eigenvalue of this particular matrix here, I'll be done. Still good? Still nice? So Barak and Moitra actually do this in the completely random case. But notice that you're getting a little bit... Things are getting a bit complicated here in terms of analyzing the largest eigenvalue of this matrix here. In the fully random case, they succeed and they use this workhorse of random matrix theory called the trace moment method to actually accomplish this. It turns out that if you actually try to compute this bound in the semi-random case, it not only doesn't work, it's actually false. There is no good upper bound in general in the semi-random case on this particular matrix. So you can't quite succeed using spectral refutation. So spectral refutation fails. You can say, okay, we had this other tool, the SDP refutation. What about that? Why can't we use that? It turns out if you remember correctly, we have this analysis using growth index inequality where if we upper bounded the true value of the instance, then the growth index inequality tells us that the SDP value is not too far. This seems almost there. We seem to be almost there. But we are actually very far because we are trying to upper bound the value of a 4xor instance. We are in this enlarged space now. But earlier, we were doing it for 2xor. Now we are doing it for 4xor produced from a 3xor. So the key point is that the 4xor instance in this enlarged space has n square variables. So if you wanted to do union bound and turnoff bound like argument to upper bounded true value of the 4xor instance produced, you would need at least n square bits of randomness. But you started from a 3xor instance and you are trying to do the case where there are only n to the 1.5 constraints. So you have way fewer bits of randomness. So you are now in trouble. In some sense, you have to come up with a way to analyze the true value of this instance which beats the union bound by a whole lot. So hopefully you get some idea for why both special SDP refutation are in trouble if you try to apply in a semi-random case. Okay. So Yakub, I only have like one slide to explain at a high level how we are going to overcome this. Is that okay or should I stop? Well, I guess you can have 2 minutes but that's about it. Okay. Yeah. I'll take only 2 minutes. So let me ignore this. Let me give you pointers instead. Okay. I had some more math but I'll stop there. So the first thing we do as I promised you is give a completely new proof of this eigenvalue upper bound on bi tensor bi, the sum over i, this matrix that came up in Barak monitor analysis. And we actually do it using an off-the-shelf matrix concentration inequality without running the trace moment method. And on a good day, I would tell you how this works because it is very simple but today I will not. Okay. And then what we do is we go and try to open up the proof of why this whole analysis works and we do what we set out to do which is we identify a deterministic condition on the graph H which makes this analysis work. So now we are doing what we were really intending to do. We were trying to analyze which conditions make the random, like the analysis of the random case work. We actually extract the deterministic condition out that makes a simple algorithm work. Okay. Then what do we do? Well, there's only one natural thing to do. Right. This condition somehow was the key idea behind the randomness, the fully random case. So you identify that as some kind of a pseudo-randomness condition and then what we do is we take arbitrary instance and we show that we can decompose it into a pseudo-random instance and a structured instance. We can somehow use some variant of spectral refutation for the pseudo-random instance. It turns out that the vanilla one doesn't work but it requires some additional ideas. And for the structured instance, we show that because of the way its structure is, we can actually make semi-definite refutation work. And so in some sense, if you combine this combinatorial decomposition which can be done efficiently along with these two tools that we study, then we are done. And so that's all I want to tell you. Sorry for going on a little bit in time and sorry for not telling you the complete proof. That's okay. Let's thank Pravesh. I believe there is a question for you in the chat. Okay. Let me take a look. Oh, there it is. Okay. Yes. I guess it seems like Anesh's question is already answered. Okay. Good. Okay. All right. Any more questions for Pravesh, please? Leborn? Yes. So just very roughly, what kind of algorithms are used for this decomposition you need? What is it about? Is it some matrix things or something completely different? Ah, good. Yeah. No, it's actually a very combinatorial thing. So it turns out, I mean, I'll just tell you the punch line even though the proof itself is not that hard. So here's the punch line. So let degree i, b be the number of constraints in your instance that contain variables i and b. So it's a 3xR system. Think of all the constraints that contain two of the variables i and b. Okay. In general, they can be multiple. Okay. So it turns out that the random case, the analysis for the random case works as long as the following mysterious combinatorial condition holds. So for all pairs of variables v and v prime, if the sum of the product of their degrees at i happens to be at most order log n, then you actually work. Okay. And this is actually very easily satisfying the random case because there are only, if you take less than n square constraints, then the number of constraints that use a pair of variables is basically order 1. Okay. And so that's it. This is our combinatorial condition. This is our pseudo-random condition. So whenever this condition holds, the random analysis goes through. This is the key observation. And somehow, what we can do is we break the instance into parts. And the pseudo-random part satisfies that degree i, v is always small for every variable. And then we collect the structured instance where this is not true and show that something else holds there that makes the SDP reputation go through. Sorry about not being able to give sufficient details here, but that's, it's a very simple property. Like nothing very complicated going on. It's a very simple combinatorial property. Thank you. You're very happy here? Yes. Let's look at it. Any more questions, please? Well, I might ask a very simple question. So you started with the random case when everything is random. So then you removed some randomness. So what is this randomness? Once k, so what is it that you have to keep? And what is it that you can remove? How less random can you make it? Good. You're saying how can we go beyond some random in some sense? Is that the question? Maybe not beyond, but maybe a different truth, something less random. Right. So in a certain sense, I think the semi-random model that I discussed is probably the least random model I expect where you can actually succeed here. Perhaps there is a way to choose the literal patterns in some mildly correlated way and make this analysis still go through. So you might be able to decrease the randomness by, you know, maybe dropping the strict condition that the literal patterns for every cater for this to be independent. You might be able to relax that a little bit. But yeah, to be honest, I don't see natural models that go beyond that too much. So in a certain sense, yeah, it seems like we are already at roughly the minimal level of randomness that at least our algorithmic ideas need. Okay. Okay, thanks. If I may, I might have another question relevant to this one. So you're dealing with Boolean CSPs, right? Yes. So do you have some intuition what the right definition of semi-random in like three element case would be? Like if the domain has three elements? Ah, good. Yes. I think so. Yeah. So I think the catapults are still going to be arbitrary. That's an actual thing to do it, right? And so now the question is what's the literal pattern, right? And maybe one way to do it is, you know, you pick an element of f3 to the k at random and, you know, shift the constraints additively with this, with this. So that makes sense, you know, instead of, now it's easier to write it in, you know, the additive world. So imagine that, you know, your assignment now lives in f3 to the n. Okay. The field of three elements, you know, the, yeah. And now, you know, you choose for every constraint b, which is like f3 to the k. And now the kth constraint basically corresponds to taking x1 up to xk, the catapult that appears, plus vector b, and say that this is equal to 1. Does that seem reasonable? So it's like, you know, I'm applying a random shift to every variable. So instead of the random flips, I'm applying a random shift. And you think this will be enough? Because like you can actually get it more random by just taking, you know, any permutation of the three values, not just the shifts. Yeah, that's true. So at least, you know, in the fully random case, where we actually do know how to do, even the higher alphabet CSPs, this shift notion actually is enough in the fully random case. So my guess is that it might be enough even here, but yeah, I might be wrong, because we have not tried to do the higher alphabet case at all here. In fact, that's a cool open question, I guess, to work out. Yeah, I think since it gets the expectation right, this should be enough, I think, since all you use is first-moment type things and independence. Okay, cool. Thanks. Any more questions for Prajesh? Yeah, there's a question in the chat. So the question is this. So you use only one predicate. What if you have more than one? Yeah, it's a really good question. And I think, again, even though this has explicitly not been studied even in the fully random case, I think existing techniques might be enough to, like with some minor modifications, you might be able to get a right answer for any set of templates. Like, you know, if you take a collection of predicates, then it's, you know, there's, I told you this property earlier, which completely governs the behavior of random CSPs, let me just refer to it vaguely as the degree of uniformity. And it seems to me that the behavior for refutation, at least for weak refutation, would be completely governed by the easiest predicate for refutation in your template. So in some sense, it would be some simple max slash min of this property in the template of your predicates. Now, if I have to generalize, that would also be my guess for the semi-random case, even though both these things have not been tried or written down yet. But my guess is that, you know, there is some simple max slash min of the property, you know, in the set of predicates that you have that will govern the behavior. Does that make sense? Michael, does it make, oh, I guess, okay, good. Yeah, okay. It's curious about it. So what you say is that they actually somehow linearly ordered by how good or bad they are, right? Actually, yeah. I mean, you know, okay, I'm kind of tempted to tell you what this is, because it's very simple, okay? Here is the cool property of, so let me, yeah, I'll tell you maybe at most two minutes, okay, just because I prefer this property twice. So let's look at 3XR, right? Like we talked about this predicate. And what's the property? So let me tell you what property it has, okay? It has the following property. There is a distribution on, you know, the space of assignments of 3XR, which is, you know, supported only on satisfying assignments of 3XR, okay? And is pair-wise independent, pair-wise uniform. So the first two marginals of this distribution are uniform. And what is the distribution? It is simply the uniform distribution on all the assignments. You can check very easily that, you know, it is pair-wise uniform. This is the key property that governs, you know, whether you will need n to the three halves constraints or, you know, some constant times n would be enough. And there's a generalization of this property, the property of being able to support a pair-wise uniform distribution, namely the ability to support pair-wise or T-wise uniform distribution that governs, you know, this higher refutation thresholds. So, you know, it's a very simple property. And what I'm saying is that if you have a set of predicates, okay, and if any of them, if none of, if there is at least one of them, which is not pair-wise uniform, meaning there is no distribution on the set of satisfying assignments of the predicate, which is pair-wise uniform, then you would be able to refute, at least weekly refute the whole CSP, you know, in about n log n constraints. If on the other hand, all the predicates are, let's say, pair-wise uniform, meaning for all the predicates, there is some distribution, you know, which is supported entirely on the satisfying assignments of that predicate and is pair-wise uniform, then you would need n to the three halves or larger number of constraints. That is my guess, you know, given what I understand about the fully random case right now. Does it make some sense? Okay, yeah, thanks. I mean, it's obviously hard to give intuition for why this property matters, but it turns out it does. Okay, well, thanks a lot, Pravesh. So let's, let's, again, for a great talk. And that's, that's it. The next talk will be in early December, given by Alex Brandt. And you will be notified, as usual. Okay, thanks, Andre and Yakub. Thanks all for joining. Thanks for the talk. Thank you.