 Hi, everyone. So yeah, I'm going to talk about worst case hardness for LPN via cod smoothing and some applications such as cryptographic hashing. And this is joint work with Vadim Lubashevsky, Vinod Vaikunathan, and Daniel Wicks. So let's sort of all remember what's the LPN problem. So essentially, it's a problem of solving a random system of linear equations over the binary field. So we have a system of m equations with n variables. So you can just represent it as a matrix A. And this is just going to be a random matrix because we're thinking about a set of random linear equations. And the variables is this secret vector s, which is the variables that we actually want to recover. And we are thinking about these variables also as chosen at random because this is actually the hardest case. So I'm going to give you the coefficients of the equation. So this is the random matrix A. And I'm going to give you the vector b, which is A times s. And the goal is to find the variables s. Of course, this is not an intractable problem. And well, there's no noise here. So you can think about this learning parity without noise. But if you want to think about learning parity with noise, then we're thinking about adding a little bit of noise to each one of these equations. So flipping the output of the equation with some small probability. And again, this can be thought of as adding some IID Bernoulli noise. So this is the learning parity with noise problem, very well studied problem. And let's think about it for a second and think in which parameter regimes this problem is easy and which parameter regimes it's hard. So it's not so hard to see that as the weight of the noise, the hem rate of noise gets higher, gets closer to 1.5. So I'm only going to think about noise with weight less than 1.5. Otherwise, it's symmetric. Then the problem actually becomes harder. So we already saw that if the weight of the noise vector is 0, then the problem is easy because you can just solve it as a system of linear equations. So this is what I said before. So I'm thinking about delta as being the relative weight of this vector, which is also related to the parameter of the Bernoulli noise. So for delta equals 0, the problem is easy. And actually, also for slightly larger values than 0, it's not hard to see that 1 over 5 times n noise is also going to be easy to solve. And actually, you can take this a little further and show that even if the relative noise is log n over n, then you can still solve this problem in polynomial time. And the idea is to just sort of pick a random set of n out of these m equations. And with probability 1 over polynomial, all of these equations are actually going to be noise-free. And you can use these equations to recover the secret s. And then you can test it using the other equations. So I should say that people think of variants where the number of equations m is fixed ahead of time. Like, it's going to be some polynomial in n. So either we're going to take m equals, say, n squared, or you can think about a variant where the adversary can specify how many equations he wants and he can get as many polynomial number of equations as they want. So this is the case of sort of easy to solve LPN. But if you go a little higher, even if you think about noise weight of log square n over n, then the problem, as far as I know, stops being tractable in polynomial time. And in fact, the algorithm that I just described before is going to take quasi-pollinomial time. So it's going to take n to the log n time. And while this is perhaps not the worst running time in the world, it's still not polynomial. So we're going to still think about it as a hard instance of LPN, but it's sort of only barely hard. And I should say that you can also think about any little omega of log n over n and not necessarily log square, but somehow less than n to the log n hardness seems, I don't know, to give me the chills, so I'm not going to do it. So this is on one end, so we're going to think about this barely hard regime of parameters where LPN is hard, but it only starts being hard. And on the other hand, we saw that if the noise rate is equal to 1 half, then this means that the noise vector might be random, and this means that the vector b actually does not contain any information about s, and therefore the problem is not solvable even information theoretically. And you can relax a little bit and say that if the weight is something like 1 half minus negligible, then this property still remains. And the problem is not even information theoretically solved. So we want to consider sort of the highest noise level where the problem is still at least information theoretically solvable. And this is the case where the relative noise is 1 half minus 1 over polynomial. So in this case, at least if we set m to be high enough, a large enough polynomial, then the problem becomes at least information theoretically solvable. And we think of this as sort of the hardest instance of the problem where we want to achieve computational advantage. And I should say that for cryptographic applications, this parameter regime might be useful for at least in some cases. So it's not a completely contrived parameter regime. So this is what I sort of want to set up about LPN, when it's easy, when it's hard, and we're going to think about these two parameter regimes later on. But right now, let me talk about a completely different problem, maybe not completely different. So I want to talk about the learning with errors problem. And as you can see, it's giving me a very easy time preparing the slides, because the syntax of the problem is almost identical. So again, we want to solve a set of random linear equations with some additional IID noise. However, the difference is that now the system of equations is not over the binary field. It's actually some modulus q, and this q is going to be large. It's going to be asymptotically large. And for the purpose of this talk, just think about q being some polynomial larger than n. And usually, people think about the IID noise as being Gaussian, but the exact distribution doesn't matter. The important thing is that the L2 norm of the noise vector, in this case, needs to be small. So notice that here we can talk about L2 norm, because, well, these elements are modular q, which is bigger than the noise regime, whereas talking about L2 norm in the LPN setting, well, it's possible, but it doesn't really carry much meaning. So this LW problem has been pretty useful for cryptography, and we actually know quite a lot about its properties and about its uses. So for example, we have a worst case to average case reduction for the LWE problem. So this is an average case problem, because, well, everything here is sampled from a distribution. However, it was shown that by regive that if you can solve this average case problem, then you can also solve worst case problems on lattices. So you can actually solve these short vector problems on lattices. You don't need to know what they are. Even for the hardest lattice in the world, if you can solve this average case problem, and I guess it's your choice whether to think that this means that this average case problem is actually hard, because it's harder than the worst case problem, or that the worst case problem is actually not that hard, because it's not harder than some average case problem. But at least we know that we have this connection. We also know that this problem is contained in the complexity class statistical zero knowledge. Again, you don't need to know for the purpose of this talk what SDK is. However, we know that or we conjecture that problems in SDK are unlikely to be NP-hard, for example. So we know that this problem is not going to be sort of, we have some evidence that this problem is not super hard. So these are some properties that we know about LWE. And we also have a lot of applications, a lot of cryptographic applications. So starting from sort of the basic symmetric and public encryption, collision-resistant hash function, homomorphic encryption, attribute-based encryption, recently non-interactive zero knowledge. So we know how to do a lot of things. And sort of given the sort of minimal edit distance between this slide and the previous one, you would think that maybe the structure can also carry over to the LPN setting. So whatever we can do with this LWE, well, can't we just sort of translate Q to 2? And maybe, as I said, the L2 norm maybe doesn't make sense in the LPN setting. So maybe we need to replace it with Hemingway or something. And we would hope that things will still sort of work out. Or at least a lot of these things will be able to carry over. But it's not the case. So we don't know of worst case to average case reduction or SDK result. And in terms of applications, we can get symmetric and with some parameter regimes also public encryption. But a lot of these other applications are still not known yet. And I should say that recently, there's a lot of progress in constructing primitives from cryptographic primitives from LPN. So I should probably check on ePrint whether I should update the slide. And I'm hoping that there's going to be a lot of progress in filling up these blanks in the sort of near future. However, at this point, we could still ask, why is it so different? Why is LPN, what we can do and what we can say about LPN is so different from what we can say and what we can do about LWE. And there are a number of attempts to try to explain this difference. But we want to try to close this gap as much as we can. And what we show in this work is a first step in this direction and trying to equate the status of these two problems. So the first thing that we show is new properties. So we show worst case to average case reduction for LPN. So we would like to show that this LPN, this average case problem, is harder than some worst case problem. And the worst case problem that we're going to think about is the nearest codeword problem, which is a problem that comes from the problem of decoding linear codes. So for our purposes, you can think about this nearest codeword problem just as a worst case version of LPN. So rather than the matrix A, these coefficients being random elements and the noise being sampled from some distribution, you should just think of these A and E as chosen adversarial with some parameters. So this is going to be sparse codes. This is going to be the nearest codeword problem. And actually, for our reduction, we require nearest codeword problem with balanced matrix A, which means that if you think about the matrix A as a generator matrix for a linear code, then all code words in this code are going to have Hemingway very close to 1 half. How close? Something like 1 half minus 1 over poly. So this is perhaps a bit aggressive parameter regime. This is what we know how to do. And as an excuse, you can say that, well, since a random code has this property, it means that most codes have this property. I don't know how convincing it is, but at least at this point, this is what we can say. This is still a worst case problem. So any A and E that have these properties are going to be considered in our reduction. And this is not all the bad news. So what we show is a worst case to average case reduction with the worst possible performance. So what we show is that the hardest instance of the average case problem. So you take the LPN, the hardest LPN that you can consider. So the noise is going to be 1 half minus 1 over poly. And what we're going to show is that, well, it's harder than a worst case problem, but it's harder than this nearest codeword problem in this barely hard parameter regime. So if the weight of the noise is something like log square n over n. So this is still better than what was known before. However, you could hope to not lose so much in the reduction, but this is what we get in this work. The other thing that we show is containment in SDK. So again, this could be interpreted as evidence of easiness, like the problem is not going to be NP-hard. And as you would predict, we have this proof of easiness for the easiest version of these problems. So we show that these barely hard LPN or nearest codeword problem are actually contained in SDK. So the easiest parameter regime we can think of is unlikely to be NP-hard. Again, this was not known before, so now it is. We can also get new applications based on these techniques. One thing that we can show is collision-resistant hashing based on the barely hard LPN problem. And I should say that concurrently, you et al were able to present the construction that's almost identical, the construction itself. So in follow-up works, we're actually able to use these techniques to introduce other applications, such as IBE and other things. And again, there's more things that are known now. So this is what we show in this work. And let me tell you a little bit about our techniques. So the technique that we use is sort of, we think of it as smoothing. So if you've heard this term in the lattice regime, hopefully it will connect. And if not, then let me tell you what I mean. So I'm going to start with a work by Luboszewski from 2005. And Luboszewski actually considered an LPN to LPN reduction. So he was considering the following thing. You want to solve an LPN instance with n squared many equations. However, what you have is a solver for LPN, but the solver actually needs a lot more equations. Let's say n to the 100. Actually, Vadim needed even more than that. But for our purposes, this is the parameters that we're going to think about. So we have a solver that needs many, many LPN equations. And we only got much fewer equations. And what this reduction shows is that so long as your n to the 100 solver can manage with much, much, much higher noise than the n squared instance that you started from, then this is actually doable. So you can think about it as a reduction, where if you have, again, if you have a solver which requires many equations, but high noise, you can actually use it to solve LPN with much fewer equations, but much smaller noise. And let's see how it is done. So this is done using sort of re-rendomization. And the observation is that the matrix A, this random LPN matrix, you can actually use it as an entropy extractor. So we know that for a random matrix, a random vector multiplication is going to sort of extract randomness. So we have this now n by n squared matrix, which sort of corresponds to the A part of the n squared LPN instance that we got. And what we know is that if we multiply it by a vector R, which comes from a distribution of sufficient entropy, then what we're going to get is, in the end, is a vector A prime, where the joint distribution of capital A and A prime is mutually uniform or statistically close to uniform. So in order for this reduction to work, we're going to need to find such distributions on R that, on one hand, has sufficient entropy. So the required entropy is, of course, at least n bits, because we're generating this A prime, which has roughly n bits of entropy. So we need this R distribution to have at least entropy n. And in addition, we want, for the purposes of the reduction, as we're going to see in a second, we want this R to have as low as possible hamming weight. So quickly, if we think about it, then obviously we can get such a distribution with hamming weight n. But if we think a little bit more, I can actually see that you can generate a distribution with hamming weight something like n over log n, which still has entropy n. And actually, this gap between n and n over log n is going to buy us a lot in this setting. So this is actually what we capitalize on, what we didn't capitalize on and what we're going to capitalize on. So once you have this distribution, what you're going to do is the following. The reduction is going to generate like n to the 100 vectors R. So each one of these vectors R is going to be used to generate a new sample for the n to the 100 solver. So we're going to sample n to the 100 such R's. And for each one of these R's, we're going to compute a prime as I showed here. So a prime is going to be R times a. And b prime, which is just going to be R times b. And we claim that this is going to be close to an LPN instance with the same secret but with a somewhat higher noise. And since we can generate as many of those as we want, in particular, we can generate n to the 100 of those, then we can feed this into the 100 solver and hopefully get our s back. So let's try to see by how much the noise blows up with this operation. So this b prime, which equals to R times b, now I'm going to open the parentheses. b just equals to a times s plus e. So what we get is R times a times s, which is just a prime times s. This is good because this is what we want. This is going to correspond the equation that we're getting for the LPN instance plus some e prime. And this e prime is going to be equal to the inner product of this R that we sampled from our distribution and the noise of the original LPN instance e. And again, if you do the arithmetic, you're going to see that e prime actually grows kind of by a lot. And again, you need to do the calculation exactly and not just do the first order approximation, which is going to give you something that is useless. But if you do the calculation carefully, what you're going to get is that delta prime, 1 minus 2 times delta prime, is going to be equal to 1 minus 2 times delta to the power n over log n. n over log n is the weight of R that we picked. So we see that delta prime is going to converge to 1 half really quickly. However, if we plug in this barely hard parameter regime, so if we start with delta, which is log square n over n, then delta prime is actually not going to escape into the regime, which is information theoretically hard. So it's still going to be computationally tractable. So delta prime is just still going to be 1 half minus 1 over polynomial. And this is what happens in our reduction. So this is an average case to average case reduction, which just changes the number of samples. And we'd want to apply the same method also for a worst case problem. So rather than starting with an LPN instance with m equals n squared, now we're going to start with, again, a comparable LPN instance. But now the matrix A and the vector E are going to be chosen adversarily rather than uniformly. So we would like to see what happens in this case. But before that, let me say that the application for collision-resistant hashing actually follows already from this setting of average case to average case reduction. And the reason, I guess, the right way of thinking about it, then thanks to Leo Ducat for expressing it in this way, is that we already know the standard way of constructing collision-resistant hashing is this setting is to use the binary short integer solution problem. So if you thought about it, then you probably know what I mean. If not, it doesn't matter that much. And what this reduction actually shows is how to actually convert the binary short integer solution problem into a solution for the barely hard LPN problem. And this is how the reduction works. But once you have this, then we could just plug it in into stuff that we know, and you get the collision-resistant hashing. So let's get back to the worst-case-to-average-case setting. So now we have this worst-case problem NCP, and we have this matrix A. And we know that A is balanced. But other than that, we don't know anything or close to being balanced, not necessarily perfectly balanced. So an adversarial matrix is not going to be an entropy extractor. So we cannot just use verbatim the same thing as we did before. However, we don't actually need an entropy extractor, because we can choose the distribution of R to be a distribution of R choosing. We don't have to work with any distribution that has been entropy. And what we need to show and what we actually show is that for a specific distribution of R that has the properties that we want, any close to balanced matrix is actually going to be a deterministic extractor from this particular entropy distributions. And the analysis uses Fourier transform, and is actually similar to analysis that's done in coding theory, in particular in work by Kuparty and Saroff. And one has to think a little bit, what is the right way to define the distribution R in such a way as to make the analysis go through? So there's a few sort of obvious guesses on how to choose R with these parameters. But it turns out that only one of them leads to sort of a clean analysis, or at least to a clear analysis that we can perform. So this is sort of how the worst case to average case reduction works. And just sort of end up with some open problems. So I guess I didn't say anything about the statistical zero knowledge result, but again, it follows from the same sort of intuition of smoothing. So again, if you know about this statistical distance problem, which is like a problem that is usually used when we talk about SDK, then you can think how to sort of convert these ideas into a reduction. But again, I think this is the first step, and part of it is to sort of lay out some open problems for people, including ourselves, to think about. So one is, of course, to extend the parameters of the reduction to sort of get sort of less pathetic connection between the parameters of the worst case problem, the parameters of the average case problem. And again, you can also think about sort of whether the balancedness of the code can be relaxed and some other sort of parameters that can be improved. So there is this question of even like for the specific technique of smoothing, are we doing the best that's possible? Can you think about sort of different distributions that are going to have better properties? Maybe you can sort of try to sort of play with a distribution that does not have entropy, but only has like computational entropy or some things of this sort in order to either improve the result or show that at least the use of this technique is tight in some way. Smoothing for unbalanced codes, actually, we can get some smoothing results, even for codes that are unbalanced, but have good minimum weight. But this smoothing property or this extraction property is not going to be strong enough in order to get a meaningful worst case to average case reduction. So more can be sort of investigated in that regime. Another sort of thing that sort of always bothers me when I think about these results is that we don't actually know whether our results actually follow from a trivial property. So it could be the case that even though currently we only know how to solve the log n over n regime, even the log square in over n regime is so vulnerable in polynomial time. And this would explain all of our results, right? I mean, being in SDK, having a worst case to average case reduction, and so forth. So if there is actually a polynomial time algorithm for log square and over n, then this would be sort of one explanation as to why we can do what we do. And we'd be good to know whether this is the case. I'm almost done. Yeah, more cryptography from LPN. People are already working on it. That's great. Last bullet I want to talk about is sort of one of Vinod's obsessions. Can you construct the candidates for collision-resistant hashing that is provably not in statistical zero knowledge? So one has to think a little bit in order to formally define it in a meaningful way. But I think what Vinod wants is some Oracle separation, some Oracle world where SDK is easy, and yet collision-resistant hashing exists, I think. Hopefully, I'm not misinterpreting. Thank you. Thanks a lot for the talk. Hey, Zvik. So what is this distribution on R for which you can do the analysis? Can you tell us? Yeah, so OK, let me first ask you. What do you think the distribution R is? Bernoulli, independent. So unfortunately, we don't know how to do the analysis for this distribution. Notice that this distribution actually does not have high mean entropy. It's only statistically close to having high mean entropy. So the second guess would be to just sample it from a hamming ball. And this, I think, is doable, but the analysis is really hard. And what we end up using is just a sum of independent indicator vectors. So we think about vectors of weight 1, and you just sample n over log n of those vectors around them and take their sum mod 2. So this turns out to be because they're independent, then the analysis becomes tame and doable. OK, another question. Can you do something similar also for the binary version of SIS, a reduction from a minimum distance? So this is the problem that you said is information theoretically unsolvable, because you cannot uniquely recover the secret. But you can still define a problem which is a find a short vector even when it is not unique. So the analogous of SIS, and instead of reducing from closest vector problem, you would be reducing from a minimum distance. So I guess one way to think about what we do here is that you actually have a reduction from the binary SIS to this sort of barely hard LPN, but you're thinking about the worst case to average case binary SIS? Yes. So other than sort of going through this sort of, yeah. No, we don't actually. One direction would be quite. You're going to get stuck, right, because you have the. Yeah, we don't know how to do this at this point. OK, any more questions? Or maybe I have one. So as you said, we always, whenever we do something in LW, we want to do the same in LPN. So the worst case problem in LW is obviously the lattice problems. Can you see these worst case problems for LPN as some kind of shortest code word, right? But is this some kind of a lattice problem, too, or? So yeah, so usually we think of coding problems as different from lattice problems. Actually, never mind. I have a slide about it, but let's not go there. Actually, you could think even about coding problems as lattice problems. So if you think about the lattices that naturally come from LWE, they're going to be periodic over Q times the unit cube, and the lattices that come from binary codes are going to be periodic over two times the unit cube. So these lattices, they still exist, and they still have interesting short code word problems, but a lot of the connections that we have kind of break. So for example, finding short vectors in these lattices is easy, because there's always a vector of length 2, because 2, 0, 0, 0 is going to be in the lattice. However, decoding still translates even in the Euclidean regime to a closest vector problem. So closest vector problems in these types of lattices are still going to be hard, and perhaps I should also mention that Daniela's works on showing NP-hardness of these decoding problems. OK, thanks. Thanks, the speaker. I think you're out of time. OK, you're out of time. So you could consider a variant which is between LWE and LPN, where the modulus is large, but the noise is kind of all-or-nothing noise, either in p-fraction of the coordinates or nosy. And as far as I know, the previous worst case to average case reductions for lattices don't work for this variant of LWE. So my question is, can you do the same tricks and in this domain and whether the parameters are better in this when the modulus is large, when the field is large? So you could do similar things. I didn't write it down. I mean, I wrote down only a little bit. I think the parameters get a little bit worse. You need to, like, your worst case problem actually becomes more restricted. Rather than sort of going with sort of close to being balanced, you need to be close to being sort of balanced modular, like every, you need to be balanced over every sort of QRE coset. So it seems that things do not play as nicely. However, it's definitely something that one should investigate. OK, so let's thank the speaker. And welcome.