 Hi, I'm going to talk about our work titled BKW meets Fourier new algorithms for LPN with sparse parity decision joint work with my advisor, Professor Donna document solid using gong and Hunter keeping. So let's remind you about the learning parity with noise. So there is a secret which has an coordinate each time we call an oracle x is going to be sample uniformly at random so x has also an coordinates and the inner product of x and the secret s is going to be computed, and we either return the inner product with probability minus eta, eta being in a noise rate, which is less than a half, or we flip the inner product and we return the inner product plus one with probability eta and note that everything here is done. And the samples coming from this oracle is represented by an order paired x and b, b is the label. And we represent this oracle by oil P and zero and eta, eta means the noise rate and zero means that the x is coming from a uniform distribution. So it's easy to recover as if there is no noise, so we can just run Gaussian elimination, but even for a constant noisers the best algorithm is by bloom color was sermon which runs in time to to the over log n and needs essentially the same number of sample. And later there is an improvement by the Obochowski which only needs a polynomial number of samples but runs in slightly worse time. So there is a search to decision reduction for LPN problem which makes it a more interesting for cryptography construction, especially there is an encryption based on LPN. And LPN is believed to be resistant against quantum computers and it has also been shown that without loss of generality we can assume that the secret key is drawn from the same distribution as the noise. So in this work we consider LPN with sparse secret so normally, there is an eta times n coordinate of the secret key which are one, but in this work we assume that k which is a sparse city of the secret is much less than that. And we consider two cases either a constant noise rate and low noise setting in which in that case the noise rate is a sub constant. So one motivation for this work is that we can assume that the secret is recovered using some type of attacks, for example side channel attacks, but we don't have a high confidence in each coordinate of the secret key. So if we recover secret key s prime, we don't have high confidence in all the coordinates but we can turn our focus on excess minus excess prime. And since s and s prime are going to be equal in the number of coordinates, we expect that s minus x s prime to be sparse and we can turn out and or focus on that. And also a sparse secret might be using some applications for efficiency purposes. So for the for the constant noise setting we show that for sparse city k equal n over log and raise the power of one plus one over C, for certain parameter for certain range of C, the runtime of our algorithm is containing a log n over a small o of k, and also it's containing a small o of n over log n. And in this setting the runtime of brute force is a log n over a big omega of k, and the runtime of bkw is two to the big omega of n over log n and notice that the runtime of our algorithm is asymptotically better in the exponent in comparison to both of these two algorithm and one ranges of k to be an idea of the ranges of k you can think of k being much smaller than n over log n, and much bigger than n over log squared. And this smaller and bigger is captured with the asymptotic in the exponent of the log. For the low noise setting here I'm going to just talk about a special case for the full parameter please refer to the paper. So for a sparse city case such that k equal equal to a square root of n divided by log log n and for the noise rate of eta equal log n over a square root of n. The learning algorithm runs in time n over k raised to the power of, raised to the power of small o of k and the runtime of brute force is at least n over k to the k and the runtime of lucky brute force which we're going to talk about is going to be n over k to the small omega of k. And again the runtime of our learning algorithm is better than these two algorithm asymptotically in the exponent. And as an other applications for our work is we showed that by applying no reduction to LPN and solving LPN using our algorithm we can obtain application to learning other classes of functions such as DNF and on this. So here is the outline of this work. So first look at the free analysis of Boolean function which is a basic block of our, our work. Then we show how we can recover a secret from Fourier coefficient and we show that in case that the samples are biased we can have an improvement to finding the secret. So our focus shifts towards constructing this bias sample, which in constant noise setting we use the KW algorithm and for load noise setting we use a design by Nissan and Wittgersen. So consider Boolean function f from zero one to the end to zero one. And this work and also in the paper we change our mapping, we change our notation from zero one, zero and one to minus one one by mapping one to minus one and map mapping zero to one. And in this case the multiplication on minus one one is X or in the previous notation. So the free expansion of Boolean function is simply its representation as real and monthly linear polynomial which I'll explain what I mean. Just now, so assume that we have a function that takes a true coordinate X1 X2 and just output the maximum of these two. So the right hand side shows the Fourier expansion of the same function. So here we have four coordinates. So for example, the second coordinate is the Fourier coefficient at this singleton set one. And the last coordinate is Fourier coefficient at the set one to so we compute. We see that the Fourier coefficient is defined over some subset of the inner subset of basically X. So for example to compute the three coefficient of singleton set one, we can compute this expected value over all the values of the input X and we compute the compute function times the first coordinate and to compute the three coefficient of set one to begin take the expected value of the function times those function times coordinate X1 and X2. So to define a free expansion and more accurately, we say that every function f can be represented as a linear combination. So here the combination is taken over all the subset is that are subset of set one to and we have a Fourier coefficient f hat multiply chi s and the chi is simply the multiplication of the input coordinate and in order to compute the Fourier coefficient we compute this expected value. And going back to our example, we see that the free coefficient of empty set is one half the free coefficient of singleton set one and two is again one half and the free coefficient of set one two is minus one half. So, let's see how we can recover secret using a Fourier coefficient so let's focus on the noiseless case. Let's focus on the case that we are only given an inner product of X times s. So, take the function x1 plus x3 mod two. So this one, this function has as input x which has only three coordinates. And in this case, the secret is one zero one. So if we change our notation to minus one one notation then the same function can be represented as x1 times x3. If we use the previous, if you use the previous slides and we compute the Fourier coefficient. Note that the Fourier coefficient is all is all zero in all the subset except the set one three and note that the set subset one three is exactly this and the locations that the secret is non zero. So here the Fourier coefficient is only on a single the free coefficient that the weight of the Fourier is only on a single free coefficient. And in order to find the single free coefficient, we have to run a brute force search over all basically dimensions of size at most game which is kind of infeasible. But what if the samples are not from a uniform distribution so what if X is a zero with some probability which is not one half so what if X is zero with probability one half plus P essentially, we can have free coefficient in this case as well, except that this time and we our chi function and we basically normalize the chi function. And in order to compute the Fourier coefficient we have the same, we have the same calculation as before, except that this time X is coming is samples from this bias distribution which we denoted by dfp. So in this case it can be shown that the free coefficient of singleton set J is zero for the coordinate that the secret is zero, and it's non zero for the cases that the secret key is one in those coordinates. So our learning algorithm wants to distinguish these two cases so basically wants to distinguish if the if the free coefficient at singleton set is zero, or if it's a non zero. But remember that the that the samples are noisy. So we really have to distinguish between zero and something which is multiplied by noise rate. So, but still the goal is to distinguish zero versus something which is non zero. So how we can think of P bias LPN oracle, but how we can think about it we can think of it exactly the same as before we have a secret key but this time X is sampled from a bias distribution. So from a distribution such that X prime is zero with some probability one half plus P over two. We have the same inner product as before, but we flip this time with probability one minus eight a prime. And then LPN for LPN this was eta. And again, we have the output our samples as a ordered paired X prime and B prime and B prime is our label, and we represented this oracle by LPN P, which P represent that the X is not a uniform anymore and eta prime which is our and noise rate. So in order to construct this oracle views views and algorithm by BKW. So what BKW does is that for each sample, which is represented as X and B split X into a block each contains the elements. And what it does is it partitions the partitions the queries based on the first block. So for example we put all the queries which has zero one in the first block in one partition, and all the queries which has one zero in the second partition and so on and so and we zero out the queries. So we make sure that the first block in all queries are going to be zero out and this is essentially the first step of this algorithm. And the algorithm progress by looking at the this time at the second block and partition based on the second block and then zero out and then progress to the next block. So in each step we essentially lose some number of queries and the noise rate of the new samples grows. So, we want to construct a P bias LPN oracle so we want to buy, we want to construct this P bias oracle, but we only have access to the oracle that give a sample from the original oracle, which was not biased so we are given a sample from X and B. And from that we have to do something and construct the P bias samples which are denoted by X prime and B prime. So the samples here has a property that X is zero in J's coordinate with probability exactly one half, but the samples that are from biased oracle has a property that there are zero with probability one half plus P over two. So how we can do this so each time we call an oracle the oracle sample and index set are so basically for each index, it either it select its index independently with probability P. So let's mark the selected index by dark blue. Once we have this index set R we can run the BKWR algorithm. So BKWR algorithm is essentially exactly similar to the BKW algorithm, but it's only focused on the coordinates pointed by R. So essentially what it means is that upon receiving some samples from the original LP and oracle, the samples has some values in the dark blue coordinates so it can be zero one or whatever. And after running the BKWR algorithm, we make sure that we output a single single single query which has zero in all the dark blue coordinates. One one details is that the size of our might change across different oracle calls, but if we fix a which is some parameter of the BKW algorithm we make sure that the constant number of samples are are combined to get a single bias sample. And we showed that the samples X prime B prime outputted by BKWR algorithm are independent and distributed identically to sample drawn from P bias LP and oracle for a different noise rate. So here is the relation of the noise rate at a prime for the P bias LP oracle and the noise rate of the original LP and oracle. So here, really, really reminding you about what that what the learning algorithm does is it has to distinguish between a zero and this parameter which we call it epsilon. If, and we said T which is the number of P bias sample to have an estimate of it in epsilon over two distance. So basically in our paper, we show that for sparsity, such that K equal N over log, N over log raised to the power of one plus one over C for certain ranges of parameter we are. We have an algorithm which runs faster in the faster in comparison to brute force and BKW algorithm and faster is captured by a synthetic in the exponent. So for the low noise setting we present an algorithm and called Sam P that draws only polynomial number of samples from LP and oracle and uses them to construct a larger set of P bias samples. And the catch here this is this the samples are close to being a pair of eyes independent. And the idea of this design is based on a design of a Nissan and Wittgerston, and the learning algorithm is essentially very similar to the algorithm we had before except we have to repeat it a number of times. So here is how we generate a P bias sample. So we generate only two NP plus one squared and number of samples from the original LP and oracle and the sample P put this put this queries into into this bucket. So, for example, Bucket 01 and 02 can have at most can have at most a T queries that are common between them. So each time that we call each time that we call an oracle the oracle does three things so it first sample are so are similar to before we have a dark blue coordinate that has to zero out. And in the second step we look at each bucket and we select some queries from this bucket and we make sure that we zero out the coordinates that are pointed by set R so we zero out the dark blue coordinate. And in the second step this is just a detail that we have we want to make sure that no matter what we have the same number of queries that are added together to generate our bias samples so we essentially add noise. So the important thing to notice is that we have we need to only 20 plus one squared samples to start with. We showed that in our paper we showed that the samples generated like this so the samples Xi and be I output it is out with it by the previous construction is distributed identically equal to the sample drawn from a P bias LP and oracle. So this time with different at a prime and here is the relation between the at a prime of this new construction and the H of the original LP and oracle, and we showed that Xi XJ are pair wise independent. So the thing. What we bound here is that we showed that we can really bound the covariance of the, of the covariance of these new generated samples. And finally, we showed that the brute force runs in times and over K obviously the lucky brute force is the case that if we sample in which is in number of samples which M is equal and over. And minus it we expect that approximately in noiseless samples are among them. So the question now is going to be select in out of this in number of samples and run Gaussian elimination. And since we expect exactly in a number of samples to be noiseless, we expect to recover is by doing this type of algorithm which we call lucky brute force. So the runtime of this algorithm is essentially e today time. And for this parameter set for this sample of parameter setting our algorithm is faster than both brute force and lucky brute force. And for the full range of parameters, this is just one example of the parameter for the full range of parameter, please refer to the paper. Thank you so much for your attention.