 Yeah, good morning everyone. This talk is going to be about how to prove knowledge of small secrets. This is joint work with Ivan Damgo and Kaspalas and Michael Nielsen from August University. It's a very, very short reminder about what is your knowledge proofs. You all know that just to recap, so we all know what we're talking about, we have a relation R and we have a prover P1, a verifier P2. There's a statement V, both the prover and the verifier are aware of. And the prover wants to convince the verifier of the truth of a statement V, using an interactive protocol Pi. And therefore it has a witness W and they exchange messages and the whole protocol has to have the following three properties. First of all, we want that whenever the prover actually has correct witness for the relation, that the verifier will accept. Second, if the prover starts this protocol with something which is not the correct witness for the relation, then the verifier is not supposed to accept. And zero knowledge, meaning that you can simulate the proper protocol transcript only having the relation and the statement. So being able to generate a simple from the distribution that is close to protocol transcripts without having the witness efficiently. Why would you like to prove that a secret is short? First of all, very often when we use zero knowledge proofs in interactive protocols, we use them as a building block, for example, to show that we have knowledge of, let's say, the preimage of a cypher text. So we know a plain text and we know some randomness, and they encrypt to a certain cypher text. Then for special encryption schemes, we may have to prove an additional thing, which is not just that we know a plain text and that we know some randomness, but that both of them fulfill certain criteria of shortness. So for example, if you think about lattice-based cryptosystems, your encryption procedure will always include something like sample from a Gaussian distribution or from some simple randomness from some small interval or so. And you will actually have to prove in zero knowledge in an interactive protocol that this is fulfilled. Why so if you think about it, when you decrypt in lattice-based cryptosystems, you mostly get these, let's say, noisy plain text, right? You get both your message plus some randomness on top, which if you do a modular reduction, it goes away. But then the problem is if your noise here gets too large, then depending on the message, this may wrap around or not. And this in an adversary, an interactive protocol could use to figure out which messages, for example, another party encrypted, and depending on if a protocol aborts or not, and this we want to avoid. And therefore, it's very crucial to have these zero knowledge proofs, for example, in this situation. More in general, in our work, we define homomorphic one-way functions over the integers as the main building block we're talking about. So what is a homomorphic one-way function? Over the integers, we're short IV one-way function. So soon we have an abelian group G, and we map from the integers or vector of integers into this abelian group G. Then such a map is called a homomorphic one-way function. Over the integers, first of all, if it is a one-way function. So we assume that evaluating the function can be done in polynomial time. But if I give you an element from this abelian group G, and you want to find a pre-image, then it is very hard to find one, assuming that it's supposed to be short. This is where the shortness comes into play. And third one, criterion is the homomorphic property, meaning that the map into the abelian group G is supposed to carry over some of the additive structure into the abelian group. As an example, in addition to lattice-based cryptosystems, one could consider GGA chashing as the prime example of what is such a homomorphic one-way function. So you sample, in GGA chashing, what you do is, you sample a random matrix M, which is very wide on one side, and then compressing on the other side. And then you define F to be the application of this map to a binary vector. And so what you throw in is, or what you input in this function is a binary vector that is rather long, let's say R. And what you get out is a lot shorter vector, but which is from the elements, from ZQ and not just binary elements. And it is known since 1996 that this finding short pre-image is for this GGH hash function reduces to known lattice problems which we think are hard. Other examples for homomorphic one-way functions would be, for example, the Swift hash function, as I said, string LWE encryption, or, for example, integer commitments. So the title of this talk is how to prove that something is short. The first idea coming to one's mind may be, let's use sigma protocols. Just to remind you what sigma protocols are, you sample an auxiliary value S that is supposed to be short. That's what the prover does. He applies the one-way function, sends it over to the verifier and the verifier decides. He flips a bit and says, well, either show me that this value here was good, so send it to me if E is 0 or if E is 1, send me the sum of the two values, S and X, where X is the element you want to prove shortness of. So it's typical sigma protocol. In addition to what we know, we have normal sigma protocols here since this is smaller than B and our X is supposed to be smaller than B, the sum of the two must be smaller than 2B. It will turn out that what we can verify in the end and how far this is away from the original bound is actually very crucial. And we call this the soundness slack in our work. And as you know, sigma protocol, if you do it this way only gives you soundness one half, basically, because you only have one bit challenge, right? So if you want to have security to do against the cheating prover with probability two to minus K, then you have to repeat this thing K times. So you have to generate K of these auxiliary values which we call the overhead. So first of all, this sigma protocol gives you very, very small soundness slack. So can we also make the overhead small? Then we would be definitely done. But then I probably wouldn't stand here. Turns out it's not that easy. First of all, let's take this here from a larger interval. Like, what could possibly go wrong? Correctness would still be, the whole thing would still be correct. But if you want to prove that you can extract a witness using a special soundness property, then you would normally do this by taking two accepting transcripts with the same auxiliary value and subtracting the equations from each other which are supposed to hold. But then it's quite immediate that you have to divide by the difference of the challenges over the integers. But if E is zero or one, then the difference is one or minus one, right? You can always divide by that. But if you choose E from a larger interval, you may have to divide by two, three, five, which over the integers is not always possible or in the worst case by zero. So, but don't worry, not all hope is lost. They were most straightforward idea would now be what can we, if we can't go for one ciphertext and make it better or one pre-image and make it better, can we prove knowledge of a lot of them at the same time while making the overhead quite small? And if you do, if you just do the naive repetition of sigma protocols, you get K overhead where K is the statistical security parameter whereas the soundness lag is very small. In some work by Kramer and Damgold in 2009, they showed that basically the complete opposite also holds, that you can have a soundness lag that is exponentially far away from what you actually wanted to prove, but on the other hand, that the overhead is very, very small. In the speeds and PC protocol, there was some additional, let's say trade-off between these two things where they achieved polynomial soundness lag and logarithmic overhead using some techniques due to Nielsen and Olandi. And in this work, we show that by having slightly super polynomial soundness lag, we can actually go down to constant overhead again. The reason why we care about this is this overhead means additional messages we have to send in an MPC protocol if we use zero knowledge proofs and the less we send, the better for MPC in the real world. So how do we get down to constant overhead? So the first step of a protocol, let's say we have N values of which we want to prove knowledge of. So let's say we have XI from X1 to XN and what we do in the first step is we sample T auxiliary values for a T which is supposed to be defined later. So what we do is, first of all, a traditional cut and choose where the prover chooses a lot of auxiliary values since applies to function F, sends them over to the verifier, the verifier chooses a subset and the prover opens that subset to him. So the prover looks at all of these open choices that he made and if they're all like short and if they actually all exist, then in the next step, the prover will send him for the remaining values sums of the secret that he wants to prove knowledge of and the auxiliary value. So what do we intuitively achieve by doing this? For using the cut and choose, what we get is that most of the auxiliary values that were not opened are also short and do also exist. Now if we send sums with all the secrets that we have, what is actually true is that most of the values you want to prove knowledge of do exist and are short. So basically we do cut and choose on random values but at the same time achieve a cut and choose effect on what we actually want to prove knowledge of. So we start out by n values you want to prove knowledge of and doing this, we get down to K that are still to be proven where K is the statistical security parameter. And what we show in our work is that if you set this T to be three times n, so linear in the number of values you want to prove knowledge of, then everything is fine. In normal cut and choose you would set T to be two n but we have to do some rejection sampling in order to keep the cells like small and we cope for the rejections by having this T to be a little bit larger. And in addition we show that you do not in the work in speeds two, they need to assume random oracles in order to be able to extract and we get around this using some interesting techniques. So now that we're done with this not that we're done from n to K, unproven images, pre-images, let's get from K to zero. So how do we do that? The idea is to basically do the same as we did before with the cut and choose but instead of going for each value individually we let the verifier sample sums, random sums of these pre-images and let the prover prove knowledge of these. And for this proof we then use the same cut and choose as before again. So why do we hope that this works? Or what is the intuition behind that? The intuition is to, let's look at this from a balls and bins perspective. Let's say the verifier says the approver put these values into the first bin, these values into the second bin and these values into the third bin. And then prove me knowledge of the sum of all of these values, right? Then the verifier is gonna be happy if let's say these red ones are the ones we weren't able to explain yet then if there's only one of these unexplained or bad pre-images in the bucket then we're happy and if there's zero or two or three then we're not that happy. The reason for the happiness of the verifier can be found in the linearity of, or in the homomorphic property of the one-way function. So assume we already have an X4 and we have an X9 to explain certain values and now we're left to, in the soundless proof we're left with extracting an X3. And the prover was actually able to convince us that the sum of these values is short. Then since we already know that the other two values in the equation are short we can just like subtract them from the, since the proof of that the sum is short is also sound we can just extract the pre-image there, subtract the X4, X9 and there we go we have a pre-image for that is, that goes to the same value as X3 and it's also short. So that's the most fundamental insight. So now we just have to, for a, in an actual proof now we just have to figure out how often does this event happen so we know how often the verifier has to generate sums. In our work we're able to, so what we do is we establish a certain invariant which is that if you, if you for a certain set of bins and bad elements and if you play this game a few number of times we show that with probability exponential small and the number of bad pre-images you would be able, so this event will occur often enough. To be able to extract for a, to be able to extract. So this holds for certain choices of B and T. So, but unfortunately this goes, the probability goes actually down in the number of bad pre-images. So if we have, let's say in one round if we have extracted half of them then the next round with only probability two to the minus k over two, we can extract the rest. So we actually have to play this game now two times, right? We moreover then show that actually if we start with k sums the first time in the first round and we extract let's say half of the pre-images then the second time we actually only need to have k half of these sums or buckets in every such instance. We now have two instances to do, but the number of buckets goes by in half, you play the game twice, so it cancels out actually the number of sums you play per round is constant. In addition, as you now see if we go in half in the first round and we go in half in the second round again, we have to play this game a number of times. So this is already like the overall protocol. In the first step, as I said, you do the cut and choose and open the sum of the auxiliary value and the secret you wanna prove knowledge of. And in the second step you do a cut and choose again, but this time instead of the actual elements you would use these random sums as chosen as explained before. So what about the overhead? If you have, let's say, a constant number of sums, which is, let's say, linear in k in every round that you play and you play a number of rounds, all in all you will have k times log k sums, you have to prove knowledge of, so we have to choose n to be bigger than k times log k to get overhead that is constant. Additionally, the cut and choose is actually imperfect. From the first step we had that k, from the n values, k are still remaining unexplained. This will also happen to us again if we do the cut and choose in the second step, but we showed that this doesn't, the chance that this is actually matters is very, very small, so we kind of get away. The caveat is if we actually try to prove a bound on the soundless leg, then in our work, at least, we were only able to achieve something that is slightly super polynomial in the security parameter. But I mean you can always, let's say, go for, let's say if you want to prove security parameter 40, you can just apply or prove twice for security parameter 20 and, you know, fine. Question is, of course, are we done yet? First of all, our analysis comes with large hidden constants. I mean it's super polynomial and the number of auxiliary values is constant, but it's a big constant. So if you want to apply this in practice, definitely like to get these hidden constants smaller. Moreover, we have this quasi polynomial soundness leg and getting this down to something as polynomial or linear would definitely be desirable. And also it would be nice to see whether this actually works and how it performs in practice. And after we published our work on E-print, there was some subsequent work by Kramer and Dambod who showed that actually you can do this whole thing with a linear soundness leg, but they need this N to be bigger than K squared where in our case it was K times log K. So first of all, there are techniques for getting there are like a little more involved than our somewhat simplistic Vincent Balz game. So it's definitely also interesting to see, to look at our dare work and how it evolves. So if you compare everything on the graph, this is how the current state of the art looks like. You have zicking protocols at one end of the whole game. You have Kramer-Dambod on the other end. The speed is to prove here and we were able to get somewhere close to here. And of course, long-term goal is to get both the soundness leg and the overhead to be as small as possible. And with those words, I'd like to thank you for your attention and I'm open for questions.