 In this video, I will present our recent work on concretely efficient zero-knowledge arguments for arithmetic circuits and their application to lattice-based cryptography. My name is Karsten Baum and this is joint work with Aril Noff. The paper will appear at the PKC 2020 conference and the full version of our paper can be found on a print on the link shown below. Let me recap what a zero-knowledge argument of knowledge actually is. For this, we will be in a situation where the proof and the verifier, which both know a certain statement, the prover, will try to convince the verifier that that statement is actually true. To do so, the proof and the verifier will exchange a set of messages at the end of which the verifier is going to decide whether or not to accept that interaction. In order to be a zero-knowledge argument of knowledge, this interaction will have to have three properties. First of all, it must be complete. Second, it must have knowledge soundness. And third, it must provide the zero-knowledge property. Completeness means that if the statement is true and the prover has a witness for it, then a prover will always be able to convince a verifier about the truth of the statement and the verifier will always accept at the end of the protocol. Soundness means that if the statement is not true, then a prover will only be able to convince a verifier with negligible probability. In particular, knowledge soundness means that there exists a certain algorithm called the extractor such that if we have a prover that can convince a verifier with a certain probability, then the extractor can extract a witness for the statement from that prover. Last but not least, zero-knowledge is a property that intuitively means that the verifier does not learn anything from this interactive protocol beyond the fact that the statement is true. More concretely, it means that we can construct an algorithm called the simulator that outputs transcripts that are distributed as they are in the interactive protocol except that the simulator does not get access to the witness which the prover actually has. More concretely, let's assume that the prover and the verifier both know a circuit C and an output Y of the circuit and moreover the prover has a witness W and it wants to convince the verifier that it knows this W and that C when applied to W outputs Y and the whole construction is called an argument of knowledge because we assume that the prover is computationally bounded. In this work, we present two main results. First of all, and use your knowledge argument of knowledge protocol which is based on the so-called MPC in their head technique. Our work is concretely efficient in the sense that the communication in our protocol might not be sublinear in the circuit size as it is in other protocols but it has small very concrete constants and it's very fast when implemented and we only use symmetric key primitives which means that our solution is plausibly post-quantum secure. We then show how to apply our protocol to a problem in lattice-based cryptography, namely we show how to construct a protocol to convince a verifier to the certain SIS problem has a solution with low prover time and moderate proof size over in comparison to other solutions. In order to understand how our protocol actually works, let me first introduce the concept of multiparty computation or MPC for short. In MPC, a set of servers has some certain inputs, WI, and they run an interactive protocol over a certain network after which they each obtain an output. Now what we want to have is that if we feed the individual inputs of each server into a circuit then these outputs will appear as the outputs of the circuit. Now a simple solution for that would be that each party simply broadcasts its input to all the other parties so let's make the whole thing more interesting. So in MPC we require first of all that if the parties learn the output then it is this output of the computation. That means that even if we have servers that try to disturb the computation they will not be able to make an honest server output another value and think that this output is correct. Moreover in an MPC protocol we further more require privacy or T privacy where T privacy means that no T parties can learn anything beyond their inputs and outputs from the protocol meaning that they can run the computation and they they can derive no information about the inputs of the other parties by only looking at the values that they've seen in the transcript of the protocol. A well-known paradigm for constructing as your knowledge a protocol is the so-called MPC in the head paradigm. There a prover will simulate in its head an MPC protocol between multiple servers so it will simulate these servers. It will first secret share its witness of the statements to to all of the servers in the multi-party computation and then these servers will run an evaluation of the circuit C and this whole interaction only happens in the head of the prover. Then after this is done the prover will commit to the views of the different servers that means it commits to both the input shares and the randomness and also the exchange messages of each of the servers individually and it will send these commitments to the verifier. The verifier will then ask the prover to open a certain subset of let's say T views of this multi-party computation interaction and then the prover will open these commitments based on the choices that the verifier has made. Then the verifier will inspect these views that he sees namely the secret shares of the inputs, the randomness and the messages and he will check if all of these messages that were exchanged were consistent. He will check that each of the servers will have acted consistently with its input and its randomness and the messages that it has and that the correct output was obtained and then the verifier will accept otherwise it will reject. Now MPC in the head has been around since the seminal work of Hishai et al in 2007 and the first practical implementation of this paradigm has been provided in the ZKB protocol and the ZKB++ work that followed upon it. Furthermore the Ligero protocols showed how to instantiate the MPC in the head paradigm with sub-linear communication complexity so this is asymptotically less than what we have in our solution. In 2018 the KKW protocol showed how to combine MPC in the head with the so-called pre-processing paradigm which has been very popular in the area of MPC for a long time. Now in comparison to these previous works we generalized this pre-processing based MPC in the head and we show how to use this more specifically for protocols that run over large prime fields. In MPC with pre-processing we assume that before the actual MPC protocol runs there is a trusted third party that samples correlated randomness according to a certain distribution and it then gives each of the individual servers a share of that randomness in secret. This pre-processing is supposed to be input independent and it will allow us to lower the cost of the actual MPC protocol in terms of either communication or computational complexity. Now it's not very straightforward to apply this paradigm in the zero-nord setting for MPC in the head simply because we assume in the MPC with pre-processing that this server who shared the randomness is a trusted third party and that the randomness is therefore correct. Now in the MPC in the head we will have to assume that the prover will simulate this trusted third party but then it's not clear that this secret sample data is actually correct so it gives the prover more opportunities to cheat and fixing this is what is important when constructing MPC in the head with pre-processing. Popular application of pre-processing to an MPC protocol is in order to allow for multiplications of secret shares values. For this let's denote by v in brackets a secret sharing of a value v and we will say that abc with brackets as written here is a random multiplication triple if a and b are indeed random and if c is a times b. Now a straightforward approach is to multiply x and y but as you could check directly using abnc this is a deterministic process and this will always succeed but it will require that abnc is always correct so this is a strong claim that we make about the pre-process data and on the other hand there exists a technique that allows us to verify if three values x, y, z are actually correctly multiplied using a random multiplication triple abnc and if either of them is actually not does actually not follow this multiplication relation then this process will fail with probability 1 minus 1 over the field size. Now that means that we will be able to use pre-process data that is not always correct by adding additional randomness to the npc protocol. Intuitively it could be said that the protocol by Katz et al follows the first approach that was outlined before namely the prover will first generate correlated randomness by committing two random multiplication triples and it sends these commitments to the verifier. The verifier will then ask to open a subset of these triples and the prover will then at the same time decommit this chosen subset to prove to the verifier that the multiplication triples are correctly formed. It will then run in its head multi-party computation using the remaining triples and will commit to the views of the parties as outlined before then the verifier will ask to open a certain subset of the views of the parties and these will be decommitted by the prover and the verifier will then first of all check that the triples were correct, that the views were consistent and then that the output is correct as well. This uses a cut-and-choose paradigm because the prover will first generate all these commitments and then in an interaction a so-called cut-and-choose protocol the verifier will check some of that and the rest will be used in the actual protocol. In our protocol we instead follow the second approach that was outlined before namely first of all the prover commits towards the verifier to random multiplication triples but also to an evaluation of the circuits and for that it commits to secret shares of all the wires of the circuit. Then the verifier will ask the prover to use search and randomness for a verification computation on these secret shares and the prover will then simulate in its head a verification computation on these secret shares with this randomness and it will commit again to the views of the parties. The prover will then be asked by the verifier to open a subset of the views and it will decommit those views. Now the verifier will first of all check that these views are consistent as before and it will check that both the actual actual computation of the circuit will output that the circuit is correct and this is correct and it will check that the verification of the multiplication in our case showed that all the modifications were correct. Now after this it will decide to accept or reject accordingly and this is in comparison to the protocol by Katzedal a sacrificing based approach in the sense that we sacrifice one random multiplication triple to check one multiplication in the actual circuit. As was shown both cut and choose and sacrificing can thus be used to construct an MPC in the head protocol. Both in the actual MPC protocol will have similar communication cost per multiplication gate that is evaluated in the circuit and the MPC protocol that is thus used has both free addition and multiplication by constant gates. This is important for application. Overrings or small fields sacrificing as we presented isn't that beneficial. The reason is that the success probability of the sacrificing is proportional to the field size and if the field size is small then obviously the advantages are not that big. On the other hand if we do computation over a prime field then the saltiness of the sacrificing actually allows us to have less repetitions in the MPC in the head protocol and thus we get less overall communication. When using the MPC in the head paradigm as we do then the communication complexity of the overall protocol will directly depend of the communication complexity of the simulated MPC protocol that is simulated in the head of the prover. Now if we have free multiplications with constants and free additions then this means that we can actually construct a low overhead zero knowledge argument for the so-called SIS problem. In the SIS problem the prover will have a matrix A and a solution T which the verifier also has and it wants to convince the verifier that it has a certain value S a vector S such that A times S is T and furthermore that S is small meaning that this is below a bound beta that the verifier also has. Now it's clear that this is a very easy task if we only wanted to show that A times S is T and the hardness comes from the fact that S has to be small and this is a crucial building block for all kinds of post quantum lattice based cryptographic protocols so it's very important to have efficient protocols for proving such a statement. In order to demonstrate how to apply our zero knowledge protocol to the SIS problem let me first introduce the binary SIS problem. In binary SIS the prover will want to convince the verifier that it knows a value S such that A times S is T and that S is actually a bit strange. Now if the prover secret shares S to the parties in the MPC protocols then the first of the two steps in the proof are actually for free because we can do three linear operations on secret shares values because the matrix A is a public value known to both the prover and the verifier so all we actually have to take care of in the proof is to show that S is a bit strange. Now showing that S only consists of bits is easy because each value S i will only be a bit if and only if S i times S i minus one equals zero. Now that means if we follow this very straightforward approach we obtain a circuit that has M multiplication gates so this is the first step of implementing this. Now for arbitrary bounds this can be obtained by constructing general range proofs where since the prover knows the binary composition of each value S i it can just prove that each value of its binary decomposition is a bit and then we can compute the linear combination of these bits in the MPC protocol which is free and then perform the multiplication with that with the matrix A which is also free. So more generally using a standard range proof this will mean M times log beta multiplication gates so this is a very direct way of applying our protocol to the SIS problem. The outline solution only uses standard multiplication gates so an obvious question is if one can actually do better by for example having a circuit with less multiplication gates or by using special gates and in our work we actually show that this is possible. First of all bit tests can be done using square gates instead of multiplication gates where square gates have less overheads in the preprocessing stage. Furthermore for the binary SIS problem we show a circuit that just uses one multiplication gates instead of M multiplication gates where M is the length of the input vector and for general SIS we show a circuit that has one multiplication gate but where the proof is approximate meaning that the proof does not give the exact bound beta but something that is a bit bigger than beta. Mind that many proofs that are done in the SIS setting only give such an approximate solution. We also show a solution for the general SIS problem that does not use any multiplication gates whatsoever. Now the main idea that we use to actually construct this is what we call circuit sampling. The idea of the circuit sampling process is that the prover and the verifier together negotiate the circuit that will finally be evaluated by the MPC protocol. Here the hope is that if we have a large number of possible circuits to choose from we can choose a class of circuits that overall have a smaller complexity and thus we can reduce the communication complexity of our protocol. This will work as follows. First the prover will commit to an extended witness meaning from the from the witness that the prover has for the original circuit. It will compute an extended version of it that contains some additional information which could be helpful during the evaluation of the negotiated circuit. Then the prover and the verifier together will pick such a circuit and then the circuit will be applied to the extended witness as before. Now the idea will be that a prover might try to tailor its extended witness to one specific circuit instance but if there is a large number of possible such circuits then a prover cannot alter or make up a witness that would make all of these circuits evaluate to a positive outcome while at the same time it does not have a witness to begin with. This in itself is not necessarily straightforward for two reasons. First of all as mentioned soundness might be complicated in particular as we will want to have that also the prover will have a say when choosing such a circuit for reasons that will be clear later and furthermore we will still have to make sure that the zero knowledge property of the protocol is preserved. As an example of the circuit sampling technique consider the range proofs that we have to do in our protocol in order to verify that a witness indeed is valid for an SIS instance. There for the range proofs to go through we will have to test whether input values are bits and this is done by multiplying these input values with the input value minus one and then testing whether the outcome of this product is indeed zero. So if we have n input values and we want to check that all of them are bits this will mean that we will have to do n multiplications and for these n multiplications we will have to originally generate n random multiplication triples in the pre-processing. Both these multiplications and this pre-processing will be part of the communication that has to be done to the verifier. So can we do better and yes indeed we can by amortizing these bit tests for an amortized bit test the tests if all values of w are bits simultaneously we have instead the following solution. First of all let us define a polynomial f of degree n minus one where f at point i evaluates to wi meaning f at point one evaluates to the first element of the witness and so on and so forth. Now this polynomial is uniquely defined by the input w and furthermore we can evaluate this polynomial at any arbitrary value in the field using Lagrangian interpolation. Now Lagrangian interpolation only requires us to perform linear operations and this is for free in the npc scheme. Moreover let us additionally define a polynomial g which at point i evaluates to wi minus one. Now additionally as part of the extended witness we will share a polynomial h and we will then have to test if indeed h is the product of f and g. Now this can simply be done using only one multiplication where the verifier chooses an arbitrary value from the field and then the prover will evaluate h f and g at that point it will then multiply f at that point with g at that point and subtracted from h at the value chosen by the verifier and if the output of this computation is zero then by the Schwarz-Zippel lemma with overwhelming probability a polynomial h of degree two and minus two is indeed the product polynomial of f and g. Now this helps us to establish that both f and g if defined as above imply that all the values of w are bits as follows. By the fact that the evaluation of a polynomial at a certain point is a homomorphism we only have to check now that h at points one to point n is indeed zero but this actually comes to us for free namely we can simply construct h this way because the verifier already knows that h must have this specific shape now that means that in order to send h as part of the extended witness we will only have to send n minus one additional values and that means that we will have a larger extended witness which now has length two n instead of length n but we'll only have to do one multiplication. Now multiplications are the specifically costly part in our proof system so this will help us save communication. For details of that please have a look at the paper. Now we will also make other techniques available for SIS in our paper in particular we showed that rejection sampling can be done if both the prover and the verifier have a say in which circuit will be used and for that I also want to refer to the full version of our paper which can be defined on your print as mentioned before. We implemented our protocol and tested how feasible it is in practice in terms of runtime for that we implemented the binary SIS solution but the unoptimized version where each bit of the witness individually is tested using a multiplication for that we used an amazon c59x large instance both for the prover and for the verifier individually and we implemented our protocol for different parameters namely we let the field be of size for 15 bits up to 61 bits and the matrix A of dimensions 256 times 1024 meaning compressing from 1024 input values over zq to 256 up to 4096 to 512 now observe that in our protocol the matrix A is unstructured meaning that we cannot use additional techniques for efficient polynomial multiplication. For example if we choose the largest instance that we measured for namely if we have a modulus of size 61 bits and we have a matrix that shrinks from 4096 to 512 field elements and if you want to have 40 bits of silence in our interactive argument then the protocol runs in 1.2 seconds if both the prover and the verifier have only a single thread at their disposal now if each of them use 60 or more threads then as can be seen down there in the graphic the runtime shrinks to around 250 milliseconds for the overall protocol. Now we observed in our implementation that over 90 percent of the runtime is actually due to the multiplication with the unstructured matrix A which in our experiment was chosen uniformly at random if we would instead have a as a ring SIS instance then using efficient NTT algorithms we could probably have a much much lower runtime of the overall argument. Now depending on if one prefers a faster argument or less communication one has to choose the parameters for the MPC instances differently if one chooses more parties per MPC instance that means that each of the MPC instances has a better soundness guarantee one needs less MPC instances that have to be communicated to the verifier this overall gives less communication but it increases the computation time because the prover and the verifier now have to run more parties in their head if one instead uses less parties per MPC instance then this gives a better soundness this gives a lower soundness guarantee for each of the individual MPC instances but this then leads to less overall computation time because one just repeats this more and more often and this is nice when done with multithreading. To conclude our paper makes the following three contributions first we generalize the MPC in the head with pre-processing paradigm in order to support both protocols with cut and choose and with sacrificing we give full proofs for both of these approaches second we show how to apply our protocol to the SIS problem in order to construct an efficient interactive argument system third we introduce some new ideas in order to lower the communication complexity for the SIS setting and in order to prove security for these techniques we construct a new approach called circuit sampling which might be of independent interest with this i would like to thank you for your attention