 Hi, my name is Daniel and I'm going to be talking about our work on improving the performance of the Picnic Signature Scheme. This was joint work with Greg from Microsoft Research. In this talk I'll be first going over the background of Picnic, so what is Picnic and how does it work internally. Then we're going to be exploring some of the parameter choices that can be made in the Picnic Signature Scheme. Finally I'll be showing you some protocol optimizations that we performed for Picnic and our new parameter set called Picnic 3 that combines all of them. In the 90s Peter Schor presented this quantum algorithm for integer factorization and discrete logarithms. And this algorithm essentially means that all of the public key cryptography that's currently in use is broken once we have powerful enough quantum computers. And these quantum computers are theoretically viable and they are already small scale prototypes that work. But currently there's a lot of engineering effort by the big players to scale up these quantum computers to sizes that can tackle much larger problems. To be ahead of this the National Institute of Standards and Technology has started a post quantum standardization project which aims to find new algorithms for cryptography that are resistant against quantum computers and attacks using them. So this project has recently entered the third round and it aims to find new algorithms for key encapsulation, public key encryption and digital signatures. So Picnic is one of these digital signatures that is a candidate in this post quantum standardization project. And over the next few slides I'm going to be showing you how Picnic works internally. At the beginning we have to talk about multipart computation briefly. So multipart computation allows a set of n players to compute a shared function on some secret inputs and arrive at a common output. And importantly the individual players in this MPC learn nothing about the inputs of any other player and they learn nothing about any intermediate values in the computation of this function. And if you choose the correct MPC system even if n-1 so all but one players collude with each other and share their information they cannot learn anything about the real values. From these MPC protocols you can actually build zero-knowledge proofs and that is a very nice idea that is called MPC in the head. And it transforms a semi-honest MPC protocol into a zero-knowledge proof of knowledge. And the general idea behind this MPC in the head approach is as follows. You have an MPC protocol with n players and this n-1 privacy so that even n-1 parties working together cannot leak anything. And the prover in our zero-knowledge proof simulates the execution of the function he wants to prove in this MPC. And by that we mean that the prover works as every single party and commits to the state of each party during the execution of this MPC protocol. The verifier then can ask the prover to open all but one chosen party and since we have this privacy property he cannot learn anything from these n-1 opened parties. But he can still verify that this n-1 open parties are consistent with each other so that he gains some assurance that the function was actually executed correctly. And this can be repeated again and again until the verifier gains enough assurance that he can accept this proof. From this interactive zero-knowledge proof we can build a non-interactive variant and we can do that by the means of the FIHME transformation which is a decades old standard technique where the prover instead of getting the challenge from the verifier actually computes the challenge himself. But since the challenge needs to be random and unpredictable the prover has to set the challenge to the output of a random oracle under his first message. And this transformation is approved in a random oracle model and there are also some recent results for the quantum random oracle model. Here we can see again this transformation so that the protocol is only a single message. From this non-interactive zero-knowledge proof you can actually build a signature scheme. And to do that we first pick an identification scheme where we pick a block cipher. In our case we need it to be a one-way function and we publish a public key, a single plaintext ciphertext pair where the plaintext is evaluated under the secret key and gives the ciphertext and this plaintext and ciphertext are put in the public key. Then to identify we execute a zero-knowledge proof of knowledge for the secret key. To build a signature scheme from this identification scheme we make it non-interactive via the Fiat-Chemia transformation but we also include the message in the challenge generation of this non-interactive zero-knowledge proof of knowledge. And this is a technique that's already used in existing schemes for example the Schnorr signature. The concrete instantiation of picnic as follows. In the first round of the NISPOS quantum competition there was this picnic one variant where we have low MC as our one-way function, our block cipher. And low MC is a block cipher that is especially suited for execution in MPC protocols. It has a very low amount of AND gates and these AND gates contribute most of the communication during this execution. The MPC protocol was a 2-3 circuit composition which means that we have three players and it has this property of two privacy so that two working together cannot learn anything. And the hash function was shake. And in picnic 2 which was in addition in the second round of the NISPOS quantum competition we again have low MC and shake as our building blocks but the MPC system was replaced with KKW which is an MPC protocol with a pre-processing phase. So this pre-processing phase is run before the actual execution of the MPC protocol and produces some correlated randomness. And this correlated randomness is then used in the online phase. And these two instances of picnic have very different performance characteristics. So as you can see picnic 1 has very fast signing and verification times at around 1 millisecond but the signature size is around 30 kilobytes. When using picnic 2 the signing and verification times are much higher but the signature size is lower at around 12 kilobytes. So we are interested in improving these trade-offs and therefore we are going to be exploring this KKW proof system used in picnic 2 further. So this KKW proof system has a few important parameters the number of MPC parties that are used in the protocol the number of parallel repetitions and then the number of online phases that are opened to the prover. Add to the verifier. Using all of these three parameters gives you exact formulas for the security so we of course want to have for example 128-bit security and we can also calculate the resulting signature size from this parameters but the concrete speed is a bit harder to predict. You can use implementation techniques such as packing N parties into one CPU register to gain a lot of speed up. And we can try to approximate this by counting the number of hashing operations and similar stuff but the real-world performance of an optimized implementation is what is most interesting for comparison. And so what we did first in our work is implement a lot of different parameters and looked at the trade-offs in terms of signing and verification speeds and we looked at the optimized implementation so that we can really compare these numbers. And as we can see here the first row gives the current parameters of picnic 2 and we can see that if we reduce the number of parties we gain a small increase in signature size but a much larger increase in signing and verification times. So these trade-offs can be worth it and will be a part of this building block for picnic 3. Another set of parameters that we are interested in are those of lowMC so lowMC is not a fixed block cipher, it's actually a family of block ciphers and it's parametrizable on pretty much every parameter so the block size, the key size, the number of S-boxes per round and also importantly to allow data complexity for an attacker. And this is very interesting in picnic since we only ever reveal a single plain text cipher text pair to an attacker and this eliminates a large class of attacks. And the original exploration of the lowMC parameter space focused on block and key sizes that are equivalent to the security levels but we expanded this search to include instances with a full S-box layer and what are these instances with a full S-box layer? So we observed that instances that have a full S-box layer so do not have a partial S-box layer that has some amount of S-boxes and the rest of the S-box layer is the identity function but has a full S-box for so that every bit is going through an S-box. The problem is that the S-boxes are of size 3 bits and this does not fit nicely into 128 and 256 bits so therefore we propose to move to new state sizes of 129 for L3 this works out perfectly and for L5 we drop down one bit to 255. Let's talk about some of the protocol optimizations that we have formed for this MPC protocol. So in the KKW MPC protocol we have an invariant that is true for each wire in our circuit. We have that the real value that is underlying this wire is masked and each party holds one share of the mask. This linearity means that XOR gates can be computed for free and AND gates need multiplication triples like in many other MPC systems and these multiplication triples are defined as follows so we have four sum masks A and B we have that a valid multiplication triple is that this multiplication triple is the sum of all the multiplication triple masks is the product of the sum of the individual shares of A and B and in contrast to some other MPC protocols these multiplication triples are circuit dependent so this multiplication triple here is specific to the input wires A and B and this actually means that during the pre-processing where we calculate this multiplication triples we have to also evaluate the whole circuit and this is where also a large part of the cost comes in during this pre-processing as I said we are generating this multiplication triples and for each AND gate with some input wires A and B and output wires C we read the masks from the random tapes each party has a random tape that we read all of the random values from and if you just read this masks A B and the multiplication triple from these random tapes we are not guaranteed that this actually is a valid multiplication triple that fulfills our previous condition so what we do during this pre-processing is we calculate the error in this multiplication triple and then we fix the last party's random tape so that this error is fixed and the multiplication triple is a valid triple for these masks and we observe that the error in this multiplication triple only depends on the sum of all parties' masks and not the individual values of the masks so one optimization we can perform is that we apply any linear gates following the AND gates only to the sum of these masks instead of the individual masks and while in general this might not seem like a big improvement due to the design of low MC which has a very dense and heavy linear layer this actually saves a lot of time during pre-processing and results in 1.5 times faster signing and verification the second optimization is that we can look at the online phase and in the online phase the parties use the the output masks, the fresh output masks from the AND gate and they have to calculate the linear transformations that are applied to these masks through the linear layer and then through the key addition until they arrive at the next S-box layer and can perform the procedure for the AND gate so what we propose is a slight change to the pre-processing and that we instead of sampling this fresh output mask see here at the output of the AND gate we sample the fresh output mask at the input of the next layer's AND gate then we can calculate backwards applying the inverse linear transformations until we arrive at the S-box layer and fix our multiplication triples that way and this optimization enables us an improved online phase where the parties can actually just read their input mask that they need for this AND gate from their random tape and the linear operations are already baked in so to say from the pre-processing phase so they do not ever need to calculate this linear layer in the online phase so this linear layer here only gets computed during pre-processing and the third optimization is that at the end of the MPC protocol the parties have to broadcast their share of the output so that they can check that the unmasked output is actually equal to the public key that was given for picnic and our observation is that there are only linear operations between the last S-box layer of LoMc and the final ciphertext and what we can actually do is combine the communication for the S-box AND gates for this last S-box layer and the output into one single communication and this new communication can be computed from the previous old communication so there is no leak of additional information and this optimization saves N bits of communication where N is the block size per MPC instance and this is reflected in a lower signature size so let's finally talk about Picnic 3 and Picnic 3 is a new set of picnic instances combining this previous improvement so first we are taking a trade-off of using N is equal to 16 parties in the KKW MPC protocol we're using a LoMc instance with a full S-box layer which also provides better performance then we apply all of the present optimizations to the MPC protocol and the existing security analysis for Picnic 2 is still applicable since this new structure of Picnic 3 is very similar so let's have a quick look at the performance of Picnic 3 so here we compare the performance to the old existing Picnic instances and as we can see that Picnic 3 has signature sizes that are very close to the old Picnic 2 signature sizes so only about 2% larger but the signing and verification times are much much faster whereas the signing time is about 8 times faster and the verification time is about 5 times faster in the last row of the table we also give an additional instance of Picnic 3 that has one more internal round of LoMc to provide a larger security margin against any future crypto-analytic attacks on LoMc and as we can see this has not a very big impact on signing and verification times but a little bit of an increase in signature size in the NISPOS Quantum Sanitization Project there's also another submission called Sphinx Plus our security is only based on a security of symmetric primitives whereas Sphinx Plus is using hash functions for their security properties and Picnic is using hash functions for their transformations and also LoMc is the block cycle and we can compare the performance of Picnic 3 to Sphinx Plus instances here at the 128-bit security level and we can see that the Picnic 3 parameter set provides faster signing times than Sphinx Plus and slower verification times for the signature size it has a good middle ground between the fast Sphinx Plus instances which have a signature size of about 17 kilobytes and the small Sphinx Plus signatures which have a much lower signature size at 8 kilobytes these have very large signing times so we can see that Picnic 3 instances provide competitive performance to the existing Sphinx Plus instances so in conclusion we revisited the parameter choices for the Picnic Signature Scheme we presented some optimizations for the MPC protocol and finally proposed new Picnic 3 instances and these Picnic 3 instances provide 8 to 14 times faster signing times than previous Picnic 2 instances and about 5 times faster verification than previous Picnic 2 instances while the signature size is very comparable at around 2% larger and we implemented these Picnic 3 instances into the optimized implementation which can be found at this GitHub link and these Picnic 3 instances are also part of the round 3 submission to NIST where Picnic has advanced to the third round as an alternate candidate in the signature category in the full paper you can also find more on optimized instances for interactive verification and some benchmarks for alternative hash functions that provide better performance than shake such as the kangaroo12 hash function some application for future work is concerning the cryptanalysis of LoamC because this Picnic scenario has a very specific attacker scenario where we only ever reveal a single plaintext type of text pair to an attacker and this is a scenario that is not very often considered in previous cryptanalysis and therefore there is an ongoing LoamC cryptanalysis challenge with a total price fund of 100,000 US dollars can be found at this link at loamcchallenge.github.io and if you are interested in symmetric cryptanalysis and want to broaden the field of this very specific scenario have a look at this page another area for future work of Picnic is an optimized implementation for embedded platforms where in this work we only focused on x86 targets and also our optimized implementation is optimized for this on these embedded platforms we also have to consider resource constraints that we have lower memory and maybe also even lower flash storage and this is a concern that has to be addressed in implementations especially with complex implementations such as the MPC simulation in Picnic 3 but this is an other area where future work can focus on and with that I will end my presentation here and I hope to have brought you this topic of this new optimizations to Picnic 3 to Picnic and the new Picnic 3 instances a little bit closer thank you and goodbye