 Hi, my name is Jan Janchar and thanks for watching our pre-recorded talk, Minerva, the Curse of ECDSA Nonces. This work was done in cooperation with my colleagues at Crocs, the Centre for Research on Cryptography and Security at Maserik University. This talk presents our discovery of a group of section of vulnerabilities in implementations of ECDSA, which we titled Minerva. It also presents a systematic analysis of lightest attacks on noisy leakage of bit length of ECDSA Nonces using data collected from vulnerable implementations. Now, at the beginning of our research into implementations of elliptic curve cryptography is the ECDSA tool. We created it as a tool for testing black box ECC implementations, targeting both Java cards and software libraries. Java card is a very popular programmable smart card platform that runs a subset of Java. It is likely that most used programmable smart card platform out there and any vulnerability found in one is likely to have a large real-world impact. The general idea of this tool is to independently verify implementations of ECC for correctness and security, basically. The tool contains test suites which you might imagine as testing known attacks against ECC, such as invalid or twist curve attacks and so on. Now, to move forward with this talk, I will pick some notation. We will be working over prime field short bar stress curve with the generator G of prime order M, and we are interested in ECDSA signatures and in ECDSA signing. And in our target implementations, the operation looks like this. First, a unit normally random Nonces generated, then scalar multiplication is performed with this nonce and the X coordinate of the result is the first part of the signature. And then thirdly, some modular arithmetic is performed with the hash of the message, the private key, and the nonce. Lastly, the two components of the signature are exported as an ASN1, their encoded sequence. Now, our ECDESTER tool obviously had some support for ECDSA and so it implemented several ECDSA tests. As the signatures are encoded in ASN1, which is somewhat known for parser errors, we included tests for ASN1 parsing in the ECDSA signature verification, which passed on all implementations. We also tested signature malleability, essentially testing whether signature verification accepts things like un-reduced scalars and so on and so forth, and this also passed on all the implementations. Now, when testing with just ordinary test vectors, we discovered some discrepancies where some smart cards did not verify correct signatures, but we found nothing exploitable, merely an implementation broken on some parameters. We also looked at nonce randomness and verified how uniformly random the nonces are that are generated by the implementations, and we found in all obvious issues. Now, having ran out of interesting and directly testable things, we decided to also look at timing. Now, this is a heat map of the most significant byte of the random nonce used in signing, which is on the x-axis, and the duration of signing on the y-axis. This is how constant time implementation looks, and we expected to find something similar in all implementations. Instead, we found this on a smart card implementation, and then this in the default Java implementation, then this, and again, and then this, and even this, which I don't really know how to explain, so timing is clearly a fail test. In total, we found one leaking smart card and five libraries, then we learned about the TPM fail paper, which found two implementations leaking the same way, and recently, paper discovered this leak in the Mozilla NSS library, and who knows where else this leak might be. In total, we tested 13 libraries and eight smart cards, all with various color multiplication algorithms, and found several different leakages, as you can see on the previous slide. It's important to note that the Athena smart card that we found the clearest leakage in was also common criteria certified, I think like EIL level 4, and it was also PIPs 140-2 validated. So, even certified devices succumbed to such simple leakages. Now, let's now look at what is actually leaking and how. Here you see the same most significant byte versus signature time heatmaps as before, and it turns out that the Athena and the libgcrypt and the SANAC or Java implementations leak the bit length of the nonce kind of linearly. You see in the libgcrypt that there are several layers of leakage, but it's only offset by a constant offset, and it's most likely some operating systems or the stuff. And the matrix implementation leaks the bit length, but also in the hamming weight kind of, you know, some. And the Crypto++ and the Volvo SSL implementations leak in a more complex way, that depends on the most significant byte of the random nonce, but also leaks the bit length in the process somehow. And we will however focus on the Athena, libgcrypt and SAN implementations, which leak the bit length linearly. Now, this is an overlay of several aligned power traces of ECDSA signatures on the Athena smart card with different bit lengths of the nonce. As you can see the leakage happens in the scalar multiplication loop, which loops exactly bit length amount of times, which creates the timing difference and is also clearly discernible on the power traces. Now, when we instead look at the bit length and time, the linear but noisy leakage is evident. So, from a power trace the attacker is able to count the loops very simply and just arrive at the exact bit length of the nonce without any noise, but if only the timing from the reader side is provided, there is enough noise such that discerning individual bit lengths of the random nonce is kind of harder and more noisy. Now, to be able to compare the leakage in the implementations and simulate it, we modeled this leakage as this random variable L consisting of the constant base time, which basically for each implementation it represents the constant time part of the computation like hashing for example. And then the interesting leakage of the bit length from the scalar multiplication loop and then some noise. In this model each implementation has the three parameters specifying how long it performs the constant time operations, how long it takes for one iteration of the loop, which is this iter time constant and what is the standard deviation of the noise. Below you can see basically a histogram of the Athena smart card leakage where the signature time histogram is on the x-axis and you can see it split into the different groups based on the bit length of the random nonce. Now you might be asking, yeah, okay, but how do you exploit this? It's only one leading zero bit on average per signature so like one bit of information per signature and there is noise. So what are you going to do about it? You are going to have errors most likely and it seems unexploitable just from the get go. However luckily Bonnet and Venkatesen introduced the hidden number problem way back in crypto 96 and the hidden number problem is formulated as recovering a secret element given some L most significant bits of some random multiples of it. It is also stated in least significant bits and some other things like approximations but we are interested in the most significant bits statement of it. And this may start to sound familiar and indeed if we think about what leakage of leading zero bits of the nonce gives us, we first have this. So basically we have some more echo giving us the most significant bits of random nonce and we expand this from the relation of the nonce to the signature and the private key. We finally have this and the original paper also presents a way of solving the hidden number problem by transforming it to an instance of the closest vector problem which allows us to exploit this leakage. Now the basic attack on this kind of leakage is previous works and goes as follows. The attacker first collects n signatures and takes d of the fastest which obviously contains a larger number of leading zero bits and does the largest number of usable information. The attack assumes some bounds l i for the number of leading zero bits of the i used nonce. I'm being quite vague here but let's say the attacker just does this somehow. Next a lattice with the following basis is constructed utilizing the bounds and the values from the known signatures. The target vector is constructed from the known signatures and assumed bound this vector u and finally the attacker solves the closest vector problem on this lattice with this target vector. Now the closest vector will often have this very special form with the private key as the last element. And this is because of the inequalities we started with which as you can see all of these for all i all of these inequalities module n will be very small because of the bounds we put on them which we could do because we kind of knew that these nonces are very small or very short. Now comes the question of can we improve this attack? As the basic attack was previously known we decided to analyze it further and see whether there is a way to improve it and mostly with regards to how it handles noise and how to minimize the number of signatures. The attack is pretty quick if you have enough signatures and if you don't really have noise and you can get your private key out in like 5 minutes so we didn't really focus on lowering the runtime as it didn't seem like a good target but instead focused on lowering the amount of signatures. And also if there is a lot of noise this kind of lattice is very sensitive to noise and so if there are any inequalities that you are staying with your bounds that are not true your target your attack breaks down very quickly and you will not find the private key you will just find some I guess random integer. And so to systematically evaluate previously known and new improvements to the basic attack we used 4 data sets of measured signatures from the vulnerable implementations with varying noise. So the simulated data set is a noise free simulated data set, the SW data set is from the software library libg grid, the TPM data set is taken from the recent TPM fail paper and represents the measurements of the SDM microcontroller's TPM chip and finally the card data set represents measurements from the Athena smart card. The card data set obviously has the largest noise because it's measured on the reader side and the card was kind of noisy and the exchange of comments and responses between the card and the reader also added a lot of noise. The simulated data set obviously has no noise. And now to have an insight into the behavior of the attack with regards to the number of signatures M and to the dimension of the lattice or the number of use signatures D we run the attack five times randomly sampling N signatures out of the selected data set and we do this for grid of parameters N and D. So again this 2D grid for each point we have five results of like a randomized attack from which we can look at things like success rate and other interesting things. Now the first question to tackle is that of the assignment of the bounce Li for the I past signature. Most previous works simply use the constant for all these signatures calculated based on D essentially and they also evaluated this based on D and had all these graphs based on this. We instead however use geometric bounds calculated based on M which we introduced as they better approach the true distribution of leading zero bits in the fastest D signatures. Here you can see a sample of the leading zero bits of the first D signatures as the blue dashed line and with the geometric bounds as the green line. They obviously overlap quite a lot and below them you can see box plots or basically of the difference of our geometric bounds with the distribution of the of the true bounds or true leading zero bits. So you kind of see like the errors and that we are pretty close to the distribution. We get some more errors on the boundaries of the different kind of you could call them steps. Now using geometric bounds gives us a large improvement to the success rate of the attack as visible from this heat map of the number of successes of the attack out of five tries on the four data sets. So we have the constant bounds with C equal three so kind of claiming three zero leading zero bits for all the D signatures on the left and on the right you have our geometric bounds and you have the color coded data sets there. And you can see that it specifically improves the results on the card data set which for the constant bounds there are only like four or five successes out of all the thousands of attack tries that we did. But with the geometric bounds it improved the success rate quite a lot and also for the other data sets. Now as we want to analyze the success rate with regards to the number of signatures and we average the success rate over the dimension range and give the results here. So this shows a significant improvement in success rate for geometric bound for basically all of the data sets. And all further experiments will thus use these geometric bounds. While the hidden number problem kind of naturally transforms into the closest vector problem we can also further use an embedding strategy to transform this problem into the shortest vector problem. We investigated both variants using the Babais nearest plane algorithm to solve CVP and using a search of the reduced basis vectors to solve SVP. Now we confirm the findings of previous works that show SVP solving performing better and achieving success rate at the smaller number of signatures. However there also exist other methods of solving both of these problems which might improve the success rate even more. But in all of our further experiments we used SVP solving if not otherwise noted. Recentering is another possible improvement to a lattice attack like this one that has been used in previous works also. And so as the nonces are non-negative and in the inequalities we are only bounding their absolute value we can re-center them by subtracting and obtaining a bound titer by one bit. So you kind of subtract this half value and when you are doing an absolute value you kind of gain a titer inequality let's just say. Anyway it can be thought of obtaining a free bit of information or using the bit of information that is non-negative. So when evaluated we found a significant improvement in success rate in all datasets except the most noisy one. And so for the simulated dataset this improvement decreased the minimum number of signatures for success to only 500. And so all further experiments used re-centering if not otherwise noted. Well errors arise obviously when the inequalities given by the assumed bound student hold do it to noise and so they very often lead to an unsuccessful attack and even in small quantities. So another attack improvement used in previous works aims to avoid errors by using random subsets of signatures. So we evaluated this by sampling the random signatures out of 1.5 t fastest signatures and we did this 100 times and compared the success rate of this to the attack without random subsets. So we found that for the most noisy dataset this indeed improves the success rate as can be seen from the blue lines on the graph. However it decreases the success rate for other less noisy datasets. And this can be explained by the fact that taking a random sample of these signatures out of the 1.5 t fastest does not choose the signatures with the largest amount of information. That's the d fastest signatures. And as this is also a time consuming technique as each attack now runs like 100 times more we did not use this in the further experiments. But it is a promising technique of fixing errors in very noisy and very error prone datasets. Well another possible implementation of the attack is that instead of bounding the noses themselves we can bound a difference of two noses. And if we indeed strive for matching the bounds information should not be lost and errors might cancel out. So if for example two noses with the same assumed bound had one bit more than they assumed a bit length. Subtracting them would clear the bit and make and fix the resulting inequality. So the resulting value from this subtraction might be negative so we can no longer use recent ring. This technique improved the success rate which was at first quite surprising to us and then we found out how it actually fixes errors. And it improved it on all but the most noisy dataset and is comparable with the technique of using recent ring. So this is kind of a variant of that if you might think of it like that. We think that it did not help the most noisy dataset because there is just so much noise that the chance of correcting some of the most of the errors is just too small. In this kind of lattice effect the lattice reduction step is the most costly. The technique of random subset is able to avoid errors but has to run the costly step many times to do so. We introduce a technique which allows us to fix some errors without running the lattice reduction step many times but instead running the relatively cheap but wise nearest plane algorithm many times. So it is important to note that when solving via CVP the U values are not part of the lattice basis construction. So you have the lattice basis and you have the vector U and we can construct the lattice and reduce it only once and then try to solve CVP with many different vectors U with changes in some of the high bits of the elements of the vector of the UIs. And while this actually corresponds to kind of clearing errors when some high bits of the nonce Ki are set. So if you imagine changing a bit in some high bit in UI it actually corresponds to you kind of flipping the bit in the Ki nonce and seeing if there was an error and if you actually corrected it this way and hoping that you will find the private Ki. So we evaluated this technique by trying to correct up to three errors at positions one bit over the bound and we stopped the attack as soon as someone was successful. When you are fixing errors this way you have to fix them all at once so you have to try to fix all possible triples in all of your D signatures so this obviously takes some time. And so the heat maps display the minimal number of errors fixed before an attack was successful. And as you can see as the boundary between the attack not working at all and working without fixing any errors there is quite a lot of improvement where this kind of fixing errors actually helped the attack to move from not working to actually do success. So this looks like a promising technique to fix errors in lattice attacks when using CDP. And to save time by not reducing the lattice multiple times but only changing the target and kind of searching around different places of the lattice. And when we project this or like average this over the dimensions we also see that there is significant improvement in the success rate. And it is only comparable to the success rate when using ordinary SVP so this is not an improvement that goes further than using SVP but this is an improvement that kind of matches it while using CDP and might be somehow possible to make this even better. Now to conclude we have performed a systematic analysis of lattice attacks on noisy leakage or bit length of annonces in ECDSA. We found that our geometric assignment of bounds lowers the minimum number of signatures for attack success and we found that SVP solving outperforms CVP solving via the nearest plane algorithm. We found that recentering improves the success rate and we found that correcting errors via bit flips in the U values is promising as an improvement. During this we also demonstrated an attack on data from the TPM fill paper in which we lowered the amount of signatures required from 40,000 to only 900. We also demonstrated this attack on the Athena smart card which was common criteria certified and also of PIPS 140-2 certified. So thanks again for watching this talk hope you enjoyed it. You can check out the paper and some other supplementary material on this link. It contains our proof of concept code, it contains more figures, it contains the paper itself and so on.