 Hello everybody. Good evening at GPN20. Thank you for coming to the talk. This is Joachim and he will show us insights into not sufficiently random randomness with digital signatures. He is usually working with programming language design and he will present in English, so I am greeting you all in English. Let's enjoy the talk afterwards. You have time for questions. If you need to leave the room during the talk, please do so quietly and take all your stuff with you afterwards. That's it from me. Enjoy the talk. Thank you so much for being here. Thank you. It's great to be at GPN again. It's great to be in a non-virtual conference again in GIF Talk. It's just much more fun this way. I'm going to talk about joint work with Nadja Henninger, a relatively famous crypto-analysis person living in San Diego now as a professor. I did mostly some Python programming and helped a little bit. She's the real crypto-analysis crack, so I may not... All the tough math is not really mine here. Speaking of tough math, I decided to cut this talk into a very fast 30-minute talk rather than a long one-hour talk. I'll skip all the math. If you have more questions about the math, we can talk about that later. I hope to get fast through the intro and then get to the fun anecdotes. So to warm up a little bit, there's a law of large numbers. That's a math theorem saying something about if you do something many times, then you get on average the expected outcome. This is boring. Less boring is the law of truly large numbers. So if you imagine something happens to you with a one in a million chance, which sounds like very rarely, that happens in New York 10 times a day, just because there are so many people in New York. And then Nadja's conjecture is if you look at crypto-implementations, code that does signatures and this kind of stuff, and assuming there are many different independent crypto-implementations than any most-agreed stupid buck is going to be present there, and that's what she builds her research career on. She finds these bucks and writes papers about them and that seems to be working well. And we're looking at one of these projects that we did. The context here is ECDSA. It's one of these signature schemes that allow you to make a cryptographic signature, and it's the one used in Bitcoin and Ethereum and many other places. As I said, I will skip the math. The only thing that we really need to look at here is there's a per signature nonce called K. K is just a number, a large number because there are always large numbers in crypto, and nonce means number used only once. So it has to be used once for each time you make a signature. And this needs to be secret. Now let's skip the math and go to the ways you can screw up. This needs to be kept secret. If you create a signature from some text with your secret key, then during this signature creation, you will have to use this nonce. It's just a number. But if you somehow tell anybody about this nonce that you used, then they can take the signature, which is usually public information, and the message you signed, and the nonce, which they shouldn't have, and they can calculate the secret key, which is bad. Why is it bad? Well, the context here might be Bitcoin, secret key. It roughly means the account you have with all your bitcoins in. So if anybody else can know your secret key, they can take all your bitcoins. And that's, I think, bad. Okay, so keep the secret non-secret. That's maybe relatively easy. But then it has to be a nonce. It has to be used only once. So the second pitfall is if you use the same number in this process of creating a signature for like a second time for the next transaction that you're signing and with your Bitcoin stuff, then the attacker can just take these two signatures and do some very simple math, and they can calculate K. And as we just seen, if they know K, they know the secret key and you lose all your bitcoins. So don't use the number twice. So use a different number every time. Now, does it make sense to just like count up 1, 2, 3, 4, 5, and so on? No, that doesn't work either. It has to be a really uniformly random number that you choose. Because if you use a different K every time, but the K is small, so what does small mean? We're talking about 256-bit numbers here, so these are very large numbers. And if you don't use all the bits, you only use like the low bits, like the low, I don't know, 64 bits or something, then that's a small number. It's still a pretty large number, but it's small compared to the range you should have. And if you create a few signatures using all these nonces small, then you can do some not simple math. This is the lettuce math, which is quite fancy and which is the stuff that Dandja does, and I don't. But then you can just throw a few signatures into a big black box called lettuce magic and you get the secret key out. And that's bad because I think I mentioned it. Yes, you lose all the bitcoins. Okay, so we have a way of attacking signatures. Now the question is, okay, where do we get signatures to attack? So where would we find many, many, many signatures created by many different implementations of code, maybe some of them even by people who are more enthusiastic about crypto than they are a clue for about crypto? Well, obviously, oh, sorry, that was... Yeah, I'll have to skip to that. The other thing we can do more. So I said we can look at small case and you get the secret key. This is the lettuce attack we just have. But it's more than just small case. It's also if all the nonces have a shared prefix. So this is now a large number. This is all the 256 bits, but the part on the left is the same for all of them. You can do that by doing simple math between two signatures and suddenly the common prefixes appears, becomes all zero, the numbers are small. So shared prefixes are bad. And because of that, shared suffixes are bad too because then you can do some other math. So obviously, if you pick your case in a way that they all share the same prefix or the same suffix, then you screwed. Okay, now where do we find a bunch of signatures to look at and maybe find interesting secret keys? Well, obviously, in the cryptocurrency space. So that's what we did. We looked at Bitcoin, Ethereum, Ripple. We also looked at non-cryptocurrency spaces, things like SSH and HTTPS implementations. And then we tried to crack them. So that was our plan of attack. We scraped the blockchains for lots of signatures and there are many signatures there. Then we grouped them by a public key. There's still many of those because obviously we need to attack things with signatures created with the same secret key. And then we run the various attacks on them. So we look for like the simple one. People just using the same K multiple times. And then we do the fancy one where we try to find signatures that are not chosen uniformly at random for the whole range of numbers. And we get a bunch of, hopefully, after running a lot of computation, we get a bunch of secret keys. Secret keys in Bitcoin equate to access to the accounts. So we can look at how many Bitcoins are there. If there are many Bitcoins, then we are rich and we can retire on. If there are not many Bitcoins, then we can write a paper. So we wrote the paper. But it's still fun to write a paper. All right. So what did we find? For the first one, repeated nonces, this is a known problem. This wasn't very new that if you use K twice, you're screwed. It's actually so known that there are attackers who are constantly watching the whole blockchain if there's a transaction that uses K for a second time and therefore reveals the private key. And if that private key still has some money in its account, you can do that very quickly. So this suddenly disappears once you have a bad signature on the blockchain. Interestingly, for Ripple, we found one key that was affected by that and there was actually money in there. That's enough to get two pizzas or something. So this shows that the bad guys are only watching Bitcoin, and maybe Ethereum, not sure, but they don't watch Ripple yet. So there's an interesting insight. But really, repeated nonces is not that new thing. The really thing, like the kind of contribution we made was looking for these bias nonces, these nonces that share common prefix or share common suffix. And we plotted the keys that we found. So we found a bunch of keys, 300 keys. I mean, I guess there were 80 million keys on the blockchain, so it's not like many, many, but it said something. It's there. And there were various clusters we could identify based on the kind of shape. The nonces have. So we have some common prefixes, some common suffixes, and then we have these small nonces where the high bits are zero. And we can see a plot of time that there must have been something like appearing and then disappearing. So we wanted to know more about where do these bad signatures come from. I mean, we already checked there's not much money in there, so at least we need to do some research here. But there's not much we can do. We can just look at the blockchain and we get things like, well, the transaction data. But there's not much there. It's like account from, account to amount. That doesn't tell us much. We wouldn't get information about implementation used. I mean, it would be great. We would be able to tell people, hey, your code is bad. Maybe you should fix it before you lose your money. So we tried to find out what we can. And for a few cases, we could find out some things. Now I said I shortened the call talk. So we start with number two here. Cliff Hanger, if you want to know more, there's, I can tell you later, or there's another recording of a similar talk. So this is repeated keys. So that's less interesting one. But still, one of these, we looked at this address belonging to the key and we googled for that. Googling is always very great when doing research. And we found it on some web page that was collecting donations for a wallet, like a software that keeps your money, darkwallet.is. And it turns out that the donation destination address, which still holds, or at that time, held 17 Bitcoins, which is now, well, it changed all the time every time I update the slides. It's annoying. So there's real money in there. And it's a three out of five multi-signature address, which means there are exists five secret keys. And if you want to use the money, you have to sign the transaction with three of these keys. And one of these keys did two transactions, somewhere in history, using the same nonce, which I guess then isn't a nonce because it was not used only once, and which allows us to get that secret key. We tried to contact the authors and they said something like, oh, maybe that was me. I was doing some signatures by hand when I was building this. So lesson to be learned. Don't play around with real accounts when you write signatures by hand. You might be screwing up. So I guess now it's a two out of four multi-signature address, so the money is still somewhat safe, just less safe. Just because we were looking at small nonces, I actually did a complete brute force of all signatures checking if the nonce is a very small number, two to the power of 32. We find, well, we find some values that obviously look like they're written by notes, which I find amusing. There's nothing bad about this. People play around with that. That's fine. Play around with stuff. Just don't play around with other people's money or with too much. So these are clearly hand-generated. A bit more serious were examples where we know there's a well-known vulnerability that caused signatures to go bad. And they all revolve around the problem of where do we get a random number that's actually random. And that's unfortunately quite hard. So the first example that I would like to mention briefly is there was a time when on Android, well, there's an API to get random number, and it was buggy. So if your process, the app running on the phone, forks, so it creates two processes, and then both get a random number, they will get the same. And that means if both are now doing crypto signatures, then you use the same K twice. And I think I mentioned, if you put them on a blockchain, somebody will notice that and will take your money. So there's a real bad problem, and people lost money through that. The other one is also real bad, but it's more amusing in a way. Blockchain.info is one of these Bitcoin-related web pages, and they have a wallet where you can store, or at least they can make transactions there somehow. And as I mentioned, you want to use randomness when creating a transaction. So how does this wallet get randomness? It uses a web service, random.org. It's a website where you can get a random number. It's maybe not too bad of an idea. They have good sources of randomness, so maybe that works. But because it's also very reasonable, these random numbers are very sensitive. So at some point random.org decided, you can only use us over SSL, so it has to be encrypted, the connection. Very good people to do that. So they did introduce this thing, and every request to non-HDPS was redirected to HTTP. So what happened when the Blockchain.info wallet code was trying to get randomness, it would, for some reason, still use the old URL, and then get a redirection on the HTTP level, telling it to use something else, but it wasn't following the redirection. So it got randomness as a random number, but I guess everybody got the same redirection, so they all get the same number, so a bunch of people lost their bitcoins. Okay, so this was repeated nonces. This is like well-known, but still nice to recapulate what went wrong there. So let's look at some of the new findings we could find by looking for nonces that are still used only once, but not fully random, and what we maybe can guess there. So this cluster here, we were, somebody gave us a tip what it could be, Gregorly Maxwell, so very thankfully we actually have a good story of what this is very likely, because at like 2014, in the code of a Bitcoin library, it's called Bitcore, it's a JavaScript library for all kind of things you need to do when you want to interact with Bitcoin, created by the company BitPay. They refactored it a little bit, they started using a different library for dealing with large numbers. And in this commit, they use a secure random API to get a random number, and they pass the parameter 8 there, asking for 8 bytes of randomness. So 8 bytes, how many bits is that? 64, and how many bits do we want? Right, so that's not enough bytes, not enough bits, so somebody just put in the wrong number, can happen, and it's a bit unfortunate event into a release, but very shortly after, somebody noticed this and fixed it. This is like a month later. So the problem that we see here is not really that there was a bug and it had to be fixed, that happens all the time, there really is kind of shocking to disturb or worrying, let's stay with worrying, worrying observation here is that the fix was a month later, and still for a whole year we still see bad signals because of this. Because there was a release in between, actually a few one, and it seems a property of the JavaScript ecosystem that you just start using a dependency, you pin the dependency to a specific version, and unless somebody tells you, you probably forget to upgrade it to a new version when there is a new version coming out. And if that new version fix a important security problem, then well, you have that problem for as long as you don't update. And that points to a very fundamental problem we currently have in our software and building ecosystem, and I'm not going to offer any solutions here, but it's hopefully something that will be better at some point in the future, because this is embarrassing, or at least worrying. But it was really nice to be able to find these signatures in this raw amount of data on the blockchain, and then trace it back to actually what happened there. So, two more. So this is not... This is not blockchain, this is SAH connections, but it's still an interesting anecdote for how we found here. So here we found signatures where the random nonce was a fixed suffix. So the low eight or low four bytes were always the same. So what do we do when we see a hex number and we want to know what it means? We Google for it. So I took this, the common suffix, Google for it, and I actually found stuff. There was random dumps of binaries, mostly arm binaries for some reason, that had this number in them. So I found my machine, it wasn't really clear what was happening, but I noticed the arm does this thing very little-endian or big-endian. So they write the numbers in the wrong order. Or write order, depending on who you ask. So I Googled again, but now I reversed the hex number. So I Googled for C6, 7178, F2. And then I found this number in the specification for SAH, the hash function. Something you use a lot when you do crypto and other stuff. A hash function takes some big amount of data and gives you a small number that changes every time you change the original data. And it's used pervasively in crypto and other places. And I don't know exactly how SAH works, but it seems to be like multiple rounds of basically mixing up your data. And then there are some constants that kind of play a role in this computation. The number that we were looking for is the very last constant here. This is the same number if you reverse it. So this was a nice find. My theory is that somebody was trying to use SAH to get their random number, which is a good idea. Using a cryptographic hash function really helps you, if you don't mind, to get a uniform distribution on your random numbers. But they somehow screwed up. You just need it for randomness. We don't need to do all the rounds of the SAH algorithm. And now they get a random number that's random everywhere, but in the last four bytes, and yes, you lose all your Bitcoin. Well, okay, this was as an age, so no Bitcoin to be lost. But this happens when you roll your own cryptos. When you build your own... implement the crypto stuff on your... which is usually a bad idea. And with that I get to the last one, I am really fast. It's good maybe I can get up to the back... can do the backup slides after all. This is the one I was most looking forward to. And it'll fall... all fall apart, but we'll see. So, again, Bitcoin, I had a bunch of 128-bit suffixes. So the second half of the nonce is always the same, given the same secret key. And I was staring at these numbers and then I found, hey, really, this is the... the second half of the nonce is the same as the first half of the secret key. How does that happen? Any ideas? What is your first thought when you see that? Buffalo of Law, right. Any other first thoughts? Right, that was my only first thought as well. And I'm a big fan of like high level languages and Haskell and memory safety, so I was immediately jumping at the chance of... this is obviously somebody using C, a language like C, and messing up their memory because they probably wrote something like this. This is just me making things up. But maybe, you know, there's a... the buffer is too small and they're copying the secret key and they're overriding half of the randomness. What a great story! And it gets even better because if you look at these signatures, some of the secret keys are actually quite strange. So, 11 is not a very secret number. I think all of you know 11. So, it turns out that all the secret keys that are affected by this are secret keys that are compromised anyways. Small numbers somehow leaked on the internet. There's something called brain wallet that allows you to take a passphrase you can remember and turn that into a secret key. But if, of course, the passphrase is something not very secret like dog, then somebody else can also get the secret key. So, these were all secret keys that are kind of insecure. And all these transactions were emptying out these accounts. So, it looks like there's a bad guy who's watching the blockchain for any money in an account that belongs to a secret key that is compromised anyway, and then they use their code to key it out. And, I mean, I mentioned before that we know that this is happening and I just found it nice to be able to say haha, I mean the attacker has a bug in their code. This was my story until 12 hours ago, or 10 until I was looking at this slide again. And then I was wondering, if this is the first half of the secret key, did I ever check if that this may be somehow related to the message hash? And, so to maybe recap, when you sign a message, you take the message, you hash it that gives you a 256 number and then you do some math together with the message hash, the secret key and the nonce. And actually, so I thought did I ever look at that? So I looked at the raw data and I read all these signatures the first half is taken from the message hash. Now, I still raise the question, why is it the case, but I just sorry it was so nice with the buffer overflow. But it looks like there's no buffer flow, the attacker probably has a reason to do that. My theory is they want to read really fast because there might be multiple people watching for these accounts and they all want to raise to get an transaction in to take that money out and they want to do something complicated like get real name randomness. Maybe you're a bit slower with your transaction and therefore they just did something very simple. This is one theory. Another theory, what they might have been doing is the following. So I'll put this on. This is what I found out this morning. But actually interesting we actually know more about this structure of this bad implementation and we can actually now look more thoroughly for these, maybe we find out something more. From a research perspective this is actually nice. From giving a talk at GPN at 8.30 it's a bit disappointing, but hey I can still tell the story even though it's wrong, right? Anyway one big takeaway for this is well it looks like getting random numbers is hard but you don't even have to. We said that K needs to be different for each signature you make and it has to be uniform in the whole range of numbers and there's another way of getting a number like that and that is called deterministic hashing so you take the input like the message you have, this is the message the hash of the message you need to sign you take your secret key which nobody else should know anyway and then you put them together and you hash that and you use that as the nonce. This has got to be one from a nonce and there's no more randomness involved so much less to screw up because this is much easier Bitcoin and so on have started using this method newer crypto schemes only use this method and maybe I was looking at this slide when I noticed maybe the attacker is just doing something like this but screwing it up forgetting to hash it or something so we don't know we probably will never know but it was still fun trying to imagine for a while that I can be snobbish about bad programming languages right this brings me more or less to the end I guess the main takeaway is don't build your own crypto there's just too much stuff that can go wrong when you write your own crypto graphic implementations so use existing libraries unless you really really know what you're doing or you're doing something where it doesn't matter if it breaks too much I said I shortened the cork because I shortened it even more while going through it very quickly if you like conspiracy theories there's something strange about the curve the parameters of the signature scheme used in Bitcoin and others which means that there's a certain number that makes your signatures go very small and we don't know why it's this case that's interesting somebody might have messed with it when it was defined and there's more stuff in the paper I don't need this printout anymore I'll leave it here so if anybody wants to pick it up and have a look you can do and yes we still have some time for questions that's great thank you for coming with me on this very fast ride through these flights thank you so much for the talk if you do have any questions I can come to you with the microphone so everyone can hear your question and I will come to you you can pass the mic please don't talk my question is how the diagrams work that you showed so of course there's the exit of time then there are the different kind of compromising options and what is the size actually it's just the number of signatures in that day or period of time so there were just very many here of this particular kind for example okay thank you next question over here you told us you found numbers that were used twice maybe you saw in the blockchain how long did it last until someone stole the bitcoin account it's money I didn't check for that directly but I believe it's immediate I guess in the next block because whoever is watching it once there are two people watching it they're just going to race for it so they need to be really fast so I would expect that thinking about how bitcoin works you submit your construction and enter something called the memory pool so even before it's actually on the blockchain it's kind of distributed around so that's already early enough for people to compromise it so they might actually be able to take the money they're just about to move by racing the transaction that you just did that is revealing the key yeah so I think it's effectively instantaneous it's maybe gone before it's in the blockchain yeah so the thing is you're always revealing something when doing a transaction so if the transaction is taking all the money out of your account it doesn't matter if you're reeling the key because now the account is empty so if you're not active they can put in a transaction before yours and even take that money we have time for another question anymore so if you believe the conspiracy theories I guess about ECDSA what else should we use I wouldn't say I believe them I just say there are some I I don't think of myself as a cryptographer that can give advice I think it's good enough for now the ECDSA stuff okay well then again thank you all so much for coming if you have any more questions for Joachim maybe you can meet outside on the beautiful meadow or inside in the hack center when you leave the room take your empty water bottles with you so it's easy for the trolls to clean up the room for the next talk and thank you again for speaking here tonight and have a nice evening