 Hello, this talk is substitution attacks against message authentication and this joint work between myself and Bertram and Bertram. So in this talk I'm going to first give a high level motivation as to where the notion of algorithm substitution attacks came from, then I will talk about prior work on algorithms, just substitution attacks, looking at encryption as an example, and then I will move on to message authentication and our work. Okay, so the motivation really comes from the Edward Snowden revelations, which showed that there is widespread mass surveillance of communications and that was done by the US and UK spy agencies, together with maybe a few others. And the good news from the Snowden revelations, from the point of view of cryptographers, is that hardness assumptions haven't been broken. So in particular, there is no, sorry, there's no evidence that there's some secret supercomputer that can break RSA or break any of the hardness assumptions that we assume that hold. But the bad news is that cryptography is being circumvented on a massive scale. So that might be through malware or you're getting data and keys from corporations planting backdoors and standards or mass collection of metadata. So as an example, I'm going to talk about the Jupiter dual EC as you don't random number generator, which was always a dual elliptic curve deterministic PRNG, which is supposed to provide a source of random or random looking numbers. But built in weaknesses with the choice of the elliptic curve points that parameterize the algorithm and that is possible to choose the points in such a way that establish to establish a backdoor, which was observed by Ferguson and Schumau in 2005 at a romp session of crypto. And then later the Edwards Snowden revelations suggested that the NSA had influenced that design process. So this idea that cryptography could be subverted, there is some evidence that that is happening in the real world or has happened. And then before I go on to talk about prior work on algorithm substitution attacks and give some definitions, I just want to talk about the wider conversation, which came out of a seminar hosted by Royal Holloway a couple of weeks ago, which really made me think slightly differently about or question some of my assumptions around mass surveillance. So cryptographers, as we know, are very interested in mass surveillance. And I think this is my personal opinion, but thinking about mass surveillance feels kind of like as close to doing something political as you can as a cryptographer. But so the Royal Holloway seminar had the authors of this paper, Crypto and Empire, the contradictions of counter surveillance advocacy. And as I understand their argument, it's that counter surveillance needs to be repoliticized. So cryptographers thinking about very technical, narrow questions means that we don't have to take a political position, we don't have to think about who is being surveyed or for what reasons or what political forces or societal forces are going into that. You can just think about narrow technical questions. So I guess as an example, maybe after 9-11, there was a mass targeting of Muslims and maybe kind of directly engaging with why Muslims as a group can be so easily targeted or some of the prejudices that maybe we have as a society against some of the minorities that end up being persecuted or surveyed or, I don't know, is worth thinking about. So I don't really have time to go into this in any greater depth, but I thought it was very interesting to think about and I'd recommend you have a look at the paper. So continuing with algorithm substitution attacks and some definitions and looking at encryption as an example. Okay, so an algorithm substitution attack replaces a cryptographic scheme with a subverted version. And this subverted version is going to reveal to an adversary engaged in mass surveillance some information and it's going to do that whilst remaining undetected by its users. So here in this diagram we've got the adversary, we have Alice and we have Bob, and Alice is sending some messages which are first encrypted to obtain these ciphertexts which are sent to Bob. And Alice and Bob, they both share a secret key. So Alice can encrypt using a secret key and Bob can decrypt using the same secret key. So the algorithm substitution attack would be to replace this encryption algorithm by an algorithm that behaves essentially the same way as the encryption algorithm would, but deviates in some way that the adversary is aware of and can look at these subverted ciphertexts and learn something. And the foundations of this idea was laid by Jung and Young in a series of work they called kleptography. And post Snowden, the term algorithm substitution attack was introduced by Blare Paterson and Rockaway in 2014. And they showed how an algorithm substitution attack could be launched against encryption and they gave some definitions. And their attack and definitions were improved on by Blare Higa and Kane. And in particular their attack was stateless where the Blare Paterson Rockway attack had been staked for. And then later the definitions that were used were critiqued by Cabrera, Fascha and Cuttering. And I think their key insight was that prior to this, so Blare Paterson and Rockaway, they had insisted that the subverted algorithm is perfectly correct. But that doesn't necessarily need to hold. If the algorithm deviates very rarely then it's very difficult to detect that. So in Cabrera, Fascha and Cuttering's work there's this idea of a trigger message that an encryption algorithm could send. So it could just output the key like every once in 10 million ciphertexts or something. And that's effectively an algorithm substitution attack that you wouldn't be able to detect. Okay, so algorithm substitution attacks work by implanting a terminal channel into ciphertexts. So continuing with this example of encrypted messages, this is what the ciphertext could look like. It's obviously a toy example. But if you imagine that the first bit of each ciphertext spouts out the secret key that Alice and Bob share, then observing the ciphertext, the adversary can now learn what that ciphertext, learn what the secret key is. And to embed that subliminal channel, previous approaches used the technique of rejection sampling. So for encryption, we have a key k, randomness r and a message m, and the encryption algorithm outputs a ciphertext c. And for any randomness r, when you decrypt with that randomness, you get a ciphertext, sorry, you get a valid ciphertext, which means that decrypting that ciphertext will give you the message that you started with. And that works for any randomness. So what you could do is just resample the randomness until the ciphertext is in the right format that you are. So in our example from the previous slide until the first bit is the intended bit of the key that you're leaking. Okay, so this is generically a very powerful technique. And you could leak the secret key. So you could also have some other message that you send in this subliminal channel. But if you send the secret key, that's kind of the most useful thing to the adversary because the adversary can now decrypt all of these cipher texts. But I guess you could send some other message, but we're just going to think about extracting or exploiting the key. Some other work, so algorithm substitution attacks seem to be a fairly active area of research. Just a couple of recent tish papers that I found. Signature schemes, there's algorithm substitution attacks against cryptocurrency, against key encryption, key encapsulation mechanisms, sorry, and data encapsulation mechanisms against lattice-based crypto, tweetable block ciphers. And there's also work looking at protecting against version attacks. For example, you could eliminate the randomness and use non-space encryption. Or you could have keys that are so big that it's infeasible to exfortrait them. Or you could sanitize the randomness in some way. There's a technique called reverse firewalls. And there's also something called self-guarding protocols. I'm not going to go into these in any great detail, but just to say that there is other interesting work out there. Unless there's also the idea of a watchdog algorithm. So now to move on to our work, to talk about message authentication and our algorithm substitution attacks. So first of all, authentication deals with a situation where Alice is communicating to Bob. And Bob wants to know that the messages that Alice sends really did come from Alice. So they haven't been tampered with or they haven't come from someone else that's pretending to be Alice. And to do that, we use a tagging algorithm. So Alice can take the message and then with the secret key that Alice shares with Bob, Alice can drive a tag and send that tag with each message. So now Bob, receiving the message with the tag can check that the tag is really belongs to that message by using the secret key that's shared. And what do you want in terms of security is that it's very difficult or impossible for an adversary to inject messages or to modify messages. So basically that means it should be very difficult to calculate or create a tag that is valid for a message. Okay, so syntax wise, we have a tagging algorithm tag, tag t, key and a message. And then the verification algorithm takes the symmetric key k, a message and a tag and then outputs v. So v will be zero if the tag is accepted is correct. And v will be zero if it's rejected. So one correct and zero incorrect. And then as a correctness property of a message authentication code scheme, we want that all tags should be correct. So if Alice generates a tag for a message, then that should verify. Okay, so in contrast to prior work, we look at subverting the receiver rather than the sender. And there's no reason why when you have Alice and Bob who have a shared key, you know, you could target Alice to leak that key or you could target Bob. And there's no reason as to why you shouldn't target Bob. And our attacks leak the secret key. Once the secret key is known, you can forge any tag which you could use to enable a tax against confidentiality in the encrypt and MAC situation. You could get users to accept compromised, authenticated software updates. So you can force malware on users or you could inject malicious packages. So into secured communication streams to de anonymize users. More general attack rather than trying to export trade the secret key, you could export trade arbitrary information, for example, the key for a different application or the internal state of a number generator. Okay, so syntax wise, for subversion, we're going to denote that with a subscript j. So you have a subverted verify algorithm. It needs to have the same syntax. It still takes a key, a message and a tag and outputs a zero or a one. And in diagrams, I've added some horns to Bob to say that Bob is now subverted and verification algorithm is subverted. And so are the outputs here, potentially. Okay, so earlier I gave a kind of intuitive definition of what unforgeability means. You can quantify that by means of a game where the algorithm gets access to a tagging oracle and a verification oracle. And then you want a similar notion for subversion. So this time you have an adversary that interacts with a subverted tagging and a subverted verification algorithm. And it's maybe worth saying that if you, for example, just output a tag that was the same as a message, then that would be very easy for an adversary to win some kind of unforgeability game, but it would also be very obvious to someone that's interacting with the algorithm that something has gone wrong. So you have some kind of basic correctness requirements. Otherwise, you will kind of have some trivial detectability. And that's another notion that you can make more formal. So for detectability, you have some detector that's going to interact with the subverted tagging or verification oracles. And it's to be able to work out whether it's interacting with a real or subverted algorithm. So I guess intuitively, a good algorithm substitution attack from the point of view of attacker would be very difficult to detect, but would be very effective at leaking the key. So that's another notion which we've formalized as key recovery. And the important thing to note here is that you stop once the adversary has a key which is equal to the user's key. But I'm not going to go into the full detail, but we have two different versions. We have a passive version with just an eavesdropping adversary and an active version where an adversary can cross their own tags. Okay, so passive version. So what we do is we subvert the verification algorithm so that it rejects a sparse subset of valid message tag pairs. So it needs to be a sparse subset because otherwise you have something that is kind of easily detectable. And it works by taking the subverted verification algorithm is going to take the tag and compute the hash of it so you obtain some index. And if the key at that index equals zero, then output v equals zero. Okay, so this should be a correct tag. It should be v equals one, but this subverted algorithm is going to output v equals zero. And the adversary can see this message and tag with some bias that should be accepted, but it's been rejected. So therefore it now has learned one key bit. And we assume that the adversary can learn whether a tag is accepted or rejected. So maybe that happens at a lower application level, and maybe a retransmission or something, but there'll be some way for the adversary to to learn whether that attack has been accepted or rejected. Okay, so that's the attack and it works fairly well. You have one bogus rejection per every second key bit on average to leak a whole key. And we can limit that bogus rejection rate even further with the parameter to make it less detectable. And what's happening here is that correctness is affected. So sometimes correct message tagpads are rejected. And here's a nice graph to show. The probability of a key being expatriated goes up nicely. So eventually, once you have enough message tagpads that you've observed, you're able to completely recreate the key. Okay, and there's a second attack that we do tell an active attack. So at this time, we think about an adversary that is able to tweak these tags in some way or inject their own perhaps, which is a slightly different setting. So traditionally in the mass surveillance setting, we think about an adversary that's just eavesdropping. But I think there's no real good reason as to why you can have an adversary that is active to some extent. Okay, so this time we take the verification algorithm and subvert it so that it accepts a fraction of invalid tags. So there's a nice kind of complement there with the passive attack. And to do that, the adversary takes a message tagpad that it sees and intercepts. And then if the tag, so hash the tag, I know you get an index. If the key at that index equals to zero, then the adversary is going to replace the tag with an encrypted version of that tag, which I'm denoting PFT. So now the verification algorithm receives message with PFT. First, it will check whether the tag is authentic. So an authentic tag will still get accepted as it should. If it finds that this tag is not authentic, then it will check whether it's being encrypted in a special way by the attacker adversary. So it will decrypt that to get back what should be the original tag. And now we'll see whether that tag corresponds to the message. So if so, then it will set v equals to one, when really v should be equal to zero. Yeah, and if this tag is not authentic, this P inverse of what it received, then it will output v equals zero as it should. So there's only a special kind of set of message tag pairs that will bogusly be accepted. Okay, so when an adversary observes that a tag that it knows should be rejected has actually been accepted, it learned one bit of information. So one bit of information about the key. So this time, again, we've still got keys being exfortrated with one bogus message tag pair for every second keyword on average. And this time, we're still affecting the authenticity because an authentic tag will be accepted. And there's a similarly nice graph. Okay, so onto the conclusion. Our work in contrast to prior work attacks the receiver, rather than the sender. And in the setting of symmetric cryptography, we're both send and receiver share a key. That's something that's quite reasonable to do because either the sender or the receiver, they both have the same key. So you could target either. And we present two attacks that show targeting the receiver can lead to successful algorithm substitution attacks leaking the private key. And we do that in two variants passive and active. And those are kind of suited to slightly different scenarios. So I think the high level takeaway is that proofs of security don't necessarily mean so much when the adversary doesn't play the games that we anticipate. So in other words, security is defined with respect to some model, but the adversary's hobby is to try and get you to go outside of that model. So I'm going to finish on everyone's favorite cryptography cartoon. And thank you very much.