 Welcome to the last session of FSE. It will be a session about new designs, and there will be free talks. The first talk is Adiantum, Length Preserving Encryption for Entry Level Processes by Paul Crowley and Eric Biggers. And Paul will give the talk. Thank you very much for the introduction. I want to talk about a problem. I'm on the Android platform security team. And I want to talk about a problem myself and my colleague Eric faced and what we did about it. This is going to be one of the less technical talks. It's the last day. We're all tired. But also, Adiantum is not a deep technical advance. It's a combination of well understood techniques. And what I want to talk about is the practical problem that we faced and why we chose Adiantum as a way of solving it. So any of you who have Android devices in this room, undoubtedly those devices will be encrypted. And those devices will have something like ARM cryptographic extensions, or perhaps even an inline crypto engine that makes AES super fast. And so those devices will be encrypted using AES. But for a lot of devices, not in this room. Devices being, for example, being used in developing countries or IoT devices or something like that, they have processors like the Cortex-A7 processor, which lack the AES-CE extensions. And on those devices, AES is just not an acceptable performance. And so this affects storage encryption, which is my area. But it affects all sorts of things like, for example, TLS connections. And in TLS, the solution to this was RFC7539, which uses Dan Bernstein's primitives, Chacha Poly 1305, to build a fairly straightforward AEAD mode that you can use for these kinds of applications. And it is, I'm afraid I don't have benchmarks, but there's similar benchmarks I'll show later for Adiantum, is way faster than AES on these devices. And that gives users an acceptable performance. And so for an internet connection, for network connections, RFC7539 solves this problem. Trouble is, because it's an AEAD mode, the ciphertext has to be larger than the plaintext. There has to be room for a Mac, and there has to be a non-switch is not reused. For storage encryption, we have to have a ciphertext, which is the same size as the plaintext. And I'll get onto why that is. So the most familiar example of storage encryption is the full disk encryption. For every physical sector on the device, we have a virtual sector. And a write to the virtual sector, which has to be exactly four kilobytes, because that's what the software expects, is encrypted, and is sent to the hardware as exactly four kilobytes, because that's what the hardware supports. If we had flash storage that gave us a little bit more room, we could have a few extra bytes in our sectors, then we could use an AEAD mode to store these extra bytes. That would change the picture there a little bit. I've been saying this since about 2000. I've been having this conversation with, like, if the storage manufacturers could just give us a bit of extra room here, we could really do something different. Even if they were to turn around tomorrow and say, OK, we're going to do it, it would take a long time for that hardware to become available. And to me, this problem is urgent. There are devices out there that are not using encryption because we don't have a solution to this. Now, on a lot of more recent Android devices, we don't use full disk encryption. We use file-based encryption. And at first, I hope that this would give us another way to address the problem. It would allow us to, the file-based encryption could mean we could have a bit more flexibility format, and maybe we could make room for a little bit of extra storage for a Nultz and a Mac. Sadly, it doesn't really work that way. First of all, that would be a major engineering effort by itself to write a file system that could make room for this. It would be a pretty deep change. But even given those deep changes, there would be certain circumstances where you would see a major hit. So databases, for example, they expect that if they have a multi-gigabyte database file, they can go to some four-kilobyte line sector and make a change and see a single write. If we have to update a Nultz and a Mac for that sector, making it a little bit larger, that means it's got to be at least two writes. And when they read from it, at least two reads. That halves the speed. It breaks atomicity, makes it a major challenge to update this in a way that is atomic. But worse than that, it's really bad for the flash devices. The lifetime of a flash device is measured in writes. If under these circumstances, we're doubling the number of writes, it will halve the lifetime of the flash device, which would be a serious inconvenience for users. So even in the file-based encryption scenario, we can't get away from this problem that we have to have the ciphertext the same size as the plaintext. In order to call itself an Android device, a device has to pass what we call the compatibility definition document. And for many years now, the CDD has required that Android devices be encrypted. But there's a carve-out. And there's a carve-out so that people in developing world countries can have a full-fledged smartphone operating system on the hardware that people there can afford. That carve-out says that if you encrypt AES at below 50 megabytes a second, you may ship an unencrypted device. Now, that's sad in that lots of people have unencrypted devices. But it also provides some pretty weird incentives. If you can just slow down AES in your device to 49 megabytes a second, bingo, you can ship that device unencrypted. And now your device is faster than the device that runs 51 megabytes a second, which is unencrypted. So I was pretty unsatisfied with this. And I wanted to fix that. So here's what we did. So given that the ciphertext has to be the same size as the plaintext, we can't achieve formal properties of an AEAD mode. It has to be deterministic because it's a bijection. We rewrite new content to old sectors so there's no way to store a nonce that isn't reused. And the best we can achieve is a tweakable super pseudo random permutation. We want, for every sector, we want there to be a bijection between the plaintext and the ciphertext, which is indistinguishable from a random permutation. And for all of those sectors, we want it to be a family of permutations that's indistinguishable from a family of random permutations. And that's assuming the attacker has access to the encryption and decryption functions. So an example of a tweakable super pseudo random permutation is AESXTS. And this is what we use on modern Firebase encrypted devices. To encrypt a four kilobyte sector, we simply apply AESXTS 256 times in parallel units. And so the tweak comes in two parts. There's a part for the offset into the sector and there's a part within the sector, for the offset within the sector. However, on the devices we target, we see performance of around 59 cycles per byte, which is like 20 megasecond, which is way too slow for our user's experience. If they're loading an app or something, it takes really visibly too long. And so we can't expect that our users won't accept it. We can achieve a better security guarantee if our super pseudo random permutation is applied to the entire four kilobyte sector. So a change to any bit of the plain text in that sector affects the entire cybertext and vice versa. And for every tweak, it should appear to be a completely new permutation. Not only does this give us better security properties, it gives us an opportunity to be faster. Because we're operating in a much larger bulk on the data, we can use primitives that work in bulk and give us greater speed in bulk. An SBRP has to read every byte of the plain text before they write anything. So you have to have at least two passes. Because we want it to be super pseudo random, we need the same property in the decryption direction, which means a minimum of three passes. So we've gone for a hash or hash structure, simply because the hash is faster than the ZOR. And so doubling up the hash is faster than doubling the ZOR up. And we're inspired by examples like, I mean, there's a lot of interesting work in this area which is cited in the paper. We're inspired by examples like H counter and HCH, which, like the other work that we looked at, is based on the assumption that you're going to use AES and you're going to use multipliers in GF108. H counter and HCH have this hash or hash structure. They have a narrow 16 byte block on the left. And they hash the wide block on the right to a combined block on the left. They use that to generate the nulls that goes into the counter mode encryption. And then they hash again on the other side. And there's a single block cipher call on the thin block on the left to defeat the Lubey-Rackoff attack on the three-round FISCAL structure. But because, like the other things I looked at, this is based on AES and GF2 to the 128, it achieves better security properties, but it doesn't perform better on our hardware. So the work we did was simply to take these ideas in H counter and HCH and combine them with the ideas from RFC7539 with the high performance primitives we have. And so that means we changed the hash combiner on the left to a addition, mod 2 to the 128, because that's the combiner that works well with our hash. But the structure is basically very similar. We also, because Chachar is very well behaved compared to AES and counter mode, we can achieve a slight optimization where we don't have to achieve a slight optimization where Chachar and AES can run in parallel in the decryption direction. So that we sacrifice the symmetry of encryption and decryption, but it can give us an opportunity for a slight speed up in decryption. And so that gives us a massive performance boost. 17.8 cycles of bytes well over twice as fast as AES XDS. And that was a big win. But we needed it to be if we were going to the discussion we were going to have with OEMs trying to make sure this feature was mandatory was going to be a lot easier for every cycle of a byte I could shave off this mode. So the first change we made was we switched from Chachar 20 to Chachar 12. The currently the best attacks on Chachar break seven rounds. And that's been the case for over a decade now. And it's seen a lot of crypt analysis in that time. So every round of Chachar adds a lot of strength. And so we felt good about choosing that mode. That gave us 13.6 cycles of byte, which is a significant improvement. But there was room for one other improvement, which is that while Poly 1305 is very fast, there are still faster epsilon almost delta universal functions on this hardware. And so we looked at NH for hashing the bulk of the data. NH is blindingly fast around 1.5 cycles of byte. NH's output is variable length. The more you hash, the larger its output. So we use NH essentially as a compression function to reduce the amount of data we feed to Poly 1305. And then Poly 1305 handles the final hashing stage. And that's then combined with the tweak. And that gives us our mode adiantum at 10.6 cycles per byte. And so this is the overall performance where so because of the single AES encryption, we get faster the wider the block we're encrypting. But we're still fairly acceptable evil on the 512 byte sectors that used to get on all devices. And we're faster than not only faster than AES, but faster than other block ciphers such as Noakeon and SPAC. And there's a larger performance table in the paper. I think I've got time to talk a little bit about the proof. It's fairly straightforward. So this is the main step in the proof where the adversary is trying to distinguish between two worlds, World X and World Y. They can make plain text and cipher text queries of any length and tweak. So World X is essentially adiantum, except we replace the block cipher AES with a random permutation pi, and we replace cha-cha with a random function f. In World Y, for every query the attacker makes, they get a random reply of the appropriate length. And we're going to use the H coefficient technique, which is, thanks to one of my reviewers for suggesting that. And so after the final query, we're gonna make the attackers life a little bit easier. We're gonna give them the hash key, which can't hurt and can only help them. Once the attacker has the hash key, one thing they can do is calculate all of the intermediate hashes inside adiantum. So you've got this diagram where you've got these hashes being either side of AES, and we're gonna allow the attacker to calculate those. So they can put in the plain text at the top, tweak at the side and get the plain text hash, and the same thing for the cipher text at the other end. Having done that, they can look for collisions either in the plain text hash or the cipher text hash. It doesn't matter if a plain text hash collides with the cipher text hash, only matters that if there's a collision in a specific layer. And if they find such a collision, they're going to win. What's the probability? For whatever queries they do, we can bound the probability of finding such a collision. And because the wonderful thing about the H coefficient technique is we only have to consider the probability in worldwide. This is the effort that it saves us. So for the result, it's pretty simple. The result is totally random. So the right hand side is totally random, and it's combined with whatever the hash value is, but the result is totally random. So the probability of colliding with any specific previous query is two to the minus 128. This is assuming a plain text query, it's all the exact opposite for a cipher text query. For a plain text query, it's the epsilon, we were applying the epsilon almost all universal property of the hash function. We forbid pointless queries. So given it's plain text query, either the plain text or the tweak or both must, for any given previous query, they have to, one of those two has to be different. And so the result is the probability that the hash will differ by any given amount is at most epsilon, meaning the probability there'll be a collision in the plain text is at most epsilon. And so across all pairs of queries, the probability that we'll see such a collision is at most epsilon plus two to the minus 128 times q choose two. If that happens, we call that a bad transcript. And so supposing we get a good transcript, what's the probability, given the queries that attack them, for a given deterministic attacker, what's the probability we'll see a particular set of responses. In world wide, it's pretty simple. For a given response, the probability is two to the minus p. All responses are equally lightly. The length of the response is the length of p and two to the minus the length of p. In world X, it's almost the same. We'll see a particular response. First of all, if f has exactly the correct output, to zore into the content on the left-hand side to convert it to the expected content in the ciphertext, probability of that is two to the minus the length of p minus 128. On the right-hand side, pi has to encrypt in just the right way. For the first query, the probability of that is two to the minus 128. As the queries go on, because this is a good transcript, it's always producing a value that it's never produced before and it's always guaranteed to produce a value that it's never produced before. And so the probability of getting it right goes up by a tiny bit every time. And so that's one over two to the one to eight minus i, where i is the number of queries before this one. And I should mention that the reason why we can say, make the assertion I did about f is because the nonsense being fed to f is unique every time. And so its output is always independent from all previous outputs. And these two probabilities are also independent. So overall, the probability we'll see this, well it's this formula here, it's the product of the two. But the important thing is, this is always either the same, depending if it's one query, or just a little bit bigger than the probability in world y. And these sum across all queries. So now we can play the H coefficient technique. Every good transcript is at least as likely in world x, world y. We've bounded the probability of a bad transcript and that bounds the distinguishing advantage, which is at most epsilon, plus two to the minus one to eight, times q choose two. So epsilon depends on the sum bound on the length of the message and the tweak the attacker sends. We're using polynomial hashing, so the longer the message gets, the longer the tweak gets, the larger epsilon gets. But we plug that into our formula and we add terms for PRP, PRF, bounds and four to char and AS. And we get this rather large expression, but the key thing is that it is linear in the message and tweak length, and it's quadratic in the number of queries, and it's small. The good news is, adiantum is already part of the length's 5.0. So I don't really think in terms of years. I don't call years 2017 and 2080, and I call years things like Oreo and Pi, because those are the names of the Android dessert releases that come out in each year. So this year's release was Android Pi, and we have added adiantum to Android Pi and some devices will be using that. And in Android Q, which we don't know what it'll be called yet, if you know of a dessert that's called after the letter Q, please let me know. There will be no carve out. We will require encryption on all devices and where AAS is too slow, those devices will use adiantum. Thank you very much. Any questions from the audience? So then I will ask a question. I was wondering, so you said it has some favorable properties compared to XTS. So could it be also interesting to use it on the devices which have AES support and then replace the charger call with AES in counter mode or something? Is that something you looked at or? Where are you using some AES instructions that's very appealing. A lot of these devices have inline encryption engines and they're as hard to make changes, but also an SPRP mode is a terrible fit for an inline encryption engine. They want to be able to stream the data past, oh, excuse me, they want to be able to stream the data past and decrypt it as it goes. And SPRP mode totally rules that out. You have to read everything before you get to write anything. And so maybe that's possible, but I think that will take a little bit longer to land. Okay, and maybe another question I was interested in, maybe could you give some ballpark figures like how many percent of Android devices this would be deployed in the next year? I know it's many millions, but I think I'm not allowed to say. So we take this question offline. But it's a lot. I mean, it's the next billion users we're aiming this at, it's a lot of people. Okay, any questions from the audience? Then let's thank the speaker again.