 systems often precludes just simply pulling them all down and pulling all the ciphertext down and decrypting them. So in this paper, our main contribution is a generic algorithm which we call, with apologies to Omar Reingold and the people who built the zigzag product, we call this zigzag. This is a generic algorithm that solves the problem of backwards compatible FPE. And the tokenization upgrade example that I gave before, we're going to refer to as domain completion in the rest of the talk. And we prove that the zigzag meets the natural security notion for the setting. And we also do a couple of analyses of the runtime of the zigzag and prove that it's pretty fast with all but negligible probability. For a domain extension, which is the expanding, you know, the format expansion example we saw before, we prove that unfortunately, like the natural strongest security you would want in this setting is actually impossible. And we give a new, slightly weaker security goal and we analyze the zigzag with respect to that. So a little more formally, a domain completion, which, sorry, let me start that again. A domain completion setting involves FPE, and an FPE from a set D to itself with key K is a permutation of that set for every key. So in domain completion, we're going to call the old FPE, and we're going to consider this partial permutation, which is to say it's a permutation that's only defined on a subset of its possible inputs. We're going to call this F sub K star, and we're going to call the domain of this partial permutation T. So these are, concretely, you can think of these as kind of the plain text that were present in the system before this domain completion occurs. So the goal of the setting is to build a new FPE ZZ sub K prime over the same set so that for all the points in T, the image under ZZ agrees with the image under F of the point. So basically the zigzag or the ZZ domain completed cipher has the same image for all those points in T. The security goal in this setting is kind of the traditional one for FPE, which is strong pseudo random permutation security. And this basically says that the cipher is indistinguishable from a random permutation, but the slight twist for domain completion is that we're going to give the adversary knowledge of T. So in the next few slides, rather than referring to the function F sub K star abstractly, we're just going to think of it as a token table, which is basically just a big attribute volume map, as I described before. So there's an obvious approach to this problem, which is basically to use a tokenization scheme and a new FPE scheme in parallel. So when you want to encrypt a point that was in the token table already, you just return its value. And if you want to return a point that's not in the token table, you just encrypt it with FPE. Unfortunately, it's not that hard to see that you're going to get collisions in the output. So this actually doesn't even preserve the permutivity of the set that we're trying to encrypt. So we have to kind of discard this basic construction. But fortunately, a slight tweak of this basic idea gives us what we need. So the idea here, this is the zigzag construction. So the idea here is to use kind of a modified form of cycle walking to repair the permutation on the points that collide. So there are two easy cases here and one a little more difficult. So the first easy case is if the point is in set T, then we're just going to return its value in the token table. The next easy case is if the point is not in the set T and its image under the new cipher E doesn't collide with any of the images of the points in T. In that case, we're just going to return it because there's no problem and no permutivity is violated. The third case and the one that's a little more complicated is when you encrypt a point not in T whose image collides with the image of some point in T. And here what we're going to do is basically decrypt to find the point in T that caused this collision and then we're going to re-encrypt with E. We're going to re-encrypt that point with E. So this is a slightly reminiscent of cuckoo hashing if you're familiar with that data structure. Just some analysis of this basic algorithm. So if the set T is at most half the size of the domain D, the zigzag algorithm runs in amortized constant time over a sequence of encryption and decryption queries except with negligible probability. The intuition here is that this sampling experiment that we're doing can be modeled as a kind of sampling without replacement experiment from common atorics and once you view the problem that way you can use a standard tail bound on the hyper geometric distribution to bound the probability of having to go through this loop a lot of times. And the last thing I'll say about the domain completion in the zigzag setting is that the zigzag in this setting meets the strong pseudo random permutation security goal that we want and we prove this via reduction to the underlying permutations E and F and this reduction is actually tight. There's no loss in security. So now we will discuss the domain extension setting in a little more detail. Again the point of this domain extension setting is that we want to be able to encrypt points in the old set that existed before the extension and the points in the new set in the extension set. And we want to be able to do this while maintaining the ability to decrypt the points that existed in D before the update and we want to do this all while preserving permutivity over the whole set M. So again as before we're going to call our old permutation F sub k star and the new domain we're going to call M which I've conveniently put in blue and just remember that D is a subset of M. So M contains D. So in this setting we need to build an FPE over the whole set M with the same kind of preservation property that we had before in particular that the image of the points in T is the same under ZZ and under the old Cypher F. So here we're not going to refer to a token table. We're just going to be a little bit more abstract. So we have the old Cypher F sub k star over the set D. And we have the new Cypher E sub k over the whole set M. And it's not that hard to see that ZIGZag actually works perfectly fine for domain extension. So the two easy cases stay the same and the hard case where there's a collision works the exact same way as well, which is not all that surprising because we never actually needed that the sets were the same in the ZIGZag for domain completion. The question here though is what security this achieves. And unfortunately in the paper we proved that SPRP security is actually impossible for any domain extended Cypher, the Cypher that meets this functionality goal. And we proved this with a distinguisher that's actually pretty simple. All it really has to do is take the points of T and pick some arbitrarily and just query them over and over again. And it sees whether any of the images of the points that it's querying fall outside of D. And if they do, then the distinguisher returns that it's in the ideal world. Otherwise it guesses that it's in the real world. So the key intuition here is that for a random permutation, it's very unlikely for all of the images of all the points that it queried to fall in D. But for a domain extended Cypher, this has to happen. This happens basically with probability one because of the functionality of the Cypher. And the probability that it gets fooled is this nasty kind of like quotient of factorials on the upper right. So this is a pretty fast growing function. Especially if the set M is much larger than D. So the question here becomes then whether we can prove kind of any meaningful security about this setting. And we're gonna do this by weakening the security goal a little bit. And we're gonna target indistinguishability from a slightly different ideal object. We call this definition strong extended pseudo random permutation security. So intuitively a permutation is an SEPRP if it's indistinguishable from a permutation sampled uniformly but subject to our functionality requirement. So the first theorem that we proved and it's a pretty simple one is basically that zigzag meets SEPRP security in the domain extension setting. The next one, which was a little bit tougher to prove was that any SEPRP Cypher basically gives any adversary at most a factor of two advantage in kind of message recovery games. So we proved this by taking a message recovery notion from Bellarie et al's prior work on FPE. And we basically generalized it to accommodate the domain extension setting. And once we did this, in the experiment where the challenger picks a hidden message and gives the Cypher text to the adversary. And the adversary has to kind of guess the message by making encryption queries. The only thing that's really hidden from the adversary here is membership in T. So this is like one hidden bit of information you have to make twice as many queries to make this up basically. So in this paper, we did a few other analyses of the problem that we were considering. The first is that rather than weakening your security goal, if you weaken the amount of knowledge the adversary has about the setting, you can modify the zigzag construction in a way that actually achieves SPRP security. And unfortunately, I won't have time to go into that in this talk, but you can see the paper for more details. Another thing that's kind of worrisome about the zigzag is that it has variable timing for different inputs, which in general is a pretty bad property for a cryptographic algorithm to have variable timing for different inputs. But we prove in the paper that the timing side channel here basically only leaks whether the point is in T or not. So since in the strongest setting we assume that the adversary has this anyway, this is pretty inconsequential. An anonymous reviewer gave us a kind of an alternate construction of achieving domain completion and extension via a construction that we analyzed in the paper and we call rank and cipher unrank. So the advantage of this alternate construction is that unlike the zigzag, it actually has fast worst case performance, but the zigzag only runs in the only head, that is to say, the zigzag only has bad run time with negligible probability, so I don't see this as a huge advantage. And the disadvantage is that you have to do a lot of pre-processing and build a kind of a data structure that's specially made for this setting. So the storage overhead is a little high, it's on the order of the size of the set T. And because this data structure is accessed in kind of like a point dependent way that's more granular than the side channel you get from timing, we don't have like a proof of this, but we conjecture that this would lead to like memory access pattern side channels in practice. So in summary, in this paper we introduced kind of the idea of backwards compatible crypto, which I think is important because the inflexibility of cryptographic primitives has been pointed out as kind of a serious practical problem before. If any of you were at crypto, Brian Sniffen gave a really nice talk where he pointed out that like the models that cryptographers use are very inflexible and don't always line up with practical considerations like key rotation, for example, is one that he specifically highlighted as a huge problem in practice that's not really modeled in academic work. And I think this paper is cool because we kind of like make progress towards building more flexible cryptographic primitives that can kind of be changed after they're deployed. And we give a generic algorithm that solves the problem of backwards compatible FPE, and we show how it solves the problem of domain completion and domain extension. So the techniques that we develop in this paper are efficient, they're provably secure, and they solve real problems for practitioners. Thanks for listening. Any questions? So you say that the construction is efficient, but as far as I understand, with a non-zero probability, your Y loop never terminates, so the average complexity is infinite. So can you have a construction with a good average complexity? It does terminate. If the old scheme, the tokenization scheme has a fixed point, then if you decrypt, you continue to obtain this fixed point and the Y loop never terminates, right? I don't think about it. Maybe we can take it offline. That's possible.