 My name is Philip Hodges, and I will be presenting research done with Douglas Stabila on algorithm substitution attacks, including a formalism for detecting such attacks with state resets, and asymmetric methods for subverting symmetric encryption. First, I will define an ASA. Then I will briefly discuss motivation for considering this kind of attack model. I will talk about some past work on which our work is heavily based, and then I will discuss the two parts of our research, state resets, and making ASAs asymmetric. So firstly, what is an ASA? An algorithm substitution attack, or ASA, is an attack on a cryptographic scheme, say a symmetric encryption scheme SE, where a component of the scheme is substituted for a malicious version. In essence, this model is going to broadly capture what might happen if the attacker is able to substitute the secure algorithm a user intends to use for another algorithm of the attacker's choice. This model captures any of a very broad range of methods of undermining cryptography. In our case, we are mostly going to consider symmetric encryption specifically. The encryption function SE.ank, from the symmetric encryption scheme SE, is being substituted with an algorithm sub.ank, which will run instead on every invocation of the encryption function. We call sub.ank the subverted encryption algorithm. The goal of the attacker here is going to be full recovery of the key used for symmetric encryption after observing some number of ciphertexts. Now, you might think that this would be easy, because we are giving the attacker a lot of capability. Indeed, with no other requirement, it would be trivial. The attacker can simply substitute in an algorithm which outputs the key K on any input. In practice, however, this would be obvious to the users. Decryption would fail, for example. The algorithm users would likely notice, and they would stop using their encryption. The likely enormous amount of effort that the attacker put into implementing this attack would yield nothing. We could require that the ASA create valid ciphertexts so that decryption is always successful, and the users would still be able to communicate. In fact, we'll go a little bit further than that. We want an ASA to be undetectable to the users. To define undetectability, we will use a detectability game. This is a simple distinguishability game where the user U plays the role of the adversary tasked with determining whether or not the oracle they are given access to is the subverted or unsubverted algorithm. If U has a strong advantage at this game, then we say that the ASA is detectable. If not, it is undetectable. Note that I've introduced another component to the ASA, the key kbar. Kbar is generated by a keygen algorithm, specific to the ASA, and used by the subverted encryption algorithm. While kbar could be considered part of the description of sub.ank, it will be useful to us to parameterize this subversion key, kbar. It is worth noting that we're referring here to black box cryptography. Indeed, this area of research started with a paper by Adam Young and Moti Young in 1997, warning against black box cryptography. In practice, a user could detect an ASA by reading the code of the algorithm that they're using to see that it isn't what it's supposed to be. However, there are many situations where such a thing may not be possible. Furthermore, it can be difficult to ensure that code perfectly matches a specification, and code vulnerabilities can go undetected for years in the best of circumstances. So while this model of black box cryptography does restrict the detection capabilities of the user, it is certainly still worth exploring. With all these requirements, it is no longer a trivial question of whether an ASA exists that both recovers keys and is undetectable. So there are two questions. First, why do we care about ASAs? And second, do undetectable ASAs exist? So now I'll talk a little bit about motivation answering that first question. So a quick timeline, very quick in fact. In 2013, Snowden revealed a vast amount of classified NSA material, some of which disclose that the NSA works to deliberately undermine cryptographic standards when it is of benefit to the US. In 2014, the first research on ASAs was published. The first works in this area were heavily motivated by the threat of mass surveillance by powerful adversaries, and ASAs were their way to model the undermining of cryptography towards that goal. So I'll discuss some of those past works, and our research draws quite heavily on those, so I'll go in some detail so that the improvements that we've made will be evident. So in 2014, Belarie Patterson and Rogaway renewed interest in this area, and they coined the term ASA. They proved this theorem. For any randomized coin-injective symmetric encryption scheme SE, there exists an undetectable ASA against SE that recovers keys efficiently. They constructed an ASA that works against many symmetric encryption schemes. Their idea was to resample the encryption C of SE.ank with fresh coins until a certain condition on the ciphertext is met, shown here for a secret counter T and a pseudorandom function F. If the ciphertext satisfies this condition, then we can see that the attacker, who knows K-bar, will be able to learn a single bit of the key K by recomputing F of K-bar and C. After observing many ciphertexts, the attacker can learn the entirety of K as T is iterated through every index. One might also imagine that we could prove that this ASA is undetectable, since the distribution of the ciphertexts will be the same as for the unsubverted algorithm. I'll go into a little bit more detail on that later. So here is an implementation of BPR's ASA idea. Note that the index T for the key K is kept as state and incremented on each encryption. The value S is going to be a fixed parameter that limits the number of ciphertext samples drawn during a single invocation in this main loop. As we can see, the subverted encryption algorithm resamples C using fresh coins until the result of applying the PRF F on C is equal to the current bit of the key K. In 2015, Bellary, Yeager, and Kane improved the ASA of BPR by removing dependence on state and improved their analysis to avoid the coin injectivity requirement, relying only on high min entropy of the original encryption. Their main idea was to include index T in the PRF output, sample C until the two outputs of the PRF W and T yield correct information about the key K, that is, K at T is equal to W. Key recovery looks a little different from BPR's ASA. Since the leaked key index T is random, we cannot guarantee that the key will be fully leaked after a number of ciphertexts equal to the length of the key, as in BPR's ASA. However, we can do a coupon collector type analysis to see that, with high probability, every index will be selected at least once after a modest number of ciphertexts, and therefore K can still be recovered by the attacker. Here's an implementation of BJK's ASA. This is very similar to BPR's, but due to the fact that the index T is no longer maintained as state, this ASA is both simplified slightly and stateless. This is undetectable for generally the same reasons as BPR's ASA was undetectable, but I'm going to talk about this one in a little bit more detail so we get an idea of what undetectability looks like. Here I've written the detectability game that I had earlier with the ASA substituted in. The proof of undetectability proceeds with a sequence of game hops based on indistinguishability between the games. It can be seen relatively quickly like this. The first game hop replaces F by a lazily sampled random function. The second replaces the laziest random sampling by true random sampling. Then, since the loop condition no longer depends on C because of the true random sampling, then it doesn't depend on the choice of R, and that means that the distribution of the ciphertexts produced by this function is going to be the same as the distribution of the ciphertexts produced by the original encryption function, and hence there is no way to distinguish between the two cases B equals 0 and B equals 1. This is the exact relation that we get in the proof. The red part is from the first game hop, and the blue part is from the second. The blue term is abound on the probability of seeing two repeated ciphertexts during the game with lazy random sampling. N is the total number of oracle calls, and eta is going to be the min entropy of se.ank. Recall that S is the maximum number of loops that the subverted encryption algorithm will perform before returning a value. While much of our work is comprised of extensions on the two papers I mentioned so far, there is a lot of other research in this area that is still ongoing. This includes ASAs on max, signature schemes, and chems, among other things, as well as countermeasures where different approaches have been taken based on designing non-black box schemes. This includes so-called split programs, reverse firewalls, and self-guarding schemes. Now we'll discuss the main contributions of our paper. There are two areas that we made contributions to. Handling of state in the detection game and asymmetric ASAs. So stateful ASAs may behave differently than stateless ones, and hence become detectable if their state is reused or reset deliberately. So which ASAs are detectable in this way, and how do you formalize this, and should all realistic ASAs be stateless? These are some of the questions that we have tried to answer. The second contribution is to the development of asymmetric ASAs, where knowledge of the key K-bar embedded in the ASA is not sufficient to recover the key K. This would be desirable to the attacker, since K-bar is not securely held, and anyone who knows K-bar can exploit the ASA to recover K. A third party might be able to find K-bar even if the user is unaware of the ASA. We will show ways for the attacker to prevent that outcome. So we'll start with the issue of state resets. BPR noted in their paper that a reset of the state will lead to increased detection ability for an observer, but this increase does not appear to be enough to lead to actual detection. On the other hand, BJK claimed of the BPR subversion that a state reset, as can happen with a reboot or cloning to create a virtual machine, leads in their attack to detection. This seems contradictory, so which is actually the case? Does state reset lead to detection or not for BPR's subversion? BJK, in their paper, defined strong undetectability, which effectively forbids undetectable ASAs from having any state at all by providing their state to the user in the detection game. If there is any state, the user will detect the ASA, and if not, well, they might not be able to. In 2019, Bayek et al. designed a stateful ASA targeting a signature scheme. They proved that even if the state of the ASA is cleared during the detection game, the ASA is undetectable, so state resets cannot detect their attack. Now this notion of state reset doesn't quite reflect the vision of BJK, who described state resets which might happen when cloning a virtual machine as well as resetting one. In that case, state could be set back to any previous point in time, rather than just to annul state at the beginning of the game. In 2020, Chen, Huang, and Yong designed a stateful ASA on key encapsulation that recovers the exchanged key using two consecutive ciphertexts. Their approach to state was to argue informally that because the state used by their ASA is limited, that is, it retained only information from the previous invocation, their analysis need not include state resets. Now this was a reliance on only informal arguments to justify using small amount of states without really specifying what a small amount of state is. So the question is, how should we address state? The research thus far has been inconsistent. Stateful ASAs have been completely discounted by BJK as more easily detected, and other works such as Chen et al. justify their use of state through appeals to practicality and believability rather than formal analysis. So we propose a middle road, which comes from formalizing exactly what kind of state reset capability we imagine the user or detector having. BJK suggested that state resets could occur from cloning or rebooting of virtual machines, and we could imagine in a cloud setting that a user with black box access to encryption functionality could have access to these functionalities as well. The Bayek et al. paper addressed reboots, but not cloning. We will address both, giving the user or the detector in the detection game the ability to reset state to any previously used state. So here is our state reset detectability game called SRDET. In this game, we are allowing for sub.ank to be stateful, and we are explicitly tracking the state on each invocation i with the variables tau sub i. The reset oracle allows you to set the next tau i to be the same as any previously used state, without having direct access to that state at any point or even knowing if the state is being used. Bayek et al. used a model like this, where the only valid argument to reset was j equals zero, corresponding to clearing state back to its initial value. Aside from those changes, this game is the same as the previous detectability game that I showed. So let's explore how this compares to previous definitions a little bit. So here are three existing definitions of detectability, along with our definition from the previous slide, placed to demonstrate which definitions imply which other definitions. So if an ASA is undetectable according to PJK, then it has no state. Therefore, it will be undetectable with our notion of state reset, as the reset oracle will have no effect. As I stated earlier, our definition implies the definition of Bayek et al. since their state reset is a special case of ours. Finally, any of these definitions implies that of BPR, since BPR's definition does not account for state at all, and so the ASA is allowed to have any kind of state. More surprisingly, no two definitions here are equivalent. That is, there are separating examples at every level. Moreover, these examples are not artificially constructed, but are in fact the ASA is in many of the papers that I've discussed so far. So let's see what happens when we apply our new detectability definition to these ASAs. So it turns out that if we allow resetting state to any previously used value, then the ASAs from Bayek et al. and Chen et al. are quite easily detected. In practice, a user would have to run the scheme in question, inducing it to store non-trivial state. Cloning that state and having it rerun using the same state would produce the exact same result, while in an unsuverted scheme, new randomness would be sampled, causing random output no matter what cloning occurs. This is easily translated to a detection strategy in the SRDET game. However, the ASA from BPR is actually undetectable. We rewrote BPR's ASA using BJK's framework and adopted BJK's proof method using game hops to the BPR ASA. Then we showed that the proof goes through completely independent of the presence of the state reset oracle. In other words, resetting state provides no assistance in detecting BPR's ASA at all. So what does this mean? We think that the following two points should be heeded for future research in the area. For researchers who avoid or discount stateful schemes, it should be made clear what detection threat model they are working in, and why state is unrealistic. For researchers who develop stateful schemes, undetectability should be proven in a formal model, including some version of state reset, or detection methods in such a framework should be acknowledged. More importantly for our work is that we've shown that stateful schemes can be undetectable, and knowing this supports the development of our stateful asymmetric ASAs on symmetric encryption in the next part of the paper. So let's now discuss asymmetric ASAs. I'll talk a little bit about the concept of NOBUS. So NOBUS stands for nobody but us. The designer of an ASA who wishes to exploit a symmetric encryption scheme would have a strong incentive to ensure that other third parties do not have the same ability for exploitation. If we assume these third parties are, say, nation-state intelligence agencies, then we can imagine they might have the resources to reverse engineer the ASA. Essentially, the designers would want to ensure that exploitability of the ASA is not contingent upon possession of a key that the designers can't maintain control over. The solution here is really to replace this aversion-key K-bar by a pair of asymmetric keys, XK and EK. If only EK is embedded in the ASA, then this is the only value recoverable by reverse engineering. If XK is what is required for exploitation, then our imagined third party cannot exploit the ASA because they will not be able to recover XK. We call ASAs that use two keys in this manner asymmetric ASAs. Again, we ask, is it possible to implement a successful ASA in these circumstances? Well, it turns out that some of the ASAs I've talked about already are asymmetric. So the answer is certainly yes. However, you may remember that these two asymmetric ASAs are detectable by state resets. In fact, all of the asymmetric ASAs that we've seen in the literature are detectable by state resets. Furthermore, although BPR specified definitions for asymmetric ASAs on symmetric encryption specifically, they did not provide a construction, and to date nobody else has either. So that's what we did. So there are two different reasonable requirements that we could have for our asymmetric ASA, and this distinction hasn't been made in other literature. The first type is the simpler one, where we simply say that the ASA must be completely undetectable, regardless of whether the detector is in possession of the embedded key EK. In the second type, we relax this assumption slightly. We consider two types of adversaries in the detection game. Before, we were modeling users U as adversaries in the detection game who did not have possession of an embedded key. We could carry that assumption over here. Then we ask, what actual requirement do we have against the third-party adversary, which we're now going to call V? So we'll require in the type 2 asymmetric ASA that the ASA is not exploitable by V, i.e., it is secure. I will present both a type 1 and a type 2 asymmetric ASA. This differentiation might be small, but the relaxed requirement for the type 2 ASA does allow for some advantages. I unfortunately won't have time to cover these advantages in detail here. So this is the detection game we will use for our new ASAs. As written, it describes both regular and augmented detectability games, and of course includes a state reset oracle. The augmented game includes the boxes, while the non-augmented one does not. For type 1 asymmetric ASAs, where we require undetectability against adversary U in possession of EK, we will use the augmented game. For type 2 asymmetric ASAs, where we only require undetectability against U who does not possess EK, we will use the non-augmented game, which is identical to the game that I showed earlier. We'll start with our type 1 asymmetric ASA. The idea here is relatively simple. BPR used a technique like this one to find a ciphertext C, which leaked information about K. We will modify this only slightly by first encrypting K under a public key encryption scheme to get Kappa, but then using the exact same technique to leak Kappa. Since the attacker has XK, they can decrypt Kappa after recovery, but nobody else can. I'm going to claim that key recovery follows. It should be clear that if BPR's technique works, it works for values other than K, and so this leak will work even if it may take more ciphertexts to do so. Now how do we make this undetectable to someone with EK? Using EK, you could test the output of F of EK and C. If this is anything but random, then the ASA will be detectable. Therefore, all values of Kappa of Sigma must be random, and furthermore, we must never reuse them. To achieve this, we can use an in-dollar PK scheme and compute fresh encryptions after using all the bits of the encrypted key. An in-dollar scheme is one whose ciphertexts are indistinguishable from random. So here's the type 1 asymmetric ASA, ASUB. The first section consists of bookkeeping to compute new encryptions of K into Kappa and iterate through each one bit by bit. Once we have leaked every bit of Kappa, we recompute a new Kappa. The second part is just BPR's leaking technique applied to Kappa. It's worth mentioning that the in-dollar property for PKE can be achieved with ciphertexts of lengths less than 400 bits at a 128-bit security level. So this is certainly an achievable property for our application. Using a similar game hops to BJK's proof of undetectability for their ASA, plus one more for in-dollar security of PKE, we showed the relation here, bounding augmented undetectability of our ASA. This is very similar to the bound that BJK achieved, but note that we had to model F as a random oracle rather than a PRF because of EK being known, which wasn't necessary in BJK's proof. Our type 2 asymmetric ASA will start with the technique used by BJK instead, which we illustrate here. In our ASA, we encrypt K, again, to get Kappa, and use BJK's leaking technique to leak Kappa. Since the attacker has XK, they can decrypt Kappa after recovery, and as before, no one else can. And again, as before, we observe that key recovery works here because of the fact that the underlying leaking mechanism works, no matter what the leaked value is. Through a coupon collector analysis, a 400-bit value could be recovered with high probability after observing less than 3,000 ciphertexts. And in this case, since we're discussing a type 2 asymmetric ASA, the undetectability requirement does not impose that K needs to be indistinguishable from random, since our detection adversary will not have EK. So this is our type 2 asymmetric ASA. The description is similar to our type 1 ASA, but the bookkeeping related to the encryption of K to Kappa is simpler, since we no longer need to re-encrypt at any point. The looping component uses the leaking mechanism of BJK. We now consider the undetectability of our type 2 ASA. Recall that we consider only non-augmented detectability against user adversaries. That is, the adversary does not possess EK. It might not be surprising that the results and analysis of BJK's proof of undetectability carry over here almost exactly, since we are using the same leaking technique as them, just on a different value. Note also that despite the state being kept by our ASA, the state reset oracle given to the detector in the detection game gives no advantage in detection. Let's now consider the second requirement for our type 2 asymmetric ASA. We require that the ASA is not exploitable by a third party who is in possession of EK. To do this, we will show that the ASA itself is still a secure encryption scheme. We'll consider NCPA security. In order to discuss the security of ASub2, we defined the security game NCPA prime, shown here. This is a modification of a typical NCPA game to handle the key generation and state required for ASub2.ang. We showed that the security of ASub2 in this modified security game is bounded by the NCPA security of the PKE scheme and the NCPA security of the underlying symmetric encryption scheme. As it turns out, the choice of NCPA security as a notion is quite arbitrary and using another notion of security, the proof still goes through with the security notion on the left most and right most terms modified accordingly. In the final section of the paper, we consider generic modifications to symmetric ASAs to make them asymmetric ASAs. The modifications that I showed earlier are actually independent of the leaking mechanism being used, so long as such a mechanism can be used to leak an arbitrary value. We call this property of ASAs that they can be used to leak an arbitrary value flexible. We can apply these modifications to any flexible symmetric ASA to obtain a flexible asymmetric ASA. Some of the ASAs we've seen today, including BPRs and BJKs, are considered flexible and others are not. That's all I've got. Thank you very much for watching.