 Hello, my name is Bai Yu Li, and I will be talking about our paper on the security of homomorphic encryption on approximate numbers. This is a joint work with my advisor, Danieli Michiancho. So here is a brief overview of this talk. I will be first talking about the passive security model of approximate homomorphic encryption. I will introduce a new security notion in the CPD security to formalize passive attackers against approximate homomorphic encryptions. I will compare this new security notion with the classic in the CPD security. I will then present a passive key recovery attack against the CKKS scheme. This attack is within the CPD security model. It is both efficient and effective against most of the previous versions of the open source FHE libraries. And I will also be briefly talking about countermeasures against this attack. So this talk is about homomorphic encryption, which is a crypto primitive allows you to compute unencrypted data. Essentially, FHE makes this small diagram commute. FHE can be extremely useful for building privacy-preserving protocols. And over the last several years, we have seen a great progress in improving the efficiency of the FHE schemes. Nowadays, many services are providing their services using the FHE technology. And one of the schemes, the CKKS scheme, has become a very popular and serious candidate in many of these applications. So what is the CKKS scheme? It is a special kind of homomorphic encryption scheme. It is an approximate homomorphic encryption scheme. What this means is that the decryption function on an encryption of x does not retain x exactly, but it rather returns something close to x. The scheme itself is instantiated based on standard AWE encryption, where the raw decryption function computes an inner product between the circuit key and the separate hexed. The result is a noisy encoding of the plaintext x. Typically, with the exact FGE schemes, you would need to apply error correction code to extract the plaintext x. The CKKS scheme considers this noisy as a sort of approximation error. It does not do any error correction. And instead, the CKKS scheme decodes the entire noisy encoding of x into x prime that is only approximately equal to x. In many of the applications, especially with this numerical computations, approximate results sometimes are already acceptable. And on the upper hand, by not doing any error correction, the CKKS scheme becomes much more efficient compared to the standard exact HG schemes. The CKKS scheme has been implemented in many of the open source FHG libraries, and it has been used in many of the privacy preserving machine learning applications. For security, you can show that the CKKS scheme is in the CPA secure based on standard lattice assumptions. The proof is essentially the same as other AWE based scheme, such as BGV, BFV. Here, the in the CPA security is typically considered as the standard security notion for passive security. So at this point, we may intend to conclude that the CKKS scheme is passively secure. But we want to ask if that's really the case, or maybe the real question is in the CPA suitable model to consider for approximate equations. Well, to answer that question, let's look at a homomorphic encryption as a formal object. So formally, homomorphic encryption scheme consists of four algorithms. You can use the K generation algorithm to generate set of keys. The party with the public key can encrypt his message. And another party with the evaluation key can homomorphically evaluate her circuit and to get another separate text. Finally, the party, the secret key holders, can decrypt the final separate text to learn the final computation result. For exact schemes, the very first requirement is that the scheme must be correct. This essentially requires that the decryption of a homomorphically evaluated separate text must return something as if the competition is carried out in clear in plain text. For security, we consider passive attackers for FHG schemes. For example, this sunglasses guy in this picture, such attacker can influence legitimate users on their choices of plain text messages as well as on their choices of homomorphic competitions. This passive attacker can also eviscerate the communication line so learn the separate text. And a very important fact is that this passive attacker is also capable of observing the final computation result in plain text. For example, in this picture, our favorite heroes Alice and Bob communicate through this encrypted line. At the end of the communication, Bob decrypts Alice's message. And at that point, Bob's behavior may depend on the decryption result. And this passive attacker, this sunglasses guy, can observe Bob's behavior to learn the final computation result in plain text. So this is a very crucial fact that we must consider passive attackers who can observe the final computation result in plain text. For exact schemes to formalize such passive attacker, we typically end up with the in the CPE security or indistinguishability on the chosen plain text attack. Such security can be defined by this in the CPE security game and assume everybody is familiar with this classic definition, so I will skip this definition. But what I really want to emphasize is that the formulation of in the CPE security does not consider the decryption function. It only concerns about the encryption and evaluation function of the scheme. And this is totally OK, because for exact schemes, the adversary already knows the decryption result due to the crackiness requirement. But such crackiness requirement is not satisfied by approximate encryption schemes. So this brings up a question whether we should incorporate the decryption function into the security definition to capture all the powers of a passive attacker. To do this, we introduce a new security notion in the CPE security or in the CPE with a special decryption protocol security to formalize these passive attackers against approximate homomorphic encryption schemes. This security is defined by a standard indistinguishability security game where the adversary is given access to three stateful oracles, the encryption oracle, the evaluation oracle, and the special decryption oracle. The encryption oracle is a standard left and right word oracle that takes a pair of messages from the adversary. And it encrypts one of the message, depending on a secret bit B, return the ciphertext to the adversary and stores both the plant taxes and as well as the ciphertext into the state. This evaluation oracle H is also standard. It takes a circuit and the sequence of indexes from the adversary. It picks the tuples indexed by the indexes in J. And homomorphically evaluate the circuits on the ciphertext indexed by J, return the final ciphertext to the adversary and stores the plant tax computational result in both left and right word as well as the final ciphertext into the state. The decryption oracle is very special. It takes only an index from the adversary. It picks the tuple corresponding to this index from the state compared the plant tax messages in this pair. If these two plant tax messages are equal, then this decryption oracle simply decrypts the J's ciphertext and return the decryption result to the adversary. Otherwise, this decryption oracle just returns an error symbol. Well, this is pretty technique that all this definition wants to do is to formalize a passive adversary who can encrypt and evaluate, who can encrypt the messages honestly who can also evaluate ciphertexts honestly and who can also access or observe the decryption result on honestly generated ciphertext. So this decryption oracle is very different from the decryption oracle you would typically see in active security definitions such as CCA or CCA2 because here, this decryption oracle can only decrypt ciphertexts that are honestly generated by the encryption and evaluation oracles. It cannot decrypt the ciphertext arbitrarily chosen by the adversary. So this oracle does not give the adversary any power to mount active attacks. So with this definition, the very first thing we want to do is do a sanity check. We want to make sure that we don't give unnecessary power to the adversary. Well, or formally, we can show that for exact homomorphic encryption schemes, this new security definition, the IndiceAPD security, is equivalent to the classic IndiceAPD security. So we don't really give any unnecessary power to the adversary. But for approximate schemes, we can show that this IndiceAPD security is strictly stronger than the classic IndiceAPD security. And we will show this by showing a K-recover attack against approximate schemes KKS in the CPAD model. Before I go into the details of this K-recover attack, let me summarize the theoretical result that we obtained in our paper. We show that for exact schemes, our new security definition, the IndiceAPD security, is equivalent to the classic IndiceAPD security. This shows that our newest definition is a conservative extension of the IndiceAPD security. For approximate schemes, we can show that there exists a strict hierarchy of security variants of the IndiceAPD security based on the number of decryption queries allowed by the adversary. We can show that the unrestricted IndiceAPD security is separated from the Q-IndiceAPD security where the adversary is allowed to make only a peer-rebounded number of decryption queries. And this separation goes all the way down to the variant in which the adversary is not allowed to make any decryption query or the same as the classic IndiceAPD security. We also show that in terms of query orders, the non-adaptive version of the IndiceAPD security is separated from the fully adaptive or unrestricted IndiceAPD security. We also defined a simulation-based IndiceAPD security and we show that there also exists a strict hierarchy of simulation-based IndiceAPD security based on the number of decryption queries. You can find the details of all this theoretical result in our paper on E-Print. So now let me present our key recovery attack against the CKKS scheme. So the CKKS scheme is based on the RAIN.WA encryption and it is typically instantiated with a cyclohomic RAIN of power of two orders. In CKKS, the secret key is a pair of polynomials and the cipher tags is typically also a pair of polynomials. The RAIN decryption function in CKKS computes an inner product between the secret key and the cipher tags. CKKS applies a special encoding scheme based on the canonical embedding of the cyclohomic RAIN. The encoding function computes the inverse of the canonical embedding, scaled it up and run it to the nearest integers. The decoding function first scales down a polynomial and applies this canonical embedding to get a complex vector. The full encryption function and the full decryption function is standard as just a raw composition of the raw encryption or raw decryption function with the encoding or the decoding function. What's really important here is that the decryption function, the raw decryption function is a linear function in the secret key and the decoding function is also a linear function. So the full decryption function is, as a composition, is a linear function in the secret key. So this brings up to the core ideas of our key recover attack. In this tag, we consider a passive attacker who can observe some separate taxes and who can also observe the decrypted result, decrypted numbers of the separate taxes. With this information, the attacker can first compute, first try to re-encode the decrypted numbers into a polynomial and prime. And then the attacker can just compute the inverse of the full decryption function and at the end of the attack, the attacker may have the secret key in the final computation result as prime. Well, this tag looks simple and very efficient but in practice, there are some obstacles. For example, in some cases, you may use a pop-up tool, modulus. And in those cases, you don't always have an inverse element for arbitrary polynomial. If that's the case, the attacker can just collect more instances of the separate tax and decryption result pairs and apply Gaussian eliminations to obtain the secret key. And a more serious obstacle is due to competition errors because the encoding function is not strictly inverse of the decoding function. And due to floating-point errors, you may, the encoding function may, there may be re-encoding errors between the encoding function and the decoding function. Once there, if there exists a re-encoding error, then by computing the inverse of the full decryption function, you will not necessarily get the secret key. But if the re-encoding error is not too big, the attacker can just fall back to a lattice attack. All this implies is that the CQKS scheme is not a passively secure scheme in the CPD model. We implemented our attack against most of the open-source FHC libraries. And in our experiment, we chose lattice parameters such that we can obtain at least 256-bit securities under the classic in the CPE security notion. We tried to homomorphically compute the variance of large numbers in order to get more floating-point errors. Well, this is to make our attack harder. And also we tried to homomorphically evaluate logistic and exponential functions. And this is to try to produce bigger encryption noises also in order to make our attack harder. In almost all of our experiment, we can successfully recover the secret key. And we disclosed this attack to the library teams in October last year. And following that, there has been a very extensive discussions about this attack and also mitigations. During this process, more sophisticated attack have been discovered. And some of the heuristic mitigations have been implemented in those libraries. Well, this mitigation is so far only heuristic. So the question is, can we have a provably secure a solution to make the CKKS scheme in the CPED secure? The answer is, yes, we can just add a large Gaussian noise during the decryption. This is typically known as noise-floating technique. And it has been used in many areas. For example, the threshold encryption and also the circuit privacy problems of homomorphic encryption. But in practice, noise-floating has a serious drawback because it requires to add a large noise. And this means that you will need a bigger modulus because typically you probably need to add about 40 bits of noises during decryption. And this as a result, this may require you to use 128-bit integers to do the polynomial arithmetic. So by doing this, you certainly will see a certain performance penalty. So this brings up some open questions, open problems. The first one is whether we can design a more efficient solution to achieve in the CPED security for approximate homomorphic encryption schemes. Maybe it is very hard to achieve the full in the CPED security, but it seems maybe it's possible to design efficient solution for approximate homomorphic encryption to achieve a limited notion of in the CPED security in which the adversary is allowed to only make a bounded number of decryption queries. Another very interesting direction is to show whether we can build efficient schemes for homomorphic approximate competitions. With such a scheme is to homomorphically evaluate some approximate operations where you know the approximate operations result in clear text deterministically. So if that's the case because this operation is deterministic and you can trivially prove in the CPED security as long as the scheme, you can trivially prove in the CPED security as long as the scheme is in the CPED security. And I say that there are some recent progress towards such direction, but it looks like we still have, it is still open problems to build efficient schemes in that direction. So as a final remark I want to mention that we introduced a passive security model for approximate homomorphic encryption scheme in this talk. We introduced a new security notion, the in the CPED security. And this security notion is equivalent to the classic in the CPE for exact schemes. So it does not give the adversary the attacker any additional power for exact schemes. But for approximate schemes, we can show that this new security is strictly stronger than the classic in the CPED security. And we show that there exists the strict hierarchy of variants of in the CPED security based on the decryption queries. We also presented a key recover attack against the CKKS scheme in this CPED model. This attack itself is very simple and efficient. And some heuristic against this attack, some heuristic account measures against this attack have been implemented in many open source libraries. But it may require some further study to say how much concrete security can be obtained, can be achieved through these counter measures. And we can also achieve full-body security with a full-body security with the CKKS scheme. But the efficient solution is still open problem. So this is the end of my talk and thank you very much for your attention. And you're welcome to check out our paper on Eprint.