 Okay, so our next talk is a high precision bootstrapping for approximate homomorphic encryption by aerial variance minimization, and Yong-Woo Lee is giving the talk. Okay, thank you for the introduction. My name is Yong-Woo Lee, and I am from Samsung Advanced Institute of Technology. It is our great pleasure to share our work in Eurocrypt, and thank you for attending. Okay, so we have achieved high precision bootstrapping, which is interesting in approximate homomorphic encryption, which is very similar to previous talk. And we used aerial variance minimization technique for that. So, in this paper, we proposed how to find the optimal polynomials for approximate homomorphic encryption, CKKS scheme, in the aspect of SNR. And this paper is basically how to find a good approximate polynomial for CKKS scheme. And using this approximate polynomial, we have achieved high precision CKKS scheme, for example, like more than 90 bits, while the interesting thing is that it has equivalent multiplicative depths to prior Rs for 4-bit. And plus, we also proposed a algorithm for efficient homomorphic evaluation of polynomials, which is called LaGPSGS algorithm, Baby Step Giant Step. And this is about two times faster than original Baby Step Giant Step algorithm. And we also proposed the polynomial approximation for Baby Step Giant Step. Baby Step Giant Step algorithm specific polynomial approximation. So this paper, this table compares the prior Rs and our work. The prior Rs usually use the indirect approximation, for example, using sine function and double angle formula or arc sine, something like that. However, our message is inverse to approximate directly. So prior Rs is working as a composition of small polynomials, but ours requires one single high-degree polynomial. The precision is like more than 90 bits, but prior Rs achieves up to 4-bit. In this table, we don't consider the previous talk because it's something like parallel work, you know. So the depth is very similar, and the measure of noise is different. The previous work uses minimax usually, but measure of noise means that when we say this approximation is good, in terms of what? So the prior Rs usually uses minimax as a measure, which means that it minimizes the maximum error. And ours seems the measure of noise as SNR, signal-to-noise ratio, which is very widely used term. So here's the outline. I'll first leave you some preliminaries, and then I will tell you about our approximation method, and then the efficient evaluation algorithm. And finally, the implementation research, and I will conclude the talk. First, preliminaries. High-precision home pre-encryption itself is very interesting topic. Did you notice that the best known precision so far is even less than the standard bubble precision? This is due to the bootstrapping as bottleneck of CKKS accuracy. That's why we need to, it is interesting to study high-precision bootstrapping for CKKS skin. And we can also think about Li-Mishan-Shio attack. Recently, Li and Mishan-Shio proposed a passive-key recovery attack for CKKS skin, and its known countermeasure is noise-floating technique, which adds, to use noise-floating technique, we have to add a huge error after decryption, for example, like 30 or 40 bits. By doing that, we remove most of accuracy. So I'm not sure if the bootstrapping error helped this attack or not. Maybe, maybe not. But if we have high-precision CKKS, everything became very simple. We can lively apply the noise-floating technique. I will review the approximate home pre-encryption scheme. CKKS scheme, Cheong-Kim proposed by Cheong, Kim, Kim, Song, is an approximate home pre-encryption scheme, which is efficient for layer or complex numbers. So it's interesting characteristic is that the message contains error. So when we decrypt the cybertext, we got M plus E, M plus E, and we do not extract error here, unlike BGV or FB scheme. Additional multiplication is directly supported for CKKS scheme, so we can perform any polynomials. However, non-arithmetic operations like comparison or modular reduction, or maybe let it look, is not represented by addition and multiplication. So we have to do polynomial approximation for that. And when you do the cybertext, cybertext multiplication in CKKS scheme, we have to perform a operation called linearization, and it's expensive. So linearization is key-switching from 1S as square 3-to-4 key to 1S, a linear key. The reason why 3-to-4 cybertext is generated after multiplication is that the multiplication of RLW like cybertext is done by some sort of tensor product. And then the key-switching requires a lot of entity, which is heavy. And we also have to consider rescaling in CKKS scheme. In CKKS scheme, plaintext is scaled by scaling factor, and when you multiply, it's squared. So in order to reduce its increment from exponent to linear, to make it linear, we have to perform rescaling, and when we do the rescaling, the message scale and cybertext modulus is reduced together. And we have to note that rescaling introduces rounding error. In cybertext, we define depth as maxima length of a path from input to the output gate for a given circuit. And then we only care about multiplicative depth because addition is cheaper than multiplication, way cheaper than multiplication. And for example, degree D polynomial has depth about log D. And the level of a cybertext is maximum depth that a cybertext can perform and bootstrapping is homomorphic evaluation of the decryption circuit. So our goal in CKKS bootstrapping is to refresh the level of cybertext. But the decryptions of the circuit itself also have gaps. So left steps for bootstrapping means that more levels after bootstrapping means that less bootstrapping in the whole procedure. So I will briefly review the CKKS bootstrapping. And first, after many, many rescaling, we got very small cybertext modulus here, and we want to increase this to large Q. So if we ignore this mode Q thing here, then we got the multiple of Q term. And we want to remove this. So in order to perform coefficient-wise operation to reduce this QI term, we perform linear transformation, homophene coding, which is called co-operative slot. And then we perform modular reduction, but we cannot do the modular reduction directly. So we perform polynomial approximation of it. So we call it F mode, and we finally remove this Q item here. And then we do slot coefficient, which is inverse operation of co-operative slot, but this is very important. And I note that slot coefficient is given by this MI multiplied by Zeta I, where MI is coefficients of message M and Zeta I is little unity, so it has size of one. Okay, what is signal-to-noise ratio, SNR? SNR is very widely used measure of signal quality, for example, virus communication or storage devices. So basically, most of noise media we use SNR to measure the signal quality. It is defined as the ratio of signal power and noise power. In CKKS, we can also think that CKKS is a noise media for computation. So to increase a big SNR is good, larger SNR is good. So to increase the power of message, we can use larger scale and factor. Very easy. But it's not good, because it means larger scaling factor means that more consumption of module, cybertext modules when we're scaling, so it means that less levels. So we do like to focus on minimizing the noise power, and which is same as error variance when we assume that error has average zero, which is very reasonable, right? So how do we do the approximation? Okay, what should we have to consider when we design approximate polynomial for CKKS scheme? First, polynomial basis is noisy. The messages in CKKS scheme has error, so the polynomial basis is noisy, and the basis error is similar to rounding error introduced by rescaling. And as the basis has error, if we multiply large coefficient, the error is amplified. So we don't want to use large coefficients. And the depth is very valuable because if bootstrapping has huge depth or a circuit has huge depth, it's not good because we need a lot of bootstrapping for the whole circuit. So this is basically a penalty for composition of small degree polynomials. The previous works used small degree polynomials for approximation, but it is a kind of penalty for this. And especially in bootstrapping, final error is not approximation error. We have to do slow-to-coefficient. And the slow-to-coefficient is linear combination of independent errors, and we need the measure of error considering slow-to-coefficient, not only just well-doing the approximation. It is very important. So how does the noisy basis affect the polynomial? Let's see. Say fx is an approximation of f-mode. It doesn't have to be modular reduction function. It can be any arbitrary function we want to approximate. And c, it is denoted by summation of ci times phi i, where phi i is arbitrary polynomial basis. And in CKKS scheme, if you perform fx, it does not actually give fx. Because we have e basis, which is error included in this polynomial basis. So this term is also added. So the error is multiplied by coefficient ci. So the actual error is the approximation error plus the amplified basis error. Usually, e basis is very small, so it's acceptable, but when c became huge, the error became dominant somehow. So the magnitude of error and its coefficient should be small. How do we explain this? Actually, when we say wi is variance of e basis i in the previous slide, it is multiplied by c. So its variance is multiplied by c squared. And let's say the approximation error and this basis error is independent. Yes, it's independent. So the total variance is given by this. And what you have to find is a coefficient, I mean coefficient represents the polynomial, the polynomial that minimizes this variance. Where e approaches is f mode minus f. And wi is determined by basis error and the basis error is determined by rounding error. And when we think about the variance, there is also interesting fact we can use, which is that the distribution of input is not uniform. We don't know about the distribution of message because message distribution may be related to the security concerns, but the qi has a specific distribution. So it is better to reduce the error well in the portion that the probability is very high, right? This is variance. So this graph is the black line is Woodrow reduction function. And we observe that the i follows some distribution like all-in-all distribution because by LW assumption, it is represented by some of random uniform numbers. So around zero, it is highly probable. And when it goes far from zero, it has low probability. And let's see why it's good when you consider error slope to coefficient. So the slope to coefficient is given by message times zeta, I told you, but when we considering the error, error is also multiplied by zeta and edit. So when you say e-mode is error in slots after motor reduction. And when we do the slope to coefficient, the error e-boot became a summation of errors. So when you see the variance of error after bootstrapping after slope to coefficient, the variance itself is the summation of errors after motor reduction. So it directly gives the variance. So if we reduce each variances, it gives the minimum variance after bootstrapping. However, if we use minimax as a measure of error, even though we reduce these errors after motor reduction, it only gives its upper bound. It does not guarantee the minimax error after bootstrapping. And about the best. Let's say p-degree is polynomial among elements like set of polynomials of degree less than or equal to dEg. And our approximation can do the direct approximation, so our search space is p-dEg itself. However, if we use some composition of polynomials, the search space is way narrow than previous one, way narrow than ours, I'm sorry. So the direct approximation has less steps than the indirect approximation. For example, we can find some easy inequality here. And plus why our method is beautiful is that it's easy. I mean, it's beautiful. It has a simple analytic solution. So to find this c minimizes this value, you can see that those two are quadratic. So the solution is easily given by a derivative. We can find zero of these derivations so well. The zero is given by the solution of this system of reneurications, and the system of reneurication is easily found. So this is about our approximation. And I'll briefly tell you about our new giant step-ape step algorithms. So naively evaluation of degree d polynomial requires d multiplication because we have to find x, x square, xq, xd, and some all of it. So data step giant step proposed by Hanengi requires square-top d multiplications. In the data step giant step, the polynomial is recursively divided into smaller polynomial, and we build up again. So if we given px, we divide px by tk, and we get this quotient p0 and its remainder p1. If we got p0 and p1, we multiply tk to p0 and then add up, and we finally get px. And those values are obtained recursively. So building blocks, when you consider the building blocks of BSGS algorithm, cybertext multiplication, and plaintext, cybertext multiplication, and additions are required. We observe that plaintext multiplication and addition does not require linearization. So we can multiply plaintext or we can add cybertext without linearization. So this is a very simplified version of how lazy BSGS algorithm is working. So white box is for linearized cybertext, and blue box is for not linearized cybertext. So we got a degree one polynomial basis t1. To find t2, we have to square it, right? And to find t3, we have to multiply t1 and t2, but to multiply, we have to linearize t2. To find t4, we have to square t2. To find t5, we have to multiply t3 and t2. So we have to linearize t3 and goes on and on. After that, we got four linearized cybertext and three nonlinearized cybertext. We have to multiply coefficients here. And during this multiplication above, we don't have to worry about the scale and factor, because we can adjust the scale and factor here. So we can simply multiply those coefficients because it's plaintext. And then we can add them. Finally, we got the nonlinearized px. If we need it, we can linearize it. So in this simple procedure, we reduce two or three linearization, right? This is comparison of baby steps of algorithms. And the green one is the original BSGS algorithm proposed by Hanengi. And yellow one is lazy BSGS algorithm. Oh, I'm sorry. X axis is degree of polynomial and Y axis is number of linearization. This is the most dominant computation. The blue line is for old polynomials, because our approximation, I didn't explain it, but our approximate polynomial is old. We can use the oldness to reduce the computation. So the implementation result and conclusion. This is simplified version of our implementation result. And to see the first rule, to achieve 31 bit accuracy, we use steps 10. But in previous work, to achieve similar accuracy, we need steps 12 or 11. If you use steps 10, you achieve only 22 bit accuracy. And plus, if we want to achieve high precision like 90 or 100, we only need 11 of those. So our method is good in terms of depth. Conclusion. We have proposed a optimal approximate polynomial. And it can be applied to modular reduction and we achieved high precision bootstrapping. We also proposed a efficient algorithm to homo-pickly evaluate a high degree polynomial. If it is not high degree, we can also apply this. You see that, like, seven degree polynomial also have gain. Anyway, by using our method, we can reserve more levels after bootstrapping. And we can use that level for efficient circuit design or to speed up the bootstrapping, whatever. And also, as we have high precision bootstrapping, it makes everything simple. We can directly apply noise-filling technique for INDCPAD security. Thank you for listening. And I'm happy to take any questions. Thank you. We have questions for Yong Wu. Hello. Thank you for your talk. I simply want to ask, how would you compare your lazy BSGS algorithm with Patterson-Stockman algorithm? Oh, that's actually a good question. I didn't compare them directly. But in paper, we have the equation for how to find the number of multiplications. So you can simply check it. And as far as I know, in the Patterson-Stockman algorithm, we can also easily see how many multiplications do we need. Yeah, they approximately need a square root of two n modifications. Thank you. Yeah, and the difference between BSGS and Stockman here is that BSGS is good in terms of depth. For the same depth, BSGS can slightly larger degree polymer for the same depth, yes. Do we have another question? Hello. Thank you for your talk. I maybe misunderstood something. When you talked about depth 10, did you mean that before doing the bootstrapping, you need to have a depth 10 left in your scheme? Or do you say that after the bootstrapping, you have the possibility of doing a circuit of depth 10? Okay. It's the tabs for polynomial approximation only. So it's basically a tabs required for bootstrapping. I mean, for a slope coefficient and coefficient slope, we can use more tabs, but it's not the focus of this paper. Okay, so it's before doing the bootstrapping, you need a depth 10 left to be able to do multiple, yes. Yeah, sure. But before doing the bootstrapping, actually in CKKS, we don't need levels, because we just increase Q, and this means level. So we have to increase Q a lot. Okay. Any other questions? All right. If not, let's thank Yong Wu and Nathan. That's the end of the session.