 harmless, but as we will see, it was not. And in fact, their instantiations of ring LWE, they just leak exact equations in the secret. So it's like learning without errors. And in particular, you can just use linear algebra to recover the secret. So this is kind of the conclusion. You can recover the entire secret with near certainty. Although it remains a legitimate question whether these evaluation at one attacks apply. But the main conclusion is that there is no, it's not a threat to ring LWE as it may have appeared to be. OK. So to talk about this, I will have to explain to you a little bit what ring LWE is. So to start with that, I will talk about LWE because ring LWE is a ring version of this. So LWE is about solving linear systems of equations. But the equation, and this is modulo p. So over the integer modulo p over the field fp. But these equations are not exact equations because otherwise one could just use the techniques of linear algebra like Gaussian elimination. But these equations are given with some errors. So every equation, so this you can read as a bunch of linear equations. And every equation, so the bi that you are given is not the correct bi. It's not the correct outcome. The bi that you are given is the correct outcome plus some error ei that is sampled from some known distribution. And that is considered to be very small. It's part of the problem that the linear part of your system, this will be important in a second, that the linear part is generated uniformly at random. So this is part of the problem. And it's also part of the model that an attacker or a person that wants to solve this problem can ask for new equations indefinitely. So this is LWE. This was introduced by Riegel about 10 years ago. And it's, so yeah, I rewrote this approximate version into an exact version by including this error vector here. This is the shape that I will use, continue using. Okay, so two of the features of LWE and two of the reasons why this problem is being appreciated a lot is that it came along with a hardness proof, so to speak. So if you can come up with an algorithm for solving LWE, you can also come up with algorithms. This is a quantum reasoning. So quantum algorithms to solve certain well-known version of lattice problems, shortest vector type problems. Okay, so this is one reason why it's being appreciated. Another reason is that it turns out to be a very versatile building block for a lot of crypto. So fully homomorphic encryption is one of the main applications. But as I said already, there's this quantum element in there, so it's believed to be quantum secure. So for PQ crypto, it's also being studied. There is one main drawback about LWE and that's a key size issue. So if you want to hide the secrets, which we think of, say, as a vector of length n, let's ignore the size of P for now. So this is something of length n that you want to hide. Well, you have to use this entire linear system as a public key. And so you need like a public key of size n squared, at least, m, so the number of rows has to be strictly bigger than the number of columns. The system has to be over-determined for the problem to be well-defined. But let's think of this as size n squared. You need size n squared to hide something of size n. And so people have tried to solve this, and this is then what the goal is of ring LWE. So let's first start with something which I call ring-based LWE. So it's on purpose that I don't write ring LWE there. This is not what I'm going to say now is not exactly ring LWE. But the idea behind using rings in LWE is as follows. Instead of thinking of our secret as a vector of length n, we will think of it as a polynomial of degree at most n minus 1. So s0, s1 up to sn minus 1. We will encode it as 0 plus s1 times x plus s2 times x squared and so on. This is just a matter of language. It's two ways of writing down the same thing. But of course, when you think of polynomials, you can try to multiply these polynomials. Now, if you multiply two polynomials of degree at most n minus 1, you will end up, most likely, with a polynomial exceeding that degree. And so you have to do it modulo some polynomial to get back to the size you want. And this is a parameter in the system. This is this f of x, modulo which we are computing. So this is the integer modulo p. So here we also have integer modulo p. But now we also work modulo f of x. Now, once you have set up this multiplicative thing, so you have given the secret key space a ring structure. This is basically what happened. You have a bunch of matrices that are attractive for our purposes, namely, multiplication matrices. So if you take a random such polynomial and you multiply the secret with this polynomial, you reduce the outcome modulo f of x, then in terms of the original vector notation, this will turn out to behave like a matrix multiplication. And the idea is to use such matrices. Now, it's very important to be aware of the fact that these matrices are not random. But this is the whole clue. To encode this entire matrix, which is of size n by n here, it suffices to save this polynomial you'll multiply with, which is of size n. And you save a factor n. This is the idea. So just to give you an idea of the flavor, if you would work modulo x to the n minus one, modulo this polynomial, then the matrix of multiplication with a looks like this. So it's what is called a circulant matrix. So you have this first column. And then the other columns are obtained from this first column by shifting them cyclically. And so obviously it suffices to know the first column to know the entire matrix. That's the idea. Now, this is kind of a bad example because of some potential threat that is around. And these are exactly this evaluation at one attacks I wanted to talk about. So I will try to be very brief. This is just a standalone thing. If you don't get the details of this, this is not important. The most important thing is that you know that this threat is around. As soon as f of one is zero modulo p, for instance, if it's exactly zero, as is the case with x to the n minus one, this is exactly zero if you evaluated it one. But if you have zero modulo p, then you have a ring homomorph. So we have this secret key space. But then we have a map from the secret key space to a much smaller space, fp, by evaluating these polynomials at one. So this is a well-defined map. And whatever the map is, one can use it to convert samples, ring-based LWE samples to, you can evaluate everything at one, and this behaves well. This condition is needed for this to behave well. And basically because you are now in a much smaller space, you can examine things exhaustively. I will just go quickly over this, but you can examine things exhaustively and try to discover non-uniform behavior. And if you can play this game in the right way, you can probably extract information about s evaluated at one. So this is the sum of, this is, so recall, s is here, s evaluated at one is just the sum of the digits. So you can hope to play this game to extract some information about the sum of the digits. This is this class of attacks. It's not the main point of this talk, so I will not talk about these attacks any further. It's just important for you to know that they are around. And so one safety measure one could take is to work with irreducible f of x. Like it already excludes a certain kind of versions of this attack, but yeah. So for instance, x to the n minus one is excluded with this because this has a factor x minus one, but the actual condition is that f of one has zero modulo p and this is not entirely excluded by this irreducibility condition. In any case, from now on, we assume that f of x does not have non-trivial factors. So what is ring LWE then? Well, we had this idea of replacing the linear part of our system with the matrix of multiplication. So the direct analog of LWE in this ring-based world would read like this. So we have this constant factor equals this matrix of multiplication times a secret plus some error. Now the direct analog of LWE would be as follows. Take this error, take this EI independently and choose them at random and choose them small. This is what happens in LWE. So if we want to play this game literally, this is what we will do. So this is what is written here. Take them independently from the same distribution. This is like an ornable distribution with a small standard deviation. I've represented this distribution as a sphere. Just to, this symbolizes the fact that in every direction, the distribution is the same. There's no preferred direction. But this is not what ring LWE is. So this is important in particular. So the people that introduced ring LWE also gave like a security proof of this or a hardness statement. But this hardness proof does not apply to this version. And in particular, also evaluation that one attacks are known to work in certain special cases in this setup. This was by Eisenrager Lauter and Stange. So this setup is sometimes called poly LWE. So what is ring LWE then? Well, let's have a look at the samples again. So you see that I left some blank space here. This is because there's something to be filled in there. So let's do that. So we will plug in an N by N matrix that arises as a product of two N by N matrices. So I have a blue one and a red one. So the blue one is the canonical embedding matrix. I will not define what is this. But the basic point is that the ring in ring LWE does not necessarily have a canonical representation. You might have two polynomials defining essentially the same ring. And the previous version, the poly LWE one, is not robust to these kind of changes. And so there exists a sort of canonical representation of your ring. And this is where you generate the errors. And then this matrix pulls these errors back to the original setting. This is what this B inverse does. If you have never seen this before, just there is a matrix there. And then there's another matrix. And this is in multiplication by the derivative of f at x. It's some multiplication matrix. And the reason that this is there is that in actual ring LWE, the authors propose to pick secrets from something called the dual ring. Let's call it the dual ring. But you can reformulate everything in the original version. But then you have to put this matrix there. So this is basically witnessing the fact that your actual secret space is not the ring that I mentioned. But you can reconvert everything into secrets, living in that ring. But then this is the price you have to pay. Oh, sorry, push the wrong button here. So this is where we were. So there are two matrices. One for going back from this canonical world to the world we are working in. And this is to compensate for the fact that we are sampling our secrets in the ring. And then in this version, there is some hardness statement that was proven. Now we had this sphere before. Now I draw this ellipsoid. Because this factor, this matrix we put in front here, this might completely skew this spherical distribution. So the EI are still thought of as being sampled independently from the same distribution. So they are spherical. But then by applying this matrix to these guys, we will completely skew this. And so the errors in each coordinate, the resulting errors might not be independent. So some might be bigger than others. So this is represented by this ellipsoid here. And it not only skews it, it also scales it up. So this is also very important. So this, as we will see in a second, this matrix, this product of these two matrices is like a big matrix. On average, it scales things up a lot. So this, yeah, thank you. So this is what I will explain now. So let's have a better look at these matrices. So we have these two matrices. Well, one can prove that the determinant of this multiplication by f prime of x matrix has a quantity called the discriminant. And if you know what the discriminant of a polynomial is, it's basically that, but in absolute value. So it's important to know that this, in practice, is a huge quantity. It's a very big quantity in general. So this is what the determinant of this guy is. So this scales things up by a lot. On the other hand, the B inverse scales down. And the B scales things down by a factor square root of delta. So the determinant of B is the square root of delta. So the determinant of B inverse is one over the square root of delta. And so if you do the math, on average, you have one over square root of delta times delta. So you get square root of delta. And so on average, your errors are being scaled up. And I put square root of delta to the power one over n here. And that's because there are n directions. So the determinant kind of takes the product of everything if you want. So in the average direction, your errors are being scaled up by this matrix, multiplication by this quantity. But this is on the average. This is very important. So we still have this extreme skewness. So in some directions, the scaling might be completely different than in other directions. OK, so what did the people from the provably weak instances of Ring LWE do? Well, for convenience, they ignored the fact that we actually have to sample our secrets from this dual space. So they sampled secrets in the original ring. It's not completely clear to us how harmful this is. You can try this. But basically, if you naively remove this matrix, because this matrix expresses the fact that we have to pick secrets from the dual space, if you naively remove this matrix, then you have a problem. But because recall, the determinant of this guy is one over the discriminant. So it scales down. And the discriminant is a huge guy, so you scale down massively. And you have a risk that your errors will become just almost nothing, and that if you round your equations and you get an exact linear system and you can solve it using linear algebra. So this is definitely not a good idea. So what is the remedy they took? Well, the remedy is that to compensate for the scaling down by one over square root of delta, they scaled up again by square root of delta. So this compensates. But this is only on average. So maybe a look ahead. So remember that this matrix had determined delta. So from the ring LWE point of view, a more natural choice would have been to scale up by delta instead of square root of delta. But they only scaled up by square root of delta, which is OK, which takes care of things on average. So this is like the version they took. So this is what I say. This factor compensates on average. But remember this queerness. It might be that this B inverse scales things down in one direction much more than in other directions. And then if you scale everything up, like in every direction with the same scale, in some directions, this compensation factor might be far insufficient. And this is indeed what happens. So as we will see on the next slide, in some directions, so for instance, in the last so many coordinates, you're scaling up by square root of delta to the one over n is far insufficient. And your errors will still be very small there. So this is a graph showing this for a concrete example they gave, so they gave this example. I ignore the size of the Gaussian from which it starts. It's something like 8, in that case. It's something all of 1, say. But they were modulo this polynomial and modulo this prime. So remember that f of 1 is indeed 0 modulo p, which they needed for the evaluation at one attack. But for us, this is not relevant. But the crucial observation is that for this polynomial, this ellipsoid is extremely skew. So in some directions, this will be very big. In other directions, this will be very, very small. And this graph illustrates this. So of course, this is a three-dimensional ellipsoid. But now we are in 256 dimensional space, so we cannot draw the ellipsoid. But we can represent it like this. So for instance, in the coordinate corresponding to the constant term, so this is one dimension, say, we have a distribution. This is an empirical thing, but you can make this work in theory also. You have a distribution with basically standard deviation 400. But in another direction, so like in the term corresponding to x to the 200, we basically have 0. Well, this is because this scale is so small. So let's zoom in maybe. For instance, in the 210th dimension, we have ellipsoid is such that in that direction, the standard deviation is something like 0.3. And so now have a look at this line at height 1 half here. So this is, if we will round or samples, then everything that goes below 1 half will become identically 0. And so indeed, if we see, if we round our samples, then the last so many guys. So this is three times the standard deviation. So with 99.9% certainty, your errors will indeed be below that brown line. And so definitely below this black line. And so if you round, like with very, very high probability, you will round this down to 0, and you will get exact equations in these last coordinates. So let's have a look here. So these guys here, if you elaborate this, and then you look at the last so many coordinates, and you round them, then they will become 0 with very, very high probability. And this gives you exact equations in the secret. So the last so many equations of the linear system will be exact equations. And so if you have enough samples, you will be able to just break the system using linear algebra. So this is the summary. So evaluation at 1 allowed these people to recover S of 1 as we announced. And they had to use about 20 samples with a success rate of about 20%. But after rounding the last n over 7 equations, roughly become exact. And so if you have 7 or 8 samples, you have basically enough equations to apply linear algebra. And this then will allow one to recover the secret exactly. And similar remarks apply to their other examples. OK, so let me complete with some thoughts. So currently, evaluation at 1 is not a threat contrary to what they claim, but it's still an interesting question. So if you remedy their examples, if you scale things up properly, then they are not going to examples anymore. So the work has to be redone, but it's still not entirely clear whether it's possible or not. If it would be possible, it would be very interesting. So you might say the problem is in the fact that they used this non-dual, that they removed this matrix A, so that they sampled the secrets from a non-dual thing. But this skewness is not really a phenomenon that is specific to this non-dual world. So we had this matrix was very skew, but this one can also be extremely skew. And so to us, it's more a matter of insufficient scaling than of dual versus non-dual. Of course, the fact that they made this mistake was a bit because of the confusion there. But still, this is a related question. If we would have scaled by this instead of by the square root of that to compensate for the removal of this factor, do we then get an equally hard or a provably hard version of ring LWE or not? Another interesting remark is that if you have a look here, so these errors are very small, but these are very big. And if you now scale things up properly so that to make these big, then these will become huge. And because we are working modulo p, things will start wrapping around modulo p, and the first errors will become indistinguishable from uniform, which for most LWE applications or ring LWE applications is a problem. And so these polynomials are useless for most applications. That's maybe an interesting observation. And then the last thing is that the cyclotomic case, which are the main candidates. So cyclotomic polynomials are the main candidates for finding their way to cryptographic practice, I guess. They seem naturally protected to this kind of geometric growth. I will not go into the details of this, but for these polynomials, this insufficient scaling would have probably still been sufficient. I will stop here. All right, questions. What happens to your work if you just look at PLWE and you get rid of the B matrix entirely as well? Yeah, then we cannot conclude anything. So, yeah, Polly LWE is a bit more secure against insufficient scaling if you want them, because you don't have this matrix B that skews everything. But there's no security proof for it. Have you observed the geometrical growth for other rings like the entry prime rings? No, we have not investigated that. For the x to the n plus x plus one, for instance. So if all your roots are roots of unity, then you don't have this geometric growth, basically. As soon as your zeros of your defining polynomial are not roots of unity, then this B matrix is essentially a Van der Monder matrix, or the inverse of a Van der Monder matrix. And so you have all these exponents there, and as soon as you plug in roots that are not roots of unity, you have some exponential growth in that matrix, and this is what is reflected in these graphs. So it would be interesting maybe for x to the n plus x plus one. All right, let's thank the speaker again.