 Today, what I wanted to do is basically go through the basics of how fully homomorphic encryption works. So fully homomorphic encryption, as I'll get into a bit later, basically is this kind of encryption that allows you to run computations on encrypted values. And there's a lot of things that are really interesting about this. Like you can do privacy-preserving computation really easily. It's also an important building block in some multi-party computation protocols. It's an important building block in a lot of the proposed obfuscation protocols that people are starting to come up with. And the interesting thing with fully homomorphic encryption is that it sounds scary. Like there's a lot of, I feel, kind of impressions that it's this very complex form of cryptography that just got figured out only slightly a couple of years ago. And so it must be really complicated to understand. And it turns out it's completely not. It turns out that fully homomorphic encryption protocols are literally kind of simpler to fully understand, I would say, than even a lot of elliptic curve-based instructions. Now, the reason we're not using fully homomorphic encryption for everything is basically because it's not very efficient. And it has high overhead. It has ciphertext that are very large. And I'll get into why this is the case, if you're always here as well. So to start off, let's talk about what is fully homomorphic encryption. So the goal of fully homomorphic encryption is basically to have an encryption scheme where you can compute unencrypted data. So if you have an encryption of x, you want to be able to turn that into an encryption of f of x for some function f without being able to decrypt, so without being able to determine x or determine f of x. So you can think of a partially homomorphic encryption protocols that we know about already. So if you think about even elliptic curve points, now this does technically fully satisfy the definition of fully homomorphic encryption because it's not an encryption scheme. It's just a group that has a homomorphism. But if you look at elliptic curve points, if you take some number x and you multiply it by some elliptic curve point that's the generator, and then you add the generator multiplied by y, you get the generator multiplied by x plus y. This will say here a really nice property of elliptic curve points, basically, that they do kind of follow this. Distributed property, right? And there's a lot of protocols that do make use of this, right? So in the case of blockchains, for example, there is deterministic wallets that have master public keys that they rely on these elliptic curve homomorphism stairs, even zero-knowledge proof protocols based on elliptic curves, things like bullet proofs, even just signatures rely on this additive homomorphism. So it already provides a lot of value, right? And if instead of adding, you wanna be able to multiply, then you can just use RSA, right? If you just use kind of simple RSA encryption, so x to the power of e, then xy to the power of e equals x to the power of e times y to the power of e. And so if you have the ciphertext for x and the ciphertext for y, you can create the ciphertext for xy, right? So we have this kind of family of schemes that gives us additive homomorphism, and we have this family of schemes that gives us a multiplicative homomorphism. Now the question is, can we come up with a scheme that lets us do both, right? So can we come up with some way of making a ciphertext so that you can do both adding and multiplying on ciphertexts? And if you can add and multiply, then you basically have general-purpose computation, right? If you restrict your ciphertext to just being zeros and ones, then you can do pretty much every logic gate, right? So if you wanna do and, then and of x and y is just x times y. If you wanna do or, then well, or is basically you can kind of invert x, invert y, then you do an and so you invert again, or you have this kind of simpler formula where you just say x plus y minus x times y, right? So if both value inputs are zero, then this is obviously a zero. If x is one, then you know, you have a one here, you have a zero here and then you subtract the zero here, so you get one. If y is one and x is zero, it's symmetric, so you also get a one. And then if x and y are both one, then you get one plus one minus one, so you also get one. For xor, you can do either just x plus y, and that's if you don't care about the numbers staying within kind of the range zero and one, if you just care about them being even and odd, or if you care about the numbers being within the range zero and one, then you can just do x plus y minus two times x times y, right? So if x and y are both one, then it's one plus one minus two and you get the zero. So if you can add and multiply, then you can do pretty much whatever logic gates you want, and you can do arbitrary computation. Now, it's not going to be efficient arbitrary computation, because if you imagine wanting to do some math based off of numbers, then you'll pretty much have to decompose those numbers into bits, and then you'll have to turn your addition and your other operations at the circuits, and then you have to evaluate the circuits, and so you can see how kind of the complexity multiplies, but you can do it. So to give the intuition around how these FHE protocols work, we'll start with a kind of toy protocol, and this is, I believe, technically secure under some very, very unfavorable key sizes and complexity, but this is the way that people started kind of figuring out how to be able to do addition and multiplication, right? So imagine your private key is a big prime, so and think of it as like a really big prime number, then to encrypt a message where a message is restricted to either zero or one, then what you do is basically, you take a big random number, then you multiply it by P, then you add together a small random number multiplied by two, and you add your message, right? So what we have here is basically have a multiple of P that's offset by some noise, and a message that's hidden in the least significant bit of the noise, and if you want to decrypt, then you decrypt the ciphertext with the key by basically saying it's the ciphertext mod P in brackets and that kind of kicks out this term, and then you just do mod two and that kicks out this term, and so you have your envelope left. By the way, if people have questions at any point up, please feel free to ask, I'm happy to answer. So why is this scheme additively homomorphic, right? So basically, why is this an approach where you encrypt a message by taking a big multiple of your key, adding some noise, and then hitting your message in the least significant bit of the noise? Why is this additively homomorphic? Well, basically, if you imagine you have two ciphertexts, each one of these ciphertexts is gonna be a multiple of P plus some noise plus some message, multiple of P plus some noise plus some message, and these things just kind of naturally distribute, distribute, right? So basically the CT1 plus CT2 actually is a ciphertext, like it's a ciphertext in the correct format because you just have a multiple of P that is K1 plus K2, then you have your noise, and the even part of the noise also adds up, so you have two R1 and two R2 over here, and then it's just two of R1 plus R2, and then your messages add up, right? So M1 and M2, basically you have M1 plus M2 over here, and note that this is addition and mod two, right? So if your message here is one and your message here is one, then this is two, but the two over here is indistinguishable from just being part of the noise, and so it just becomes the same as zero. But if we just wanna do binary circuits, then this is totally fine, right? So you can add together two ciphertexts, and you get a ciphertext that is of the correct format of the addition or rather of the X or of the message bits. Now, why is this multiplicatively homomorphic? So let's say you multiply together your two ciphertexts, then you have an expression of the form that your first ciphertext K1P plus two R1 plus M1 multiplied by your second ciphertext, K2P plus two R2 plus M2, and you basically just kind of fully expand this, and when we expand it, we're just gonna say, like first of all, let's figure out which terms are multiples of P, right? Because if you remember the decryption process, the decryption process starts off by just modding, which just kills all your multiples of P. So let's figure out what the multiples of P are because we don't have to care about them. Basically, you have K1P multiplied by all three of these, then you have K2P multiplied by all three of these. So these five terms go over here. Now you have two times, so now we have to figure out which of the remaining terms that are not multiples of P are even, and it's basically two R1 multiplied by this, and you have two R2 multiplied by this. So you have a bunch of terms that get multiplied by two, and then you have this thing over here, which is not multiplied by P and not multiplied by two, which is just the product of the messages, right? So if you take this expression and then you do a mod P and then you do a mod two, then the mod P is gonna kill these five terms, the mod two is gonna kill these three terms, and so you just have this one term left. So we see that this is additively homomorphic, we see that this is multiplicatively homomorphic. Now, why is it to have any level of security? Basically, it's this approximate GCD problem, right? Basically, if you did not have noise, so if we did not have this two times small random, then you have just a bunch of multiples of P plus zero or one, and then a bunch of them are gonna be just multiples of P plus zero, then you can just use any GCD algorithm, and then you just keep just the Euclidean algorithm, and you just GCD all of these multiples together, and you get P, and then you can kind of break the scheme, right? But as it turns out, if you just have approximate multiples, then figuring out what P is becomes much harder, right? So the security of this cryptographic scheme is based on the kind of the hardness of computing approximate GCDs, as opposed to computing kind of exact GCDs, which is easy. So there's a more general principle here, which is basically that solving systems of equations becomes much more difficult when these equations have errors, or you can call them noise, right? So if you add some noise into your equations, then suddenly figuring out kind of parameters that are part of those equations that you can normally figure out easily, it just suddenly becomes much harder, and this is pretty much what the security of all of these constructions ends up being based on. So as a side note, public key encryption, so I showed this scheme as being a secret key scheme, right? You need P to encrypt. Now, turning these schemes into public key schemes is really easy, right? You basically just provide a bunch of encryptions of zero, and then if you can public key encrypt zero by just computing a random linear combination of these, and you can add some more error. And if you want to encrypt one, then you could just take these encryptions of zero and add one. If you want something more generic, then into your public key, you can add some encryptions of one, and then to encrypt one, you add a bunch of encryptions of zero, and then an odd number of encryptions of one, and you have a new encryption of one, right? So turning any fully, or for a work scheme that is private key, you can start it into a public key one pretty easily. That's not a problem. Now, why doesn't this work already, right? So there's two problems. One of them is that multiplication just doubles the length of the ciphertext, right? So here you had K1P and K2P, and here you have K1, K2P. And actually it's K1, K2P times P, so it's K1, K2P squared. And so the bit length of this multiplication is just gonna be the sum of the bit lengths of the ciphertexts that are coming in. And so if you try to do more than a couple of multiplications, it just blows up horribly and it just becomes impractical, right? So you can't do more than a logarithmic number of multiplications, period. But there is another problem, which is overflow of the errors, right? So if you remember the sum of the ciphertexts, kind of turns into a new ciphertext of this format, then the product of the ciphertext turns into a new ciphertext of this format, we can look at what happens to the even kind of error terms here, right? When you add the error roughly doubles, you take the sum of the errors, when you multiply the error squares, right? So when you had R1 and R2, here you have a two R1, R2. And so the bit length of the error doubles. And so once again, the maximum of multiplicative depth is going to be the logarithm of the number of bits in P, so or the log of log of P, right? So you can think of the maximum multiplicative depth as being really low, like think of it as being something like 10 or even five, like it's tiny. So how do we overcome this overflow problem? So there's two general categories of solutions to this, right? The first category of solutions basically is that we do clever tricks to make multiplying only increase the error by a constant factor. And so instead of having log log P or log of the number of bits of P multiplicative depth, you just, you actually do have a log P multiplicative depth. So the multiplicative depth is proportional to the number of bits in P, right? So if you imagine we can make a protocol where every multiplication increases the error by some fixed factor, say a factor of a thousand, then that's only 10 bits. And so if your P has, or if your modulus has a bit length of say 10,000, then your multiplicative depth goes up to 1,000, which is already huge. Now, so this is one solution. And the second kind of solution is bootstrapping, right? So bootstrapping basically is this kind of key switching mechanism where what we're going to do is basically imagine that you have some ciphertext that has some noise and you want to convert it into a fresh ciphertext that has less noise, right? So what we'll do is we will basically take the, and a circuit that represents a decrypting x under some secret key and we're going to evaluate it homomorphically, right? So what we're going to do is we're going to take x, so x is a ciphertext, then we're going to represent x as a bunch of bits. And what you're going to do is we're going to provide a kind of bootstrapping key. So think of this as being part of the public key of the system where, so first of all, you can encrypt the bits of x under some u key. So you can encrypt the bits of x here under some u key s2, but also we provide the bits of s under a key s2, right? So what this basically means is that you have, for some key s2, you have the regular public key for s2 and you also have an encryption of s under s2. And so given the value of x and the clear, you convert that into its bits, then you convert that into encryptions of the bits of x, and then you have encryptions of the bits of x, you have encryptions of the bits of s, and you just compute fs homomorphically, basically you just compute this decryption process using these inputs that we have. And what this gives you is the decryption process returns one bit, and so you get one bit, which is the decryption of x. So it is the same bit that x represents, except it gets encrypted under the key s2, right? So before we had a x, which is an encryption of some bit under s, and now we have basically an encryption of x that's under the key s2. So, and the important thing is, right, that x might have had a lot of error, but this decryption, it's only going to have a fixed amount of error, regardless of how much error you have over here, right? And the reason basically is that, well, from the kind of perspective of looking this under a key s2, all of these bits start off as being a ciphertext of error of kind of the minimum error level. They start being kind of depth one, and then this decryption has some constant depth. And so the amount of error that's gonna be in the result is actually going to be constant, even if there is a lot of error in the ciphertext here. So before I kind of continue, maybe I should offer an opportunity for some questions because bootstrapping is kind of a little bit complicated to understand. So maybe just kind of think for a bit and make sure that you understand what's going on here. One question I have regarding bootstrapping is, is there randomness involved? And where does it come from? So bootstrapping just basically means that you're evaluating the decryption circuit. So the place where there is a kind of error involved is basically that the process of homomorphically encrypting these experts is gonna involve basically taking values from the public key and those values already have some noise inside of them. So the process of executing the circuit kind of after you have those values doesn't require you to add any extra randomness, but the bootstrapping key already has a randomness that got used in the process of constructing it. Right, that makes sense. Could you describe what exactly, sorry, I'm having, what's the output of the decryption function? What does it look like when it's evaluated in the circuit? Right, so the output of the decryption function is either the bit zero or the bit one depending on what the ciphertext X represents. Right, but okay, so the thing I'm having trouble with is that I don't want the one evaluating the circuit to actually see those bits, right? Correct, so okay, so the reason why the person evaluating the circuit cannot see those bits is because that person does not have X, right? So computing the decryption requires you to have X and it requires you to have X. Now, in your bootstrapping key, you're not gonna give them X, you're gonna give them an encryption of the bits of S under S2. And so the only thing they can do is they can evaluate the decryption circuit in a way that gives them just outputs outputs that are encrypted under the key S2. They have no way of kind of extracting outputs in the clear. I see, so they don't actually get DEC, they get the decryption circuit, okay, they have some representation of it. They get the bits of S encrypted using S2 and they run this circuit using those bits. Exactly, so they have S encrypted under S2, they have X in the clear and so they just encrypt X under S2 and then DEC is just a circuit and it's part of the algorithm. So everyone has it and so you just evaluate DEC homomorphically using bits encrypted under S2 and so you get zero or one encrypted under S2. Right, so what you write below this ink DEC, that's not actually something I have to do. I only have to run the DEC and that automatically gives me the encryption, okay. Exactly, so this is just what it's equivalent to. Yeah, right, okay. So what is K? K is just whatever the value is that's that is encrypting. All right, thank you. Why, once you've re-encrypted this, why can't you backpropagate the new lower entropy or the new re-encrypted format with the lower error? Why can't you backpropagate that lower error and gain a log factor or constant to give you your multiplication trick through every previous multiplication again? How would you backpropagate the lower error? I don't know, I haven't tried to play with this yet but intuitively it feels like there may be some method of doing that. No, I think the problem is that lower error plus higher error equals higher error and generally kind of any function, if you have some function and you throw in a lower error thing and a higher error thing, you get a higher error thing out. So because the errors are just randomly generated, you're not gonna be able to make them cancel in some way. Okay, right. So one subtle technical point. So this X is encrypted under S and here we're saying we encrypted under S too. Technically you could just set S equals S too, right? So you could technically just encrypt the bits of X under S and then you can have the bootstrapping key be the bits of S encrypted under S and this makes things a lot easier because instead of having lots of keys or having like an entire chain of keys with a chain of bootstrapping keys, you can just have one key. So why like in the standard descriptions of bootstrapping, do people do that? Basically for weird technical reasons, you can't prove a kind of a security reduction straight to the lattice assumptions. And so there's this kind of special term. It's called circular security. And depending on whether or not you're one of these cryptographers who's like, no, you have to have proofs versus one of these cryptographers who's like, yeah, whatever, it's fine dog. But it's your choice whether or not to trust circular security. A lot of people think it's fine. I have a question. This S too is also a public key. So the private key don't know. So just have to use several public keys of the recipient for this report. Right. What is the second depth of the description? Good question. And I'm about to get to it in the next slide. So the main problem is that the multiplicative circuit depth of this decryption, it involves a modular reduction and X mod P has a circuit depth of log log X, which is bigger than log log P. And this is higher than what FHG can support, right? So basically one of the reasons why like nobody cares about the scheme that I just spent 10 minutes describing to you is because bootstrapping is not possible. So we're going to, in general, bootstrapping has a circuit depth which is logarithmic. And so the next parts of all of this are going to be kind of how do you adjust the scheme so that you actually can bootstrap. So we're gonna skip over a few things. So Craig Gentry came up with some really clever tricks in 2009 involving subset sums and other things I don't understand to try to kind of get around this problem. So to try to reduce the a circuit depth of bootstrapping to a point where we actually can bootstrap in a fairly simple FHG schemes. And Craig Gentry is great and we love Craig Gentry, but and he's gonna come back in the story. He is the one that created or one of the people who created the matrix FHG protocol, but we're gonna go straight to, as Mika Berkirisky and Vinoda, like Rintanathan's work in 2011, which is kind of the first scheme that really I kind of managed to get around this problem. Right, so the schemes that I'm going to show out from here on, they don't depend on approximate GCD. They depend on a slightly different but kind of similar seeming assumption, which is this learning with errors assumption, right? And the learning of errors assumption basically says that if you have a system of equations, so if you have this kind of system of equations, you have a bunch of variables, you have a bunch of equations, normally you can solve them using a Gaussian elimination and it's fairly easy, but if you just have approximate equations, so you have a whole bunch of equations that even if the system is really over specified that have a small additive error, then finding any of the variables becomes very hard, right? So solving systems of linear equations where the outputs have errors in them is something which is hard and is something which as far as we know is quantum hard. So these mechanisms also reduce to hardness of lattices and the shortest vector problem and all of these other kind of problems that are fairly well studied in mathematics. So basically the schemes that I'm going to get into from here on, they're not going to depend on approximate GCDs, they're going to depend on hardness of solving approximate systems of linear equations. So let's get to BV 2011. And so the key is going to be a vector and it's going to be a vector of numbers, S1, S2 to SN and the ciphertext is going to be a vector A that satisfies the property that A dot product S is a message plus an even error and this is all going to be done modular or some odd Q. Q doesn't have to be prime, but it has to be odd. So basically what you generate in A such that A1S1 plus A2S2 plus dot dot ANSM equals M plus two times your error. So the connection between this and the learning with errors problem is basically that you can think of each of these ciphertexts as being an equation and your secret key. Your secret key is I guess YS2, S3, S4. The coefficients in the equation are going to be the A and the output is going to be basically some small number and it's zero or one depending on what your message is plus some error and that error basically kind of preserves the values and that makes it hard to extract S. So your ciphertext is just, if you're kind of comfortable thinking in terms of vectors and dot products, it's just M A such that A dot product S equals some message plus two times an error or you can think of it as this one in your equation. One optimization is that you can set the first number in your secret key to one and that just makes it very easy to construct these A's. Basically you just generate all your other A's randomly and then you set your S A1 to kind of compensate for whatever the rest of the dot product gives and so whatever the rest dot product gives you basically you construct A1 to be kind of the minus of that and then the minus of that plus S1 kind of adds up together with this and then you get your zero and if you set A1 to be kind of subtract another M plus two E then or rather add another M plus two E then this whole dot product just becomes M plus two E. So constructing vectors that satisfy this equation is very easy. So as I mentioned, an alternative interpretation of this vector interpretation is that a ciphertext is a noisy equation where the variables are your secret key. Addition is easy, right? So addition basically if you have AL such that AL dot S equals one M plus two E then AR such that AR dot S equals another M plus two E then dot products are additive and so AL plus AR dot S is just gonna be the sum of your messages plus the sum of your errors and the way that you decrypt obviously is just you actually compute A dot S that you take the last bit and so you can decrypt the addition and if you decrypt the addition you get twice the even error and then you have the sum or rather the X or your two messages again, right? So addition here is simple, multiplication is harder, right? So the problem here is that in the scheme I showed you before you're just operating with single numbers and single numbers can be added and multiplied. Here you're operating over vectors and vectors can't just be multiplied with each other. Well, they kinda can, right? So you have this kind of notion of an exterior product, so basically you have none of this concept of an exterior product where a vector multiplied by another vector is just a big vector that contains all of the products of the elements, right? So if you imagine AL is a vector with 10 elements, AR is a vector of 10 elements, the exterior product is gonna be this vector of 100 elements that's like the first times the first plus the first times the second plus the first times the third all the way up to the first times 10th, then you have the second times the first all the way, blah, blah, blah and then the 10th times the 10th. Now you have this nice second of algebraic identity that basically says AL exterior product AR dot producted with S exterior product test equals this dot product multiplied by this dot product, right? And so you remember that AL is one ciphertext, AR is one ciphertext and the product of the decryptions, so M1 times M2 was just gonna be this dot product times this dot product. And so because of this algebraic identity that equals the same thing as AL exterior product AR dot producted with S exterior product S. So if you take two ciphertexts then you just exterior product them. So basically you just turn them into this big vector that contains all of the product the all possible products of their elements then that becomes a valid ciphertext of M1 times M2 under the key S times S, right? So you can multiply but the length of your secret key grows quadratically. And so the challenge is you need to have a way of going back to linear. So to go back to linear what we're going to do is we're going to have this procedure that we call a real linearization, right? So the procedure for real linearization it feels a bit like bootstrapping like basically because what you do is you have kind of encryptions of S under your new key and then you just evaluate things under your new key. So it's kind of similar to that but it's doing something slightly different, right? So real linearization does not solve the error problem. It just solves the problem of taking a ciphertext under the key S times S or S exterior product S and turning it into an encryption under S. So the real linearization key is going to contain kind of encryptions of S i, S j times two to the D so it's going to contain basically S i, S j for all i and j so all elements of S exterior product S and not just those elements it's also going to contain those elements times two those all bits times four, those all bits times eight and so on and so forth going up pretty much all the way up until the, for every power of D that's smaller than your modulus. And so what this lets us do is it lets us compute an encryption of the dot product of A L exterior product A R with S exterior product S as a linear combination of S i, S j two to the D, right? So if you imagine this expression like just as we did in bootstrapping we're not going to kind of evaluate it in a clear because well, you don't want the other person executing the circuit to be able to figure out the output in the clear. So instead we're going to evaluate it homomorphically under either some new key or if you're willing to assume circular security you can provide it under S if these encryptions are under S, right? So we're going to evaluate this whole thing homomorphically just as a linear combination of all these powers, right? So for example, if you have A L as a kind of one, two and then A R as three, four and so A L exterior A R is going to be three, four, six, eight, right? So this is one times three, four is one times four, six is two times three, eight is two times four then to evaluate A L exterior the dot product of A L exterior A R with S exterior X basically what you're going to do is you're going to say well, you're going to have three multiplied by S1, S1 four multiplied by S1, S2 six multiplied by S2, S1 and then eight multiplied by S2, S2 and then to avoid having to multiply cybertext because multiplying cybertext that makes the error blow up you have these powers of two and so with multiplication you can just express as a linear combination of these powers, right? So to express three times S1, S1 you're going to take this encryption of S1, S1 to undo the zero plus the encryption of S1, S1 times two of the one and so this gives us S1, S1 times three encrypted under your new key then for four you just have S1, S... So here four is the second element in A L exterior A R and so we're going to multiply it with the second element of S exterior X which is S1, S2 so we'll use the encryption of S1, S2 times two squared then for six, six is two to the one plus two to the two and so we're going to add the encryption of S2 S1 times two to the one with the encryption of S2, S1 times two to the two and then here we're going to add the encryption of S2, S2 times two to the three and so you basically if you add all these encryptions then you have a vector which is the encryption of or rather a vector which kind of if you were to dot product it with your key it would give you S1, S1 to the zero plus S1, S1 to the one plus blah, blah plus some error because each of these terms are going to have some error added to them and so you're going to just add all of this error and you have these set of terms and this is going to give you three times S1, S1 four times S1, S2 six times S2 plus one eight times S2, S2 which is the same thing as this term here, right? So basically what we've done is we've created this I know this sum of a bunch of these ciphertexts such that if you were to decrypt it so if you were to dot product it with the new key then you would actually get a value which equals to the evaluation of this plus some more error and so if you were to evaluate it then you were to cancel the error out then because this gives you M1, M2 plus some error M1, M2 plus some error plus some error canceled the error you would get M1, M2 so this both kind of preserves the message and because it's an addition of a bunch of linear sized ciphertexts the ciphertext size and the key size also go back to linear so I'll stop again and wait for questions because real linearization is also not very intuitive. Well, this is super cool. One of the question I have is you mentioned that this helps reducing the size of the key but not helping with the error. Correct. Why does it not help with the errors? It does not help with the error basically because this expression itself it's going to evaluate to this multiplied by this and AL is equal to M plus two E, AR is equal to M plus two E and so the product of those things is gonna contain a term that has four E squared and so this thing evaluates the four E squared and so this thing evaluates the four E squared and so here you have four E squared plus some more error that comes from a logarithmic number of these, right? So the error still blows up from being a multiple of E to being multiple of E squared. Now there is another magic trick to make the error blow up less quickly and I'll get to this after but first I'll kind of wait for questions about this game. What does it mean to encrypt SI as J to the D about encryption? This is yes, very good question. I was about to actually answer this myself. So the kind of the encryption scheme as I provided, right? It can only encrypt zero and one because it has some even error, right? And so you might ask, well, what does it mean to encrypt this thing? Which is obviously gonna be much bigger than zero and one. And so the answer basically is that that the kind of the encryption in quotes of SI as J to the D is going to be a vector A such that if you were to dot product it with the key you would get SI as J times two of the D plus some even error, right? So you cannot actually extract SI as J to the D from the ciphertext because there's the error and the error is just gonna kind of wash away everything except for, well it's gonna wash away everything except for the really high bits and the low bit or it's gonna wash away some bits in the middle. But so it is basically a vector where if you dot product that vector with the key that you get SI as J to the D plus even error. And so the reason why this is still useful is because if you add up these ciphertexts then you still get AL times AR or AL exterior ER dot product with S exterior X plus a whole bunch of even errors but these even errors just kind of add together with the even errors that were already inside of here. And so whoever ends up kind of ultimately decrypting this ciphertext, let's still like the errors are all even and so they'll be able to just cancel take them out. Okay, I got this. It's just kind of nothing gives property like parts of the final ciphertext that you cleverly adopt. Right, okay. So here is the magic trick to make your error blow up less quickly. So first kind of a fun fact, right? So we can switch ciphertexts between what you can think of as being two perspectives, right? So one perspective is your ciphertext is A and A dot product that S equals M plus two E where M is either zero or one. Now if you just kind of module or divide A by two, right? So this is all mod Q then A divided by module or divided by two dot product and with S gives you M becomes M over two and two E becomes E, right? And so your even error, your small even error becomes a small error that doesn't have to be even and M over two, so zero divided by two is zero and one divided by two is I'm ceiling of two of half your modulus, right? So if your modulus is saying 999 then one divided by two is gonna be 500, right? Because 500 times two is a thousand which module 999 reduces to one. And so you can basically take a ciphertext that's of this form and then you divide your A by two and then you convert it into a ciphertext that has this format where instead of your message being in the lowest bit, your message becomes kind of in the higher order bits. Now this perspective is better for multiplication because if your message is in low order bits then when you multiply your two messages together your message is still kind of preserved, right? Your message is in the kind of the field of two elements or I guess you can't divide so the ring of two elements might be a little more precise, right? Like basically if your message is in the low order bits then everything that happens with the error kind of happens on top of the message and the message just kind of flies under and it doesn't get affected. But in this perspective, the message is at the top and so everything interferes with it. So in this perspective, you can't multiply but this perspective has a key advantage which is that it makes the noise less structured, right? So here the noise has to be even, here the message just becomes small. And so by cleverly switching between these perspectives you can without having the secret key change the module SMS ciphertext. So here's what we do, right? We start off with A such that A dot product S equals your zero or one plus two times the error. Then you do a perspective switch and so you have an A prime which is A module or divided by two which is equal to small error plus either zero or half your modules. Now what we're gonna do is we're gonna switch it from mod Q to mod P, right? And so what we're gonna do is we're gonna take this ciphertext, we're gonna take the A prime and we're gonna multiply it by P and then we're going to integer divided by Q. And so if you do this then what you're gonna get is you're gonna get a value which if you then multiply it by S so you do rescale A but you do not rescale S, right? So this is a big kind of importance. So the rescaled A prime gives you basically, well if you dot product it with the same thing then the output is gonna be rescaled and so zero becomes zero, ceiling of two over two becomes roughly ceiling of P over two, small error becomes small error except the error itself also gets kind of downscaled, right? So it gets downscaled from being something times Q, some small fraction times Q to being the same small fraction times P and so the actual magnitude of the error kind of goes down and then when you basically do a reverse perspective switch and so you go from this perspective where it's either zero or ceiling of P over two to this perspective, you double all this and then it becomes either zero or one plus a two times the error. Now this operation. I didn't understand the previous step about, yes, why is there? Why this times P divided by Q trick works? Yeah, why it gives you the one on the right? Why is the message now zero P over two? That's not obvious. Basically so if you just think of this as being just a multiplication, then if it was an encryption of zero then it's just going to give you a multiple of Q and if it was an encryption of one then this gives you a multiple of Q plus half of Q, right? And so if you multiply this by P over Q then multiples of Q would turn into multiples of P and multiples of kind of half multiples of Q turn into half multiples of P. So I guess like one example is that if you imagine, say Q is 5,000 and P is 1,000 then if A prime dot S is, or let's say Q is 10,000, P is 1,000 just for simplicity, then if A prime dot S here was say 50,000 or let's say 50,000 plus a bit of an error then if you go down from Q to multiply by P over Q so you basically P is 1,000, Q is 10,000 so you divide it by 10 so your 50,000 just becomes 5,000 and 5,000 module or 1,000 is still a zero but if you instead had say 55,000 then 55,000 times P over Q is gonna be 5,500 and so module of 1,000 is gonna be 500 and the error that gets introduced by kind of the floor division not being perfect is still a small error so it just kind of mixes together with a small error that already exists. So this should really say E prime as opposed to E, right? Is that correct? Yes, sorry, that's correct. This should be E prime. And even the contribution from E that's also gonna change. Right, so and I'll give, I'll just kind of go back to this example, right? So let's say your A prime dot S was like as an integer dot product equal to, let's say instead of 55,000 it was equal to 55,000 and 30 so your error is 30 then if you multiply by P over Q 55,000 and 30 turns into 5,503 and so module of 1,000 it's gonna be 503, 500 is P over two and so your new error instead of, it was 30 before and now your new error is three. So basically the ratio between the error and the model of size remains the same. Correct. And I'll explain why, it seems like you're not accomplishing anything because the error ratio is the same but you're actually accomplishing something really important which I'll get into in the next slide. So again, so right now, right? I guess, are people satisfied that if you have a ciphertext whose error gets bigger you can turn it into a ciphertext whose error as a number is bounded by some constant but it's just a module that keeps being reduced, right? So this is kind of the lesson here. Yeah. Okay. So here is why we switched, right? So there's two reasons to switch actually. So the first reason to switch is this makes bootstrapping actually practical, right? So bootstrapping involves basically computing this dot product A.S.ModQ and computing this dot product A.S.ModQ has a depth, circuit depth of log log Q and by reducing the modulus what we can do is we can basically turn something which to a decrypt requires a circuit depth of log log Q to something that has a circuit depth of log log B. And P here is like, it can be very small, right? Once you do the modulus switch you don't have to actually do any more computation so P can be a pretty small constant and so this basically means that the circuit depth of bootstrapping becomes a constant and so you can just increase Q to as much as you need in order to make the bootstrapping procedure actually possible. Cool. Yeah. So in the bootstrapping key basically the way that we do it so bootstrapping is just kind of computing this dot product mod P or a kind of as a circuit and so to make this computable you basically just you're gonna provide SI times two to the D values in binary representation, right? So for every SI times two to the D well, you basically, and this is modulo P then you provide a binary representation of this and so you have N log P numbers and binary representation so for every one of these terms you have your SN basically to compute and SN you take for every kind, for every one bit that's in AN because you have AN in the clear you multiply that by the corresponding kind of bit representation of the power of SN and so you just have a bunch of these numbers and these are numbers in binary representation, right? So they're not single sort of ciphertext they're log P ciphertext and then you just add a whole bunch of these numbers together and then you to take modulo if you like the best choice of P is gonna be two in the power of N minus one and that makes modulo really easy because you just meet your addition circuits wrap around. So how do we implement, right? So how do we implement these addition circuits, right? So basically, so the addition circuit is just going to be you have these N log P numbers and you just have to add together these N log P numbers, right? And then the modulo basically your addition circuits instead of kind of, instead of carrying onto kind of even higher bits like every time you carry past the size of P you just kind of wrap around, right? And then that just gives you modulo at two N minus one for free, right? So like if your modulo says 1,023 then your bits are in 128, 256, 512 then 1,024 is equal to one so you just kind of wrap that around to one so modulowing is really easy. So your setting is you just have to add together a whole bunch of numbers, right? And the first step is that you can reduce three numbers to two numbers so you can have a kind of three to two adder that has multiplicative depth one, right? So and the way that this works is basically that every bit of P is going to be just the XOR of the corresponding bits of A and B and C and every bit of Q is going to be kind of the two of three functions of A and B and C, right? So this is zero if A plus B plus C is zero or one and one if A plus B plus C is two or three and you can think of the SSB and carry digits, right? And the Q you also kind of multiplied by two which basically means you just kind of left shift Q by one, right? So the reason why this works is basically because every bit in A and B and C gets represented inside of P but then in the specific case where you have kind of two bits that are turned on to one, then inside of P they get turned into zero but you flip the corresponding Q from a zero to a one and then the Q gets kind of moved to the left by one and so that multiplies it by two and so basically kind of every individual bit in A, B or C gets kind of correctly represented in T plus Q, right? So this is a kind of three to two adder and this has multiplicative depth one. Adding two numbers X plus Y does require logarithmic multiplicative depth, right? So if you're going from N numbers to two numbers that requires with the multiplicative depth which is basically logarithmic in N so logarithmic in the number of numbers and we don't care about the size but then adding the last two does require multiplicative depth to log log X plus Y and here X and Y are mod P so we're basically saying log log P or log of the number of digits of P, right? So if P is 1,023 then P has 10 digits the log of 10 is going to be at well four and so your multiplicative depth is just going to be four. Now the reason why adding circuits have this logarithmic multiplicative depth is basically because you have to worry about kind of threading the carry bits through the number and I mean in the code that I'll like to later I have algorithms for doing this. So in general, computing a linear combination of size N log P is going to take size so first of all, you have these N numbers and second you have this kind of and the N numbers together the sum is going to be a size N log P and so it's kind of log of N log P and so this becomes log N plus log log P here you have another log log P and so the total kind of O is going to be a kind of log N plus log log P and you can pretty easily just set Q to be high enough that this is less than log log Q and so basically you can implement the decryption circuit and so you can bootstrap. So here's optimization number two, right? So I mentioned back when we were over here that there's two tricks to overcome overflow where one is bootstrapping and the other is a clever tricks to make your multiplication only increase your error by a constant factor. So this is the clever tricks that I'm gonna talk about, right? So basically imagine if instead of just doing a modular switch before bootstrapping we do a modular switch after every multiplication, right? So if you don't do this, right? So if you imagine a circuit that has some multiplicative depth then your error is gonna square every time you do a multiplication, right? So if you imagine your first ciphertext, let's say the module says 10 to the 16 minus one and your ciphertext has error 10 squared. If you square it, the error is gonna be 10 to the four. Actually it'll be a 10 to the four times times some constant, but most simple final say it's 10 to the four. Then the third time you also applied the error is 10 to the eight, then the fourth time you also applied the error is 10 to the 16 and then the error kind of overwhelms the ciphertext and then here decryption starts failing. So if you just do this kind of fully homomorphic or partially homomorphic encryption naively then after a logarithmic multiplication depth you're screwed. Now what happens if you do modulus reduction, right? So you have x, which has an error of 10 squared module 10 to the 16 minus one then you multiply it and so you have x squared with an error of 10 to the four. Now you perform a reduction. And so here you have x error 10 to the four modulus 10 to the 16 minus one. We're gonna basically cut the modulus by a factor of 100. And so your error gets cut from 10 to the four to 10 squared and your modulus gets cut from 10 to the 16 to 10 to the 14. And so now you have this term over here and then you can, or well over here or you have this term over here and now if you square this again, right? Then your x squared goes up to x to the four. Your error of 10 squared goes back up to being an error of 10 to the four. Your modulus is the same. Then you reduce it again. And now your error goes down to 10 squared. Your modulus kind of hops down again. Then you square it again. The error goes up from 10 squared to 10 to the four. And then you just reduce the modulus again. And so notice that instead of the error squaring every time, basically we have the modulus just kind of very nicely orderly just stepping down by a constant factor every time, right? So the clever trick here is basically that when you multiply the cybertext, the errors get multiplied. And so if you just make sure that the errors as numbers stay small, then squaring the error is always just going to be a constant factor multiplication. And so instead of having a logarithmic number of steps, you can have a linear number of steps. And so your multiplicative depth goes up from log log q to log q. Pause for questions again. I didn't completely follow everything. So it's a multiplicative depth of what? Do we want to decrease the multiplicative depth? Sorry, by multiplicative depth, I mean the maximum multiplicative depth that you can support. So if you notice the example at the top and the example at the bottom, here you make your error blows up to the point where it overwhelms the modulus after it's been multiplications, right? Because each time you're squaring, here we're keeping the error under control, right? The error is just always 10 to the 4 down to 10 squared, up to 10 to the 4, down to 10 squared, up to 10 to the 4, down to 10 squared, except every time the module just steps down by a constant factor of 10 squared. And so here instead of doing three steps, you can do eight steps, right? So here you can do a log log q steps, here you can do log q steps. And both optimizations are kind of unlocked by this context switch, right? Right, so both the bootstrapping and this technique that lets you basically not need bootstrapping half the time are unlocked by the module switch. Okay, so now we move on to what this presentation was supposed to be about, which is matrix FHE. So the problem that matrix FHE solves is basically can we try to find something that's kind of more natural and more efficient than real linearization, right? So real linearization is expensive basically because real linearization requires you to just add this really huge number of ciphertexts. And if you imagine a large module as you're talking about hundreds of ciphertexts for each element, so can we try to make something that's kind of somewhat nicer? And can we also try to create a protocol that's somewhat simpler where adding and multiplying ciphertexts actually is done by adding and multiplying elements in some kind of naturally reasonable way. So matrices, yay. So here is kind of the context, right? So to encrypt s with, or to encrypt zero with a secret key, what we're going to do is we're going to have, come up with a matrix A such that the secret key multiplied by the matrix is an error. And to encrypt one, we're gonna have a matrix A such that a secret key multiplied by the A gives the secret key plus an error, right? So basically what we're doing is we're kind of hiding an approximate eigenvector where the eigenvalue, if it's zero, the eigenvalue is zero, and then if it's one, the eigenvalue is one. And if you really want, you can potentially encrypt other values as long as the eigenvalue kind of becomes the place where the employee text gets hidden. And it's only an approximate eigenvalue and so you can't extract it, as you are using out of the standard techniques for the standard, because learning with errors makes it hard. So addition is easy, right? Basically, addition, if you just add together two matrices, then, so if you add together a matrix such that SA equals something with a matrix where SA equals something else, then S of AL plus AR is just gonna be basically whatever the value is for the first one plus whatever the value is for the second one, right? So S times AL or decrypt of AL plus AR becomes decrypt of AL plus decrypt of AR, which is gonna be your left message times S plus your left error plus your right message times this plus your right error. And so you have the sum of your messages times S plus the sum of the errors, right? That's basically kind of the same thing that we had before. Except here, because instead of dealing with vectors, we're dealing with matrices, and matrices, yay, can actually be multiplied, then the decryption of the multiplication of the matrices is going to be S multiplied by AL times AR. And remember, you can think of matrices as being linear maps, and so you're gonna kind of pass S through AL, and then you're gonna pass S through AR. And so the eigenvalues get kind of get producted as well. And so you can kind of show that this kind of expands out. So you have first S passing through AL, then you multiply that by AR. And so S times AL is gonna be a kind of MLS plus EL as before, right? So here, S times A is gonna be whatever your message is multiplied by S, either one or zero plus your error. Except here, we still have to multiply it by AR. So we have MLS times AR, then you have EL times AR. And so here, time multiplied by AR basically means your MLS is gonna turn into, well, L, ML, MRS, because AR times S by itself is gonna give you kind of MRS, right? And so if S times AR gives you MRS, then ML times S times AR is gonna give you ML, MRS. So here, kind of the decryption of the product is gonna give you kind of the product of the messages here. And then over here, you have is the two error terms, right? So one of the error terms is gonna be the left error multiplied by AR. And then here you have the right error multiplied by AR, multiplied by the left message. So you can add, you can multiply, but there is a problem, right? And the problem is that the error is not just being multiplied by the other error, you're not just having like EL times AR. Instead, your error gets multiplied by the magnitude of the matrix elements. So if you start with matrices that have a fairly small magnitude, then every time you do a multiplication, you're the magnitude of your matrix elements just keeps on doubling. And so once again, you're limited to your depth being kind of log log Q. By the way, just kind of one other reason why this matrix approach is nice. Basically, from here on, we don't actually care about the modules as being odd, we don't care about them being prime, we don't care about modules as having any kind of properties. And so to make the math easy, we're just gonna set the modules as they equal some power two to the K and that makes them kind of math really easy. And it makes bootstrapping really easy as well because you can just kind of forget the higher order bits. So the problem is, right, that you have this kind of EL multiplied by AR and these AR values themselves are kind of get big and they get bigger every time you multiply and so how do you stop the error from just blowing up? So the fix to this problem is you have this kind of really strange and clover bits splitting trick. And the bits splitting trick basically says that if we treat X as being a vector, then we're gonna define these two functions. One is called powers of two, the other is called bitify. Where powers of two basically says for every element in X, you just kind of concatenate the powers of two of X and then bitify of X just turns it into a bit representation. Then you have this identity that the dot product of powers of two of X together with the bitify of Y equals to the dot product of X of Y. And to kind of visually show this, I have an example rate. So if you have X as one, two, three, and then Y is six by four, then powers of two of X turns into one, two, four, two times into two times one, two, four, so two, four, eight, and then three turns into three, six, 12. And then Y turns into its bit representation. So six is one, one, zero, five is one, zero, one, four is one, zero, zero. And the dot product here, well, here it's one times six and then here we're basically kind of doing powers times bit representations. And so you have two times one and then four times one. So this adds up to being six. Then over here, two times five and basically the five turns into one, zero, one. And so you add two times one and then here you add two times four times one to kind of compensate for this one having a higher place value. And so two times one plus eight times one is 10, which is two times five. And then three times four is 12. And so three times, well, here four, you just have like a one here and then the one gets multiplied by three times four and so you have 12, right? So I guess also pausing and just kind of stare at this and just kind of verify for yourselves that this works. Right, so basically powers kind of converting X into powers of two and converting Y into a bit representation is a dot product preserving operation. It kind of feels a little bit like the real linearization trick. Exactly, yes. It's very similar except this is kind of just explicitly representing real linearization using a vector math. It's exactly the same sort of stuff. So the bit splitting technique also applies to matrices because matrix math is basically just batched vector math. And so powers of two of S times bit of five of A equals S times A. Actually, this should be a dot. This should be just actual matrix multiplication. And so what we're going to do is we're going to basically do, instead of doing AL times AR, we're gonna do AL times bit of five of AR, right? And we're gonna set our cybertext instead of being N times N, we're gonna set them as being kind of N times N log P. Basically, the idea is that whatever is on the left side, we're gonna kind of set it as being permanently in this kind of powers of two representation. And so bit of five is gonna give us an N log P times N log P of matrix. And so kind of the dimensions match up, but instead of multiplying AL times AR, we're gonna multiply AL times bit of five of AR. And the reason why we do this is basically because if you notice here, you have these two terms, these two error terms, one has an ERAR and one has ELAR. And so if AR elements are always going to be zero and one, then here you just have EL and then here you just have ER multiplied by the message, which is just gonna be zero or one. And so the error blowup is small, right? The error blowup is just gonna be adding a whole bunch of these terms that were proportional to the original error. So I'll admit it's the thing that's kind of difficult to see as why AL times bit of five AR is still kind of still eigenvector preserving the same way that AL times AR was some eigenvector preserving. Again, basically, how do I describe this? The kind of intuitive idea, right? Is that these ALs are kind of permanently in this powers of two form and then these ARs are gonna be bitified. And so their dot product is going to be the same as it was if you just had a kind of regular AL together with just a regular AR. And so the eigenvector here instead, well, here it's not kind of fully an eigenvector instead of being S, instead of turning S to S, it's actually gonna turn S into powers of two of S, but kind of this works and you can also just experimentally verify that it works, right? And the benefit is that, as I mentioned, instead of the error blowup multiplied by big AR values, you just guarantee that these AR values are always zero and one and so the error blowup is smaller, right? By the way, if anyone, I have code here, so if anyone want, and the code is actually not long, so the homomorphic encryption here is about 300 lines of Python. This is under 200 lines of Python. So if you want to kind of see for yourself, if you want to kind of play around with matrices for yourself and kind of verify things for yourself, I highly recommend looking through the code as well. So optimizations, right? So there's a bunch of ways to optimize this. So first of all, you might notice that one of the really interesting problems or properties of matrix FHE is that the error growth is asymmetric. So here you're multiplying by AR and here you have just EL, here you have ER being multiplied by a message and you have this interesting property that if you have a ciphertext on the right that has a higher error then the error actually barely increases at all, right? And so what this basically means is that whereas in tensor FHE you don't really have this property, the error is kind of just multiply here. If you want to perform some kind of operation that it actually makes sense to perform that operation kind of asymmetrically, so say if you want to do a bunch of additions then you just kind of fold them all into the same somewhat on one after the other. The error balance criteria on DECOM is kind of more complicated. So it's not just max multiplicative depth, it's not just max polynomial degree it becomes a sort of weird complicated thing. Other optimizations, so you can decompose, so you don't need to do things in base two or you don't need to decompose into kind of binary empowers of two, you can also do this in base three or base four or base 16 and this potentially lets the error kind of increase less slowly, you can pack multiple bits into a ciphertext and you can do kind of addition and multiplication on a lot of bits at the same time. There's also this thing called ring LWE where basically instead of just thinking about kind of a whole bunch of independent equations you can represent those equations as being an equation and a polynomial ring and I don't really have time to get into this but this also lets you kind of decrease key sizes a lot. So this is still kind of basically how a kind of modern fully homomorphic encryption schemes work except there's this kind of whole suite of optimizations that people have been slowly coming up with. So the main reason why there is a big overhead is basically because the ciphertext or matrices, right? And in order to satisfy these LWE assumptions the ciphertext have to have a pretty substantial length and so you can think of these matrices as being something like matrices of size 100 by 100 or whatever and then here we're gonna store them in powers of two and we're gonna bitify them and so the 100 potentially blows up into being 10,000 at least temporarily and so matrix multiplication becomes this really big kind of multiplication procedure and if you want to be able to process circuits of substantial size then these numbers have to have a fairly substantial bit length so we're potentially talking over 100 potentially. So there's a bunch of factors that do end up kind of conspiring to make it fairly different or to make this kind of fairly inherent large blow up in circuit sizes and this is the reason why this is all kind of less large and less clean than things like elliptic curve cryptography for example, but it is kind of increasingly getting to the point where for at least small computations it is completely viable and you can do an individual computational steps on the order of milliseconds. So there you go. Thank you, that was really cool. Thanks for the presentation. I have a question of what do you know what is considered the state of the art today? Like there is all these different libraries out there at the FHG and all these kind of things. So what is what is method and is considered the best today? Right, there's definitely a bunch of libraries like HEOlib is one of them and there are some others with different names. So there was a survey or by Zika Burkarski the inventor of the 2011 protocol. Let's see if I can look this up right now. Here you have fundamentals of fully homomorphic encryption. And so it's basically just search fundamentals of fully homomorphic encryption with Zika Burkarski and then what you get, it's for 2018 and the base is probably similar to the FHG protocol. So the state of the art is definitely you start from either the approach or the matrix approach, they have different pros and cons and then you just apply a whole bunch of optimizations bring LWLE is one of them and other one is those various schemes for being able to kind of stuff a whole bunch of ciphertexts into this or a whole bunch of messages into the same ciphertext. So like one of the ways that you can do this is you can imagine here, you know, instead of having a ciphertext where a just satisfies a dot product that equals one message, you might say a dot product that's the message and a dot product t satisfies another message and then you can do kind of SIMD operation. So you can, if you add and multiply that kind of adds and multiplies all the plain text simultaneously and then you can do kind of rotations, you can do kind of permutations. And so there's this kind of growing body of clever tricks the same way that the zero-knowledge proof space has a growing body of clever tricks that try to compensate for the inherent inefficiencies. So that's where we're basically at right now. We're kind of just taking this base and incrementally building upon it. Awesome, thanks. I sent a link to the report that you referred to in the telecom channel. I just have a quick question. You can also look up. Sorry, like finish this one first. No, I was just going to mention that you can also look up at BV 2011. There's also another protocol at BV 2012 which avoids the need to do a modular switching in the completely like basically by kind of pretending that the ciphertext are fractional but still representing them as integers. So you're just kind of feel free to look up the papers for all of these as well. And hopefully it'll be more understandable after this. All right, thanks. Go ahead. Yeah, this is like just coming back to your last thing about the switching into AL and AR matrices. So, where, yes. Okay, so you can do this multiplication. So how in the next step, right, you do it using the spitify, isn't that actually exactly, I haven't quite understood why you're not doing exactly the same operation and just computing it in a different way. Right, so basically the idea here is that so bitify is guaranteed to always contain zeros and ones. I think I might have a good intuitive answer. So the intuitive answer is, so first of all, like powers of two and or kind of power of two production and bitifier kind of opposites in some sense, right? So if you bitify something then you can multiply by a matrix which kind of converts it back into the pre-bitified form. But if you take a matrix which is a multi matrix which is in this case not bitified, then the multiplication itself when you do kind of modulo, it's going to, and then you kind of un-bit of, you're basically kind of, what this is is, and you can think of it as being kind of an un-bitified version of a bitified thing. And what happens is that the kind of inverse of bitifying that's kind of inherently baked into here, it has this property that if you apply it to things that are not bits, right? So a bitified matrix multiplied by another bitified matrix is not necessarily going to contain bits because matrix multiplication adds a whole bunch of bits together. Then you're still going to get a matrix and you're still, because you have modular reduction, it's still going, it's going to reduce to modulo whatever your modulo S is. And then, so basically the modular reduction just kind of magically makes the higher order terms go away. And because you're only doing the matrix multiplication while the matrices are bitified means that the process of matrix multiplication never multiplies any particular value by more than the width of the matrix. And so the error only gets multiplied by a small amount, right? So it's actually not the same operation. It's a different operation but which is still guaranteed to have kind of the same consequences specifically with respects to multiplying by basically things of the format of values. I actually just noticed that I don't even know what this multiplication here means because like, so in the previous operation, okay. So we have like bitify has blown up a vector into, okay, an n times log p vector, right? So what are these AR and AL objects here? Are they tensors or are they still two dimensional matrices? How many of them are these? These are all matrices, right? So right, so the way to intuitively think of this is, okay, so here's one way to intuitively think of this, right? So think of AL as being a bitified matrix multiplied. So AR has some size n log p times n log p, right? Bitify AR is a big square. AL has size of small times big, it's n times n log p. And think of AL as being bitified AL multiplied by the matrix that is the inverse of bitification, right? So the inverse bitify matrix multiplied by bitify AL multiplied by bitify AR. And then you're gonna take bitify AL and bitify AR then you multiply them together. And then to actually see what this means, you have your kind of inverse bitification matrix multiplied by the product of the bitifies. So, right, so basically you have these kind of matrices that have the property that if you multiply them by powers of two of s, then you get powers of two of s. And so here you have kind of already wash the matrices and so if you multiply them by s, you get powers of two of s. And then if you take AL and kind of expand it out and as a bitified thing, then the bitified things are kind of, they're both kind of, well, powers of two of s is an eigenvector basically. And so if you product together bitify AL with bitify AR, you get something where powers of two of s is also an eigenvector. And then if you kind of squash it back, then you get a cypher to x is still kind of of the same form. Right, I think, yeah. As I said, you know, I think, yeah, if you want to kind of... Yeah, I think I realized the part so that I don't understand it. I guess I'll have to look at the source. You're right. Yeah. Yeah, I'd recommend not even just looking at the source, but even just kind of opening up a Python console and kind of playing around with kind of bitify and multiplying by vectors and just kind of see what property is the cypher to x and the bitified cypher to x have with respect to multiplying by s and multiplying by powers of two of s. Okay, all right. Okay, so now, okay, here's the part I didn't understand. So there's no extra operation each time. Now, the whole time we are computing with n times n log p matrices. Yes. Aha, I see. Okay, so we changed our protocol. This is not just like a modification of the multiplication itself. Right, great. So another way to think about this, this approach is just an optimization. Another way of thinking about this is you can make a less efficient but easier to understand protocol if your cypher to x are n log p times n log p matrices. So imagine your cypher to x are kind of bitified a l and bitified a r. So they are big matrices that contain only zeros and ones that have the property that powers of two of s is an eigenvector, right? So we agree that if you just multiply a l times bit of five a r, then powers of two is going to be an, of s is going to be an eigenvector of the product, right? But then what this operation does, but basically we want to preserve the invariance that your cypher to x are just zeros and ones, right? After you multiply two matrices together, they're not necessarily zeros and ones. And so what we're going to do is we're going to basically kind of do an inverse bit of five operation. So squash it. And because of this, squashing preserves the eigenvector. So basically before it turns powers of s into powers of v, powers of two of s into powers of two of s, after it just turns s into powers of s, squash it and then you verify it again. And the squashing in the bit, and you kind of, you basically, the squashing in the bit, it preserves the eigenvector, but it kind of forces the values in the cypher text to just continue being zeros and ones. Cool. And you mentioned like you can do operations in order of milliseconds. That's like sort of one bit, essentially, like a one bit operations, like something like and then more. All right. I am. I have a question. I'm not sure. Oh yeah, go ahead. Yeah, first thanks for a fantastic presentation. And the question is, you have talked so far only about multiplications of plain text and cypher text, but do you think that any of this technique translates to some more complex functions, like polynomials of cypher text, something like more nature than just computing them operation by operation? Yes, that's a good question. I think, so one of the challenges is, right? So first of all, this technique, it's theoretically compatible with cypher texts being not just zeros and ones, but being arbitrary field elements. The problem is that because you have this MLER term, basically if your messages are not zeros and ones, it makes the error bullet point, right? So you can get some like you can potentially, instead of working in the ring module or two, you can work in like the ring modulo, some small number, like maybe up to a hundred or something. But working over things that are not bits, it just kind of seems intrinsically hard because multiplying by things that are not small numbers causes error to blow up very quickly. So I mean, you do pretty much, I'm sorry, I meant differently, not just kind of not field elements, but instead of like just multiplication, like I know multiple of three different elements or I know some polynomial of four elements, small degree or something. Right, so I mean, so the trivial way to do any of this, right, is that if you want to like multiple, there are mechanisms for I think multiplying a, I'm sorry, so if ML, if the individual ML kind of as bits, right, is just a zero or one, then multiplying by three doesn't really have anything, right? So if your messages are zeros and ones that you pretty much have to do everything using binary circuits. And the nice thing with binary circuits is that like you can kind of, multiplication is just an addition of a kind of log n, log shifted values. And so you can kind of do it. And I mentioned before in the optimization section, right, that you can potentially have a cycle text which contains multiple point text and you can do shifts. And so you could do a kind of large parts of things like addition and multiplication circuits kind of operating over many bits simultaneously. But I guess in general, like operating over binary circuits is the kind of the most fundamental thing that this ends up allowing. There are definitely kind of optimizations where people want to, like, so the main challenge here, right, is that the error from following up your message has to always be zero and one, but there are optimizations that have to do with kind of computing more complicated gates in a way that ensures that the output is still zero and one without having to do as much work as if you just kind of did it naively with simple gates. So there's nothing kind of super mathematically elegant but there are kind of these kind of bags of tricks that let you optimize a lot sometimes. Okay, thank you. One question I have is that people often say that functional encryption is related to FHE. Do you know how to go from FHE to functional, or how they're related? Right, so functional encryption. Right, so functional encryption in terms of the definition basically it says instead of being able to go from NKVX to NKVFFX for arbitrary F, you can go from NKVX to just having FFX but only for one specific F. So it's a kind of different and potentially more powerful and primitive in terms of how these functional encryption protocols work. I'll admit I haven't kind of fully figured this out yet. There's, they're definitely considerably more complicated than FHE. Okay, I mean, I guess one easy way, well, quote easy conceptually is to go through obfuscation, bootstrap with FHE and then go back down to functional encryption. Right, exactly, but even if obfuscation protocol is often in practice end up being built on top of functional encryption, so. Okay. That's it. Yeah, there's like a lot of different paths to try to get to obfuscation and none of them are fully satisfied people yet and they do end up having overhead. One really nice thing is that if you have an obfuscator that works for circuits of constant size, then you can turn that into an obfuscator that works for kind of arbitrary size. And the way that you do that is to obfuscate a big program that you first homomorphically encrypt the input, then you have the circuit itself provided in a homomorphically encrypted form. And so basically your function is gonna be circuit evaluation. And so you're gonna evaluate the encryption of X with the encryption of the circuit and then you get an encrypted form of the output and you generate some like proofs, don't think of it as being a snark or something, but like you can make proofs that kind of play more nicely together with the homomorphic encryption scheme so you have less blow up. So basically then what you do is you take your encryption of f of X and you take your proof that you actually computed F and not some different function and then you have a fixed sized program that checks the proof and that's decrypted and then you have your f of X, right? So that way you can kind of obfuscate programs that you create in an encrypted program that lets you go from f of X. But this still depends on a fixed size obfuscation. So I think pretty much the exact same technique works for something called correlation and tractability hashing. And I think the idea here is that you can have a provably secure fiat-chamere as opposed to having a heuristic fiat-chamere. And then if you have this hash function which has this very special property which has this long circle circuits and you can boost it to arbitrary circuits. And I think some of the lethesses people manage to build a non-interactive zero-knowledge proof which is secure in the standard model with that fiat-chamere. I guess I have a question I have. That's, go ahead, what are we gonna say? No, it's just I'm gonna say that sounds right and interesting. And the other question I have is on perform, so like this one guy said roughly speaking is in a decade, like FHE was optimized by an order of magnitude every single year. And I was wondering if you think we've hit a wall or like this is gonna continue? Continue for four or five years and then we'll help you in four or five years. Right. Yeah, so I kind of take in you through the journey of kind of the years of where homomorphic encryption was just getting kind of massively better with the new discoveries every two years, right? So like over here, you have only partial homomorphism so bound the degree and bootstrapping is impossible. Then you have this really complicated scheme from Craig Gentry, then you have this really over straight finalized bootstrap for the first time, but the air flows exponentially then suddenly the air only flows linearly and then now the complexity of doing a depletion went down to a matrix application. So here we're seeing huge speed ups. So I think after these protocols, the speed ups have definitely started slowing down and can I expect speed up to kind of continue slowing down over time? I don't know enough to be able to say honestly like what a reasonable lower bound is what ends up looking like. I think at this, my instinct would say at this point that we'll probably expect to see more speed ups from optimizations than we will buy from a kind of very fundamental changes to how cyber techs work. So kind of coming up with gadgets that let us operate on many bits at the same time more effectively. I don't know. Right. Okay. Do we have more questions? I mean, one thing I'm kind of curious about is kind of the ultimate kind of cryptographic primitive for the long-term future. Let's say when we have quantum computers and it's kind of unfortunate that from what I understand right now like we can't really do snarks with lattices. And I was wondering if you think this is like a fundamental thing or like we're gonna need fry and lattices working in parallel. Hi. So you can do bullet proofs with some lattices, right? Because you can just use cyber tech the same way you use better than commitments. As for, it's very possible that someone will come up with some lattice based protocol for kind of zero-knowledge proving, I don't know. I think the reason why it's hard it seems fundamentally harder in the lattice world is because you have these errors and so you can't really do things like equality tests as easily, right? Like, so for example, even one of the big kind of problems with or challenges that in lattice like this style of cryptography is can you even come up with a zero testing key? And if because of linearity if you have a zero testing key you can turn that into an equality testing key and that would just immediately give you a multi-linear map. And it turns out that, you know if all of the approaches that people have come up with for trying to zero test this some type of tech people to zero using some key that allows you to do that without allowing you to decrypt everything else like it's inevitably a weak information that kind of breaks the security proof that shows that you can extract everything else to build the message. So yeah, we might end up coming up with something, I don't know, it definitely is possible that we'll end up needing both fry and lattices in the long term and that the two things just naturally specialize in different areas. The zero testing is very interesting. Yeah, I mean, it's one of like 10 things where if we had it, then we would just solve everything. So sad. Thank you so much. This was really interesting. Okay, well, thank you. Yep, thanks very much.