 Okay, glad to be here in Paris So I Have too many slides and I'll rush us through some of them I hope that you will get the general picture without too many of the details so that network coding is an approach for routing communication networks in which one achieves increased capacity by using some Linear Algebra very elegant theory The typical example is given in this figure if you have a Multicast situation with a source and two targets Y and Z and the source has two beats that needs to transmit both of them To Y and Z then in this network if every node can only send one beat Basically, it is impossible to move B1 and B2 to both Target vectors as in the right if you really want both Beats to reach the Source the targets you must send between W and X two different beats This achieves what is called one and a half Rate because it's two to two nodes that get three beats in total However, if instead of just sending beats you are allowed also to do some combination of the packets that you received Then W will send to X not be one not be two but rather be one so be two And that will allow both Y and Z to get the two beats So using these simple so operation you get a better rate in in in the more general I mean the general approach in a network coding or linear network coding is that Information is represented as a vector of beats a source sends into the network a Bunch of vectors and then every node that receives a set of vectors will take a linear Combination of the vectors and will send this linear combination to its neighbors Eventually stuff will reach the targets which after receiving enough vectors It will be able to reconstruct the original information With a little bit more of details, we will be talking about random linear Random linear network coding which is as the previous figure showed but the coefficients are taken at random from a field So basically the source that has some file Prex the flat the file into blocks each block we think of as a vector over some given field F And then for every source vector It will attach M beats where M is the number of total vectors in the file It will attach Have to remember to do it here. So it will attach a Vector of M coordinates which at the beginning will be a unit vector And after all the combinations done in the network, it will become some more general vector of M coordinates over the field F. I will be denoting the original information by the by V and For the original vectors that will have V bar and for these composed Vectors with coding coordinates and information coordinates I will call them W and again if I refer to the original W's then I will have a bar Okay, so it's M vectors of dimension and each one and over a field F Now intermediate nodes will get sets of vectors And they will combine them by choosing Coefficients which in this case are taken at random from the field F Target nodes will wait until they receive enough vectors The condition that is needed here is that the vectors that Come to the target which will have a U component for the coding coordinates and a V Component for the data coordinates if the U coordinates Which are the M first M? coordinates a form a set of linear independent vectors or in other words if the Matrix that represents these coordinates is invertible in that case one can prove that the target is able to Recover the original information now since the whole thing is done at random with random colors There is a question of what is the probability that eventually targets will get enough linear independent vectors so as to recover the original file and These are well studied problem in the network coding literature in general What is important for us is that the probability of Reconstruction depends of course in the graph and the network, but it doesn't depend on the specific field It only depends on the size of the field and in most practical situations people use Fields of size around two to the I mean eight-bit fields And and that will be enough for most application And we will come back to that Now if we do this random linear network coding We have a problem and the problem is that an attacker that sits for example in In the In this node you can simply receive a this vector b2 from s that send The Compliment of b2 it flips the beat now you can see that it's enough to flip one bit as to get the targets Get different Information and wrong information. Okay, so a single bit that you flip in this Setting it's enough Maybe to basically destroy all information that is being sent in the network. And this is not a theoretical Friends, it's a very serious one because it's a trivial attack to mount Okay, so what do we do? So we want some form of authentication of course Signing the original file by this by the source is not enough because end-to-end Authentication doesn't work here. It helps after a target reconstruct the information It can have the target know that it reconstructed the right information But if something changed in the way in the way then just reconstruction will not be possible So end-to-end authentication will not work what we want is some form of authentication in which We not only the source will be able to create signatures But also intermediate nodes that generate linear combinations of valid vectors will also be able to to build Signatures to construct signatures for these So what are the value vectors anything that is in this linear span of the original vectors? W bar 1 to W bar M Anything in that linear space is okay because that's exactly what you get from linear combination of the original vectors These we will call valid vectors on the idea the question is how do we how can we put a? Distinguish between vectors that are in the spam in the span and vectors that are not Ideally what we want is to have something like this We want the source to have a pair of public secret secret key public key It will use the secret key to generate or initially Signatures for all the original vectors W bar 1 to W bar M then Everyone else using the public key will be able to verify signatures, but we'll also be able to choose coordinate a Some coefficients alpha i Generate a linear combination of valid vectors and now using the valid signature of the Vector wi and the coefficients it will be able to generate a Signature for that Valid vector w okay, so from the valid signatures of the vectors for any linear combination With the public key only you can generate the signature of that By doing that you get that invalid vectors will be strong from the network as soon as they appear This is a kind of a Paradoxical notion of security in the sense that you are allowing people with just having the public key to build Signatures on stuff that they didn't see before but it's a very restricted Form of forgeries that you are allowing the anyone to do so the natural Tool here to use is the so-called homomorphic signatures The momorphic signature is a signature which has the property that if you give me the the signature of w And the signature of w prime then I can generate myself the signature of the linear Combination alpha w plus alpha prime w prime and the important thing is that that Signature generation can be done just using the public key if I am giving the signatures of the individual okay, so and Homomorphic signature could have this form that the signature of alpha w plus alpha prime w prime Will be the signature of w to the alpha times signature w prime to the alpha prime So that would be an example of a Homomorphic signature that we would like to build and will help in this case So far before our work all the signatures that were of this type that were Used to solve the network coding authentication problem were based on bilinear maps These of course has the problem of of a performance cost And also notice that to use We will be used we will working over large fields not 8-bit fields before in network coding use 8-bit Fields which means that the the the overhead of the coding coordinates is m times 8 while in these solutions the Overhead will be 160 times m. So it's a 20 Fold cost for the overhead of the coding Now the question we ask is whether we can use RSA for for these signatures And it sounds like it's the natural thing to do because RSA has these Nice monomorphic properties. Most of the time. They are not nice. They are create problems. You have to break the Algebraic properties, but in this case we can really use them in our favor So this is a natural idea. It was tried in a couple of previous papers And it was done wrong Actually, there's this info comp paper that is cited a lot, but they're They they even the basic arithmetic doesn't work there So it's very secure because the signature never verifies So you are never going to accept a good Emptly valid vector as valid Neither a valid vector as valid Anyway, so we are going to use RSA and there are really some technical issues Here we need to solve so let me give you an idea of what is done here. Really. I will skip many details and I will cheat But this should be the general idea So the the basic signature the bees for big basic the basic signature here is to take Really to use the homomorphic properties of the multiplicative property of RSA So what we are going to do we need to apply a signature on a vector So first we will apply a hash function on the vector this exponential hash function in which Working over the quadratic residues group module of an RSA prime by choosing the the prime the An RSA modules by choosing the modules as a Product of two safe primes. We know that the quadratic residues group is cyclic So we will have a set of generators and we will apply this hash function Well-known hash function that on the vector of V1 to Vm The hash function will be the product of the GI to the VI and after applying this hash function will apply RSA and the good thing about these is that since The hash takes us from an additive Group to a multiplicative group and then RSA keeps us in the multiplicative group Overall we have this property that for any two vectors V and V prime The the signature of the addition equals the product of the signatures and if you do this with a linear combination and there's some Coefficient alpha i's then you get the alpha i's in the Exponent so this is nice and simple except that there is one problem this this is Equation this equality is true All only when V plus V prime This addition is done modulo five prime of n here five prime of n is the Order of the quadratic residues group, which is five of n over four And the reason is it because since the VI is working exponents of G. They have to work modulo the Order of these generators But now we have a problem because it means that we have to run network coding Where the operations are done modulo five prime of n? You cannot work over a field But what is even worse is that you cannot publish five prime of n because if you publish that you you give the factorization of of n so the solution in other instances Similar to these is Basically to work with instead of doing the addition modulo five prime of n we will be that it will do the addition over the integers We will never do any Modular Reduction at the side on the linear combinations of the vectors we do everything over the integers And in that case this is a question Where this is over Z? This equation will work Okay, so that's fine. We can work over the integers But that means that now we have to do all the network coding instead of doing Linear combinations of our fields we will do them over the integers And now the question is whether network coding over the integers works We know that it works over fields. What is the situation if we move into net into integers? So what do we mean by doing the network coding over the integers the There are the vectors will be represented over the integers the original file will be represented as blocks each block being an n dimensional vector over integers The transmission of the original vectors will be done as before with a prepended set of m coordinates Originally these coordinates are unit vectors So they are in over the integers, but we will keep doing all operations over the integers So these vectors will always be integers the intermediate Notes will do the mixing as before except that now instead of taking the Coordinates from a field. They will take them from a set of Integers that we call Q and then when targets receive the Final vectors they will be able to reconstruct as long as they can do some linear operation as before this linear algebra except that the inverse of these Matrix that was done before over the field f now will be done over the integers Okay, so how do we choose this set Q from which we take the coefficients? We want it to be a small set because I mean small numbers Because when we do operations over the integers and never do a modular operation the coordinates of these vectors keep growing every hope You do a linear combination the coordinates grow and they grow as proportionally to the size of the Coefficient so you want them to be small, but if you take them to be too small then you may not have a good probability of having an inverse over the integers so this This trade-off is solved nicely by these some I'll call fundamental limit fundamental because It's the lemma that enables this whole work, but as a theorem is a very simple. It's an observation very easy to to prove Still very powerful here and what the dilemma says is that? The probability that you will be able to decode over the integers When you use a set of numbers between 0 and Q minus 1 where Q is a prime is at least Maybe it's it may be better, but at least as good as the probability of working over a Field of size Q What that means is that for most practical purposes? It's enough to use a coordinates that are taken from the as a set of integers of 8 bits each integer Okay, so these are small integers still every time that you do a computation a new linear combination you increase your Coordinates by about 8 bits, but even in a network of let's say 25 hops then we will still be You know remember that in I told you that in existing solutions We work cryptographic solutions. We work with fields of size hundred and sixty bits coordinates Which is 20 times 8 it means that here you can go about Actually beyond 20 25 30 coordinates and still in the on the average have this overhead than with existing other cryptographic solutions So so this is nice. It means That we can work over the integers It gives us a good probability of reconstruction and it allows you to do an RSA the solution What is also interesting is actually by this observation you can improve all previous schemes By instead of choosing stuff from the beginning coordinates that are 160 bits Along you can start just choosing also small coordinates and you can show that even previous previous schemes by using that fundamental lemma they are improved very much for example in the scheme of Bonnet three months Cuts and waters The price of computing a signature at an intermediate node will go it will be 20 times Smaller by doing this just because it's proportional to the length of the coordinates okay, the RSA based network coding signature is a little bit more complicated than what I showed before But it's very similar There are some technical issues that are important because they are also needed for security for the proof to work but basically I Will not enter the details here I Discuss a little bit the performance issues here but the the most crucial operation in the system is to create a signature on incoming incoming vectors that is when a Node generates an outgoing vector as the linear combination of incoming vectors It has to return to compute a signature for these Outgoing vector w and the way it does it is just by simply taking the signatures of the incoming wi Multiplying them to the power of the coefficients and that's it now if each coefficient a coefficient is 8 bits long and you have a 10 Incoming vectors it means that you only will have to do 80 multiplications This is this is a very very efficient operation And it's important that signature generation will be fast because verification is something that nodes can Decide not to do Okay, you can by the way, this whole thing is about denial of service It's not you can use a signature end to end signature to check that the file that you reconstruct is authentic the problem is that the attacker can just Prevent you from reconstructing so Verification you can say I will skip it. I am too busy now to do a verification Someone else down the way will do a verification and we'll throw bad vectors in in all we have very very Efficient signature generation the signature verification is a bottleneck here it computation is about one or twice the now number of multiplication Modular multiplication such the number of bits in the vector The one thing that we can do to improve that is that you don't have if you get 20 vectors You don't have to check 20 vectors. You just do a batch verification. You create a random Combination and checks lots of random combination if that works it means that the signature original signatures were correct with high probability Also, one can use a RSA with smaller Modulate like a 512 bit because again, this is against denial of service an attacker that is willing to spend time Breaking a 512 bit RSA has much easier ways to do denial of service in these cases Also one good thing about this I But about this solution with RSA is that while previous Schemes used hundreds of generators for example for a key four kilobyte a block of data. You needed something like 200 Generators here we can use just one A model of security for these signatures was Introduced by this work of When Freeman cats and waters and we prove the security in that model under the under random or actual and Just based on the plain RSA assumption the hardness of inverting For the case of any course one doesn't matter what exactly that means here, but in that case we have a tight reduction So to summarize we See that network coding really is a very elegant and powerful alternative to traditional routing it increases fault tolerance and capacity And at the same time gives a completely decentralized approach to routing since every node just takes everything it gets And that's the the random linear combinations. There is no There is no need to No central authority or anything that needs to Coordinate the actions of the nodes also it's very adaptive to Situation in the in the network if there is some load that goes down the network itself will Will resolve this just by having sufficient redundancy in the linear combinations? Pollution attacks which are these attacks we were talking about actually threat against these Networks we said that end-to-end authentication will not solve the problem oboe morphic signatures have been Suggested before as a solution to this problem so far we had solutions based on bilinear Groups which are expensive operations Also in cure in very high overhead Bandits over and relative to basic that recording the RSA solution is conceptually simpler It has some performance advantages if you if This is not something that Maybe practical enough for it for many of the network coding applications That it gets closer to something that we can hope of using and the main technical tool here is Working doing network coding of all the integers which in itself is an interesting mathematical problem And actually there is more to be understood but so far what we have is sufficient at least for this cryptographic use We have more more material in the paper in particular we show how to improve on previous Schemes and also something that is nice We use a solution that doesn't use signatures Only homomorphic hashing again an approach that was used before that here the homomorphic hashing is done over composite numbers and it has Some very interesting properties including the fact that we can use just a single generator Fixed to the number two, which is also very good because of we do exponentiations to the base so overall It's an it's it's an interesting problem with Somewhat interesting solution Thank you. So we have time for some questions to who go so If you're talking about RSA we cannot do things The bilinear maps so what what you do you you use 8-bit coefficients And that to the same as before you don't change anything in the scheme Just a bit of Efficiency at some point if you have enough hops you will go above 160 bits, which is your prime and then you just keep They will not increase anymore the coordinates, so Up to 21st hops actually the first 40 hops are for free because Well, actually, you know everything is for free the first 20 hops you just Have a net net game