 So the final talk of the session is by David Freeman, who's a post-understander on homeowner tooth truth. Okay, thank you. So I'm originally from Minnesota, and so when I was invited to give a talk at BWCA, I thought that meant the boundary waters canoe area, and I didn't really know what homomorphic signatures had to do with canoeing. But then I was told that it actually stands for Beyond Worst Case Analysis, and that made a lot more sense, because cryptography is, in fact, all about going beyond worst case analysis. And particularly if we have two famous characters of cryptography, Alice and Bob, Alice has some kind of key that is chosen at random from some distribution, maybe some authority, or maybe she generated herself, or maybe she got it from Bob, and she sends over some kind of secure communication that's a function of this key and of the message she's trying to send, and then Bob applies some kind of decryption or verification or whatever he wants to do. The adversary gets to see this communication and is working on breaking the security of the system, and if the system is designed well, then to break the system the adversary has to solve some kind of computational problem, and in particular if the system is secure this means that the problem is hard on average for the distribution of keys chosen by Alice, or that Alice uses. So some problems that we look at in public key cryptography in particular, there's the discrete logarithm problem where if we're given sort of an integer mod p and that raised to some exponent, compute the exponent, the factoring problem decompose a number into its prime factors, or the shortest vector problem given a lattice computes the shortest vector or up to some constant factor, or some polynomial factor, the shortest vector. And these are all problems that on average, have been used with average case distributions to construct public key cryptosystems. There's some connections to worst case here. For the discrete log we can take, for a fixed prime p we can take a fixed group, we can take a worst case instance over that group, so we can reduce an average case instance to a worst case instance there, but between different groups there's no known connection. For the shortest vector problem in lattices, as Luca mentioned in his talk on Monday, there's, for certain distributions of lattices, there's connections to solving lattice problems in the worst case. So I want to talk about one thing in particular that we've worked on, joint with Dan Benet in the back there and also some, also John Katz in Brent Waters and that's cryptography applied to network coding. So just as a review, describe a little bit what network coding does. So the idea is we have a sender with some data, some file, that needs to be transmitted through a network of routers and get to one or more recipients. So what the sender does is breaks the file into blocks, mixes them up in some way, sends them on to the first level of routers. The next level of routers applies its own transformation to the data and passes it on. Final level applies for the transformations and sends on to the recipients and then these transformations are ultimately invertible and the recipient can recover the original data. So more specifically we're looking, we're going to look at, well, so I want to mention actually this is going to apply to online or offline applications. In particular it's been suggested as a good way to download movies illegally. So in particular I want to look at linear network coding. So specifically if we want to transmit a file, we write the file as a sequence of vectors defined over some finite field. So these are m vectors each with n components defined over the finite field fp. And the coding part comes in where we do what we call augment each vector. So we take the i-th vector and append to it the i-th unit vector. So this increases the length of the vectors by m, the number of the total vectors. So ideally m is going to be smaller than n. Then what we do is we send these augmented vectors into the network. Now each intermediate node receives some set of vectors, chooses random elements of the finite field and forwards the linear combination of vectors with those coefficients onto its neighbors. Maybe sends multiple vectors with different coefficients. So to decode the recipient receives a vector that's got some data part w prime and then some coefficients in these augmentation coordinates. And since we started with unit vectors, these coordinates indicate exactly which linear combination was used to make the vector w from the initial vectors. So it's c1 times v1 plus up to cm times vm. So then if we get a full rank system, if we get m vectors where this augmentation part is linearly independent or makes a full rank set, then we can simply invert this matrix and recover the original vectors. So in particular, if the ultimate recipient gets any basis of the subspace spanned by the initial vectors, it can recover the vectors. And it's been shown that this technique can achieve the maximum channel capacity and it's resilient to packet loss. If we miss a packet, we have other linear combinations and can continue to decode. The problem comes if there is a corruption or either random or malicious errors. We'll look at malicious errors because we think we want the worst kind of adversary in mind. So if we have a corrupt router, the sender sends a vector to this router, the adversary can pollute this vector and it gets sent on to the next level. The next level doesn't know that it's receiving polluted data, mixes it up with all the other data it's got. It's been given. And by the time against the recipient, the data is completely corrupted and the recipient can't recover anything that's relating to the original file. So this is a problem, so we wanted... So the idea is to use cryptographic techniques to try to mitigate this problem. So here are some ideas that don't work. We could sign each of the initial vectors sent into the system. The problem here is that the received vectors ultimately are different from the initial vectors, so there's no way to verify the signature. To verify the signature, you need the message and the signature and the message is gone. Another attempt would be to sign the original file, decode using the network coding and then verify the signature. The problem is if we receive a lot of packets and many of them are corrupt, we must try lots of subsets until we find one that contains only valid vectors. In particular, we probably have to try exponentially many subsets, so that's not a good solution either. So the idea that we proposed building on prior work is what we call linearly homomorphic signatures. So the idea, sort of given a very basic example of two vectors, is if we're given two vectors v1 and v2 and then that have signatures on them generated in some way, the idea is if we take the linear combination av1 plus bv2, there's some kind of combined algorithm that takes the signatures and these combination coefficients and produces a signature on this combined vector that validates that this is a good vector. And if we, so using this combined algorithm, if we have signatures on all the initial vectors v1 through vm, we can obtain all signatures on any vector in this space spanned by them. So this allows us to, it would allow us to achieve hop-by-hop containment. Each intermediate router could verify the signature on the received vectors before we're using it in its own combinations and passing it on. And then ultimately the recipient can just drop any vector that has an invalid signature. So there are some previous proposals which go towards solving this problem and have a few drawbacks. One is that the public key must be refreshed after every transmission. So this isn't very useful if we want to send multiple files through the network then it becomes a problem of sending the public key through the network also. And also some of these schemes are not optimal in terms of generating large signatures, say using the group of integers mon P, then the signatures produce m group elements per vector and ideally we like say one group element per vector. So what we've done in a series of papers with Benet, Katz and Waters is we've, well first we put a, gave a well-defined security model for this network coding problem and the systems in this model can use a single public key to sign many files. I'll present two different schemes that are efficient in the sense of sort of a minimum amount of signature size per vector signed. One is based on a log type assumption and this can be used for authenticated vectors to find over large finite fields and one is based on a lattice assumption and this can be used to authenticate vectors to find over small finite fields and interestingly we don't know how to do the other way using the other assumption. And then at the end I'll describe an extension to authenticate not only linear combinations but also polynomial functions on signed data and this concept uses ideals in number fields. So a little bit about what exactly goes into a network coding signature. What are we trying to construct? So we need some kind of set up algorithm that takes in a security parameter that just says how hard the problem is for the adversary to solve. We want to get out some kind of public key and secret key. The signature algorithm is going to take in a secret key and a vector and also what we call this file identifier, this ID. And the idea is that the ID binds together all the vectors in a single file. So to sign one file, you sign each of the vectors with the same ID and this prevents the adversary from mixing vectors from different files. If the security property will be that if I try to combine vectors signed with two different identifiers, that's not going to verify. Then we have a verification algorithm simply checks whether we have valid signature and then the homomorphic property is in what we call a combined algorithm and it's as I outlined earlier takes in the linear combination coefficients in the signatures and outputs a new signature and the idea is that if we get good signatures on V1, V2, the new signature authenticates AV1 plus BV2 at the bottom. Briefly, what does it mean for the system to be secure? Well, there's two ways that we could break the system. We could produce a forged signature. One is to produce a signature for some vector that the adversary has never seen before. So there's a formal model in terms of a game but just at a high level the idea this is sort of like an ordinary public key signature scheme where forgery is a new signature on a previously unseen message or a valid signature on a previously unseen message. The more interesting type of forgery is a valid signature that belongs to a file that we have seen before but where this vector is not in the span of the vectors that define this file. So again, the signature scheme can authenticate anything in the span of V1 through Vm and the security property is that I can't authenticate or easily authenticate anything outside of this span. So in particular this means that this V star could not have been formed honestly as a linear combination of the initial vectors. Okay, so now I'll give a very, very sort of high level overview of the construction with some building blocks and just to give a sense of how we do these sorts of things. So the first ingredient we call homomorphic hashing this is a hash function that takes an n-tuple of group elements in some finite group and the vector that we want to hash this thing in Fp to the n to think of as our data vector and I'll put some group element and we want it to be collision resistant in the sense that for a random choice of a couple of group elements in g to the n it's hard to find a collision. So you think of this first argument g to the n is indexing the hash function and then the hash function takes axon vectors and the second property is homomorphic that for a fixed index g it's linear in the argument in the vector so I guess linear h of w is h of v times h of w in the group g. So some examples it's actually fairly easy to construct these things from crypto primitives using the multiplicative group mod p we can hash the function just it's indexed by some tuple of elements g1 through gn in the group g and the hash function is raised g i to the v i and multiply all those together and we can show that if I can find a collision for a random choice of g i then we can compute discrete logarithms in this group another instance is using vectors over finite field so here the group is fp to the m think of m is smaller than n and the hash function is again compute linear combination v i times g i and sum them all up here we require this the vector to be low norm so say it's a zero one vector so if this is true then the collision can be used to compute a short vector in a certain n dimensional lattice and it's the problem that we believe to be hard I guess I should note that the p's in these two different constructions this p has to be very large exponential size the second instruction the p can be can be polynomial in n so the second ingredient is a linear hash and sign signature so this is a signature that has a sign algorithm and a verify algorithm that has two properties so first is the secure signature when a message by hashing the message and then applying the sign algorithm so as long as this is secure when the hash function is viewed as a random oracle viewed as producing random output on previously unseen input and the second property is that if I don't hash I want this to be a homomorphism from me, from the group to itself or maybe to a different group so RSA is a signature basic RSA signatures is an example of this signature of this property the textbook RSA signature algorithm is take your message x and raise it to the 1 over e mod n and if I instead hash first then I get an n sign then I get a signature that's secure in the random oracle and the security well it's based I mean it's necessary that it be hard to factor n but it's we the reduction actually is to the RSA problem which is exactly that of inverting where I'm finding x 3 mod n and it's not known that these two are equivalent the second example is the signatures of Gentry-Pikert and van Kuntanathan and this is a lattice based signature where the public key is some matrix so m by n matrix we think of it as a short wide matrix over a finite field and the signature is some short vector so a short vector s such that a times s is x mod p and the security of this system in the random oracle model comes down to finding short vectors in a certain lattice related to the matrix A so we have these two ingredients and we can put them together and so at a high level this is how the construction works for the security proof we actually have to go into a little more details and we don't have a black box for the proof at the moment but basically the idea is if we have some cryptographic hash function that's going to hash the file identifier to this group g to vn a homomorphic hash function that takes something in g to vn vector over fp gives a group element and then a hash and sign signature that signs group elements so basically we divide our homomorphic signature in sort of composing these three derivatives in particular we take the file identifier, apply the cryptographic hash function, use that as the index for the homomorphic hash function and hash the vector to be and then apply the signature scheme and to verify we simply verify the basic signature and because of the homomorphic property to combine we simply compute the equivalent of linear combination on the signature so if the group is written multiplicatively to combine signal 1 and signal 2 with coefficients a and b we take signal 1 to the a1 times signal 2 to the b2 and basically this is correct because the sign algorithm computes a homomorphism from fp to vn this group g prime for a fixed file identifier so this is where we require a new identifier to be chosen for each file so some instantiations so in the original paper with Bonaise Katz and Waters we used the discrete log hash function plus the Bonaise and Chatham signatures which use pairings and elliptic curves so we have to work in an elliptic curve group but the security is based on computational Diffie-Hellman problem in the elliptic curve group so given 3 elements g to the a and g to the b we want to compute g to the a b so if we compute discrete logs then we can solve this problem it's not known that the converse is true but the security theorem says that if any adversary that can break the signature scheme can solve the CDH problem in this group there's follow-up work by Genel Katz, Kravchuk and Rabin using the same hash function essentially plus RSA signatures and their security is based on their RSA problem and again these proofs model the cryptographic hash function as a random oracle and there's been some work at more complicated constructions without random oracles so there's another instantiation we've more recently come up with using lattice using the lattice-based hash function that it described earlier and the GPV signatures and here the security is based on the small integer solution problem which is given some integer lattice so some discrete subgroup or some full-ranked subgroup of z to the n and some parameter we just want to find a vector in this lattice that has norm at most that parameter and well here's some reasons that we looked at lattices for constructing this the first is that we can use small field sizes p so for in the discrete log scheme we needed discrete logs to be hard in a group of order p so p had to be exponential size with the lattice scheme we can use small p even p equals 2 and so this is an advantage if we want to do network coding with smaller vectors another advantage there are no known quantum attacks to the scheme there are quantum attacks for discrete log and factoring and then another another advantage of interest perhaps to the theory crowd is that the security so the distribution of lattice we use in the scheme admits a reduction to the worst case hardness of lattice problems this building on work of itai and mitchacho and regov so in I guess the last couple minutes I want to talk about an extension this extension is well more general problem and I'll motivate this with an example so we have Alice who's wanting to sign some data she's a professor in the class and she wants to store the data on an entrusted server so she's got a list of grades for students in her class she wants to sign each of the grades and store the data on the server so say the first student is an has a score of 91 and gets the signature stored that authenticates that is the student an is the score of 91 and that this is the grades database so the grades this tag grades plays a role we were talking about as the file identifier in the previous construction so Alice does this and then she goes away and we have the server and sometime later Bob is a student in the class and he wants to query some function of the grades data maybe he wants to know the mean maybe it's above or below so he queries the server with some function the server returns the value of the function as well as some signature that authenticates that this function basically it authenticates that this function is computed correctly so if the mean, he queries the mean is 87.3 the signature sigma would authenticate the fact that the value is 87.3 this is from the grades database and that the function requested was the mean function so this is sort of a more more general problem the question is what kind of functions we want a signature scheme that is homomorphic in this sense that can authenticate computations of data and the question is what kind of functions can we achieve this for so the network coding signature is that I talked about up till now this can solve the problem for linear functions so if I want to know the mean then then we can use the network coding signature and in work from Eurocrypt this year we can show that using one of the lattice constructions for network coding signatures if we change the lattices to use ideals and number fields which can be viewed as lattices if you think about them the right way then we can solve the problem for polynomial functions of bounded degrees so instead of just querying the mean Bob can query the standard deviation or maybe the standard deviation could do least squares fitting to data for the polynomial computations and of course the million dollar question here is can we do fully homomorphic signatures could we come up with a scheme of arbitrary functions of the data and have the result be authenticated and I guess with that I will stop and open up for questions the ones I talked about do there are some network coding schemes that don't use random molecules they're considerably more complicated they're sort of built on identity based encryption ideas but there are non-random oracle constructions so CSProof can can be used to solve this problem I have a slide in a different talk where I address this what we're looking for here is sort of efficiency so CSProof can achieve the same goal in terms of functionality but our systems are much more efficient they don't require the PCP thing and they can actually be done fairly efficiently