 Hi everyone, welcome to this talk and thanks for your interest in dynamic decentralized functional encryption This is joint work with Jérémy Chautard, Romain Gaet, Dong Yeh Phan and David Poncheval My name is Édouard Dufoursonce and I worked on this in part while I was at École Normale Supérieure in Paris and in part while I was at Carnegie Mellon University in Pittsburgh, and I'll be giving the talk today. I think we can all agree that one subarrior of computer science has had an believable growth during the last two decades also And unfortunately, throughout this conference you probably missed out because I'm not talking about cryptography I'm talking about machine learning The convergence of the massive availability of many types of data, the affordability of highly parallel computer equipment and algorithmic advances relevant to machine learning have issued in a new wave of very successful intelligent products Of course, it has also raised enormous privacy concerns because these machine learning models are trained using databases of sometimes very personal data that are under the control of a few institutions Because it's so hard to argue against the benefits of better artificial intelligence it's unlikely that consumers would prefer a system where no data mining occurs But perhaps cryptography can provide a solution that protects privacy without sacrificing the benefits of machine learning Now one might ask since clearly we're going to be looking at some form of computation involving encrypted data whether it's something that's already solved by fully homomorphic encryption And while you can certainly construct solutions to this problem from FHE I would like to argue that it's not the ideal so radical cryptographic primitive for our scenario Recall, FHE is for computing on data you can't see So in FHE we typically have a server computing on data that is owned by a client and a client because it knows a secret key can decrypt the result of the computation It makes sense for offloading computation to a server whether for performance reasons or because the model is too large for the client to store But in the scenario we are describing We want the server to run the computations but also to get back the result FHE can't achieve that or at least not in a non-interactive way And to be clear what we mean by non-interactive here is that in the long run You should just be sending a single ciphertext of the data you just generated And then it's up to the server to gather the data of the other parties and to perform the computations A non-interactive team matters when your clients are regular internet users Who will be on and off and are unlikely to have the time or resources to participate in the computationally heavy massively multi-party protocol So today we will be talking about something else Something that lets the server aggregate our data with that other participants and in a non-interactive way That's dynamic decentralized functional encryption or the DFE for short So here's the plan for today First we'll briefly talk about functional encryption its history and how it relates to the DFE Then we'll define the DFE both formally and with some helpful examples And then we'll give instructions of the DFE for a few functionalities We'll start by looking at something we call decentralized sum In the process of building that I'll notice that we fall a bit short of the security guarantees we want That would be nice to start by adding something to our toolbox That tool will be all or nothing encapsulation And we'll show how to build one from identity-based encryption Finally we'll present a more complex primitive It's a decentralized inner product scheme and we'll try to give an overview of the key ideas behind our construction The history of functional encryption in a sense Was all the way back to the invention of public key encryption Public key encryption is one of the most basic forms of functional encryption In 2001 we get identity-based encryption, which allows for some access control And then in 2006 attribute based encryption, which is stronger and allows for more complex forms of access control And it's all communicated in 2011 with functional encryption, which is even stronger and allows for computations on the plaintext data But there's something deeper about functional encryption It's interesting because it's not just a stronger more powerful variant of its predecessors Functional encryption is a framework What's great about functional encryption is that while the definition allows for schemes that would allow for general computations And general access control structures, it still captures schemes that are much more limited Public key encryption, identity-based encryption and attribute-based encryption are all forms of functional encryption But they are not instances of one another A great fit of functional encryption is giving cryptographers a common language to use to describe many possibly very different schemes All it takes is giving the functionality of the scheme and then the definitions in security game follow immediately Still functional encryption has its limitations There are two that are relevant to what we're talking about today One is that it doesn't really allow for computations that involve multiple parties That's an on-starter for what we talked about earlier, aggregating data from multiple users Multi-input and multi-client variants of functional encryption were introduced to address this point But they inherited another issue of functional encryption. There is a master secret key, the order of which can basically learn anything it wants We tried moving away from this in 2018 with decentralized multi-client functional encryption But the primitive we defined required the set of participants to be set at the beginning of time And no new members could join from that point on so it was not dynamic Ad hoc multi-input functional encryption came in a year later It was dynamic but in a sense it lacked the full definitional power that functional encryption had brought The definitions are tailored to their constructions and that framework cannot capture some of the functionalities we'll present here today Now that you have the context I start talking about what the DFE is We'll start with some examples So we have four people Alice, Bob, Charlie and Diane They have some pictures of themselves that they're probably storing on some device Now those pictures would be valuable to a company. It can use deep learning to extract valuable intelligence from the data However, send the data directly to the company so it can mine it Well, that could be detrimental to the privacy of our subjects They might not agree to participate or be unhappy about it if it happens without their consent Lucky for them, as you may have noticed earlier, they're dealing with not very evil corp And that firm decided it would use the DFE so it can achieve its objectives while protecting its customers privacy Full disclosure here training an actual neural network It's not something that's realistic to do with the cryptographic tools we have today So this example is about where the dream application would be what we're working towards So all of our friends here would encrypt the data under the DFE There's some metadata associated with its ciphertext Namely the date and the set of participants And the policy will be let's say that data can only be aggregated between participants that all agree on the date and the set of participants So they send off those ciphertexts to the not very evil corp And at that point these are totally opaque to the company. It might as well be your favorite flavor of ncp encryption That's because the company has no functional keys So the company has to go back to the participants for that In this case the participants are willing to help so they each independently compute a functional decryption key of their own The key is associated with the set of participants and a specific method for training a neural network Again, all of those will have to match across all participants And now the corporation has all it needs to start working Using the functional keys on the ciphertexts It can extract knowledge from the data and nothing more It doesn't get to see the plaintext data only the result of the training Okay, that's it for our example Now let's get for a more shorty For a DTFE scheme There's a set of keys k and a set of messages m A DTFE scheme will first be characterized by its functionality It's essentially the set of functions the scheme can compute And it also describes how different keys enable different functions For the DFE a functionality takes a list of public key functional key pairs And a list of public key message pairs The way to think about it is that you'll be mixing a bunch of keys and a bunch of messages together To get some value out as we saw earlier with the neural network training And each key and each message is fundamentally tied to the identity of its creator So that we can tell when someone is allowing computation on their own data Now for the flow of things You would start with a setup which generates some shared parameters Maybe everyone agrees on a hash function on an elliptical parameters and so on You do this once and then everyone can generate their public private key pair They advertise the public key That's their identity for all intents and purposes here And they keep the secret key to themselves They'll use it for encrypting and generating functional keys Finally Anyone can decrypt once they have enough ciphertexts and enough functional keys That's maybe a bit terse Let's see how we can connect those concepts to the example we saw earlier So in our case You'll remember that our participants would embed into their functionality Both the set of participants as they perceived it And the training algorithm That set of participants would really be a set of public keys And maybe that algorithm was represented as a circuit Hence the set of keys here The Cartesian product of the set of sets of public keys And a set of circuits The message contained an image, a date, and again A set of participants So the set of messages is pretty straightforward Now clearly we want to hide the image But perhaps those other attributes don't need to be hidden That's all okay It can be expressed with the functionality as it would be with traditional functional encryption Now let's take a look at the functionality It's a bit ugly at first, but we can break it down together The first argument is a set of keys associated with their users We call that the keys are composed of a set of participants and a circuit Here the keys are not indexed by pk, so it's very simple They are the same for all participants in the set They contain that same set of participants And a circuit that performs neural network training Up next we have the messages Here again everyone agrees on the date and everyone agrees on the set of participants But each participant has their own image And so the result of that evaluation Will be a neural network trained on a set of images, simple as that Notice that this is a simple functionality In that the set of users who provide keys is the same as the set of users who provide messages While our framework doesn't actually require that Now that we've had an overview of what the DFE is We can move on to actual constructions We'll start with decentralized sum At a high level, decentralized sum is about computing sums Or to be accurate, repeated group operations in finite billion groups In practice, the group A will often be that of modular integers So our messages will contain a group element that we want to aggregate with other people's group elements We need to specify the set of people we want to aggregate with And we'll want to agree on a label In practice, this is something that might be set to the date So that if we want to aggregate different data layer run An attacker can come in and mix and match our data from the different aggregations to learn more than they should What's maybe striking about this functionality is that there are no keys, only ciphertexts That means any functionality evaluation will have an empty list for a list of keys We denote that empty list by epsilon Even in traditional single input functional encryption There's a key epsilon that serves to capture the default leakage from ciphertexts But in a single input setting, that's usually something limited, like the lengths of the plaintext Here, in a multi-user setting, we can have more complex data leakage Depending on the set of ciphertexts that are matched together So for this sum, if the participants agree on the set of participants and the label The sum will be revealed So, how can we build a DDFE scheme for the decent functionality? Here's a good starting point If each party were able to compute a mask, such that the mask taken together can sell out Then we would have it mostly figured out, right? Because we can simply publish our element hidden by the mask And given all of those ciphertexts, you would just add them up and you would immediately recover the sum of the plaintext But can we actually do that? Can all of those participants sample a mask that is not uniformly random? But belonging to the structured distribution without relying on a trusted sort party Without coordinating and without communicating The answer to that is yes, in a computational sense A solution appears in Chase and Charge 2009 and is attributed to Brent Waters It consists in having each pair of parties computer shared key through a non-interactive key exchange scheme Then, from that shared key, they can compute a shared randomness that is specific to a label Simply by evaluating a PRF with that key on that label And now they can combine all the randomnesses they computed according to this formula here Where only some of them have a negative sign Now it's easy to see that if my randomness with you has a positive sign Then your randomness with me has a negative sign When we sum it all up together, all the randomnesses can sell out as we were hoping for So is this enough to construct decent DFE? It's almost enough But there's an issue that will stop you from proving security Let's have a simple example with Alice wanting to aggregate data with Bob And Charlie's the one that will compute the sum Even only Alice's ciphertext Charlie should learn nothing because he needs one from Bob to decrypt Once he gets Bob's ciphertext, Charlie should learn the sum But now, what happens when Alice generates two ciphertexts For the same label and the same pair of participants It's not obvious why Alice would do that But hey, nothing's stopping her from generating those ciphertexts So here she goes Now, what should Charlie learn here? In theory, Charlie should not learn anything Because he still doesn't have a ciphertext from Bob to evaluate with But if we just use the mask we talked about earlier Since that mask is deterministic Charlie can cancel the mask out and evaluate the difference between the original messages Now, you might argue Hey, Charlie was going to learn that by linear combination The second Bob puts out some ciphertext So what's the big deal? But maybe Bob was never going to do that So what we want is to have a clean security model That's easy for everyone to understand Without caveats that are added in because the cryptographic structure Just doesn't quite get us the most natural thing To that end, we introduce all or nothing encapsulation It's the DFE functionality that will solve our problem In AORNE, the message is some fixed length data And again, a set of users and a label Much like DSUM, AORNE has no keys And encryption requires that all ciphertexts agree on the label and the set of participants Unlike DSUM, AORNE simply reveals all of the plaintext messages And their association to a participant So, if AORNE reveals all of the data, what is it good for? Well, it only allows this reveal once all the ciphertexts have been received That's essentially what we were missing for DSUM The ability to hide everything until everyone has contributed the message Now, here's a simple idea for constructing AORNE Everyone generates an identity-based encryption key pair and advertises the public key When you want to encapsulate data, you encrypt it under as many layers of AORNE as there are participants Each layer is under a different participant's public key And under identity L Finally, you also include your functional key for identity L And that's pretty much it Once everyone has done that, you have all the keys for identity L So you can remove all the layers of IBE and get back all the plaintexts Now, the problem with the solution which is described Is that each individual ciphertext has size line nearer in the number of participants The great news is that if you carefully instantiate our above construction With Bonnie and Franklin's original IBE from 2001 That problem goes away, you can get sex in ciphertexts If you're curious, please refer to the paper for details We can finally move on to our most interesting functionality in our product It's the most complicated one So we'll simplify things a bit so we can focus on the important ideas The messages contain a scalar and, as we're used to by now, a set of users and a label The keys are a vector on a set of participants So it is a set tying each participant to a scalar And a functionality is that if all messages agree on the set of participants under label And all keys agree on a unique vector y over that same set of participants We should be able to compute the inner product Which is the sum of the message scalars weighted by the key scalars As a starting point, let's look at the construction for inner product Multi-client Functional Encryption we introduced at Asia Quick 2018 Here, we're reusing DDF notations for those of you who might not be familiar with MCFE But basically, in MCFE, there is a central authority which will assemble keys for people And hand out the functional decryption keys by relying on its global knowledge of everyone's key In DDFE, we have to compute those keys in a decentralized way Which is much harder So this is an MCFE that has been rewritten to try to achieve a DDFE But the biggest challenges remain ahead of us The basic idea here is that encryption is somewhat algorithm-like But the message is placed in the exponent because you want to achieve inner product Which is essentially an additive functionality And instead of something a random exponent We use a random miracle to compute a random element that is shared between all participants for a given label Because you don't know the discrete algorithm of that random miracle output You can only encrypt if you know the secret key But that's fine because IPDDFE is not a public key functionality For the functional key, if you're trying to give a key that will enable the user to evaluate Inner products with a vector y Well, we call that we essentially have a vector s of secret keys So we simply give out the inner product between y and s And then there's a way to combine a bunch of ciphertexts with the functionality It uses the structure of the group and you can do the math It works out and you get back this inner product As is common with group-based inner product functional encryption schemes That result is in the exponent of the group generator So you have to make sure the result of your computation isn't too big So that you can compute a discrete algorithm It's a bit unfortunate, but it's really manageable in practice So how can we distribute the key generation in this scheme? So it is closer to being a proper DDFE Well, we said the key is the inner product between the vector y from the functional key And the vector s of the secret key Another totally equivalent way of looking at it Is that it's the sum of the y's multiplied by the appropriate s's If you see it as a sum, you realize that for a key generation You really want to decentralize an interactive way to evaluate this sum But that's what we were talking about just a few minutes ago That's d sum, which does indeed let us decentralize key generation Another issue you run into is that of repeated queries, as we did earlier with d sum Because encryption is again deterministic So you can use the group structure to figure out the difference between two plain texts From two ciphertexts for the same set of participants and the same label Thankfully, it turns out that the irony which we had used for d sum also helps here Finally, we simplified things a bit here by having simple scalars as messages But we can actually have vectors as messages That actually complicates things a bit as far as repeated queries are concerned This can be addressed by combining a layer of single input inner product functional encryption With another layer of AONE on the functional keys Again, I'll refer you to the paper for details Allowing to conclude by recapitulating our main contributions First, we defined dynamic decentralized functional encryption Which is a framework for describing a variety of crypto systems That enable a server to perform controlled computations on data from a variety of clients Next, we defined three interesting DFE functionalities For which we also provide constructions AONE encapsulation is an extremely helpful build in block For achieving a natural notion of security in our other functionalities And we give a generic construction thereof from identity-based encryption With a succinct variant from bilinear maps We then define decentralized sum For router construction, next-generation use of AONE and of non-interactive key exchange Finally, we use both of those functionalities in addition to primordial groups To construct a decentralized scheme for the functionalized encryption For the functional evaluation of inner products over sensitive data That's it for today's presentation I'd like to thank you again for your interest And I look forward to interacting with you and answering your questions during the conference