 Hi, my name is Michele, and I will be presenting our work on anonymous tokens. This is joint work with Ben Creator, Tarkhad Le Pond, and Mariana Raikova. In our work, we introduce anonymous tokens, a lightweight single-use anonymous credential. They are basically authorizations that can be used only once. Throughout this talk, we will focus on credentials that are secret key, namely whatever entity issues these tokens will be also the entity that redeems them. We will also require these tokens to allow the issuer to embed a secret bit that can be read to redemption time and is hidden from the user. This, for instance, will allow for the construction of a blocklist that is not detectable by the user at the cost of lightly diminishing the anonymity set. So let me share a couple of stories from the real-world trenches on why this primitive is relevant. On the internet, it is difficult to identify if requests come from legitimate users or bots. It is even more difficult to do so with privacy in mind, without logging services or third-party cookies. Generally, website protection services like Cloudflare act as a middleman to filter requests between a user and a web server, for instance a CDN. They assess the transworthiness of a user generally on the base of their IP address, which leads to a lot of false positives, especially in the case of shared IP use, which is the case of anonymity services like Tor or I2P, or also in the case of VPNs. In Cloudflare, if an IP was suspected to be a bot instead of a legitimate user, another round of communication was added and they were presented with a challenge. This basically meant a capture, actually oftentimes more than one. And because of this, the web experience of Tor user became basically unbearable. There were hashtags don't block Tor on Twitter and on hacker conferences you could even see stickers like this one. In a few years, however, Cloudflare implemented and deployed a solution called PrivacyPass. PrivacyPass is an anonymous token. It comes in the form of a web extension, where after successfully solving a captcha, the user receives also a bunch of these tokens. These tokens can be later spent instead of solving new captchas. And because of the cryptography properties of the tokens, the anonymity of the user is preserved, while at the same time it is not possible to spend more tokens than have been issued. More recently, other tech companies joined the party. Brave is now using anonymous tokens in order to reward in a privacy-preserving manner users that received advertisement on their browser. Facebook wants to use them in order to assess whether a user stands out as statistical anomaly. More precisely, they would like to understand, for instance, if a user clicked too frequently on too many ads without completing any purchase. While at the same time, avoiding sharing sensitive information about users with other websites. Google and most specifically Chromium want to get rid of their party cookies. Third party cookies are cookies that can be read by different origins. For instance, they can be set by Google and then be read when I'm visiting bcc.com. They can be used in order to assess if a user is fraudulent, but at the same time they allow for tracking across different websites. Ideally, we would like to be able to prevent spammy behavior without tracking an individual across a web. So one solution for this is to issue tokens from popular websites and provide a service that allows to redeem them whenever we have an activity that is acceptable to abuse. In some case, it is also crucial to protect against adversarial learning. What might mean is that, for instance, if the issuer detects a malicious behavior and decides not to issue a token, this different kind of response can be used in order to train algorithm that understands what kind of behavior led to a spam detection and what didn't. In these cases, we want to issue credential also to suspicious actor and then decide at redemption time what to do with them. For instance, if you shouldn't provide a service to a user in a blocklist. At this point, the issuer, if malicious, could also split the anonymity set in two. But there is a trade-off at play here between functionality and anonymity. And it is still a better solution than tracking. In all of these cases, what you want is a functionality called private metadata. So how do we formalize such a credential system? We're basically looking for two protocols. An issuance protocol, possibly of only one round, where at the end of it the user gets a token, given as input and nonce from the user. And in the case of a private metadata bit, the given bit by the issuer. And a redemption algorithm that allows to check if the token is valid and read off the bit from the token. For simplicity, we will assume that the user communicates with the issuer over an authenticated and encrypted channel, so basically over TLS all the time. And there is no man in the middle that can steal the tokens from a user. In the paper, however, we will also provide generic ways for achieving security face to man in the middle attacks, which in this case are called token hijacking. If we forget about token hijacking, there are three base security properties that these algorithms should satisfy. First, a linkability. After interacting with multiple users, it should be difficult for the issuer to link a particular token to its issuer on session. In the case of a private metadata bit, we demand that it is not possible to link two sessions as long as they have the same bit. Formally, this is achieved by letting the adversary pick even the public parameters, but then demanding the existence of an extractor that can find out the hidden bit from a token. And if you're familiar with blind signatures, this is somewhat close to the definition of blindness, if we forget about privacy of the private metadata bit. One more unforgeability instead protects the issuer. It says that it is difficult for the user to spend L plus 1 valid tokens after interacting with the issuer L times. In the case of a private metadata bit, we allow the server to provide L issuance for each bit, but ask for L plus 1 forgery is on the same bit by the adversary. Finally, privacy of the metadata bit says that an issuance session with a bit set to 0 should be indistinguishable from an issuance session with a bit set to 1. In the indistinguishability game, the adversary is also able to observe multiple sessions, even for bits of their own choice. And then they must make a guess on a challenge session. In the paper, we also deal with a verification oracle, namely an oracle that will check if a token is valid or not. But for sake of simplicity, I'm not going to consider it for the rest of this talk and invite you to check the paper for a stronger security model. And why are we doing all this? Why are we giving these definitions? Well, because people want to standardize them. The W3C would like to provide a JavaScript API in the browser that allows to demand and to redeem tokens. The ATF is standardizing privacy pass, the protocol that I mentioned when I spoke about Cloudflare, including extensions such as the private metadata bit. And as we believe that it is important that these new cryptographic primitives undergo a formal assessment before deployment. Our contribution in this work has been to formally set down these definitions and provide a number of efficient protocols to satisfy them in their random oracle model with standard assumptions and without pairings. More precisely, we provide new protocols that efficiently implement anonymous tokens with private metadata bit. And we illustrate also new techniques for getting rid of the zero-knowledge proofs, both in previously published protocols and in the ones that we provide ourselves. Unfortunately, there are not many options already available in the space if we want anonymous tokens with private metadata. Full-fledged anonymous credentials are just too expensive. For instance, if we think in the case of advertisement, we need an anonymous token that has fast redemption. And also they are public key. We have algebraic marks that are being used in SignalR right now and they cover a similar space. But unfortunately, they no support private metadata. And they are also somewhat slower than privacy pass, for instance. We also have blind signatures and variations of blind signatures, for instance, conditional blind signatures that allow for private metadata. However, again, they are public key and as we will see later, conditional blind signatures are insecure for many parallel sessions. The starting point of our work is privacy pass, so a protocol without private metadata. Privacy pass assumes that the participant shares some public information, a primordial cyclic group and a group element chosen by the issuer, of which only they know the discrete log. This protocol consists in a blinding phase where the user blinds the nonce and sends it out to the server. The server then proceeds with a sign-in phase where it computes the CDH between the blinded value and the given parameter. Finally, the user proceeds and blinds the token. Verification at this point simply consists in checking whether the given group element is the CDH between the hash of the nonce and the parameter provided. Now, forgibility of this scheme is exactly one more Diffie element. The challenger is giving out many challenge group elements via the random oracle and the adversary has to compute one more CDH in order to provide a forgery. A linkability on the other end is more difficult because the server could have used different acts to compute the CDH in each session and thus link a user with a different key. By decision of Diffie element even more, it's impossible for the user to know the same secret key has been used. And for this reason, in fact, we must add a zero-knowledge proof that guarantees that the computation of W was done correctly. By zero-knowledge, the proof can be simulated in the forgibility game and the proof goes exactly as before, but by soundness now the protocol is also linkable because we are guaranteed the same key has been used and because T is completely unrelated to the nonce. Now, as I mentioned, this protocol does not have a private metadata and one trivial way to extend this support could be to have two keys and provide a proof that either one of the two has been used. During the issues phase now we would use one of the two keys depending on the bit that it's chosen and our proof would be an OR proof that either one of the two keys published has been used. While very informally this protocol achieves unlinkability and unforgeability, it does not really hide the private metadata bit. In fact, consider an attacker that starts two parallel sessions with the same nonce T and at the end receive the CDH and also the zero-knowledge proof that I'm going to ignore. If the bit used was the same, then unlinkability will lead to the same group element, otherwise it won't. So the adversary has learned some information about the hidden bit and this is a real mistake that could happen and has happened and it shows that we really need a formal analysis and more eyes on this protocol The problem of the previous protocol was that it was based on a deterministic protocol, on a VOPRF so let me present a simple variant of the both protocol that will not be susceptible to the same issue. The trick is basically to add another generator. Now, instead of having a simple generator G we have two generators of which you don't know the respective discrete log. The public key is now in these two bases. In the signing phase, we do not only compute the CDH with the group element given by the user but we also add another element partially chosen by the issuer and the signature is done under this group element and the one given by the user. And blending is performed as before except that now we also multiply by the blending factor the element chosen by the issuer. As before, we must also add a zero-knowledge proof to prove that the same key is being used across multiple sessions. So, in this example, despite being more complex than the previous one, we'll have the property that the same trick of using two keys will lead to a secure protocol. In fact, now if we use two different keys the private metadata bit zero cannot be distinguished from the bit set to one because we can find an S that will work for one if the bit is set to zero via a sequence of hybrids. Actually, for privacy of the metadata bit there are small issues of malleability of the verification oracle. But it satisfies the notion of privacy of the metadata bit that I gave during this talk and I invite you to check the paper to see a stronger protocol that is secure also faced to a verification oracle. The protocol is also unforgeable because we can embed a challenge in one of the two keys in a similar way that we did before and make a guess on which bit the adversary will present a forgery. And the protocol is also unlinkable for the same reason as before because t' does not have any information about the nonce and the same key is being used across the session provided that we use the same hidden bit. Now let me show another completely different way for that we can use in the initial protocol for getting rid of the Xenomaledge proof. In privacy pass we can use in addition to a multiplicative mask we can also use an additive mask shifting the blinded element by a group element of which we know this crit log. The signing procedure stays unchanged. In the blinding phase now we would have to shift back this group element. But because we the issuer multiplied the element by X we have to remove this quantity by the element that's been published by the server. Verification proceeds exactly as before for the CDH between the element published and the hash of the token. So the code on the server side as you can see is basically unchanged. Actually we are removing the Xenomaledge proof. On the client there is on the other hand a slightly increasing computation. The basic idea in this protocol is that now if the public key that's been used for signing is different from the one published we will end up with a completely random and invalid token. So now the anonymity set is split between valid tokens and invalid tokens. And users can mitigate the risk of malicious issuers trying to trace down users with invalid tokens by sending random group elements from time to time. This protocol is unlinkable provided we accept this partition of the anonymity set. Unfortunately on the other hand is based on one more Diffie element in the same way as before. Note now that these techniques, these tricks is also compatible with the technique that I showed for adding a private metadata bit. So they can be combined leading to a protocol that has private metadata and does not require Xenomaledge proof. I invite you to check the paper for more details on how this protocol works. Now it takes more than nice security protocols based on solid cryptography to make something useful. So now that we've seen a bit more formally how this protocols are constructed, let me take a couple minutes to show what it means to bridge the gap between theory and practice. First of all, the security assumptions. We managed to move all the security assumption to one more Diffie element. We know of cryptanalytic attacks on them, namely Brown, Gallant and Cohen attack that allow to recover with sub-exponential complexity the secret key. However, they also depend on small divisors of P plus or minus one and for curves using practice for instance course 25519 it is difficult to understand whether this should be considered a real danger because they use a lot of bandwidth. Token hijacking. Let's assume for a second that the client is sending out requests over HTTP, so unencrypted and not authenticated. What can we do in this case? It turns out it's really inexpensive to prevent hijacking of tokens from a man in the middle. In fact, it's sufficient just to use a MAC. This was already shown by Goldberg and others, but in our paper we present a generic way that works on the top of all the previous protocols that I mentioned. There are also generating issues that need to be taken into account like throttling, how many tokens can be issued at once, how often do you rotate the keys and also where do you store more importantly all these public parameters because all the users should have the same visibility on them. On our side, we also provide an implementation based on the blazingly fast implementation of Kultr5.519 by Isis and Henry and we did it on the top using RESTRATE on the top for a primary order group element. We published Accenture Benchmark and we are now working on a port in WebAsim that can be used to perform demos on the browser. We're taking also some other directions, for instance public metadata because in the real world oftentimes you have multiple data centers each one following its own key cycle. It would be interesting at this point to have public metadata that can be embedded in order to track which data center is issuing which tokens. Public verifiability. A legitimate question is also whether the previous protocols can be made publicly verifiable. For instance, if an entity can verify the token and then another one can extract the bit from it. Even just if we look about blind signatures for instance privacy paths can be seen as a blind BLS signatures. The legitimate question is whether we can share the group element in G2 and then use it as a BLS signature. However, there is a proof that needs to be given. Also, in the case of a private metadata bit protocol, we can think that we can transform it into a blind Okamoto-Schnor signature. However, as we showed in a more recent paper together with the invaluable help of Fabrice Manamuda, we show that blind Okamoto-Schnor signature shouldn't really be used in practice. Finally, batching proofs. Whenever we issue many tokens at once it's natural to ask whether we can batch the proofs in order to have more efficient protocols. And we know that there are generic techniques for batching together sigma protocols, but the problem comes when we issue tokens with a hidden bit with different bits inside. These are problems that can be isolated and treated separately. And I'd like to close with a word of hope. I think it's about time that we start deploying anonymous credentials for handling finite resources and access control in a privacy-respecting way. I'd be very curious to see if people have their own use cases for them and invite you all to check the similarization documents and help out. Thank you.