 Hi, I'm Tianhao. I'm a PhD student at Princeton. Today I will talk about our work on the concurrent composition of differential privacy. It's a joint work with Professor Salil Vathen. I will first introduce the background of the paper and our main contribution, then I will talk about some basic definitions and properties. Then I will go through some of the proofs of our main results, and finally I will discuss the limitations and possible future works. So differential privacy is the dominant privacy notion nowadays. The main idea of differential privacy is to carefully randomize an algorithm so that its output does not depend too much on any single individual in the dataset. That is, for a differential-private mechanism, the probability distribution of the magnetism's output of a dataset should be nearly identical to the distribution of its outputs on the same dataset with any single individual's data replaced. To formalize this, we call two datasets adjacent if they are differed by at most one row. A magnetism or a function is epsilon dot differential-private if for any adjacent dataset x and x prime, for every possible output event t, the probability that we see t when the dataset is x is at most e to the epsilon times the probability that we see t when the dataset is x prime plus delta. And the probability space is over the conflict of the algorithm. The epsilon is typically referred as privacy loss or privacy budget. It could be a small constant but not negligible. The delta is also incorporated since it turns out that it's quite useful to have one, which we will see very soon. The delta can be interpreted as an upper bound on the probability of catastrophic failure, for example, the entire dataset being published. So it is sometimes called security parameter. In the literature, when the delta is zero, the notion is often referred as pure differential privacy. And when the delta is greater than zero, the notion is called approximate differential privacy. One of the central properties of differential privacy, which we will also use in our proof later, is that it is preserved under post processing. That is, a data analyst without additional knowledge about the private database cannot compute a function of the output of a private algorithm m and make it less difficult to private. In other words, differential privacy is robust against further process of previous mechanism output. Another useful property is that differential privacy provides protections for small groups of individuals. In general, epsilon delta differential privacy is designed to protect the privacy between neighboring databases, which differ only in one row. This means that no adversary with arbitrary auxiliary information can know if one particular participant is in the dataset or not. However, this is also extendable if we want to protect databases differing in more than one rows. For example, a family. A very simple hybrid argument can show that for a pair of datasets that differ by k rows, there are still perhaps guarantees for the two output distributions with some degradation in epsilon delta parameters. One of the most important properties of differential privacy is that it permits the analysis of cumulative privacy loss under the composition of multiple mechanisms. If we run multiple distinct differentially private algorithms on the same dataset, the resulting composed algorithm is also differentially private with some degradation in the private parameters epsilon delta. This property is especially important and useful since in practice, we rarely want to release only a single statistic about the dataset. Releasing many statistics may require running multiple and different kinds of differential privacy algorithms on the same database. Composition is also a very useful tool in algorithm design. In many cases, new differential privacy algorithms are created by combining several simpler algorithms. The composition theorems can help us to analyze the privacy of algorithms designed in this way. There are already many composition theorems exist in the literature. The basic composition theorem says that the privacy degrades at most linearly with the number of mechanisms executed. However, if we are willing to tolerate an increase in the delta term, the privacy parameter epsilon only needs to degrade proportionally to the square root of the number of mechanisms executed. The advanced composition theorem is still not exact. It has been shown how to compute the optimal bound for composing k-magnetism, which we will go over them later in this talk. And all the composition theorems apply for arbitrary kinds of differentially private mechanisms. If we know the specific properties of the underlying DP algorithms, we can further improve the privacy composition bound. For example, moment content, which is used in computing the privacy loss in a diffusion-private stochastic gradient descent. All of the composition theorems discussed above focus on or implicit assume that the underlying DP mechanisms are one-shot algorithms that only up to one answer. However, many of the useful differential privacy primitives, such as sparse vector technique, are actually interrogation mechanisms, which allow one to ask an adaptive sequence of queries about the dataset. For instance, sparse vector technique could potentially accept infinite amount of queries, while paying price costs only for queries that are greater than the noisy threshold. And it's very useful. For example, SVT is used in adaptive data analysis to prevent overfitting. And to deploy DP in practice, people might also want to support interactive queries. A natural question regarding interactive DP mechanisms is whether the differential privacy is also preserved under composition. However, there could be more than one composition operation for interactive mechanisms. The most straightforward one is sequential composition, where the adversary's interactive session with one mechanism must be halted before it starts an interactive session with another mechanism. In this case, all of the composition theorems we discussed earlier still applies. However, a reality strike is that a single adversary could interact with multiple mechanisms concurrently, where different threads of interactive sessions can be arbitrarily interleaved with each other. That is, although each response outputs by different mechanisms are generated independently, the adversary, however, may coordinate the actions it takes in the various executions of interactive sessions. And in particular, its actions in one execution may also depend on messages it received in other executions. Unfortunately, none of the existing composition theorems for non-interactive DP can be applied directly to the setting of concurrent compositions. To the best of our knowledge, this work is the first to tackle the problem of concurrent compositions of interactive differential privacy. Here are our main results. We derive a bound that is similar to group privacy. For the concurrent composition of pure interrupt DP, the privacy degrades at most linearly with the number of mechanisms concurrently executed, which is the same as non-interactive DP. However, for concurrent composition of approximate interrupt DP mechanisms, it is worse than even the basic composition theorem of non-interactive DP in the delta term. We then characterize a pure interrupt DP mechanism as a post-processing of randomized response, a non-interactive mechanism, and obtain the optimal privacy bound for the composition of pure interactive DP by taking the optimal composition of non-interactive differential privacy, since we know that DP is closed under post-processing. Unfortunately, we still don't know much about the concurrent composition of approximate interactive differential privacy. We believe the bound we got are far from the correct answer. Based on computer simulation, we conjecture that optimal composition bound may also hold for approximate DP. Okay, now we talk about some formalization and basic properties of concurrent DP composition. We first formalize the notion of interactive protocol. An interactive protocol is between two parties, A and B. We do this by viewing each party as a potentially randomized function, taking its private input and the party's random coins and all messages it has received, to the party's next message to be sent out. We further define the view of a party in an interactive protocol, which is essentially everything the party sees, including the party's readiness and its private input and the message it received. In our case, A is the adversary, and B is the mechanism whose input is usually a database. Since the adversary does not have an input in our case and we will only be interested in the adversary's view, we will drop the superscript. Now we are ready to formally define the interactive differential privacy as a type of interactive protocol between an adversary and an interactive mechanism. Specifically, an interactive mechanism is epsilon-delta-difficiently private. If for every adjacent dataset x and x prime, for any adversary algorithm A, for any possible event t of the view of the adversary, the probability that we see t when the dataset is x is at most e to the epsilon times the probability that we see t when the dataset is x prime plus delta. We use this special notation to denote the concurrent composition of magnets m0 to mk-1. The main difficulty of analyzing the privacy loss of concurrent composition is that for sequential composition or non-interactive composition, the output messages from previous magnets are simply the auxiliary knowledge of the adversary. And the magnetism is interacting with an adversary of a fixed strategy, so the DP guarantee applies. However, for concurrent composition, since the adversary will receive messages from other magnets during the interactive session, the adversary's strategy is not fixed and can change during the interactive session, so the magnetism is essentially interacting with a different adversary for every query it receives, so the DP guarantee may not directly apply in this case. For the convenience of our proof, we introduce a variant of concurrent composition of interactive protocols, which only accept queries in the exact order of the k magnets m0, m1 to mk-1. Like the first query is sent to m0, the second is sent to m1, the third is sent to m2, etc., and after mk-1, the next query goes to m0 again. We also introduce a special kind of interactive magnetism, the non-query extension. The non-query extension of a magnetism has the exact same output distribution as the original interactive session, except that it will accept and ignore all the non-query, or we can call them dummy-query. One will not get any answer from those queries. Based on the two concepts introduced, we derive a lemma which says that in order to prove the concurrent composition of k magnets is epsilon delta DP, it's enough to show that the ordered concurrent composition of non-query extension of these these magnetisms is also epsilon delta DP. The intuition is that if the first query is sent to mi, and the second query is sent to mj, then we will fill in a dummy query and send two magnetisms between mi and mj. Given this lemma, for all of the concurrent composition proofs in our paper, we assume without loss of generality that the concurrent composition are ordered. That is, the else query is sent to magnetism l mode k. In particular, if an adversary a is concurrently interacting with two magnetisms, we assume that the query alternates between them. Okay, now we try to analyze the concurrent composition for pure interactive DP magnetisms. We start with the simplest case of composing two magnets. Given an interactive protocol where the adversary is concurrently interacting with two magnetisms m and m tilde, the idea is that its interaction with one magnetism could be viewed as a combination of the adversary and the other magnetism interacting with that magnetism. And the differential privacy guarantee still holds for this combined adversary. We call this combined adversary as A star. Specifically, we can see that A star is a well-defined adversary strategy throughout the entire interactive session with m, as the randomness of A star is fixed, as the randomness of A and m tilde. And the function for computing the next query is also naturally defined. Therefore, the differential privacy guarantee applies to the view of this combined adversary A star. And we also observe that, given a transcript of A star's view, we can recover the view of A through post-processing, which is also quite straightforward. Like every query sent to m tilde is derived from the adversary A, and then we can derive the answers from m tilde. Because of this post-processing algorithm, for any event t, the probability that A's view is in t is exactly equal to the probability that A star's view is in the inverse of the post-processing algorithm of event t. And also, as we said earlier, since A star is well-defined, the random variable of A star's view enjoys differential privacy guarantee. To introduce the final conclusion, we say that two random variables x and x prime are epsilon delta indistinguishable. If for every event t, the probability that we see t for random variable x is at most e to the epsilon times the probability that we see t for random variable x prime plus delta and vice versa. A very simple proof can show that if x and x prime are epsilon zero indistinguishable and x prime and x double prime are epsilon tilde zero indistinguishable, then x and x double prime are epsilon plus epsilon tilde zero indistinguishable. Suppose mechanism m is epsilon zero interactive dp and m tilde is epsilon tilde zero interactive dp. Because of the privacy guarantees from the definition of interactive differential privacy, we know that the view of A star against mx versus mx prime are epsilon zero indistinguishable. And because we can recover the view of A through post-processing and dp is preserved under post-processing, we know that A's view are epsilon zero indistinguishable between when concurrently interacting with mx and m tilde x and when concurrently interacting with mx prime and m tilde x. Symmetrically, we can obtain a similar indistinguishability result if we define A and mechanism m as a combined adversary interacting with m tilde. Finally, we are able to obtain the bond of the privacy loss of the adverse series view based on the simple transitivity property of epsilon zero indistinguishability. And we can extend the result to the case of composing more than two mechanisms by induction. This result tells us that under concurrent composition, the basic composition theorem for pure dp still holds. That is, the privacy parameters of the resulting composed mechanisms are the sum-up of the individual algorithms for the case of pure differential privacy. And now, we turn to the concurrent composition of approximate interactive differential privacy. For the concurrent composition of approximate interactive differential privacy, it is natural to think whether we can bond the privacy loss using a proof technique that is similar to what we did for pure dp. It turns out that we can, but the resulting privacy parameter is not the same as non-interact dp in this case. Suppose each mechanism are epsilon i delta i artificially private. For a particular mechanism mi, by viewing the adversary and the rest of n-1 mechanisms as a combined adversary, we can obtain the following privacy guarantee by a very similar argument as we did in the proof for pure dp. By conducting the above thought experiment for every mechanism mi, we can bond the privacy loss through a simple hybrid argument, and we can eventually obtain a concurrent composition theorem for approximate differential privacy. However, we can see that in this case, the delta term for the concurrent composition does not add up as what we have for non-interact dp. Moreover, it depends on the permutation order of m0 to mq-1. Therefore, we give an upper bond for the delta term which is potentially easier to work with. Also, we notice that each mi has the same parameters of epsilon and delta, and the bond is the same as group privacy, so therefore, we refer to this privacy bond as group privacy like bond. Here's another contribution in this project, which will try to make a reduction from interactive dp mechanism to non-interactive dp mechanism. Randomized response is a very simple non-interactive differentially private mechanism which takes a binary bit as input and output a randomized version of the input bit. This is the epsilon delta variant of randomized response, whereas the mechanism will simply tell the true input bit with property delta, and we can easily verify that this mechanism is epsilon delta differential private. A key step to prove the optimal composition theorem of non-interactive dp algorithms is that epsilon delta randomized response can simulate any non-interactive epsilon delta dp mechanism through post-processing. Since we know that differential privacy is preserved under post-processing, it follows that to analyze the composition of arbitrary non-interactive dp algorithms, it suffices to analyze the composition of randomized response. If we are able to prove a similar result for interactive differential privacy, then we will be able to extend all of the composition theorem for non-interactive mechanisms to interactive mechanisms. So in this work, we find that every interactive epsilon zero dp mechanism can be simulated as the post-processing of randomized response, whereas the post-processing function itself is interactive. That is, for every interactive mechanism that is epsilon zero differential private and every neighboring dataset x and x prime, we can construct such an interactive post-processing mechanism t so that the view of the adversary interacting with m has the exact same distribution as the view of the adversary interacting with t, given a random bit from epsilon zero randomized response. The proof idea is by setting the two distributions to be the same and then derive t. It's mainly technical and quite involved, so I do not go into too much detail to that proof. Therefore, to analyze the privacy loss of composing arbitrary pure interactive dp algorithms, we can instead analyze the privacy loss of the composing randomized response of the same privacy parameters, since differential privacy is preserved under any post-processing. Therefore, the final bound for the privacy parameter for the concurrent composition of pure interactive dp is the same as the optimal composition bound for non-interactive dp mechanisms. The major limitation of this work is that we still don't know much about the concurrent composition of approximate interactive dp, so we ran a little bit experiments, tried to get some insights. We evaluate whether there exists a two-round mechanism with one bit messages can be simulated as the interactive post-processing of randomized response. We note that the concurrent composition of two copies of such a mechanism already has non-trivial interliving. In all of our trials, we find that a feasible interactive post-processing mechanism, therefore, we constructed that the concurrent composition of interactive dp mechanisms may still have the same bound as the composition for non-interactive dp mechanisms, and we may still be able to prove it through a similar construction of interactive post-processing mechanisms. This means that every interactive dp mechanisms may be reduced to non-interactive randomized response. Okay, thank you for your listening.