 Hi everyone, thank you for tuning in today. This is Joyberg with Badi Kasi, Rasmus Paa, and Ameya Velinger. We studied skill computation in the so-called Anonymous model, which was proposed by Ishai et al. in 2006. This model, their end-user, and that is the analyzer or the server, each user i receive an input xi. Using this input and possibly his or her own private randomness, each user produce potentially multiple messages. Here we use m to denote the number of messages produced per user. These messages are then sent to the analyzer who would like to compute some function f on the input x1 to xn. The distinguishing feature of this model is that we assume that the messages sent to the analyzer are anonymous, which means that the analyzer can only see the message content, but not the sender identity. Equivalently, we can think of this as if there is a shuffler in the middle that shuffles the messages in random order before revealing these messages to the analyzer. And this leads some to call this model the shuffle model, especially in the differential privacy literature. As is usual in scale computation, there are two main properties we need from the protocol. First is the correctness. We want the analyzer's output to be equals to f of x1 to xn. Second is the security. Here we use the information theoretic notion of security, which states that if we take a look at any two set of inputs, x1 to xn and x1 prime to xn prime, such that f of x1 to xn is equals to f of x1 prime to xn prime, then the distribution of the analyzer's view must have total variation distance of at most 2 to the minus sigma. Here sigma is our security parameter. In this work, we focus our attention on the task of aggregation in which each of the input x1 to xn comes from a field of q, and the goal is just to compute the sum x1 plus x2 and so on to xn. This task was indeed also studied in the original paper of ish8l. For the purpose of this talk, we use the result and our result in terms of the number of messages per user, which recall we use m to denote. In ish8l's paper, they provide a protocol where m is equals to sigma plus log q, where sigma is again the security parameter. q is the field site. Here we give an improved analysis, which saved probably a factor of log n in terms of the number of messages per user. So now our m is only sigma plus log q over log n. This result was independently and thoroughly obtained by ballet et al using different methods. Furthermore, we also give a nearly matching lower bound for the task of aggregation. In particular, when the security parameter sigma is no more than polynomial in n, our lower bound matches the upper bound up to a constant factor. Thanks to the reduction of ballet et al, the improved analysis of the algorithm also translates to an improvement in the setting of differentially private summation. In this setting, each user has a real number between 0 and 1, and the goal is to sum them up. Here, as a corollary, we can get an epsilon delta differentially private algorithm that incur an error of order of 1 over epsilon, which is nearly optimum, and a communication of log of n over delta per user. Since this is basically a direct corollary using that result, we are not going to too much detail here. Instead, for the rest of this talk, we will focus on the proof outline of our improved analysis and our lower bound. We will start with our improvement on the algorithmic front. Interestingly, all the algorithmic results that we mentioned use the same protocol by each et al. This protocol is sometimes called a split and mix protocol. In this protocol, when a user receives an input x, which is just an element of fq, he or she randomly selects m-1 random elements from fq and send these as the first m-1 messages. Then, the user lets the last message be the input minus all the previously sent messages. The analyzer is extremely simple here, just sum up all the messages over fq. Clearly, the sum of each user's messages is equal to his or her input, thus the correctness is obvious. The bulk of the work here is in proving the security of this protocol. Each et al. were the first to prove such security and they show that the protocol is sigma secure when we take m to be roughly sigma plus logq. Here, we show that we can in fact take m to be smaller by roughly a factor of logn. It suffices to take m to be sigma plus logq over logn. What does it mean for us to prove such a statement? Let's follow the definition of security here. We have to consider any pair of inputs x1 to xn and x1 prime to xn prime, whose sums are equal. Here, we denote their sum by a. Let sx be the distribution of messages after shuffling on the input x1 to xn. Our goal is to show that the total variation distance between sx and xsx prime is small. Here, it's not too important what the actual bound is and in fact we will not go into any detailed calculation. To make our life a little bit easier, let us also consider the uniform distribution. On nm numbers from fq, whose sum is equal to a. And let us denote this distribution by sa. It suffices to show that the total variation distance between sx and sa is small. Because due to symmetry, we also have that the total variation distance between sx and sa is small. And by combining these two, we get the desired bound on the distance between sx and sx prime. So we would like to show that the total variation distance between sx and sa is small. How do we do this? Let us consider any potential messages after shuffling. And let us denote this by t1 so on and so forth to t of n times m, where their sum is equal to a. On the one hand, by definition of sa, sa is just a uniform distribution over such t's. So the probability mass of t in sa is just some constant here. 1 over q to the n times m minus 1. So we only have to figure out what's the probability mass of t in sa. If we can figure this out, then we can compute the total variation distance as desired. I copy some useful information from the previous slides here. Now to determine the probability mass of t in sx, recall that the messages are generated for each user and then they are passed through a random permutation before shown to the analyzer. It turns out that this means that the probability mass of t in sx is just proportional to the number of permutations that could result in t, or that is compatible with t. What's the probability mass of t in sx? The probability mass of t in sx is just proportional to the number of permutations that could result in t, or that is compatible with t. What does this mean? We say that a permutation pi could result in t, or is compatible with t. If after we apply pi to t, the first m elements, their sum is equal to x1. The next m elements, their sum is equal to x2 and so on and so forth. And the last m elements, their sum is equal to xn. For notational convenience, we will let yt pi to denote the indicator variable of this event, which just mean that yt pi is 1 if all these equalities are satisfied and it's 0 otherwise. Under this notation, the number of permutation pi that could result in t can be easily written as sum of yt pi over all permutation pi. Now recall our original goal that we would like to bound the total variation distance between sx and sa. And that sa is just the uniform distribution over all such t's. What this means is that what we really want to show is that the probability mass of t in sa is well concentrated around some number. To prove the desired concentration bound, we will use the Sherby-Shap inequality, where we think of the probability mass of t in sx as a random variable when t is drawn from sa. To apply Sherby-Shap inequality, there are two things we have to do. First, we have to compute the expectation of the random variable. And second, we have to bound the variance of the random variable. The two together will give us the desired concentration result. We will not state explicitly what these bounds are, but we will now outline how we can give such bounds. Okay, so recall that we have to compute the expectation and bound the variance. So let us start by computing the expectation. As a first step, let us note here that this expectation is just proportional to the sum over all permutations pi of the expectation over t of yt and pi. So it suffices for us to just compute the inner expectation, the expectation t of yt of pi. To compute this expectation, it is useful to take a linear algebraic view of this problem. To do so, let us define the metric a pi to be n times mn Boolean matrix, where the ijn3 is 1, even only if the jth message comes from user i. For example, if n is equal to 2, there are two users, m equals to 3 is users and 3 messages, and pi is just the identity. Then a pi will be this 2 times 6 metric. Observe that by definition of a pi and yt pi, we have that yt pi is 1. Even only if a pi times t, it is just an all-zero vector. What this means is that the expectation that we would like to compute, the expectation t of yt pi, is just proportional to the number of solutions to the equation a pi times t equals to 0. But the number of solutions to this equation is just q to the m times n minus rang of the matrix a. And it's also pretty simple to see that this metric is full rang, so it has rang exactly equals to n. So we can compute this expectation exactly. And this concludes the computation of the expectation. Now we move on to compute the variance of the random variable. And to compute the variance, it suffices for us to compute the expectation of the square of the random variable. And using the equation on the top, we can once again see that this expectation of the square of the random variable is proportional to the sum over all permutations pi and pi prime of the expectation over t of yt pi times yt pi prime. So again, it suffices for us to compute just this inner expectation, this expectation over t of yt pi times yt pi prime. Again, similar to before, we will take a linear algebraic approach to this computation. Specifically, we define the matrix b pi and pi prime to be the column-wise concatenation of the two matrices, a-prime and a-pi-prime, where these matrices were defined in the previous slide. For example, when n equals to 2, m equals to 3, and pi being identity and pi prime is the inverse map, then b pi pi prime is just the concatenation of a-pi and a-pi prime. And it looks something like this, which is a 4 by 6 matrix. Now observe that for the product yt pi and yt pi prime to be 1, both of them have to be 1. And this means that a-pi times t must be the all-zero vector, and a-pi prime times t must be the all-zero vector. So this also means that b pi pi prime times t must also be the all-zero vector. So what this means is that this expectation of the product between yt pi and yt pi prime, which is what we want to compute, is just proportional to the number of solutions to the equation b pi pi prime times t equals to the all-zero vector. Once again, the number of solutions here is just q to the mn minus the rank of b. However, unlike the previous case where the matrix A was always full-ranked, the matrix B here isn't necessarily full-ranked. In fact, its rank has many possible values. In order to deal with this in our proof, we have to give a combinatorial interpretation of the rank of b. And using this combinatorial characterization, we can bound the rank of b for random permutations pi and pi prime. We will not have time to go into more detail here, but this bound eventually implies also the bound on the variance of the random variable, which is the last bound that we need in order to apply the Sherby-Shap inequality. And this concludes the proof outline for our algorithm. Next, we will proceed to the second part of the talk, which is our lower bound. So our lower bound holds again any protocol. It doesn't have to be the spit and mix protocol. And it says that to achieve a security parameter sigma, we must use at least sigma over log of n times sigma plus log of q over log n messages. And as you might have already suspected, this lower bound in fact consists of two parts. The first part is what we call the few dependent lower bound. So this is the part that says that the number of messages must be at least log q over log n. So this is our whole event for very weak security, even for security where the security parameter is just one. And the second part of the lower bound is what we call the security dependent lower bound. So this is a lower bound that says that m has to be at least sigma over log of n times sigma. So these two bounds have separate proofs. First, let us consider the few dependent lower bound. So here we want to show that m must be at least log q over log n. And we will sketch the proof here. It will be a little bit hand wavy, but it will contain all of the main ideas. To prove this, let us consider any n times m messages seen by the analyzer. Since each message could have come from n different users, there are n to the n times m ways to assign them back to the end users. On the other hand, these messages correspond to some output. And from the security perspective, if we look at two input sequences that correspond to the same output, this very same output, then the analyzer intuitively shouldn't be able to distinguish whether the messages are from one input sequence or another input sequence. And for a fixed output, there are q to the n minus 1 possible input sequences that result in this same output. However, recall that there are only n to the n times m ways to reassign the messages back to the user. So this means that we should have n to the n times m greater than or equals to q to the n minus 1. And indeed, this implies that m is greater than or equals to log q over log n as desired. Next, we move on to prove the security dependent low bound. We should say that if we want a security parameter sigma, then we must have m being at least sigma over log of n times sigma. So to prove this bound for general protocol, we need a lot of notations. So instead, for the purpose of simplicity of presentation, we will only prove this low bound for the split and mix protocol here. To show that m must be at least sigma over log of n times sigma. It suffices for us to show that for each m, the security parameter is at most log of n times m to the m. Alright, so let's prove this. So what does it mean for the security parameter to be at most log of n times m to the m? By definition, it means that there must exist two input sequences, x1 to xn and x1 prime to xn prime, with the same output, in this case just the same sum. Such that the total variation distance of the message distributions are higher than n times m to the minus m. Here it will be more convenient to use an equivalent definition of total variation distance in terms of distinguisher. Specifically for us, it suffices to give a distinguisher that takes in n times m messages and either accept or reject, such that the probability of accept when the messages are from x1 prime to xn prime differs from the accept probability when the messages are from x1 to xn by at least n times m to the minus m. So it turns out that the two sequences of input and the distinguisher for us are very simple. Specifically for the first sequence of input, x1 to xn, they are just all 0. On the other hand, we set x1 prime to be 1, x2 prime to be minus 1, and the remaining inputs to be 0. As for the distinguisher, it just randomly picks m out of m times n messages and it accepts even only if the sum of these m messages is equals to 1. The point here is that if these m messages are not from the same user, then the probability of acceptant is exactly 1 over q. On the other hand, if they are from the same user, in the first sequence of input, we will never accept it because the sum will always be 0. On the other hand, in the second case, if these m messages are from the first user, then we will always accept it because the sum is always 1. So we have that the difference in the acceptant probability is at least the probability that all messages are from the first user. And indeed, this probability is at least 1 over n times m to the m as desired. So this is the entire proof of the security dependent lower bound for the split and mix protocol. For the more general protocol, the main idea is still pretty similar in that the distinguisher will randomly pick a few messages and accept even only if those messages satisfy a certain predicate. However, such a predicate will help to depend on the specifics of the protocol, and we will not go into more detail here due to time constraints. So in conclusion, we give an improved analysis of Ishi-8L's split and mix protocol, and we prove an essentially tight lower bound in terms of the number of messages that holds not only against the split and mix protocol, but any non-interactive protocol in the anonymous model. Nonetheless, there are still quite a few interesting open questions. First, although we achieve tight lower bound in terms of the number of messages, we haven't got the tight lower bounds in terms of the message size or equivalently the number of bits per message. Currently in the upper bound, the split and mix protocol, each message is an element of fq, so to represent them, we need logq bits. On the other hand, the only lower bound known is the trivial lower bound, which is logq over m, where m is the number of messages per user. So it remains an interesting open question to close this gap. Another open question is more open-ended in nature, to the base of our knowledge of the non-interactive protocols in the anonymous model involved summation in one way or another. So our question here is whether there are interesting non-interactive protocols or problems that are much different than summation. Alright, and with that, I would like to conclude my presentation. Thank you very much for your attention.