 Welcome to the presentation of our paper Coupling of Random Systems. We want to start by giving a short motivation and explaining what we mean when we say random system. A random system is an object that operates in rounds, and in each round i, you can give a query x i and you'll receive a response y i. Moreover, each round can depend on the previous rounds, capturing the fact that the system has state, and the system can be probabilistic, meaning that each response can be sampled with some randomness. What we mean when we say random system is really the behavior, and this is completely independent of how the behavior is implemented. So we don't talk about any computational model or about how such a system would be represented, say for example as pseudocode. Random systems have been characterized before by sequence of conditional probability distributions with understanding essentially that for every sequence of inputs and outputs and a new next input x i, you get a probability distribution of the corresponding output y i. A typical example of a random system is the uniform random function r from some domain x to codomain y. This is the random system that for every query x gives you an independent uniform random value from y, except that when you ask a query that you've asked before, you get back the same response. And the uniform random permutation p on some domain x is also random system different from r. Our contributions are as follows. First, we present a simple theory of random systems with a characterization that is different from the conditional probability distributions we've seen before. And then within this theory, we present a coupling theorem for random systems, which provides a tight characterization of the distinguishing advantage that is classically defined via a distinguisher in the information theoretic setting. And then we use this coupling theory to prove a new indistinguishability amplification result, essentially stating that any threshold combiner is also an amplifier for indistinguishability in the information theoretic setting. The starting point of this new theory is what we call a deterministic discrete system. In a single query case, this would just be a function from x to y. More generally, if we have multiple queries, we say it's a partial function from the set of sequences over x to y. And the idea is that the first output y1 is just a function of the first input, but then the second output is a function of the first and the second input. This is because we want to capture what you would classically call a stateful system. And therefore, the second output depends not only on the second input, but also on what happened previously. And this is why there's always a dependency on all of the previous inputs. So here we see the four single query DDS on the domain and co-domain of bits, or the set 0, 1. So for example, the system flip is the system that if you query it with 1, the answer is 0. You see this here. So these are single query systems. And more generally, if we had more queries, the tree would go on here, it would be a D-part tree. But now, of course, we want to have probabilistic systems, and this is not captured yet. And the way we want to think about this is by saying that a probabilistic discrete system is simply a probability distribution over deterministic discrete system, so over DDS. So for example, we can define PDS-V, which is just the uniform distribution over the four DDS, which we saw in the previous slide. And if you look at the behavior of this PDS-V, we see that this is just, no matter what we query, 0 or 1, the answer will be a uniform random bit. With probability 1 half, we see 0, and with probability 1 half, we see 1. Interestingly, it turns out that there are different PDS that exhibit this very behavior. So for example, we can define a parameterized PDS-V alpha for alpha in the interval 0 to 1 half, which is a different, in general, a different PDS. But it turns out that it exhibits exactly the same behavior. This is why we introduce an equivalence relation on PDS, saying that two PDS are equivalent if they behave equally, meaning that no matter what input sequence you ask, the obtained output distribution will be the same. And then, since we said before, by random system, we mean the behavior. We propose to think of a random system as an equivalence class of PDS, and we write it like this. In the bold phase, we also write bold phase to denote the equivalence class. So an example we've made before, for example, one can see that actually this set here, V alpha, for all alpha in the interval 0 to 1 half, is exactly one equivalence class, meaning there exists no other PDS that have this behavior. But now you can ask, why did we do this? What did we gain by doing that? And the answer is, since our systems are now probability distributions over deterministic systems, we can lift results, statements and definitions that are essentially about, and classically, about probability distributions and over-probability distributions to the level of random systems. And our main contribution does this by considering the distance of two systems. So the starting point is to consider a statistical distance on probability distributions and observe that this is exactly how you want to measure the distance of two probability distributions, essentially because of this nice coupling interpretation that is well known. This coupling interpretation tells us that two probability distributions, which have statistical distance epsilon, can be considered as being equal with probability 1 minus epsilon. In the sense that if you could pick an arbitrary joint distribution between x and y, then the probability that x and y are not equal is exactly the statistical distance. And our goal is now essentially to lift this kind of thinking that is just about probability distributions to the level of random systems. So we do this by saying we define a new distance of random systems. Say the distance of two random systems, S and T, is just the infimum of the statistical distance of the PDS S and T, where we go over all pairs S and T, over all PDS in the equivalence class. The reason why we need to take this infimum is simply that there exist PDS that are equivalent but have statistical distance 1. So just taking the statistical distance of some PDS in the equivalence class will not be a meaningful notion of a distance. So if you go back this system V alpha here, these are all equivalent. If you choose alpha 0 and alpha 1 half, you will see that these are just equivalent systems. Nevertheless, their statistical distance is 1. Now classically, of course, we already have a distance on systems, which is the distinguishing advantage. This is classically defined via a distinguisher, which is essentially a system for itself. Interacting with these systems and then outputting a guessing bit, somehow, which we call Z here, trying to guess whether it is interacting with S or with T. And then we can define a distinguishing advantage like that. And we can define the optimal distinguishing advantage as the supremum over this notion, going over all distinguishers. And we consider an information theoretic setting here. So we go really over all possible distinguishers, not only efficient ones. And now the natural question is, of course, how are these two distances related? And there's one direction that is very easy to see. In particular, for example, if you take this coupling interpretation, it gives you at least some intuition. Essentially, saying that if two systems are equal with probability 1 minus delta, then you cannot distinguish better. So this is a trivial observation. So this shows that this advantage here cannot be larger than this delta of S and T. But the question is now, OK, what is this gap here? And interestingly, we have a theorem that shows that there is no such gap. In particular, there exist PDS, S and T in the equivalence classes such that their statistical distance is exactly the optimal distinguishing advantage between S and T. And now, of course, we can take the coupling interpretation of the statistical distance and just plug this in. And what we'll get is that the distinguishing advantage between two systems, S and T, is equal to the probability that two coupled systems, S and T, are not equal if you choose the right PDS, S and T. But the interpretation is that it's OK, we can choose an arbitrary PDS since they have the same behavior anyway. So for example, if you have a random system or a random function F that is epsilon close with respect to the optimal distinguishing advantage to a perfect uniform random function R, then we can think of F as being equal to R except in case of a failure event that occurs. But this event occurs at most with probability epsilon. And what is important is that this event E only depends on the system F and its randomness itself. It does not depend on how it is queried on what object is querying the function F and what the querying strategy is. And what's nice about this is that we can now think of this event as being sampled even before the interaction started. So this allows to simplify the reasoning about the distance and the distinguishing advantage of two systems S and T by just saying we sampled the event in the beginning. In particular, we get rid of the complexity of this adaptive interaction if we do that. The single query case of our theorem can be seen essentially as a consequence of the following lemma which says that if you take any probability distributions xi and yi for i from 1 to n, then there exists a joint or there exist joint distributions x and y with the corresponding marginals xi and yi. Such that the statistical distance of this new joint distribution is just a maximum over all i, the statistical distance of xi and yi. Essentially this right hand side here will then correspond to the optimal distinguishing advantage. You can understand a single query system essentially as a joint distribution where on each index i is essentially the output that would be observed if you input the value i. And then the optimal distinguisher would just choose optimally the best i that maximizes this statistical distance and this is also just the obtained distinguishing advantage. And the right hand side is then just the statistical distance of the systems of the PDS themselves. So these x and y here would correspond to the PDS S and T. And for the general case essentially what we do is an induction for example over the tree or over the number of queries. And then in the induction step you would essentially apply a form of this lemma or a generalized form of this lemma. Finally, we use our coupling theorem to prove some new indistinguishability amplification results. And for this we introduce an object which we call a KN combiner which is a construction C that takes n component systems and turns them into a new system. With the understanding that on each index you have an ideal system, an ideal component system ii or ij. And the understanding is that we call that a KN combiner if whenever k or at least k of the component systems are ideal here. At least k of these fi here are ideal then the overall constructed system just behaves as if all component systems were ideal. And we then show and this is only a simplified corollary that if you have such a combiner and you plug in systems that are the component systems that are epsilon close to their ideal for all i. Then the overall constructed system is in the order of epsilon to n minus k plus one close to the ideal system meaning that if all to the construction that is obtained if all component systems were ideal. So this generalizes indistinguishability amplification results of MPRO7 which are essentially the special case k equals one and n equals two. Lastly, we show different ways of instantiating such an indistinguishability amplification result and this is here is only one thing one can do. Essentially we show that one can essentially compress or combine n random functions that are independent into fewer say k random functions and at the same time amplify their indistinguishability. And for the details you need to read the paper but the construction is very simple and the idea is essentially just to query all of the functions and to combine the outputs via what is essentially a scalar product in a finite field. To conclude we want to mention that there is still a lot of work to be done. So for example the indistinguishability amplification results we show are not obviously perfectly tight. They might not be, though we think that one cannot can probably not improve substantially over them but it would be interesting to know whether they are really perfectly tight or whether it's possible to do better. And then of course all of the results we've shown are in a purely information theoretic setting and it would be interesting to see to what extent it is possible to extend say for example the coupling theorem to a computational setting. And there are some results that are quite restricted that are known. For example for what are essentially stateless systems but it would be interesting to extend this and generalize this and strengthen these results that exist. And then lastly we would like to apply our results like say for example the coupling theorem to obtain amplification of a different kind. We now only proved indistinguishability amplification results that essentially lower the quantitative value of the distance. But there are different kinds of such results. For example in NPR-07 it's shown that you can amplify essentially the distinguisher class going from non-adaptive to adaptive indistinguishability. And it's not yet clear to us how one could use say for example a coupling theorem or this kind of result or this way of thinking to prove such a theorem. We hope to be able to do that in the future. Thank you.