 Hello, today I will be talking about formal verification of privacy program. These programs have the ability to sample values from distribution during execution and arise in the variety of contexts. They are used in cryptography, differential privacy, and more generally in security. Within cryptography, randomness is used to provide the level of entropy required to protect secrets against powerful adversaries. In differential privacy, randomness is used to noise the output of the queries in order to protect individual privacy. In both cases, formal verification is desirable. However, many security and privacy properties are relational, that is, they refer to executions of two programs. Today, we will be looking at program logics to carry relational verification of probabilistic programs. As a disclaimer, let me point out that probabilistic programs are different from probabilistic programings. Probabilistic programings additionally allow for conditioning, which we will not consider in our setting. Now let me look at relational properties. Let me consider the simple setting of two programs, P1 and P2, which both terminates, and let me focus on the class of input-output properties of programs. In this case, we want to ensure that programs' execution map related input to related output, or suitable relations between input and output. These relations can be modeled using relational precondition and postconditions, which are relation on memories, and we require that running program P1 and P2 on related input induces programs which are related by the postcondition. This simple setting already captures many properties of interest. The simplest property is equivalence. Two programs are equivalent if they have the same input-output behavior. This can be captured by requiring equality of memory, both as precondition and as postcondition. A generalization of equivalence is non-interference. Non-interference is a security property that guarantees that executing program does not leak data. Non-interference is based on a notion of low equivalence, which relates memory which are equivalent from point of view of the adversary. Non-interference states that running a program on low equivalent memory is low equivalent memories, or equivalently that executing a program does not leak secrets. In this case, the precondition and the postcondition are just low equivalence of memories. A variant of non-interference is leakage resistance. This property is motivating by such an alertness. In this setting, we assume that the adversary can observe leakage that results from the physical execution of the program, for instance, time. And we require that leakage does not leak information about the secret. This is achieved by taking as precondition low equivalence between memory and as postcondition equivalence of leakage. To conclude, let me mention a slightly different property of robustness. This property assumes that the input and output space are equipped with a distance relation, and requires that programs do not make the distance between inputs grow. For example, one might require that if we take two inputs that are a distance smaller than one, the outputs of a computation are a distance smaller than k. This is a property known in the literature as k-lipschitzness. These properties are stated in a deterministic setting, but they can naturally be lifted to a probabilistic setting. However, the probabilistic setting favors richer properties. In particular, these properties can be quantitative. Rather than requiring equality, I might require almost equality. In this setting, the postcondition then becomes a quantitative relation on distribution. This raises a number of questions. How do we model property? How do we verify them? Can we use standard verification tools to achieve practical verification? In our work, we develop a solution based on an old concept from the theory of Markov chain called probabilistic couplings. Probistic couplings were introduced in 1938 by Dublin to study the convergence of Markov chain. The basic idea of coupling is that when you have to compare two probabilistic processes, it is better to build a single probabilistic process that emulates the behavior of the two. Similarly for distribution, when you want to relate two distributions, it is better to build a distribution that emulates the behavior of the two. And how is this done? This is done by looking at a distribution over the product space such that the first and the second marginal correspond respectively to the first and second distribution. Couplings always exist, and there are many of them. As an example, let us consider couplings of uniform distribution over cons. So we are looking at uniform distribution over the set 01, and we want to look at couplings for them. So we have at least three distributions which can be used as coupling for the uniform distribution. The first one is the trivial coupling, which assigns probability one quarter to every element of the set. The other coupling is the equality coupling, which assigns probability half to 00 and 11, and probability zero to the other elements. Finally, the last coupling is the so-called inequality coupling, which assigns probability half to 01 and 10, and zero to the other elements. It is easy to check that these are valid couplings of the uniform distribution. However, this coupling will satisfy different properties. To capture properties of coupling, we actually use a more elaborate relation, a notion called r-couplings. r-couplings can be understood as an enrichment of the notion of coupling with a notion of post-condition. So suppose we are given a relation r on a1 times a2. Then m is a r-coupling for mu1, mu2. When mu1 is a distribution over a1 and mu2 is a distribution over mu2, a2. If mu is a coupling for mu1, mu2, and the support of mu is included in r, which means that every element with a non-zero property in mu satisfies r. The basic intuition behind r-coupling is that r is a valid post-condition for the coupling. Let us go back to fair-cons. The trivial coupling is a top coupling. So top is a valid post-condition for the trivial coupling. In contrast, the equality coupling is an equality coupling, which said, so equality is a valid post-condition for it, and the inequality coupling is an inequality coupling, which means that the inequality is a valid post-condition for it. This notion of r-coupling, and more generally the notion of coupling, is actually used in the literature for a number of purposes. It is used in particular to reason about the statistical distance between two distributions. So given a coupling mu between mu1 and mu2, one can always up and down the statistical distance between mu1 and mu2 by computing the probability of the event x1 is different from x2 under mu2. This probability will always be an up and down for the total variation distance. Furthermore, this equality can be realized by the optimal coupling. This result is often used in order to prove convergence of Markov chains. Another application of coupling is stochastic dominance. Stochastic dominance leads to the probabilistic setting, the notion of bigger than. In order to reason about stochastic dominance, we must assume that the underlying set is a partial order. Then one can show that distribution mu2 stochastically illuminate mu1s, even only if there exists a less than unequal coupling for mu1 and mu2. These two applications of couplings are in fact rather different from the application that we will use in our work. Our main application is based on the so-called fundamental theorem of our coupling. And our goal here is to reason about the probability of an event e1 under the distribution mu1 and the probability of an event mu2 under the redistribution mu2. And our goal will be to establish an inequality between these two probabilities. Now in order to establish such an inequality, the fundamental theorem of couplings says that it suffices to come up with an appropriate r-coupling for an appropriate relation r. And so if the relation r satisfies the condition stated on the slide and an r-coupling exists, then the inequality holds. Instances of the fundamental theorem of coupling are the bridging step and failure event lemmas, which are widely used in cryptography. The first one allows you to show equality between probabilities of events. The second one allows you to bound the difference of probability between two events by the probability of a third event, f, called a failure event, which is assumed to happen very seldom. These events will be widely used in many of our applications and in particular in cryptography. Because these results rely on the existence of r-coupling, it is important to know how we can build such r-coupling. There are some general results for showing the existence of r-coupling. A beautiful result is Trussian theorem, which shows that in order to show the existence of r-coupling between two distributions, mu1 and mu2, it suffices to show a universal statement, namely for every event x in a1, the probability of x under mu1 is upper bounded by the probability of the image of x under r under mu2. This theorem is beautiful because it relates the existence of an r-coupling to your universal statement. Another way of establishing the existence of r-coupling, which has been widely studied in the context of probabilistic bisimulation, is based on optimal transport. In order to show the existence of an r-coupling between distribution mu1 and mu2, it suffices to show that the max flow is one for a flow network that represents the distribution mu1 and the distribution mu2 as a flow network, as a source and sink, respectively. These results allow to show the existence of r-coupling. However, the most important point about r-coupling is that they are compositional and they can be built compositionally. Indeed, one can look at the standard sequential composition of probabilistic mass mapping and show that r-couplings behave sequential, composed sequentially, which means if we have distributions mu1 and mu2 and function m mapping elements in A, which are the underlying set of mu1 and mu2, the distribution of a b1 and b2, then we can build a coupling of the composed mapping under S, which is the original post-condition for the function m. This very important property is the basis of our relational verification technique. r-couplings also satisfy other important properties such that relation composition and closure under convex combination, and we occasionally use these properties as well in our verification. Let me now explain how we use coupling-based verification on the concrete setting of an imperative probabilistic language. We consider a core imperative language with sampling from probabilistic distribution and procedure calls. Our procedure calls consider both oracle calls, that is, procedures for which we have the call, and adversarial calls. Adversaries are procedures for which we don't have the call code and we just know that they will call some oracles. Such adversarial procedures arise commonly in cryptography and therefore must be handled if we want to apply our verification framework to this setting. Now to reason about relational properties of probabilistic program written in this language, we develop a probabilistic relational whole logic. Churchments of our logic are just similar to the standard relational logic. The pre-condition and the post-condition are relational memories and c1 and c2 are programs. However, c1 and c2 are probabilistic programs. Therefore, in order to give the notion of validity for this judgment, one must actually say what it means for the output of executing a probabilistic program on a memory is related by the post-condition. Well, we just use the existence of a coupling to guarantee that the post-condition is valid. Concretely, we say that the judgment is valid if we start with two memory which are related by the pre-condition, the output distribution are related by the post-condition. Or more precisely, there exists a coupling for the output distribution that is a coupling valid for the post-condition. This forms the semantic basis of our logic. One can then show that standard rules form whole logic of valid, including the sequence rule or conditional rules. So not that for a probabilistic logic, for a relational logic, sorry, we have both so-called two-sided rules and one-sided rules. Two-sided rules are rules where we consider that the program, the top-level shape of the program is equivalent on both sides. So for conditional, both programs will be if than else. Note that the two-sided rule for conditional require that the two programs have the same control flow. As embodied on the pre-condition, b1 is equal to b2 in the conclusion. However, we can also reason about programs which do not have the same control flow by using so-called one-sided rule. In such rules, we do not require that the two programs have the same top-level construction. In fact, if you look at the last rule, you will see that the program on the right-hand side, c2, is arbitrary. This allows us to do a case analysis on the guard of the first program without caring about the structure of the second one. We also need some rules for random assignments. Again, we have two-sided rules and one-sided rules. The two-sided rules simply require the existence of an appropriate coupling between the two distributions from which we are sampling. The one-sided rule treats random sampling as a non-deterministic assignment. Again, this rule has sound with respect to our interpretation of judgment. The last and most challenging rule is the rule for adversary. Remember that an adversary is a procedure for which you do not have the code, but which you know to call specific oracles. In this case, I'm assuming that the adversary A can call a single oracle O. So what I would like to show is that an adversary call preserves an invariant. And there are some conditions for this. In order for an adversary call to preserve an invariant, the invariant must be preserved by the oracle itself. And moreover, the read and write effect of the adversary must be constrained suitably so as to interact well with the invariant. Again, it is possible to prove soundness of this rule under our interpretation by induction on the structure of the code of this adversary. This is also logic for reasoning about relational property of adversarial probabilistic programs. We have used this logic for a number of purpose. A first application of this logic is provable security of cryptographic construction. Security of cryptographic construction usually relies on reductionist argument. That is, we do not prove absolute security. We prove security relative to a hardness assumption. Therefore, security statements are of the following form. We first assume that we have an inverter I that tries to break RSA. And then we assume that we have an attacker A that tries to break a cryptographic construction, say RSA OIP, which is a famous encryption scheme. And what we want to show is that if the adversary A is able to break the encryption scheme, then either the inverter breaks RSA, which we believe to be impossible or some bad event happens. However, we will also prove that this bad event has a very low probability. And this will allow us to give an upper bound on the probability of an adversary being able to break an encryption scheme. This is given by the inequality below. The inequality at the bottom of the slide essentially states that the advantage of an adversary to break RSA is not significantly bigger. RSA encryption is not significantly bigger than the advantage of an inverter to break RSA. The expressions on the right hand side depend on the number of calls that the adversary can do to the decryption and hash function. This is an instance of a quantitative property that we will need to handle. Another application also involving quantitative verification is differential privacy. The basic idea of differential privacy is to add proristic noise in order to protect individual privacy. A challenging example of differential privacy is the sparse vector example. This algorithm is actually trying to take elements within the list that are within too threshold. And it takes a bounded number of such elements. The sparse vector algorithm achieves some magic because it achieves very strong privacy with a small budget. So if you look at the code, you will see that a number of random sampling called LAP, which is the Laplace mechanism, which is a mechanism which adds noise to computation in order to make them differentially private. Each call to the Laplace mechanism encourage a budget of epsilon. Therefore, you would expect that at the end of the call, you would make n epsilon differential privacy. However, sparse vector achieves to manage to achieve a smaller or a better privacy budget, m epsilon, through a complex argument. In order to show the correctness of this algorithm, we rely on the generalization of probabilistic relational whole logic called approximate relational whole logic, which is based on approximate coupling. And we use many new proof rules, which are quite intricate and relate rely on additional properties of approximate couplings. A final application of coupling based methods are stability of stochastic fragile. The notion of algorithmic stability is used in machine learning to guarantee that an algorithm does not depend too much on a single test, on a single example. This ensures that the learning algorithm does not do overfitting. Using a similar logic, we have been showing that the stochastic gradient descent satisfies epsilon differential privacy. Our logic uses expectation based coupling, another variant of coupling. All these logics have been implemented in a tool called Easy Crypt, which is a domain specific proof assistant for doing relational verification of probabilistic program. The specificity of Easy Crypt is to achieve control and automation by combining an interactive proof engine, Alacorc SS Reflect, and the backend to SMT solvers. We have used Easy Crypt for many case studies, including encryption, signatures, and many other concepts from cryptography. Some of our most recent applications were verification of the SHA-3 standard for the National Institute of Standard and Technology, verification of voting schemes, and verification of the key management service from Amazon AWS. We have also used Easy Crypt for verifying security of constructions against side channels, including of logics run. We have used Easy Crypt for verifying many differentially private algorithms, and for showing algorithmic stability and other properties which are based on expectation, including convergence of population dynamics. In the remaining of the talk, I will consider the higher order case. It is reasonable to carry verification on imperative language because imperative language are close to an algorithmic description. However, there are also some strong arguments to consider higher order language. First of all, many problems are inherently higher order. They involve functions. In addition, there is a practical motivation for considering higher order language. Specifically, if I want to build a secure programming framework for differential privacy or for cryptography on top of a higher order language, then I will need to have a verification infrastructure that is actually applicable to this language. The main challenges by going to the higher order case are dealing with structurally different expression and dealing with adversarial computation. We have been working on this problem for a number of years and still have not completely reached a satisfactory solution. Our first attempt was with relational refinement types. Relational refinement types are very effective when the expressions are structurally similar, but become very cumbersome whenever we have to reason about programs that don't have the same structural shape. A better alternative is therefore to use higher order logic, which works well for structurally different expression. Relational higher order logic starts from a simply-type lambda calculus, which is supposed to be terminating, and then considers logical judgments on top of the simply-type lambda calculus. Here, judgments are of the form gamma, where gamma is a simply-type context, psi, where psi is a set of assertions, and on the right of the term style, then we have two expressions, T1 and T2, which we want to relate, and are of respective type, T1 and T2, and an assertion, phi, which actually relates T1 and T2. Note that phi is allowed to contain two variables, R1 and R2, which refers to T1 and T2 respectively. For people who know about specification languages like JML, you could think about R as res. This forms a basic logic for reasoning relationally about higher order terms. The benefits of this logic is that we can have both two-sided rules and one-sided rules. So if we look at the two-sided rules for abstraction and application, they are just the standard rule where we're going to produce, in conclusion, either two abstractions or two applications. The more challenging rules are one-sided rules, which fortunately in the case of relational higher order logic are easy to formulate. So if you look at the rule for abstraction and application, you will observe that in both cases, the term on the right-hand side is arbitrary. On the left-hand side, we have an abstraction and an application respectively, but we can consider an arbitrary term on the right-hand side. This gives us the flexibility to reason about structurally different programs. In order to achieve relative soundness and completeness with respect to higher order logic, we use a strong subsurption rule. And this subsurption rule guarantees that our logic is sound-relatively complete and has very good guarantees. The question is then, how can we leave this logic to the puristic setting and to support adversarial computation? It is reasonably easy to extend the logic to the puristic case by considering a coupling modality which essentially captures the existence of a coupling with respect to the post-condition. The slide presents the rule for unit and binding, and it is easy to see that these rules are relatively straightforward. The challenging case is the case of the adversary. Remember that an adversary is a stateful probabilistic computation that has access to oracles. Therefore, it is intuitive to think of an adversary as a second-order function that takes as input an oracle and produces an output. So here in the slide, I indicate that A as type tau arrow m of sigma, arrow m of sigma prime, where m is a combination of the state and probability moment. This informal type does not suffice. Remember, when considering the imperative case, the rule for adversary was making use of the read and write capabilities of the adversary. We have to do similarly in the higher order setting. Therefore, we have to use a graded monad where the grading reflects the read and write capabilities of the computation. In this setting, an adversary then becomes an expression with a universally quantified type where the universal quantification takes place over the read and write effects of computation. And the idea is like the read and write effect of the adversary is actually a combination of its own read and write effect and the read and write effect of the oracle it is invoking. With this stronger typing, it is possible to prove soundness of a proof rule for adversary. However, the soundness is fairly challenging and in particular requires to establish results about effect parametricity. Unfortunately, this is only part of the story. When we want to reason about quantitative properties, we need quantitative proof rules and these quantitative proof rules need to know how many times the adversary is calling oracles. In order to keep track of how many times an adversary is calling oracle, we need to enrich the type system by using bounded exponentials. This leads to a more complex adversary rule for which soundness is to be established. In conclusion, coupling-based verification is powerful and practical. However, there remains a number of theoretical challenges. A first challenge is to deal with conditioning. Coupling-based reasoning in presence of conditioning is possible but challenging and further development are required to establish appropriate proof principle in this setting. Another theoretical challenge is to consider more general notion of adversaries. Our notion of adversary as second-order expression is motivated by their use in cryptography where adversaries are essentially second-order. However, there are more complex notions including meta-reductions which require higher-order notion of adversaries, developing soundproof principle for such notions of adversaries left for future work. There are also a number of practical challenges. Although EasyCrypt has been widely used for a number of applications, it still requires a lot of human effort. In order to alleviate this human effort, it is important to develop domain-specific automation, in particular to deal with specific classes of cryptography construction. This is again possible and one can even automatically synthesize secure construction within a given class. But further work is required to systematize this approach beyond the basic construction that we have been considered in the literature. Finally, there remains a lot of work to be done with applications. Many new areas, including algorithmic stability, but also other notions as expected thus, still remain to be investigated. Thank you for your attention.