 So I'm happy to introduce Mukti Ramakrishnan Rengar Subramaniam, Mukti Prashat. He did his B.T.K. to my A.T. Madras, did a PhD at Cornell, did a post-graduate at NYU, and is now faculty at the University of Rochester. And he's done a lot of very important work on zero knowledge, and I'm going to talk about zero knowledge proofs. Thanks Manoj, and thanks really for inviting me. I mean besides getting an opportunity to come to India, it's also good to talk to our crowd. I'm going to talk about zero knowledge proofs. And as Manoj said, feel free to stop me. I'm going to try to keep this at as high level without too many technical details as possible. So do stop me and ask questions if you have any. I come from University of Rochester. I put a big banner over here to encourage you to go and look at our university. I'm always actively looking for PhD students, so do come and talk to me. I'm going to be here for the week if you want to learn more about Rochester and my PhD program. So zero knowledge proofs. Let's start with an example, what I call a motivating example. Zero knowledge proofs, it's been 30 years since this concept was created. So what's the example? A student is browsing, he starts his PhD here, she starts the PhD and wants to solve something big. The Clay Mathematical Institute has the list of one of these big open problems that have very big rewards if you solve them. And particularly important to computer science students is the P versus NP problem. I'm not going to talk about it, but I just wanted to say that a lot of you would be wanting to solve something big. And here are some examples and let's say that a student is thinking very hard about a problem and comes to the instructor and says, I have solved P versus NP. Now the instructor asks how? And the student is, you know, they've written the proof and presents it to the instructor. Now if this instructor was honest, the instructor would verify the proof and guide the student towards his or her glory to get to receiving the prize. But instead imagine that there was a malicious instructor. And this instructor sees the proof and he verifies the proof and then he says, I'm not going to tell this to the student, I'm just going to go deliver this proof to the Clay Institute and get the prize. So this is not a good situation. Now can we avoid this? So the question we are asking is, can I convince someone the validity of something? I prove some statement, can the student explain to someone that something is true without revealing the proof? And in some sense, can I reveal zero knowledge about the proof? And this is sort of the question we're going to try to solve during this hour. But at the end, I'll probably leave you in a slightly disappointing note saying that we haven't quite solved this problem yet. We will solve many other problems, but maybe not this year or not today. So to start with towards even this, let's start with the basic question of proof system. What is a proof? And starting from high school math, a general idea of proofs, at least from a computer science point of view, we're going to consider it as a language of true statements. And these true statements come with a proof from reduction rules from some axioms. But to truly understand what a proof is, a proof doesn't make sense until we talk about how it is verified. And what's more important is that a proof system makes sense only if you can efficiently verify it. If you can't even verify it in your life term or something, it is not quite a proof. So given the language L, the goal is to prove X is in L. A proof system for a language L starts with a verification algorithm V. Now a proof system should satisfy two basic properties. One is completeness for every true statement, meaning for every statement in the set of true statements L, there is a proof that will convince the verifier. Meaning there is a proof that when the verifier is wrong with the statement and the proof, it will output accept. This alone is not sufficient for a proof system. It also needs to be sound. What does soundness mean? The verifier must reject false proofs. So for statements that do not have a proof or the statements that are false, no matter what proof you give to this verifier, it must reject it. So a proof system for a language must satisfy at least these two properties. And this is something that we when we look from a computer science perspective, this should also be in general for a proof system. You need the verification to run in polynomial time in its input instance, otherwise it's not going to be meaningful. So classical proofs, this is what you learn in high school in math, what proofs are in computer science. I'm going to call them to be the language of NP. If you don't know what it is, it's fine. I'm going to just define what it is in the context of today's zero knowledge. So the previous definition of classical proof systems, or as I want to call it the language of NP, is defined to be the set of all statements for which there is a relation R such that this R can be verified. This relation can be verified in polynomial time. And for every true statement that exists, a witness or a proof Y such that X, Y is in this relation. And the class NP is the set of all languages that sort of satisfy this. And in my view or at least in general, the view is that this is what we refer to as a classical proof system. Now interactive proofs that were introduced by Goldwasser, Mikhali and Rakoff is a generalization of this classical notion of proofs. Instead of thinking of proofs as a static object, which is that you have a set of axioms and deduction rules for a statement, you want to think of a proof as an interaction between a prover, an entity called the prover and a verifier. So the prover and the verifier start with some statement X. And the goal here is that they should talk with each other and at the end of the computation, the verifier either accepts or rejects. So this deviates slightly from the standard notion of the proof being a static object. So what are the two new ingredients from a classical proof? Well in an interaction, first we are also going to allow that the entities can use randomness. In toss coins and use randomness in particular, the verifier can toss coins and we are going to make a slight relaxation in the sense that it is not quite a relaxation, but we are going to relax that the verifier only needs to accept with certain probability, which means it could output the wrong answer, maybe accept or reject with some error with some probability. And the second thing is interaction. As I said, classical proofs are static, but in an interactive proof there is interaction. In some sense you should think of it as a challenge response. The verifier is trying to ask the prover to answer certain challenges regarding the proof. So that's why it's an interaction. I want to point out that this is a generalization in that classical proof systems do lie in this framework and if I want to make it an instance of this framework, I'm just going to say that there is no interaction and the verifier doesn't toss coins. Then it becomes a deterministic thing just like classical proof systems and when I say no interaction, it becomes a static object. So classical proofs do lie in this framework. So an interactive proof system for L is an interactive protocol between two entities that I call the prover and the verifier that satisfies three conditions. First as in regular proof system you want completeness, which means that there is an algorithm for the prover such that for every true statement meaning X in L, the interaction between the prover algorithm and the verifier algorithm will result in the verifier accepting the proof and accepting the proof with probability 1. The soundness condition on the other hand is going to say that when you start with a false statement X not in L, no matter what the prover tries to do, you're going to say that the verifier should accept only with some probability. So when I say no matter what, I'm going to think of the prover as a malicious or as an adversary. No matter what strategy the prover is trying to do, it will not be able to convince the verifier of a false statement beyond the probability of a half. And as before we are going to require that the verifier's algorithm is efficient. And efficient, I'm going to come to this in a little bit. It means something that runs in polynomial time in its input instance and can toss random coins. Now, the soundness half is sort of slightly undesirable. I mean, if a verifier is going to error with probability half, it's too much. But with interactive proofs, if you start with something that has some constant deviated from one soundness, you can reduce it to an arbitrarily small probability by repeating the proof many times. You ask the prover several proofs and you can reduce this to arbitrarily small probability. Now, a slight variant of interactive proofs and I don't know why the names came this way is called an interactive argument. And in an interactive argument, you're only going to require that the soundness hold against efficient provers. So in general for proofs, you're going to when I say in an interactive proof, most strategy can convince the verifier. I'm also considering strategies that can run unbounded time, that can run exponential time. But now if I restrict and ask that the soundness be only hold against entities, provers or malicious parties that run in polynomial time, then it becomes an interactive argument. And as the name sounds, it's actually going to be weaker than an interactive proof. So let's start with, I've been talking a lot about interactive proofs. So let me try to give an interactive proof for this language of true statements that contain pairs of graphs that are isomorphic. This language is called the language of graph isomorphism. So what is this? I have two graphs and they look different. Now, and in fact, even the labels are different in these two graphs, but in fact, they are isomorphic. So what does that mean? So let's look at, let's say what these graphs are. A graph is defined by a vertex set and an edge set. And if I want to say that two graphs are isomorphic, it means that there is a mapping from the vertices of one graph to the other in such a way that every edge is mapped to a corresponding edge in the other graph. And in fact, these two graphs are isomorphic and I've not given the mapping, but you can think about it and convince yourself that these are isomorphic. So we want to now construct an interactive proof for statements of the form. There are two graphs, one is isomorphic to the other. So now you can think of what does one mean by true statements and false statements. A true statement is a pair of graphs where they are isomorphic and false statements are where they are in. And we're going to come up with an interactive proof for this, but first I want to tell you why even go to interactive proofs. There is a very simple way of giving a proof for graph isomorphism. If I want to convince you that two graphs are isomorphic, what do I have to do? Just give the mapping. So it's a simple, it's actually a classical proof. But I'm going to do something stupid and I'm going to make it interactive. So let's say that I have a pair of graphs and I'm constructing an interactive proof between a prover and a verifier. So let's say that the prover knows this mapping from G0 to G1. Now what is the prover going to do? The prover is going to pick one of these two graphs. Let's say it picks G0 and then it's going to construct another graph H, which is isomorphic to G0. Now this process is easy to do. If you think about it, I'm saying given two graphs, there is a mapping. But if I want to construct an isomorphic graph to this, I just have to pick a random mapping or a random bijection to construct an isomorphic graph. So it just picks one of the two, picks a random bijection and gets a new graph H and sends that as the first message of the protocol. This is followed by the verifier is going to send one challenge bit, which is either 0 or 1. Let's say that it's this bit B. Now the prover in order to convince the verifier needs to produce a mapping in the last round showing that the graph GB maps to H. So depending on what challenge the verifier gives, the prover needs to show either G0 is isomorphic to H or G1 is isomorphic to H. Now showing G0 is isomorphic to H, if the challenge was 0, this is easy because that's how Alice computed the isomorphism for H in the first place. But G1 to H, she doesn't quite have the isomorphism, but this is not hard to compute because she knows the mapping from G0 to G1. So in order to get the mapping from G1 to H, all you have to do is phi inverse rho 0. So you compose these two bijections and you can get a mapping from G1 to H. And so no matter what challenge the verifier gives whether 0 or 1, the prover will be able to convince the verifier at the end of the computation. So far it still looks stupid. I don't know why I am doing this. But let's just nevertheless check that the properties, this is a valid interactive proof system for the graph isomorphism problem. Now completeness say that when the two graphs are isomorphic, the prover convinces the verifier with probability 1, which is the case that is just that is what I argued. But now let's talk about soundness. What if the graph G0 and G1 were not isomorphic to begin with? It means there exists no mapping between G0 and G1. This is what not isomorphic means. In particular what does this mean? Given any graph H, it can only be the case that one of these mappings exist. Either G1 must be isomorphic to H or G0 or neither. At most one can be isomorphic to any other graph H. This is because G0 and G1 are not isomorphic to begin with. So now what do I have to argue in soundness? I need to argue that no matter what strategy the prover does, the verifier will reject it with probability at least a half. So let's look at it. If we are thinking of an arbitrary prover strategy, this prover strategy could construct H in an arbitrary way. Now the verifier is going to give 0 or 1. It tosses a coin and it gives 0 or 1 and each with probability half. But now since we know that no matter what H the prover gives, there is at most one graph that the prover can prove that H is isomorphic to. In other words, there is at least one value of B for which the prover will fail and this essentially means that the prover will fail with probability half. So this is an interactive proof system for graph isomorphism. We did something convoluted but nevertheless it satisfies completeness and soundness and if you stare it long enough you'll also see that the algorithm that the verifier does is in polynomial time. The only computation that the verifier does is at the end it checks if this mapping rho B shows demonstrates the GB is isomorphic to H and this can be done in polynomial time in the sizes of the graphs. Now let's move. So now we have defined a new notion or a generalization of proof system that involves interaction. Now we are going to talk about zero knowledge interactive proofs. Now before I talk about zero knowledge interactive proofs and the motivating example that I had talked to you about, what is knowledge? This is a question that's as old as humanity. It was mostly studied in the branch of philosophy called epistemology and also in other like medical fields where this is important. But today knowledge is important in computer science. We need to know how to capture it, manipulate it, how to communicate it, all these things, knowledge is important. Now we want to formally capture at least in our context of proofs. We want to know what does knowledge capture? And I want to say that this work started from the work of or at least in the field of cryptography was initiated by Goldwasser and Mikali and they won sort of the Turing Award in 2012 in the spirit of answering this question in terms of answering what does it mean to capture knowledge? And just a little bit of history with this. It was a basic idea or a flavor of this knowledge did appear in their work of probabilistic encryption but then it was matured in their work of zero knowledge proofs where they defined what knowledge means. And roughly speaking, the intuition they had is that you can say that you possess some knowledge only if you can do something with it. I know only what I can feasibly compute. And what does feasibly mean here? Feasibly at least from a computer science perspective is that I can run in polynomial time and toss random coins. So the thing here is that what do I mean by feasibly? It is when I say I can compute something I should be able to compute it within my lifetime in some sense. It should not like my computation should go unbounded. Then I don't quite call it knowledge. So anything that I can compute in polynomial time and with randomness and PPT stands for probabilistic polynomial time Turing machines but you can think of any algorithm that runs in polynomial time and can toss random coins. And this is how they captured what knowledge means. But now let's move back and I'm not going to talk about their theory but I'm going to go very specific to what zero-knowledge proofs are. And this is sort of our motivating example where we have a prover that wants to convince a verifier that a particular statement is true without revealing any information about it. So a zero-knowledge proof or a zero-knowledge interactive proofs is going to satisfy three conditions. The first two are the conditions that we have already said about interactive proofs. Completeness and soundness. And the third property is zero-knowledge which is going to say that no verifier, no efficient verifier learns anything more than the validity of the statement X. I'm going to formally capture what this means but zero-knowledge here conveys that no matter what a malicious verifier an adversary trying to verify the proof just like in our example of an instructor verifying the proof the verifier should not learn anything more than merely the fact that the statement X was true. In particular this prover, this verifier after the fact should not be able to prove it to someone else. This is just an example. Now I want to say that the stupid thing that we did for graph isomorphism actually has this sort of a property. I'm going to formally prove it but I want to sort of show you why does this have this property. Now let's look at it from a verifier's perspective. What does the verifier see during the interaction of this proof? Well it sees H and then the verifier gives a coin which is either zero or one and then the prover actually gives a mapping that shows that G0 is isomorphic to H. And now at the face of it if you look at what the verifier has learned in this interaction is that there's graphs G0 and G1 there is some third graph H which seems totally unrelated and G0 or G1 one of these two is isomorphic to H. In particular if you look at this the verifier didn't learn anything between G0 and G1. I mean there was nothing in the conversation that was talking about it. So we are going to formally show that this is in fact satisfies what we call the zero-knowledge property. So how does one define what zero-knowledge means? So zero-knowledge I'm going to give a specific definition but also one can think of zero-knowledge as an instance of what Manoj talked this morning. It can be thought of as an instance of a multi-party computation. I'll talk about it in a slide down but here I'm just going to give a definition that is just specific to this application of zero-knowledge. So in zero-knowledge you are trying to protect something against a malicious verifier and the zero-knowledge property requires that for every malicious verifier that is trying to interact with the prover and trying to learn some information about the statement or the witness or the proof that exists another algorithm that we are going to call a simulator that can essentially learn the same information on its own. So pictorially what I'm trying to say here is that in a computation as I mentioned with the graph isomorphism what is the verifier see in the computation? Well the verifier sees the messages that were exchanged and the verifier also tossed coins as part of this computation it tossed few coins. So if you think of what the verifier sees or the view of the verifier it just is the random coins it tossed and the messages it received during the interaction. Now to say that this does not convey any extra information or that the verifier didn't learn anything new or anything that it did not know before I'm going to say that for every malicious verifier there is a simulator that can generate this view on its own without interacting with anyone. So in some sense if this verifier is claiming oh I learnt something new I'm going to come and say no you did not because whatever you learnt here I could have just run the simulator produce this view and then learn the same information from it. So I'm going to sort of say what my advisor said of his advisor. So Silvio Mikali is my advisor's advisor and he is one of the founders of zero knowledge and he says this you go to a street and go to a person and ask in terms of computation what is free is you're free to do polynomial time computation and you're free to toss coins. This is something that I can do with my laptop you can do with your cell phones these days. You can run any algorithm that tosses coins. So whatever you can learn in polynomial time and with randomness is for free. This is not something I can protect. So in the graph isomorphism example if there is something about these graphs that you can learn in polynomial time that's for free. But now here what I want to say here is that whatever the verifier could have learnt during the interaction I am claiming that if I establish this property this verifier could have learnt on its own running that algorithm. Now this verifier when I talk about adversaries so in cryptography at least in theoretical cryptography when we model adversaries we model them as computationally bounded entities which means adversaries cannot run more than polynomial time or toss randomness. So we want to say that whatever the verifier learnt in the interaction it could have learnt on its own running the simulator which means the simulator also has to be probabilistic polynomial time. So what is the rational as I said? The malicious verifier learns nothing that cannot be generated by itself by running the simulator. So as I mentioned what it means is that whatever it can learn is something that could have learnt in probabilistic polynomial time, computation. So as I mentioned it can be thought of as an instance of MPC there are a couple of caveats but I just want to say that you can think of this as a two-party computation between a prover and a verifier. Let's say that there is some NP language L with relation R and let's say that the statement that the prover is trying to prove the verifier is X and let's say W is the witness then you can think of this as to securely compute a function that evaluates the NP relation R on X,W. So it's a secure computation where you apply the relation which is a polynomial time computation on X,W and just deliver the answer to the verifier. Now when you run the relation on this you either get 0 or 1 and that is delivered to the verifier. So if you look at this as a secure computation the verifier just learns either 0 or 1 at the end of the computation and learns nothing more it just learns the truth or the validity of the statement. There are a couple of caveats that I don't want to go into but you can think of 0 knowledge also more generally as an instance of a multi-party computation. So let's try to formally prove that the 0 knowledge that the interactive protocol that I talked about between the prover and the verifier for graph isomorphism is 0 knowledge. So what does this mean? I said whatever the verifier sees in this computation there is a simulator that can generate this view and in fact the simulator is going to generate this view using the verifier itself. Now the simulator wants to generate these messages as though it was interacting with the verifier. This malicious verifier could do like arbitrary computations it need not follow the protocol and it could do stuff but I'm still going to say that I can generate a simulator can generate this view of the verifier without knowing in fact without knowing whatever the prover knew. In fact even without knowing this isomorphism between G0 and G1 I'm going to show that one can generate this view. Now if you think about it for a minute this is probably slightly non-trivial because remember how the prover convinced the verifier. One of them was easy the other one it had to know the mapping between G0 and G1 to get the other mapping. So the simulator does not have it. This simulator is saying that I can generate this view out of nothing. So let's think of a strategy to do this. So the simulator what is the first step that the prover did? Well the prover picked one of the two graphs constructed an isomorphism to that graph and constructed a new one to H. Now a simulator can do the same thing I mean it does not need to know the isomorphism between G0 and G1. It's just picking G0 taking a random mapping and constructing H and it can feed it to the verifier. Think of this verifier as an algorithm for now this adversary as an algorithm. So the simulator gives H to the verifier what is the verifier respond with it might do some arbitrary computation and then it'll give either 0 or 1. Let's say it gives 0. Now it's easy the simulator does know the mapping of G0 to H it can complete great. Now I got the view but this is not what happens all the time maybe it could be lucky unlucky and get one. Now in this case the simulator cannot complete the conversation to get a view of the verifier. So what is the simulator going to do it's going to say well you know what this H was no good. So I'm going to go back and now instead of constructing the isomorphism of H from G0. I'm going to take G1 and construct the H call it H prime and give it. Now what is the verifier going to do it's going to do some computation and give 0 or 1. Now if it gave 1 it was easy but it could give anything with some probability and if it gives 0 now again I cannot do it. Well revinding and doing this seemed like a good strategy but it could be that I could get stuck. But what you can actually prove is that the probability with which the simulator will fail at every time is going to be at most is going to be exactly a half. The point is that the simulator is guessing which one of these two things did the verifier going is the verifier going to choose and the with probability half at least the simulator will succeed. What this means is that the number of times I have to rewind before which I will succeed is going to be in expectation 2. If you do a standard probability calculation you will know that the expected number of times I rewind and I pick a random graph and give the H to the verifier I will be lucky and I can complete the conversation. Now this is a very high level proof there is a lot of more like formal probabilistic calculation that one needs to prove but I just want to sort of high level tell something. One point that I missed from this analysis is that what if the verifier it's a malicious entity it's trying to like you know trying to learn something that it should not and maybe from this graph H that the simulator gives it could probably know oh this H was constructed from G1 and not G0 and ask the other thing all the time in which case rewinding is not going to help every time I rewind I'm going to get stuck but I claim this cannot happen and the point here is that this first message H actually if you construct H as a uniform like a random isomorphism of G0 or a random isomorphism of G1 the distribution of H is going to be identical why is it going to be identical because G0 and G1 are isomorphic to begin with which means if I permute this graph to get something or this the distribution of the graphs H is going to be identical in either of these cases which in our case means that the H that the simulator gives to the verifier will not reveal from where it was constructed so we can truly say that the simulators H will succeed with probability at least a half okay now there's another point that I didn't talk about and which also Manoj didn't want to talk too much about is that we say that there is a simulator that can generate a view that is the same as what it was in a real interaction now one should know that things it's a it's a random process there are random coins that are being tossed and one cannot say this should equal this when it comes to distributions you want to say that they both are identically distributed the probability mass on all these things are the same okay now for this case actually you can show the that they will be identical but in general you can only do what is referred to as something called computational indistinguishability you can say that these two distributions the distributions output by a simulator and the distribution observed in the real interaction cannot be told apart by any efficient adversary now if you yes so are you asking in terms of probability distribution that it will be different or you're just saying that the simulator gets to rewind but the prover does not so the point is that this is precisely the advantage the simulator has and I'll tell you what this in some sense means so in an interaction I'm saying it's online the verifier talks to the prover the exchange messages and the computation ends now I want to say that this verifier didn't quite learn anything more because these messages itself could be generated in a different way in some sense I can generate it out of the verifier's head itself so when I think of the verifier as an algorithm I can generate this view I can run you should think of this algorithm I can run it and I can rewind it and the claim I'm making is that whatever the verifier saw is something that was already in its head or whatever it's claiming that it knows is already in its head and that's why I can rewind but a more subtle point is as I said I should show that these distributions are the same which requires some more effort but this is precisely the advantage that the simulator has I won't call it an advantage but it is more to say that that is why this protocol doesn't carry any more information because whatever information it conveyed like in actuality is something that was already in the verifier's head yes you can yes what you said is the is the intuition and I just want to add something to what you said which is that when you say that someone learns something it means that it can come out of what is seen in an interaction and in here it's just the messages and the random coins tossed by the verifier and if I can generate them identically then whatever I'm trying to learn from the pieces of information seen in a real interaction I could do the same by running first the simulator and generating the transcript and randomness and then running whatever learning algorithm I want on this thing so the point here is that as long as I can generate the information that was in the protocol in a different way then you can't learn anything more and maybe some point down you're also going to see a definition of secure computation in the flavor that Manoj said which will also do the same which is that the messages actually don't carry any information and even in this case the actual by demonstrating a simulator you're showing that the messages don't carry any additional information beyond the fact that remember the only fact I used to construct the view and also to prove this identical is the fact that G0 and G1 are isomorphic if they weren't this argument is not going to work and that doesn't make sense also but only when G0 and G1 are isomorphic when is this true and we want to say that when for true statements the verifier cannot learn anything more it will only know whether the graphs were isomorphic or not maybe I should also tell one point here which is that I say always that polynomial time and randomness is for free now what if graph isomorphism was solvable in polynomial time then actually you don't need to give a zero knowledge proof you know you just don't give any proof because the verifier can just compute on its own and its zero knowledge so the point is zero knowledge in some sense makes sense only for things that cannot be computed feasibly and graph isomorphism at least till today is a candidate for that problem so what can you prove in zero knowledge well you can prove any classical proof in zero knowledge so you take any NP statement one can convert an NP one can construct a zero knowledge interactive proof for any NP statement in fact you can prove more than NP in zero knowledge it was shown that you can prove anything that has an interactive proof system you can convert it to a zero knowledge interactive proof system now I say this is more because now I have to ask the question what can you prove using an interactive proof we know what we can prove using a classical proof a classical proof can prove NP sorry or essentially is NP but what can you prove using interactive proof in fact this was shown in the 90s by Shamir and Lund et al that every language in p-space if you have taken any complexity theory p-space is the set of all problems that can be solved by an algorithm that is limited to using polynomial space it can take any amount of time but it's limited to polynomial space and this is a huge class or at least is believed to be so and you can construct an interactive proof for all of p-space now I just want to contrast here that you cannot give a classical proof at least we don't know how to give a classical proof for all of p-space but you can give an interactive proof and this is actually something very phenomenal in the sense that when you think of an interactive prover the verifier is polynomial time so we are saying that a polynomial time verifier can check a proof that is much larger than the things that it can compute using interactive proofs anyway so what I want to focus next is I want to construct zero knowledge for all of NP and then also give an application I just don't know how I am doing with time I am not doing well with time alright so I am going to give a classical approach you are going to see later in the sessions how to do other ways of constructing zero knowledge I want to construct a zero knowledge for classical proof statements this is done in two steps what you do first is there are these class of problems that are called NP complete problems if you don't know it's fine it's just that any language in NP can be transformed to an instance of an NP complete language which means if I can solve an NP complete language I can solve any problem in NP so there are many such NP complete problems and there are these translations that take an instance from the NP language and translates it to an instance of an NP complete problem so the first step is to do this and then the second step is to construct a zero knowledge proof for an NP the first step is construct a zero knowledge proof for an NP complete language and then for any other NP statement I will reduce it to the NP complete and give the zero knowledge proof for it so this is the two steps I am not going to talk about the first step I am just going to consider an NP complete problem which is going to be graph three coloring and I am going to give an NP complete a zero knowledge proof for this NP complete language and the problem of graph three coloring I don't know how much of you know I did not make a slide on it it's basically given a graph with vertices and edges you have to color every vertex with at most three colors in such a way that no edges incident on vertices of the same color you have to color the vertices if you can color them then it's a true instance of a three coloring graph and if you cannot then it's a false statement what we will need for constructing this zero knowledge proof we will need some cryptography and in particular we will need this primitive that is called commitments commitments are these objects that are useful in auctions but I am just going to like define what a commitment scheme is so a commitment scheme is an interaction between a sender and a receiver the sender only the sender has an input a value V and it proceeds in two phases in a first phase they interact and the receiver receives a commitment of a value V you should think of this as a sealed envelope like in if you want a physical analog to this it's like it has the value V inside an envelope and that's given to the receiver and what is required of this phase of the computation is that this does not reveal the value V so it should hide the value V which if you think of an envelope it does that and the second phase is a decommitment phase where the sender opens this commitment and it gives some information to verify that this shows that the commitment phase is a commitment to the value V and the property that we need of this second phase is that no cheating sender can after the commitment phase open it to two different values in other words it has to be binding so these are the two properties that need to be satisfied of a commitment scheme one can construct this based on the hardness of factoring and based on a lot of cryptographic assumptions but I am not going to get into that but this can be constructed in practice based on some assumptions and we are going to use these commitment schemes in constructing the zero-knowledge and I will tell how this hiding and binding will be useful in the zero-knowledge proof so I want to construct for all of NP but as I mentioned it's enough for me to construct it for the graph three-coloring problem so what is the instance? well the instance is a graph and let's say that this graph is three-colourable those are the true statements that means that the prover also has the witness which is the three-colouring of the graph that makes it satisfied now the prover wants to convince the verifier that this is three-colourable but not reveal the three-colouring or not reveal any information that it has based on this witness but it wants to convince the verifier that it is three-colourable so what is the prover going to do? well the prover is going to take the colours of each of the vertices let's say there are n vertices and it's going to give a commitment of each of these colours and now the... so imagine we have the graph the prover has the colours which make it three-colourable it takes these colours and it gives it to the verifier now as such this does not reveal anything to the verifier these are commitments remember the hiding property says that when you see the commitment phase the receiver will not know what is inside the commitment now the verifier wants to check that this is three-colourable so what is the verifier going to do? well the verifier is going to pick an edge in the graph and it's going to give it to the prover it's going to say please reveal the colours connecting these edges so let's say the edge connects vertex i and j it says please show me the colours of this now remember my commitment phase also has an opening phase where I can open the value in the commitment and the prover just opens these two values and now the verifier says well I'll accept if these two colours are different now remember if it's three-colourable it has to be different or at least if it was a valid three colouring this has to be true and the verifier will accept it only if these two colours are different now let's verify that this is... let's verify the three properties of completeness soundness and zero-knowledge of this thing completeness since there is a three-colouring of the graph if the prover just goes honestly the verifier will accept the proof all the time because every edge is incident on two different colours soundness on the other hand is going to use the fact that my commitment scheme is binding the second property which means that if the graph was not three-colourable it means that no matter what colours you give some edge is going to fail on this three-colouring test so since the colours were committed in the first round now with some probability with some probability the verifier is going to pick that edge for which the colouring were equal and now since the prover opens this it has to open the right values that it committed to it cannot cheat it cannot open a different value and these values were same which means the verifier will reject the proof now you can ask what is the probability with which the verifier will catch a prover well it will catch it with that is at least one edge that is wrong so at least with probability one over the number of edges it's a bad... like I mean you want it much smaller than this but this is... this is the probability with which it will catch now as I mentioned you can repeat it many times to bring the soundness error which I'll talk about in a minute but this is fine but then what about the zero knowledge I have to construct a simulator I'm not going to go over the details I'm just going to tell that you can use the same revinding strategy where if the prover can guess the edge E that the verifier is going to ask I can put two different colours for that edge because those are the only two things I open it doesn't matter what colour I give in the remaining things so I can keep revinding until my guess is right and I can do it the proof is much more complicated but it essentially works was there a question? you have to use some one-way function for that alright, so I'm going to skip this in the interest of time I wanted to say how to bring the soundness error from half to anything small I want to go to the application of zero knowledge you can come and ask me later what it is okay, so what we have done now is we have shown a way of constructing a zero knowledge for any NP statement for any NP language we can construct a zero knowledge proof and the proof that we saw now is for three colouring there have been other instances of this like there's the problem of the Hamiltonian circuit problem for which a zero knowledge it's another NP complete problem for which we know how to construct a zero knowledge proof for the problem of Boolean satisfiability this also has been shown now all of these construct does this two-step process you first construct it for an NP complete and then for any NP statement you do this car production to this to construct a zero knowledge proof but okay and this is the theorem statement corresponding to it assuming one-way functions there exists a zero knowledge proof for all of IP everything that can be proved using interactive proof but what I want to point out here is that these instances use the two-step process but you're actually going to see a mechanism where one can just start with the NP relation and construct a zero knowledge proof directly and you're going to see this this uses this beautiful idea of MPC in the head that Yuval will be talking about on Wednesday and I just wanted to sort of you know give a link to his talk from here and in practice this is very important because doing this car production if you want to use this in reality for certain applications doing this car production is very inefficient okay so you would want a mechanism that directly uses the NP relation and constructs a zero knowledge and this work is the first to do this in a very nice way more on Wednesday alright zero knowledge is numerous applications I don't have time to talk about it what I want to talk in the last five minutes is this application that zero knowledge I want to tie it down back to multi-party computation as this workshop is about multi-party computation zero knowledge is used to amplify or boost the security of protocols Manoj talked about passive security and active security active security is what we want we want to protect against adversaries that arbitrarily deviate from the protocol and zero knowledge is useful for this how do you do it you start with a passive secure protocol which means security is only against honest but curious entities entities that follow the protocol to the world and then you're going to convert it, compile it into a protocol that also is secure against bad adversaries that deviate from the protocol and the two things that you're going to use are coin tossing and zero knowledge proves and hopefully I can give this compilation at least in the next seven minutes alright so what is the general high level idea I have a passive secure protocol passive security says that if the parties follow the protocol to the world then my protocol is secure it gives all the privacy, nice privacy and correctness features that I want it gives me now if I want to also protect against people who deviate from the protocol I need some mechanism in the protocol to enforce that at least in these protocol messages the the adversary does not deviate there must be some enforcing mechanism and this is is going to require three things I need to force that the adversary uses a fixed input meaning every statement is according to some input X that this party uses the second thing is a lot of these protocol requires randomness and furthermore privacy in many of these situations require that the adversary pick a uniform random string as its random tape as the random coins it uses it should not manipulate the sequences of random coins for the protocol so we need to force that the adversary uses a random uniform random tape and I call tape because it's remnant of Turing machines if you don't follow that's fine just think of a string of randomness needs to be uniformly generated and finally you need to force that the adversary follow the protocol instructions exactly if I can do these three things then I get security against an arbitrarily or active adversaries alright so the first one using fixed inputs is going to be easy we're just going to make the adversary commit to the input we already know what commitments are we're going to make the adversary both parties you should think of in a two-party or multi-party computation every party can commit to their input at the beginning of the protocol to generate a random tape you're going to use what is called a coin tossing protocol okay and this I'm going to talk about in the next slide and finally to force that the adversary follows these instructions we're going to use zero knowledge what is coin tossing coin tossing the goal is that we want to fix the random tape of each party okay and we're going to use commitment schemes to do this and the protocol is very simple Alice picks a random R1 commits to it Bob picks R2 and then Alice reveals or opens R1 and then they're going to say that the uniform random string that they generated is R1 XR R2 okay and there are some security properties that you can show but take it from me that this will be uniformly generated as long as one of these two parties are honest okay now this is not quite going to be enough for generating a random tape primarily because we don't want both parties to know the random tape we want see Alice and Bob they both need random tapes for the secure computation but neither should know the random tape of the other party so this is slightly tricky I want to enforce that they use uniformly generated random tape but not reveal it to each other okay so it's a slight fix but that can be done what you do is that you remove the last message of this protocol now what happens is that only Alice can compute R1 XR R2 and she's going to fix that as the random tape now Bob sees commitment of R1 and R2 he cannot compute R1 XR R2 but he has a commitment to the coin toss because with R2 and commitment of R1 all the details about the coin toss is fixed okay and this follows from the binding of the commitment if you didn't follow that's fine you come and ask me later but this is sufficient to generate the random tape they exchange these messages and that is the random tape of Alice okay now how does one I'm sorry there's a small typo in this ignore the third message over here that should not be there okay finally so I've done commitment of input and I have generated random tape ignore the third message in the slide I copy pasted incorrectly now after these two messages are exchanged what has happened it a both parties have a commitment of both the input and random tape of either parties this is what they have and now what we want is that once these are fixed how the protocol the secure passive secure protocol is going to proceed is fixed right if x is fixed the random tape is fixed how each party interacts using your passive secure protocol is fixed it's a deterministic function of these things now all I need to do is enforce that they follow this deterministic procedure both parties follow this deterministic procedure and it's going to roughly they're going to execute the protocol and after each message each party is going to prove that they did it correctly okay so what does this mean so let's say that Alice is generating a message following a message it is going to prove that this next message is correct what is correct correct is basically that this was generated according to the protocol specification okay it's a well defined thing in fact you can express this as an NP statement now what is the statement the statement is the transcript that has happened so far the witness is the input the randomness and the randomness for used for entire protocol including the commitment and the relation basically says that the instructions were followed according to the passive secure protocol okay and that the commitments in the first round are right now even if you didn't follow it all I'm saying is that saying that this next message is correct is an NP statement in some witness that Alice knows and similarly for Bob when he is proving it okay but giving this proof as we know just like that is bad because it gives the input to the other party which you do which you want to protect so instead of giving the proof as it is you use zero knowledge okay and this is it so the basic idea is they commit inputs and do coin tossing for the random tape and then they execute the passive secure protocol at every step you send a message you give a zero knowledge proof it's correct Bob sends his message he proves that his message is correct and so this is a compilation which is referred to as the GMW paradigm for converting anything that has passive security to active security okay now the state of the art I'm not going to talk about much but in the rest of the tutorial hopefully some of them will cover what is the state of the art for active MPC in theory zero knowledge proves I mean this theoretically says that one needs to only construct passive secure protocols because you can boost the security but in practice they use other techniques they take whatever you want to do and directly construct an active secure protocol instead of doing this two-step thing of passive and compiling but implicit in these approaches there are a couple of popular approaches you'll definitely see MPC in the head I'm not quite sure if cut and choose is going to be done maybe it will but these techniques inherently do use some idea of zero knowledge okay but the point is if you want to do something efficiently you just construct it directly because you know where the bottlenecks are going to be alright finally this is my last slide I just want to say that we haven't quite solved the original problem of the student coming and telling me a proof because all through what I've talked about here zero knowledge is a single instance of zero knowledge a single prover convincing a single verifier about a statement and I told how to make it zero knowledge but in reality it could be like this there is an adversary that is receiving many proofs from a prover and is giving many proofs to a verifier now if you think about it in my example if the student comes and gives me a proof I could simultaneously be giving the proof to the clay institute okay now zero knowledge protects me from proof after the interaction has finished but there is nothing stopping me to do it simultaneously he gives even if the student comes and gives a zero knowledge proof to me I could just act as a man in the middle between me and between the student and the clay institute and the standard definition of zero knowledge doesn't protect this kind of an attack okay and this is an issue regarding concurrency and I'll talk about this in my talk next tomorrow but I just wanted to sort of give you a preview that in a stand-alone setting where there is a single interaction everything is fine but in reality it's not like that and we need to do something more to make it stronger against this class of adversaries and zero knowledge proves it's one of the cornerstones of modern definition of security and as Manoj mentioned it's actually the birth of how we even think about defined security and this is how it started on formalizing cryptographic things people knew how to construct very cool protocols but to formalize the definition of security using this real and ideal paradigm or using the simulator was because of zero knowledge proofs and it in a sense is a very important building block even more recently it's used for things like bitcoins thank you so all of this zero knowledge proof system all of this is only valid for some of this idea translated into the unbounded so as I mentioned zero knowledge can help you prove anything language that has an interactive proof and for instance if you consider p-space languages they cannot at least we don't know how to solve it in polynomial time so you can do this but what it means is that the prover strategy is going to be unbounded the verifier is polynomial time but the prover is unbounded but if you ask is this useful in practice probably you have to find some application but it doesn't like we don't have we can't execute this in practice because the prover's algorithm will be unbounded but there are you can do zero knowledge proofs for languages that are not efficiently solvable verifier was also unbounded in that case can we consider something analogous to even more like in that case can we have a proof system so probably but I mean if you have to like if you increase the verifier's complexity you have to of course like you have to change your definition of what efficient means as I said what is for free already if the verifier itself can run long time whatever it can run in that computation is for free so it can you can probably prove things that are outside this class yes so just a related question there is property of the languages property of the yes verifier is unbounded you let them find it out but there are other instances where to be useful for instance we have some instances where there is a noisy channel between a floor and a verifier every little bit of verifier is unbounded but I am going to use something about how I use this noisy channel so it says that this is a statement answer I am going to make now I wonder if I am going to say about some way it isn't okay but the usual setting of verifier is competition and non-verifier verifier really is good by themselves by the way there are computations which are very inefficient like problems like what about which are very they are not of time they can be proved using zero knowledge as long as your computation is polynomial time you can prove it like I mean even if polynomial is huge you can prove it in zero knowledge I am asking do you know correctly like many that is cryptography that is it is very inefficient and their proofs are not like in research paper many researchers write that we do not know the proofs of these techniques but we are saying that they are secure and until now we don't know of these things so I actually don't quite get your question sir like in many of these ways cryptography takes so I think the kind of proofs here to be honest not quite the proofs here we know what it means to prove it is a very simple verifier that we make around the kind of mathematical proofs we are talking about we don't actually specify verifier given process mathematical proofs mathematical proofs let's take it out yes in the problem of 3e graph here why don't the verifiers ask for every edge of the graph and we can check so it won't be zero knowledge and the reason it won't be zero knowledge is the proof is giving all the colors right then the verifier now knows all the colors and let's say that I give you the graph 3 color after this you can prove to someone else it has 3 color I just want to say that we open only one edge because if you look in the interaction all that the verifier learnt in the end of the computation is two colors now this doesn't reveal anything but as you said to increase the probability we can iterate so why don't we iterate I did not talk about it but you have to give new three colorings every time what does that mean if it's 1, 2, 3 you make 1 as 2 2 as 3 and 3 as 1 which means it's sort of a new instance every time so it is not on the same edge and that's important so that's time to read