 Hi everyone, I'm here to talk about our work on blackbox zero-knowledge. This is a joint work with my advisor, Amkan Pandey. Zero-knowledge is a so important primitive that requires no interaction. In this work, we focus on blackbox zero-knowledge. By blackbox, we mean the construction of the zero-knowledge protocol only makes use of the underlying primitive in a blackbox way. So why we are interested in blackbox zero-knowledge? Because the efficiency of the zero-knowledge protocol is not affected no matter how you implement the underlying building blocks. Over the years, our community has developed so many blackbox zero-knowledge protocols. However, once you want to use the zero-knowledge protocol to prove a statement involving some cryptographic gadgets, things change. That means even though the zero-knowledge protocol itself is blackbox, once you write on such a statement, you are bound to make non-blackbox use of the underlying primitive that is involved in a statement. This is actually a quite common situation. For example, think of the Naurion paradigm or the GMW compiler. In these situations, you need to write the zero-knowledge proof on an encryption or on a commitment. Then you have to know the code of the encryption or the commitment scheme. In this work, we are particularly interested in this simple language. This language captures the range membership of a one-way function F. This language itself is blackbox because to test the membership of the language, if someone gives you the witness, which is the pre-image, you only need blackbox access to the one-way function F. As we know, zero-knowledge can be constructed from one-way function in a blackbox way. Now we have this language that also makes only blackbox access to the one-way function F. Then a natural question would be, can we have a zero-knowledge protocol for this language, making only blackbox access to the one-way function F? To capture this task formally, MAC Resolutex defines the notion of functionally blackbox zero-knowledge. Let's look at the range membership language we defined before. FBBZK requires the existence of a protocol where both parties have only blackbox access to this F. The statement is the image Y, and the prover holds additionally the pre-image X as the witness. The prover wants to prove that Y is indeed in the range of F. The companies and soundness are defined in a natural way. For the zero-knowledge property, as you can see, it's also the same as the traditional one. The only difference is now both parties have blackbox access to F in the real execution. I want to point out that it is not necessary to require that the simulator makes only blackbox access to this one-way function F or not. You can choose to give the code to the simulator. But the parties in the real execution, they must have blackbox access to this one-way function. Unfortunately, Rosalek shows that functionality blackbox zero-knowledge doesn't exist even for this simple range membership language. This is actually quite discouraging, because as we mentioned before, we usually need to run ZK for a capital statement. But now, even for this simple range membership language, we cannot do that. Therefore, in this work, we ask the question, can we somehow bypass this lower bound? So the first question is, why can we even hope for bypassing this lower bound? If you look at the Rosalek lower bound carefully, it says for given one-way function F, you cannot have a blackbox protocol for the range membership of this one-way function. Our idea here is to construct a new one-way function from the given one-way function, and then try to construct a zero-knowledge protocol for the new one-way function in a blackbox way. I will elaborate on this in the next slide. More formally, we are looking for an oracle evergum F and an oracle protocol Pi. If we instantiate them with one-way function, F to the F will be a one-way function, and Pi to the F will be a zero-knowledge protocol for the range membership of the new one-way function F to the F, that is, this new blackbox language. It is important to notice that both F to the F and Pi to the F make only blackbox access to the given one-way function small f. We call this primitive proof-based one-way function. We can also define proof-based PRG and the collision-resistant hash function in a similar way. Then, why this approach has a potential to bypass Rosalek's lower bound? There are two reasons. First, given an input, this capital F can make many oracle queries to the small f. Also, it can perform post-processing or pre-processing on the intermediate oracle answers. Also, the protocol Pi, even though it cannot use the code of the small f, it can use the code of the capital F. If this idea can really help us bypass Rosalek's lower bound, then our hope is that in applications where you need to run the zero-node proof on one-function PRG or hash function, we will just use the proof-based version instead. Now, the question is, is this proof-based notion really possible to achieve? In this work, we provide results on both sides. On the negative side, we show that proof-based one-function is impossible. Actually, due to some technical reasons, we are only able to show that proof-based PRG is impossible. But this is already a strong evidence showing the limitation of the proof-based notion. Next, we show that if we give the Wi-Fi some control of over the input, then proof-based notion actually becomes possible. I'll explain later what do we mean by some control over the input. Let me first give a very brief overview of our negative result. Here, the strategy is, we assume the existence of a proof-based PRG, and then we try to derive a contradiction. Given any one-way function f, a proof-based PRG consists of two parts. First, we have this Oracle algorithm, g to the power f, which is the PRG. We also have this Oracle, pi to the f, which is a ZK protocol for the range membership relation of the PRG. To derive the contradiction, we first consider the honest execution of the protocol, where both the prover and the verifier has Oracle access to the one-way function f. And we have an honest generated pre-image image pair, x and y. In this habit, by the companies of the ZK protocol, we have the verifier we accept with probability close to one. Now we consider another habit, where we resample the Oracle, but we keep the most likely queries of the verifier. Let's look at the real execution here. During the execution, the verifier can make multiple queries to the Oracle f. Now we collect all the query answer pairs that obtained by v with high probability. And we call this set QEasy. We keep this QEasy set fixed, but we resample every other points of the Oracle f. We call this new Oracle fPrem. Now intuitively, because we keep all the queries asked by v with high probability, these two executions from this point of view shouldn't differ too much. Therefore, in habit one, p should also accept with probability close to one. Actually, there will be an additional one over party loss, but in this talk, we don't need to pay much attention to this. Then in the paper, we proved a technical lemma, which says if we start with an y that is in the range of g to the power f, then we replace this f by the composition of fPrem and QEasy set. The probability that y is still in the range of this g instantiated with this new Oracle is no larger than half. This actually gave us a contradiction, because in habit one, the verifier still accepts with probability very close to one. This technical lemma is our major contribution in this part, but due to time constraints, I won't talk about it here. More details can be found in our paper if you are interested. Now we just showed that proof-based PRG is impossible, and this is also evidence for the impossibility of proof-based one function. Then we want to ask the question, can we somehow relax the proof-based notion such that it's still meaningful and achievable? This is our original definition for proof-based one function. Here, we draw some inspirations from the famous Gojek 11 theorem. That is, instead of viewing the input as a single string x, we want to split it into two parts, x and r. We require that f is the one function on its first input for arbitrary r. Then in a protocol, we allow the verifier to pick the r part. Intuitively, this allows the verifier to hold some part of the input, which he can use to verify the range membership. And as we will show later, this relaxation indeed allows us to have a construction. But here I want to make two remarks. First, since the verifier chooses the r part, to have a meaningful notion of one-wayness, we want to say that no matter what the r is, the function should be one-way. This is the exact reason why we stipulate that one-wayness should hold for arbitrary r. Second, this r part is picked by the verifier. It's not a part of the public statement, so it becomes a little bit trickier to define your knowledge. Here we have to switch to a two-PC flavor definition. More formally, our new definition for proof-based one-function works as follows. As we said before, we want one-wayness to hold for our r. Now for companies and for ziki, we're gonna use a two-PC flavor definition. We compare a real execution and the edu execution. In a real execution, the sender has x as input, and the receiver has r as input. These are the two parts of the input we split as x and r. These two parties run the protocol if defined by the proof-based one-function. At the end of the execution, the sender learns x and r, which are supposed to be the input of the one-function, fxr. The receiver learns the outer y, which is supposed to be the image of f under xr. In the edu world, there is a trusted third party computing f to the f. The sender will provide the x part of the input. The receiver will provide the r part. The trusted third party just return r back to f honestly, and compute the outer y and give it to the receiver. Once we are in this simulation paradigm, ziki can be defined by requiring the security for malicious receiver. To define soundness, we require that whenever the receiver outputs a value y, that is non-abort, there must exist a pre-image such that this equation holds. If you are interested, you can pause the video and look at this more formal version of soundness. Now we are ready to talk about our construction. I will present a first attempt that doesn't quite work, but it gives us inspirations. Remember that the proof-based one-function consists of two parts, the function and the protocol that prove its range of membership. Here I will describe the protocol first, because from the description of the protocol, it's easy to derive what the function should look like. Let's recall our setting first. We have the sender and the receiver having black box access to a one-function f. The sender holds the s part of the input, and the receiver picks the r part of the input. Our first idea is inspired by the cutting-chose technique. Concretely, we ask the sender to pick n pre-image, query the protocol, and learn the corresponding y1 through yn. The sender will send the image to the receiver. Now the receiver has no idea of the corresponding pre-image. But he will pick a random size t subset, where t is a constant fraction of n. Receiver sends this r. The sender will reveal the pre-image in the positions specified by the size t random subset. The receiver, of course, checks if the positions in r are open honestly. If the check passes, he will output all the images under the pre-image for position r, and also this subset r, as a final output. From this protocol, it's not hard to see that a corresponding y function should look like this. On input x and r, it just outputs the same output as the receiver. Let's look at what we achieved. Here are n positions. The receiver checks the size t random subset. This ensures that a malicious sender can cheat on at most k positions out of n, where k is a constant fraction of n. This is not good enough. Recall that the stoneness of the proof-based y function requests that, if the receiver accepts, there must exist a pre-image. However, this protocol only ensures that there exists some string that is 1 minus delta close to a real pre-image. As we said before, that's because x star can still cheat on the delta fraction. This inspires us to think of the following idea. Can we somehow extend f to f, such that, for all the strings within a delta bar of a real pre-image, these strings are also valid pre-image. Now, I'm going to talk about how to incorporate this delta bar into the domain of the only function. Here, we need a new idea called pre-image editing. Concretely, we modified the previous only function in the following way. On input x and r, the function parses x as x1 through xn and an additional beta part. As we will see later, this beta part is the key component that allows us to incorporate the delta bar. Then, as before, the function query the oracle to compute y1 through yn, and as before, it parses r as a sasti subset of n. Now, to compute the output y, we consider two separate cases. The first case is a non-editing case. It is identified by this condition. This condition just says, if you collect the Pi part from beta, it forms a subset of positions. If this subset of positions is overlapping with the positions specified by r, the function just operates as before. That is, it put all the y1 through yn in the s part. The more interesting case is the editing case. This case is identified by the complement of the previous one. In this situation, for all the positions specified by the P set, we can set si to the value specified by the corresponding ypi prime. For all the other positions, we will just use the y return beta oracle. This might be confusing if you look at it for the first time, but the main takeaway is, if the editing case is triggered for the positions specified by this P set, the corresponding si values are already appearing in the input. Anyone who holds this input can tell these values directly. Let's see how this resolves the previous problem. Recall that the previous problem is we may have k bad positions in the pre-image. Now, by making use of this editing case, we want to correct all these k bad positions using the values specified by ypi prime values contained in this beta part. I think it will help if we run through this mini example. Let's say, after the execution of the protocol, the receiver learned the y of this form. Where this S4 is a bad position. Also, this y tells us that the set R consists of this 1 and 3. Now, I claim that this X is a valid pre-image. Let's take a closer look at this X. Here, we have X1, 2, 3, and X5. These are the valid pre-image of S1, 2, 3, and S5. Note that these pre-images exist because the only bad position is S4. For S1, 2, 3, and S5, there must exist corresponding pre-images. So the only problematic position is S4. Now in the pre-image we constructed, we just put a zero string as a pre-image of S4. We make the beta part look at 4 and S4. This S4 comes from the y. To see why this X is a valid pre-image of y, let's try to evaluate the function f on X and R following the definition on this set. By definition, we know that the R set is 1, 3, and the corresponding P set is simply 4. Therefore, these two sets does not overlap, so we triggered this editing condition. Therefore, it doesn't matter what value we put here, because this position is a position specified by P. In the output, we will always put S4 as the corresponding output. Therefore, if you follow the steps listed on the left, you can verify that X and R is a valid pre-image for this y. Here I want to make one more remark. This X1, 2, 3, and X5, they are pre-images of the corresponding S values, or y values. They may not be efficiently computable, but in the proof of security, we do not really want to compute them. The only thing we need is their existence. Previously, I just showed you the 1-way function construction. To have a proof-based 1-way function, we also need a zero-knowledge protocol computing this function we just defined. In this slide, I will first give you an interactive proof protocol, and later I will show how to add zero-knowledge property. If you stare at this construction for a second, it is not hard to see how can we have an interactive protocol computing it. So first, the sender passed X into the corresponding X and beta part. Then it makes Oracle queries to learn Y1 through Yn, hey send Y1 through Yn to the receiver. Here, since we also have an additional beta part, the sender will also send the beta. The third step is to compute this random subset. In the protocol, we simply ask the receiver to send the set R as a challenge. At this moment, the sender knows both X and R, so he can perform the step 4, as in the description of the function. Now the sender simply sends these values to the receiver. At the end of the protocol, you can verify that the receiver can output this value, and the sender simply outputs his input X and the R part he learns from the receiver. Now the only thing left is that we want to make this protocol zero-knowledge. Up to now, this protocol is only a simple cut and choose protocol together with our pre-image editing. As you can see, this cut and choose step does not review any extra information regarding the pre-image of the one function. So the only information we need to head to achieve ZK is actually the fact, whether we are in the editing case or a non-editing case. To do that, we will use the black box commit and prove. Recall that to see if we are in the editing case and the non-editing case, we need to use both the beta part and the Y value. And this information is embedded in the first-round message. To hide this information, instead of sending the first-round message in plain, we use the commit stage of the black box commit and prove. In the third step, the receiver can still check the consistency between the pre-image and image on the positions specified by the set R. However, there is no guarantee that the Y values are indeed the Y values committed at the beginning. So we will ask the sender to give a zero-knowledge proof using the proof stage of the black box commit and prove protocol. Importantly, this BB proof stage only proves statements about Y and S. It has nothing to do with the relation between pre-image X and Y. Actually, the consistency between X and Y pairs is checked by the receiver outside of this BB proof. Therefore, this BB proof stage has nothing to do with the R call only function small f. Now let me briefly summarize our construction of the proof-based only function. We start with a simple cut-and-choose protocol. This protocol is not good enough because we wanted to guarantee the existence of a pre-image. However, if we are using this protocol, there could still be a delta-bed fraction in the pre-image. So our idea is to try to embed the delta-bow into the domain of the one function. To do that, we introduce a new idea called pre-image editing. The editing case will not happen in the real execution of the protocol. That's because there is an R part controlled by the verifier. If the verifier picks the R part randomly, the editing case won't have an acceptable negligible probability. But now, the security reduction can make use of this editing case. Intuitively, it plants the k-values from the output Y into the pre-image X. Since there are at most k-bed positions, we can effectively correct all of them. To add the z-key property, we rely on a black box commit and proof. It is important to notice that this proof stage does not make use of the code of the protocol small f. Now we've shown how to construct proof-based one function. A proof-based PRG can be built in a similar way. We will not talk about it. As mentioned at the beginning of this talk, we also want to construct proof-based hash functions. However, it turns out that this construction requires new idea. Let me explain the reason why our technique for proof-based one function doesn't work directly. Recall that in the proof-based one function construction, we introduced this pre-image editing idea. It essentially introduced multiple images for a single Y. This actually violates the cleaning resistance requirement if you want a CRHF. To see why, first notice that the protocol is fine, because in the execution of the protocol, there is an R-part controlled by the receiver. If the receiver picks this R-part uniformly random, the editing case will not be triggered, except for negligible probability. So a malicious sender cannot make use of this. However, remember that our proof-based notion also have a construction of the function. The function is where the problem happened. More specifically, a PPT algorithm can easily construct two inputs. Both of them trigger the editing condition and hash to the same value. An adversary can do this, because when evaluating the function, there is no receiver. The single adversary can control the whole part of the function to solve this problem. We will embed a random string Z in the publish-hashing key. And we change the condition for editing case. We add one more condition. This condition says you can trigger the editing case, only if you know a pre-image of D under the Oracle hash function. As we know, a PPT-003 cannot figure out the pre-image of Z. Therefore, it cannot make use of the editing case. However, our security proof personas can always make use of this editing case. This is how we create the asymmetry between the other three and our security election. Unfortunately, I cannot talk more about our construction for hash function. If you are interested, please refer to our paper. Let me give a summary of this talk. In this work, we generalize the notion of functional black box zero knowledge proposed by Drosulac. We show that Drosulac's impossibility result still holds for our generalized notion. Specifically, we show that black box proof-based PRD is impossible. Next, we relax the proof-based notion by splitting the input and let the receiver control the R part a bit. Then we show that for this split input proof-based notion, when we function PRG and the collision-resistant hash functions become possible. Our work actually raises more interesting questions. As we mentioned before, we are only able to show the impossibility of proof-based PRG, I mean without splitting the input. We take this as a strong evidence for the impossibility for a proof-based one-way function, but it is interesting to see an actual proof for this impossibility. Also, this work inspires us to ask a more general question. That is, can we give black box zero knowledge proof for any polynomial-sized circuits? More concretely, consider this example. Let's say we have a circuit C, it can be decomposed into three parts. The input X will first go through a sub-circuit C1 that is crypto-free, and then a one-way function gate is evaluated on the output of C1. The output of the one-way function will go through another crypto-free sub-circuit C2 to give us the final output Y. It will be interesting if we can have a zero-knowledge protocol for the certifiability of this circuit while making only black box access to one-way function. Jojolek's impossibility result says this is impossible. That's because we cannot even have a black box zero knowledge for this one-way function gate, let alone the whole circuit. But our proof-based notion gave us some hope. The idea here is to replace the one-way function by its proof-based alternative. The result in our work gives a black box zero knowledge proof for this crypto gate. It will be very interesting to extend our idea to eventually give a black box zero knowledge proof for the whole circuit. Now we are at the end of this talk. Thanks for your attention, and more details can be found in the full version of this paper here.