 I'd like to talk about some latest progress that I'm excited about on the classical problem called private information retrieval or PIR for short. As a motivating example, let's consider private DNS. Private DNS recently received quite a lot of attention from companies like Cloudflare, Apple, and Mozilla. For example, the new Firefox browser supports private DNS. And what they're doing is essentially DNS over HTTPS. So in their solution, you still have to trust the server. It would be very nice if we can accomplish this without having to trust any central server at all. More concretely, imagine we have a paranoid cryptographer. Hypothetically, imagine he likes to play Minecraft, but he doesn't want the DNS server to know about this. Okay, so in this setting, the database is the DNS repository. The database itself is public. It is the query that we want to protect. And this is exactly the PIR problem. PIR was first proposed by Carl Goldrich, Push Levitt, and Sudan back in 1995. So it's like an almost three decade old problem. More abstractly, let's imagine the database contains N bits. The user wants to retrieve the bit at some position X. X is also called the index. And the user doesn't want to reveal X to the server. In this talk, we will focus on the two server scenario. Basically assume there are two servers that are non-colluding. We want to make sure that from each individual server's view, nothing is leaked about the user's query. In comparison with the single server setting, the two server setting can often result in more efficient schemes. Let me quickly tell you the landscape of this area. In the very beginning, researchers mostly considered a class of schemes which I call classical PIR. And these are schemes without preprocessing. In this setting, there's good news and there's bad news. The good news is that, you know, we know how to construct schemes with polylog bandwidth. Here when I write O tilde, it hides polylog factors. So O tilde one means polylog, and that's great. However, the bad news is that each query would incur a linear amount of server computation. Basically the server needs to look at every single position in the database. And in fact, Beimo and others proved the law about showing that indeed the linear server computation is necessary. Intuitively, the reason is that if there are some bits that the server doesn't look at, that it leaks the fact that the client is not interested in that position. Okay, unfortunately, the linear server computation per query is quite bad. Like in practice, when the database is large, this is not particularly scalable. Like in private DNS, the database can easily be several hundred gigabytes. So this motivated a promising new direction that is PIR in the pre-processing model. The pre-processing model was first suggested by Beimo, Ishai and Malkin. And they showed that by using pre-processing, we can overcome the linear server computation law. Okay, in this talk, we'll focus on the particular type of pre-processing, which I call the subscription model. Imagine that each client who wants private DNS service subscribes with the server. During the subscription, the client downloads and starts some hint locally. And afterwards, with the help of the hint, hopefully we can answer each online query with sublinear computation. In most conceivable applications, we always want the ability to support an unbounded number of queries after the one-time pre-processing. So in this talk, we care about the unbounded query setting. For pre-processing PIR, the state of the art is an elegant work by Kargan Gibson-Kogan, which had won the Europe 90 Best Student Paper Award. They showed how to support each online query with only square root n online time, assuming the existence of one-way functions. To get these results, they need to assume roughly square root n client-side storage. And in this talk, we are going to just assume the same square root n client-side storage. To get the sublinear online computation, they have to make a pretty significant sacrifice. Basically, their online bandwidth per query is blown up to square root n. And this is much worse than the classical PIR schemes. Also to avoid confusion, I want to quickly point out that part of the Kargan Gibson paper also considered single-shot schemes. In other words, schemes that support only a single query after the pre-processing. And they're able to get better asymptotics in the single-shot setting. But in almost all conceivable applications, we always want unbounded queries. And that's the setting we care about in this talk. So given this landscape, what's the most natural and obvious question? So we are asking, can we get the best of both worlds? That is, can we get sublinear online computation with pre-processing, but still preserve the polylog band with just like the classical PIR? In our work, we show that indeed we can achieve this. To get this result, we need to assume that LWU is hard, which is the standard lattice-based hardness assumption. Also due to an elegant law bound proven by Kargan Gibson-Kogan, our scheme is optimal up to polylog factors in the online computation, assuming the client has square root and storage and assuming that the server star the database in the original format and do not perform any encoding on the database. Okay, so essentially square root and online time is the best you can hope for this type of scheme. Okay, so I've told you about our results before I tell you our scheme. And let me mention that having a truly practical PIR is something we've always dreamed of having for the past three decades. So are we there yet? I guess in the past 10 years, I've spent a lot of time working on a seemingly related but different primitive called oblivious RAM or ORAM for short. So there with ORAM, we really made it quite practical. You can implement an ORAM scheme on a small secure processor chip. But in comparison, PIR schemes are not quite so practical yet. First, these classical schemes with linear server competition are what I call asymptotically impractical because of the poor asymptotics, they're unlikely to scale to large databases. And for this reason, the pre-processing model seems more promising if we want to eventually have like a practical PIR scheme. So our scheme is actually conceptually rather simple as you'll see later. But it's still not practical because we need a cryptographic object called a privately puncturable PRF. This is the only crypto we need and essentially the rest of the scheme gives statistical guarantees. And it turns out to construct a privately puncturable PRF, the only known construction is from lattices and the only known construction is also of a theoretical nature. And I wouldn't recommend implementing the current scheme in its original format. Okay. So, but our work, you know, hopefully does open up a new avenue towards eventually getting a practical PIR. So if there's a way to somehow make the constants in the big or small, maybe we can hope for something like that, like a truly practical PIR somewhere down the road. And maybe one way to, you know, get closer to the scope is to think about how to construct a concretely efficient privately puncturable PRF. And so this I think is an exciting, you know, future research direction. Okay. So without further ado, let me tell you how to get our results. I'll start with an inefficient strawman, which is a variant of the strawman scheme described by Carlton Gibson-Cogan. And then I'll tell you how to make the strawman efficient by using a new cryptography primitive called a privately puncturable to the random sets. So here's the strawman. Recall, we have two servers. I'll call them the top server and the bottom server respectively. Let me start with the pre-processing phase. In the pre-processing phase, the client talks only to the top server. Okay. Here's what the client does during pre-processing. It samples a random set of indices for every index from one to N, where N is the size of database. The client will flip a random coin and include the index in the set with probability one over square root N. For example, imagine that S-way is the set of indices sampled in this manner. It's easy to see that in expectation, the set is of a square root N size. The client now writes down the set. It also sends the set S-way to the top server. The top server now looks up the database at these specified positions. It computes the parity of all these bits. And the resulting parity is denoted parity of S-1 here. And it turns out to be one in this case. So the server now sends the parity bit back to the client. The client writes it down and remembers it. The client needs to repeat this process roughly speaking square root N poly log N times. As a result, it will write down square root N times poly log N sets locally. And it'll write down a parity bit for each set. Throughout this talk, for simplicity, I won't care too much about poly log factors. So sometimes when I say square root N, I may actually mean square root N poly log N. So not all of these sets must be of the same size, right? Because every index is sampled independently at random. The set size itself is a random variable, but it's very much concentrated around square root N. At the end of the preprocessing, this whole state that the client is storing is called the hint. And at this moment, the hint is like still pretty large. In fact, it's more than any size. And you may think this is pretty useless, right? Because if the client has such large space, it can simply start the whole database itself. But don't worry. Later on, we'll see how to compress things. And in fact, that's the technically interesting aspect of our work, how to compress these things. Now we are done with preprocessing. Let's see how to support online queries. In this example, imagine that the client wants to fetch the index six. Note that the client did not know the index six earlier during preprocessing, right? So this query is generated in an online fashion. During the online phase, the client will interact with the bottom server. To be able to fetch the database at the desired index six, the client first needs to find the set. It has start locally that contains the index six. In this case, S2 would serve the purpose because it contains six. OK, so the client also starts the parity of the sets as two. And to obtain the bit at index six, the client only needs to find out the parity of all the remaining bits in S2. Basically, all the bits except the position six. So therefore, the most naive approach is for the client to just remove six from the set S2. And the remaining set is called S2 prime. The client sends S2 prime to the bottom server. And then the bottom server will compute the parity of S2 prime. And then it will return the parity to the client. And the client can simply XR the parity of S2 and the parity of S2 prime. And then it will obtain the bit at position six. The bottom server doesn't know what sets the client has downloaded from the top server during pre-processing. So it seems like the bottom server is just like seeing some random sets. So intuitively, this approach shouldn't leak too much information. But I want you to think more carefully, is this thing really secure? It turns out there's actually a problem. Like this random looking sets S2 prime that the bottom server sees definitely doesn't contain the desired index six. And this leaks information, right? Because the server can learn anything containing the set must not be what the client wants. OK, so this isn't too great. But it turns out there's an easy way to fix the problem. So here's the fix. Instead of removing the desired index six from S2, the client will instead resample the decision whether six belongs to the set. And here I'm using this particular notation to denote this resampling operation. So in other words, the client will flip a fresh new coin and decide whether to include the index six. And it's included with probability 1 over square root n, the same as before. OK, so as a result, with high probability, the index six will indeed be removed. But with some small, non-negligible probability, six will still be included. And same as before, the client now sends the resampled sets S2 prime to the bottom server where the desired index six has been resampled. The bottom server returns the parity of the bits at the specified locations. And now the client computes the XR of these two parities, and it hopes to get back the bit at position six. Well, this will only give a correct result if the resampling indeed removed index six. If the resampling fails to remove the index six, the client may obtain incorrect answer. So in other words, in order to fix the privacy problem, we've introduced a small error probability. But it turns out the small error probability is easy to fix. And we can just basically use k-fold parallel repetition and take majority among these k answers. So this is like a standard technique. And we can amplify the correctness to essentially 1 minus negligible. Because this k-fold parallel repetition is simple, for the rest of the talk, I will mostly just focus on a single instance. And I'll assume that for a single instance, it's OK to incur a small but non-negligible error probability. OK. Before I talk about how to make the straw man's theme efficient, let me quickly mention that so far I've focused on making only one online query. It turns out to make the scheme sustainable and support an unbounding number of queries. This is not enough. We need another operation called refresh, right? After you make a query, you have to perform a refresh operation. Because when you make the query, you consume one entry in the hint. And you have to replenish that entry. So I won't actually have time to talk about the details of the refresh. And so please read the paper for details. OK. So so far we've seen an inefficient straw man. The good news is that we've managed to achieve square root n online time. Because the server only needs to compute the parity of a set that's roughly square root n in size. And it only has to look at the database at square root n positions. The problem is that the client space is larger than the database itself. And also the online bandwidth is square root n because the client is sending the whole set to the bottom server during an online query. What we want to achieve is to reduce the client space to roughly square root n rather than linear. And we also want to reduce online bandwidth to roughly polylog. And we want to keep the square root n online time. OK. So one thing I want to mention is that we want both the client and the server's online computation to be small and basically upper bounded by square root n. So how can we achieve this kind of compression? And this brings us to a cryptography primitive called puncturable pseudo random set. At a high level, what we want is the following. Let's start with the preprocessing phase. So instead of storing the entire set as one, imagine we compress it down to a small pseudo random key, denoted k1. And if we do this for each of the roughly square root n sets in the hint, we can compress the storage to roughly square root n. Similarly, instead of sending the entire set as one to the server, the client can simply send the compressed key k1. And then the server can take this key and expand it to the entire set. And then it can compute the parity of these bits and send it back to the client. OK. So basically at the end of the preprocessing phase, remember now the client would be storing roughly square root n keys rather than square root n sets. And its space is, again, square root n. OK. So that was the preprocessing phase. And that's easy. So now let's look at the online phase. And here's something more interesting happens. In the online phase, imagine the client wants to look up the index 6. As before, the client needs to identify a set in its local hint that contains the index 6. In other words, it wants to find the key kI, such that 6 is included in the sets generated by kI. Remember that in the strongline scheme, the client would resumple the decision whether 6 is included in the set and then send this resumpled set to the bottom server. So now here, correspondingly, we replace the resumpling operation with a so-called puncture operation. The client punctures the key kI at the point 6. And this gives a puncture key denoted kI prime. I want you to think of the puncture key as the following. The set generated by the puncture key kI prime is basically the set generated by the original kI. But with the point 6 resumpled, OK. So I'm going to tell you how to construct something like this. But for the time being, imagine that indeed we have a puncture versus a random set like this, right? And basically now the client can just send the key kI prime to the bottom server. And the bottom server enumerates the set generated by this puncture key kI prime. It looks up those bits in the database and computes the parallel, returns it to the client. And basically everything is just like the straw man scheme except that we are compressing the sets within 16 keys. And importantly, we need the ability to puncture a point from some key that's representing a certain set. So a cryptographic object like this is called a puncturable theorem set. And what it allows you to do is the following. So first, we want to be able to sample a key that represents a set. With this key, we can have a set enumeration algorithm that expands the key into a theorem set. And then we can puncture a key at a certain point, which is effectively resumpling whether this puncture point belongs to the set or not. OK, so to use such a puncturable theorem set scheme in our PIR, we wanted to satisfy some nice properties. First, we want to make sure the puncture key hides the point being punctured. And this is because the client would be sending the puncture key to the bottom server during the online query. And due to the privacy requirement of PIR, it must be that the puncture key doesn't reveal which index the client wants. So if a puncturable theorem set scheme satisfies this private puncturing property, we also call it a privately puncturable surrounding set. Second, we want a fast membership test algorithm. Specifically, given a key or a puncture key, we should be able to determine whether a given index x belongs to the set in as small as polylock time. And why do we need this requirement? Recall that whenever a new query comes, the client must find a key Ki in its local hint, such that the desired index, in this case 6, is included in the set generated by Ki. And the way you would do this is essentially scan through every key, start in the local hint, and test if the desired index belongs to the set. Because there are as many as square root n keys to check, we want to make sure there's a fast membership test algorithm. And this will make sure the client's computation is upper bounded by roughly square root n. Finally, we want a fast set enumeration algorithm that is given a key or a puncture key. One should be able to enumerate the entire set in time, roughly proportional to the size of the enumerated set, basically square root n. OK, so this is because whenever the server receives a key or puncture key, it needs to enumerate the entire set in time, roughly, square root n. And then it can look up the database at those positions and return the parity to the client. OK, just to quickly recap, we want these three nice properties from this puncture-reversed random set scheme. The first is a security property, and the other two are efficiency properties. And how can we construct such a puncture-reversed random set? In some sense, Corrigan, Gibbs, and Kogan, in their paper, they encountered the same technical challenge, and they were kind of stuck here, and that's why they didn't have an optimal scheme. OK, so let's think about how to construct a privately puncturable pseudo-random set. From prior works, we know how to construct a privately puncturable PRF, and this seems pretty close to what we want. So what is a privately puncturable PRF? It is essentially a PRF scheme, but with an additional puncture operation. The puncture operation allows you to produce a puncture key, and this puncture key would let you evaluate the PRF at any point except the point that's being punctured. And moreover, the puncture key must hide the point that is punctured. OK, and here's the most naive approach. Let's say we want to know if an element 6 is in the set or not. We can evaluate the PRF at the point 6 and check if the outcome has a half log n trailing 0s at the end. In this way, with roughly 1 over square root n probability, it will get included in the set. Puncturing a point is also simple. If we want to puncture a point from the set, we can simply call the puncture function of the underlying PRF. Recall that if we puncture some point and use the puncture key to evaluate exactly the puncture point, it will just give a pseudo-random outcome. So this is as if you freshly resampled whether this particular element belongs to the set or not. OK, so now I want you to take a moment and think whether something like this would work. The problem with this approach is that the set enumeration algorithm takes too long. If I want to enumerate all elements in the set generated by some key, I essentially have to perform n individual membership tests, taking linear time. And remember, we want square root n rather than linear. OK, so we've seen a strong man scheme and we learned that if we want to guarantee fast membership tests and fast set enumeration at the same time, that seems pretty hard. And this brings us finally to our approach. At a very high level, our key insight is the following. We will sample the set according to a carefully crafted distribution. Each element is included in the set not completely independently. There's a tiny bit of correlation between certain elements. And with this carefully crafted distribution, we can simultaneously have fast membership tests and fast set enumeration. But there's just a small caveat that comes with it. This correlation between elements is going to break the puncturing just a little bit. And it turns out this will introduce a small error probability for the puncturing. But we can cope with small error probability using parallel repetition so it doesn't turn out to be a big problem. OK, so let's see how to actually instantiate a scheme given these insights. Imagine the client has in mind a query for the index 38. The client wants to find the key among all the keys it has start such that the set generated by the key contains 38. OK, so to do this, we want to test if the 38th element belongs to the set or not. First, let's express this number in the binary format. Essentially, this is the base to representation. And next, we will pet the string with two log log n 0s at the front. I'll explain why this padding is needed later. And we'll take the last half log n bits and compute the PRF of these bits. So when I write h here, it doesn't mean a hash. It actually means the puncturable PRF scheme. OK. And not only so, I will repeat the same for every suffix of the string that's at least half log n bits long. And now the element belongs to the set if and only if all of these tests return true. And here, I've just copied the scheme to the upper left corner so we can keep looking at it. One question I haven't answered is how to puncture a point from the set. And that's pretty easy. Basically, we can puncture all relevant suffixes from the PRF scheme. And this does require a PRF that supports puncturing at log n positions. And that's not a problem because we know how to get it from a standard puncturable PRF that supports puncturing at a single point. OK. Furthermore, if the underlying PRF is privately puncturable and then the resulting puncturable student-run set scheme would also be privately puncturable. So what about efficiency? It's not hard to see that the set size is roughly square root n still. Membership test is fast. Basically, the description of the scheme itself provides a fast membership test algorithm. And specifically, a membership test requires computing the PRF on roughly log n points. So it takes time at most polylog. Finally, I also claim that with this kind of structure, set enumeration is also efficient and takes time roughly square root n. So I won't have time to give the details of the set enumeration algorithm at a very high level. You start with suffixes of half log n, and then you keep expanding them. At any point of time, you keep only the suffixes that hash to 1, and you throw away the rest. And there should only be roughly a square root n of them that you have to keep. And please read the paper for details. So there's only one last question I haven't answered yet. Why do we need these 0s at the front? To answer this question, we call that whether two elements belong to the set or not are not completely independent. There's a bit of correlation. In this example, x and y share sufficiently long circuits. Recall to test whether x or y is in the set. We need to compute the PIF on every suffix that's sufficiently long. And therefore, if x is included in the set, then y is more likely going to be included as well. In other words, x and y, they have a little bit of correlation. This correlation results in a small problem. If we try to puncture the point x from the set with the intention of removing x, we might accidentally end up removing y as well. This isn't the intended behavior we want. If poultry x causes any other element to be removed too, it will incur an error in the PRR scheme. But remember what I said about error, right? We can deal with small error probabilities through parallel repetition. So it turns out, as long as we can keep the error probability small, we are fine. And this is the reason for these beginning zeros. Like we are padding these two log log n zeros in the front because it makes the probability that any element is included in the set, a poly logarithmically smaller, such that with more than 99% probability, puncturing x is not going to cause any other element to be removed along the way. Our paper contains a detailed analysis. To summarize together our PRR scheme, we just need an efficient privately puncturable surrounding set. And constructing such a puncturable surrounding set scheme with the desired efficiency requirements is non-trivial. As mentioned, our key insight is to sample the set according to a carefully crafted distribution that results in conflict between fast set enumeration and fast membership tests. This slide correlation is going to break the puncturing just a little bit, introducing a small error probability, but we can deal with that through parallel repetition. Our construction is conceptually very simple, but the proofs are non-trivial, especially the correctness proof. And also, although the construction is conceptually simple, if we want to implement it, we do need a concretely efficient privately puncturable PRR scheme. There's quite a lot of things that I didn't have time to cover. Please see our paper if you are interested. On a concluding note, I want to call your attention again to this long time dream question that's not dear to my heart. Can we get a truly practical PRR? This question is still open. I'm very happy to discuss with you if you are also excited about it. Thank you very much.