 Hello, and welcome to my TCC talk. My name is Ari Karchmer, and I'll be presenting a paper titled Covert Learning, How to Learn with an Untrusted Intermediary. This is joint work with Ron Canetti. So to illustrate the topic of this talk, I'll start with a story. Let's say that there's a biologist named Alice, who wishes to develop a so-called structure activity model. A structure activity model describes how a certain set of molecule features influences activity. Here, features could be any molecule attributes, like components, size, or shape, while activity is basically whether or not the molecule in question binds to a specific fixed protein or cell. Now let's say that Alice, being an expert in her field, has some advanced domain knowledge that allows her to hypothesize some subset of structure activity models out of all such possible. Then to actually learn the best model, she encodes her hypothesis into a specific set of experiments that she will conduct and then process into her final structure activity model. So Alice takes her experiments and reads the results. And maybe based on these results, Alice designs some new experiments and runs those. And now the complication in this story is that Alice has a lab mate called Eve. Eve is quietly spying on Alice, and so she observes all the experiments and the results. So the main question I'll consider in the majority of this talk is, how can Alice continue to run her experiments in the lab so that she can obtain a good model while preventing Eve from using the experimental design to gain any knowledge about Alice's initial hypothesis? And furthermore, I'll consider, how can Alice prevent any knowledge leakage about even the actual structure activity relationship that is revealed by the experimental results? And I'll note that what makes these questions interesting is that the molecules aren't traditional computational entities. We can't just establish a secure channel with them, which would take Eve out of the equation. Then in the second part of this talk, I'll further complicate the story. Now imagine that after starting her experiments, Alice is contact traced and she must isolate, making her unable to read the experimental results. Therefore, Alice has to ask Eve to read and report to her the results. So in the second part of this talk, I'll consider, how can Alice verify the results reported by Eve? And this is in addition to maintaining the privacy of her hypothesis and the structure activity relationship. Let me now distill this story into a short list of informal goals, which together form the covert verifiable learning model. First up is learning. So if Eve reports the experimental results truthfully, Alice learns a good model from her experiments. Then there's concept hiding, which is the guarantee that no or maybe little information about the molecular relationship is leaked. And this is true at least when Alice's random coins remain hidden. Then there's hypothesis hiding, so no information about Alice's hypothesis or domain knowledge used to influence the hypothesis is leaked. And again, this is true given that Alice's random coins remain hidden. And finally, verifiability, which is the guarantee that if Eve tempers with the results, she can't deceive Alice into learning a faulty model. And here we might let Alice have prior to access to some random ground truth experiments, which will allow her to have some information to leverage against Eve. Now, certain aspects of this problem setting have been studied before, which I'll very briefly highlight. First, a recent work of Goldwasser, Rothblum, Schaefer, and Yehudayev examines a related model called pack verification, which only considers analogs of the learning and verifiability goals. On the other hand, we have another recent work of Ishae, Kushi Levits, Ostrovsky, and Sahai, which considers a problem called cryptographic sensing. Cryptographic sensing focuses on analogs of learning and concept hiding. Now, to prevent any confusion, let me quickly tell you more what this talk is not about. First off, the model of differentially private learning considers the privacy of the owners of the data. The learn hypothesis should not depend too much on its training data, roughly, whereas our model considers the privacy of the learner itself, that is its intentions and what knowledge is actually gained. Additionally, you might consider a relation of our verifiability goal to verifiable computation. However, this is again a separate issue because, first, verifiable computation is an interaction between two computational entities. And second, it is concerned with the computational steps themselves, not the fact that a good hypothesis is actually produced. So let me now move on to describing some real-world use cases of covert verifiable learning, one of which is lab leaks. So it turns out that although the story about Alice and Eve might have seemed a bit magical, it might not be too far from what actually goes on in the real world in the drug discovery industry. I'll point out a natural correspondence between Alice and a drug company called A, Alice's hypothesis or experimental design and any intellectual property owned by the drug company. And finally, Eve and the Biolab corresponding to some hostile environment such as an experiment outsourcing facility. So put in these terms, our questions are, how does A prevent trade secrets from being revealed by the experimental design? How does A discourage the facility from double-selling data to a competitor? And finally, how does A discourage the facility from returning faulty data? And now I promise that this is the last time I'll bring this up. But I really want to hammer home the point that we are not looking to execute a protocol with the outsourcing facility. This is essential to understanding our work. In some way or another, some experiments desired by the drug company must be disclosed to the facility. And so our algorithm can only interact with nature, that is the structure activity relationship. In other words, we must encode hiding and verifiability into the experiments themselves. And again, this is what makes the problem interesting. If nature were a computational entity, then we could just establish a secure channel and be done. You can also imagine other applications of covert learning, not just the scientific discovery example, but more generally any time you want to hide what you are learning but can't work with any other entities. One example of this is a connection to model extraction attacks, which considers the problem of a malicious client trying to reverse engineer some model that is hosted in the cloud as machine learning as a service. The client does this by simply querying the model through the cloud interface. In this setting, it is a commonly proposed defense to have the machine learning as a service provider monitor the incoming queries per client to try to classify them as malicious or benign. So essentially covert learning could provide an attack avenue here by hiding from these query monitors the intentions of the client. I'll now informally state our main contributions which are found in our paper. First, we conceptualize a formal learning model that takes into account the hiding goals. This is just the basic covert learning model. So in this basic model, we give covert learning algorithms for noisy parity functions and decision trees in an agnostic learning case. Then we describe how to augment the basic model with the verifiability guarantee. This entails extending the hiding guarantees to accommodate an active adversary. And then in this model, we show how to augment the previous algorithm with verifiability guarantees. Next, we show how to weaken the assumption that Alice has private access to random ground truth examples for verifiability so that there is instead a set of publicly known ground truth examples. And how to do this was previously unknown even in the pack verification model of Goldblaster at L which does not consider hiding guarantees at all. And then finally, we explore other settings where we can achieve stronger privacy and verifiability guarantees. And in this talk, we will focus on the first four. In fact, the main goal is to understand an intuitive version of these new learning models. And I will also give some high level descriptions of the algorithmic ideas and proof techniques we use to construct learning protocols within them. Let's now move on to showing the covert learning model. So the first observation to make when trying to model covert learning is that Alice's story also hints at a relationship to learning with membership queries settings. You can view the structure activity relationship as a Boolean function where the domain is a set of all assignments of n binary features. And the co-domain is a binary label for whether or not the molecule in question displayed activity with some fixed protein. And from this perspective, any experiments or screens are essentially membership queries to this function while an initially hypothesized set of models corresponds to a specific choice of hypothesis class in the agnostic pack learning setting. So it is important to note that we strongly care about working in agnostic learning model in order to motivate hypothesis heading. I will now go over some of this pack learning terminology. We consider a hypothesis class index by number n as a subset of Boolean functions. Similarly, a concept class also index by number n is a set of joint distributions over n-bit strings and single bits. And when drawing from the concept distribution, an example consists of an input and a label. Finally, a loss function is a function that evaluates the strength of hypothesis for a certain concept. For example, the probability that a randomly sampled data point from the concept is correctly labeled by the hypothesis in question. Now let's review agnostic pack learning. So fixing some hypothesis class in concept class very roughly, an algorithm is an agnostic pack learner if with respect to some fixed loss function, it receives a sufficiently large set of random examples and can produce a hypothesis which is competitive with the best hypothesis in the fixed class. Now, when there are membership queries allowed, that means that the learning algorithm can also query the concept on specific points. In other words, the learner can bypass the marginal distribution over inputs and receive a label on any input of its choice. And so why is this called the agnostic model? Well, the hypothesis class doesn't need to contain a perfect hypothesis for the concept. In fact, the hypothesis class could be chosen independently of the concept class. Perhaps when no or few assumptions can be made about the concept class. Therefore, it's clear that the choice of the hypothesis class is a very important aspect of agnostic learning. The choice might greatly affect the outcome. So in modeling covert learning, we largely stick to this described pack setting, but we depart in an important way. We consider a collection of hypothesis classes rather than a single fixed one. For example, the collection could be many sets of monomials where each set of monomials only contains variables within some secret subset of variables that are hypothesized to be relevant or important or useful. So in this case, the secret subset would be a hypothesis. As an usual pack learning, the covert learning algorithm takes its input accuracy and confidence parameters, but then in addition, some target hypothesis class within the collection. As you can see from the figure, we have an interaction with an oracle to the concept. Here the learner may request labels on synthesized inputs. At the end, the learner outputs a hypothesis that carries the agnostic pack learning guarantee with respect to the input target hypothesis class. So what's the catch? Well, there's also an adversary who views the entire interaction between the learner and the oracle. That is, it observes all membership queries and the results, and then tries to obtain some information about either the concept or the chosen hypothesis class. And so recall, our goal is no information leakage, but with respect to concept hiding, obviously there are some concept classes which we can't hope to achieve no leakage. For example, if the concept is always a constant function, then the moment a single example is shown to the adversary, all information is leaked. So as you probably suspected, we cannot hope to prevent all leakage. A bit more generally, the adversary could apply a technique called Alchem learning. Very roughly, an Alchem learning algorithm takes any set of examples and outputs a succinct hypothesis that agrees with all those examples. Here succinct means that some natural measure of its size is bounded by a parameter of the learning algorithm. Now it is known that Alchem learning is essentially equivalent to pack learning in that one implies the other. This equivalence will give us a nice way to both accept the fact that not all concepts can be hidden, but also neatly put together a privacy definition. In that definition, we claim that the natural choice is to allow leakage of random examples to the concept. As a result, we define hiding as the requirement that there is a probabilistic polynomial time simulator, which takes a set of random examples to the concept whose output forms a distribution which is computationally indistinguishable from the real interaction transcripts. The interaction transcripts here consists of all the membership queries and responses. The simulator crucially works without knowledge of the target hypothesis class and no query access to the concept. So only random examples from the concept are given as input to the simulator and this forms the only leakage, at least to an efficient adversary. A few more reasons why allowing the simulator to access random examples is first, we obtain an interesting zero knowledge style guarantee here in that a covert learning interaction reveals only as much as could have already been learned from a public set of random examples to the concept that exists. Furthermore, this is good because we have a strong idea that learning problems exist where leakage of random examples does not give much information on the concept. Specifically, decision trees, small depth circuits and parodies with noise are a few examples. The reason for this is that these problems are thought to be computationally hard to learn from random examples. On the other hand, they're attractable when given membership queries and this is the sweet spot that gives us a chance to do covert learning. Finally, it isn't simply an artifact of this definition that we leak random examples. At a high level, the equivalence of PAC learning and outcome learning means that in some sense, the leakage of random examples is the minimum amount of leakage anyway. This definition gives a trivial example which will not be the focus of this talk. Let me explain how PAC learnable classes are also trivially covertly learnable and so this naturally characterizes those classes that we can't hide and therefore will not focus on. In particular, if we assume every hypothesis class in the collection is learnable, that is we can apply an efficient learning algorithm to any large enough set of random examples to the concept. Then we have a covert learning algorithm as follows. First, get a sufficiently large set of examples, then run the PAC learner and then we can see that if the simulator returns the random examples given as input, then obviously the privacy guarantee of covert learning is satisfied. In fact, not computationally, but perfectly and it won't hide anything at all due to the existence of the PAC learner. And so some examples of PAC learnable classes are constant term DNFs and parodies without noise. Okay, so with the covert learning model established, we're ready to take a look at a first example of covert learning. So again, parodies with no noise are trivial, but what about the noisy case? I'll now introduce the first learning algorithm and while discussing it, I'll ask you to focus on the concept hiding guarantee, mainly for simplicity, but also because the instance of covert learning that I'm about to show you has concept hiding in a very strong way. So what is the learning problem? First, we need some background. The following is the LPN distribution of which the study was initiated in a work of Bloom, First, Kearns and Lipton. The problem at its core asks to solve a system of noisy linear equations. Another good way to describe it is as a distribution. In this distribution, first, an N-bit secret S is drawn where each bit is sampled from a Bernoulli random variable with mean P. Second, a noise bit E is sampled from the same random variable. A uniformly random vector A is then sampled and the inner product of A, S, X then X sort with E is returned along with A. This constitutes one sample and over many samples S remains persistent while A and E are freshly sampled each time. The classic problem associated with this distribution is search LPN, which asks how to find S given some bounded number of samples and it's not too difficult to see that search LPN is equivalent to solving the noisy system of linear equations. This problem is conjectured to be computationally hard. In fact, this being true implies that the decision analog is also computationally hard, which is particularly useful. The problem we will actually now consider is the low noise variant due to Electrovich, which he used to give the first public key encryption scheme from LPN of any noise rate. The low noise variant allows us to draw the secret and noise from a Bernoulli random variable of mean one over root n instead of some constant. So our question will be, is there a covert learning algorithm for learning the secret S with respect to this low noise distribution? The reason we are working with low noise will become clear later, but doing covert learning with respect to the standard LPN distribution is a very interesting open problem because it would have implications on what kind of primitives we know to exist from standard LPN. So indeed, the answer is that there is, and I've stated this theorem on the screen as relying on the assumption that low noise search LPN is hard, which in some sense is the minimal assumption because it keeps the question non-trivial and relevant. Note that it is actually unconditionally true because if the search LPN problem is tractable, then we already saw this makes it easily covertly learnable. Now, to understand the covert learning algorithm, it's best to first understand the leaky way to learn the parity. To do this, we simply need to query each bit of secret parity one by one by using the standard basis factors as queries. Then by applying repetition and majority voting, we can remove the noise. But it's obvious that if Alice chooses to do this in the lab, then Eve can easily do the same majority voting and learn the noisy parity as well. The basic idea of the covert algorithm will be to transform the queries in the leaky algorithm to pseudo random queries. If we can learn from these, then the simulator is easy. It only has to return the random examples given as input. One way to do this is to generate a pseudo random mask for each query by using the LPN distribution itself. To do this, we take one random N-bion matrix and then a fresh secret and noise vector for each query. Once we have the masks, we will use them as one-time paths for each query. Pictured, I have at the top the first masked query. Below, you can see that the result from querying the oracle for this masked query where the secret parity is in purple. Next, we also need to query for the columns of the random blue matrix and this will help us later for learning tasks. Before analyzing how to extract the secret from these queries, let us quickly see how our membership queries are pseudo random. The query transcript is at the top and indeed it follows directly from our decisional low-noise LPN assumption that the joint distribution is pseudo random since we have used fresh secrets each time. The total number of queries is two N, so a simulator needs to return just two N random queries from its input. However, I want to highlight that the concept padding here is very strong. It's not too hard to see that the oracle responses on these queries are pseudo random due to the low-noise LPN assumption. So in fact, no information is leaked here at all. The simulator does not need to access any random examples and could just return all random bits. Now, let's talk about how we can leverage the LPN secrets to find the secret parity function. We'll need this XOR lemma written at the top which allows us to measure the bias of inner products of binary vectors. Now picture, I have the result of the first masked query XORed with the inner product between the results on the columns of our blue matrix and the LPN secret used to mask the basis vector. Now, this can be simplified to the inner product between the unmasked query, the basis vector, and the secret parity, plus three noisy terms which are circled. In the case of the first term, note that both the red and purple vectors consist of bits that are independently drawn from the Bernoulli random variable with mean one over root N. With this in mind, the XOR lemma shows what the inner product is biased towards zero by a constant. The same argument gives the same conclusion for the inner product between the orange vector and yellow vector. Finally, the last noise term, the orange square bit, is one with subconstant probability. Overall, since each noise term is independent, we can conclude that the final result is the desired inner product with high probability. From here, we can prescribe a usual course of repetition and majority voting and then decode the secret parity bit by bit and then we are done. So moving on, the question is, can we covertly learn more interesting concepts? What I will now show you is how to conduct Fourier analysis based learning techniques covertly. For background, let me review some Boolean Fourier analysis and review the learning techniques. To set the stage, let F be a function from F2 to the N to the reels, but in particular, we will consider the range of F to be the set negative one and one so you can interpret F as a Boolean function. Now, for some subset S of the N indices of F, we define a character function pi sub S, which just computes the parity of the input on the bits indexed by the set S, but notice how zero is mapped to one and one is mapped to negative one. Now, it turns out that this set of character functions forms an orthonormal basis over the functions F so we can uniquely represent each F by Fourier expansion, a linear combination of the characters. We call each coefficient the Fourier coefficient on the pi sub S and we also refer to it as heavy if the magnitude is some noticeable function of N. Also, the degree of the coefficient is the size of the set S. Now, it's also important to understand why learning Fourier coefficients is a useful task. First, by definition, it is the case that the Fourier coefficient on a character measures the correlation between that character and the function. Here the expectation is taken over uniformly random N bit X. So that means that if we can find some heavy Fourier coefficients, that gives a weak predictor on the function just by taking the parity function as the predictor. Since Fourier coefficients measure a correlation with respect to uniformly drawn inputs, this shows how learning Fourier coefficients interacts nicely with learning with respect to a uniform distribution over inputs. I'll also note that a single Fourier coefficient can be efficiently approximated by estimating the expectation at the top of the slide. And it turns out that an important application of this is that it allows us to learn polynomial in N size decision trees just by obtaining the subsets of log N degree that are heavy. For a refresher, a decision tree is a representation of a Boolean function by a binary tree where each inner node is labeled by an index of an input bit, each outgoing edge is labeled by zero or one and the leaves are labeled by real numbers. On any input, the decision tree follows a computation path down the tree by going in the direction of the assignment of the variable at the labeled index. Now, the size of the decision tree refers to the number of leaves on the decision tree. So a polynomial size decision tree has polynomially many buckets or leaves. The class of polynomial size decision trees is a very expressive class for which no efficient pack learning algorithm is known. Plus it's especially relevant to us because decision trees are the standard way to conduct structure activity model. I won't get into this further, but for these reasons I'll now explain how to do covert learning for heavy low degree Fourier coefficients. So let me now explain the problem setting further. Here's the type of concept we are trying to learn. When getting a sample from the concept, first an input will be sampled uniformly at random. Then there is always a target function that labels the input. So the output of the distribution is always a uniformly random X and F of X. In this setting, to make a membership query, one simply bypasses the marginal distribution over inputs to obtain a label on any input. So in case it is still not crystal clear, our task is to obtain Fourier coefficients of F while satisfying the covert learning privacy guarantees with respect to this concept distribution that I just described. Now, before going on to describe how we do it, I wanted to highlight the hypothesis how to guarantee for this problem, which I kind of skipped over in the first result for noisy parodies. To give an example, let's say that the learner has some expert domain knowledge about this function. For example, in the drug discovery application, domain knowledge could be a certain subset of molecule features which are highly relevant. Now, in particular, let's say that the domain knowledge is that all heavy Fourier coefficients are on subsets S which are further contained in some subset T of the N indices. This knowledge is valuable as the learner could make more targeted queries to learn the function while reducing query complexity and computation time. However, the learner does not want their queries to reveal this valuable set T. So the big hypothesis question is, how can we prevent this leakage? Well, this is exactly what is formalized by the hiding guarantee. We can do covert learning for a collection of hypothesis classes where each class is indexed by the subset T which denotes the part information. Now, we have the guarantee that no information about T is leaked to an efficient adversary due to the guaranteed existence of the simulator. So what we're actually going to do algorithmically is use a similar technique to before. The masked queries, but we will do it with an LPN variant due to you and Jen. The interesting thing about this variant is that it can be conjectured to provide super polynomial hardness even when secrets are drawn from distributions with log squared minimum entropy. It will turn out that this allows us to use sparse secrets which at the end will be crucial for the success of the protocol. And finally, the hardness of the variant is implied by a somewhat better known assumption on the standard LPN distribution which assumes sub exponential hardness. So graphically depicted is the masked query technique with the squared log LPN distribution used to generate the one time paths. And then with the labels given by the function F. With this technique, we can get hiding out of the way quickly. It is again quite easy to see because it follows straight from the squared log LPN assumption and using a security under polynomial composition property. All the similarity needs to do is again just return its random examples that it has as input. Okay, so with this in mind, let me begin to overview why this process will work for us. Again, the plan is to mask queries using this LPN distribution as a one time path. The question then becomes, what can we say about the results returned by the Oracle? So the key to the protocol is an analysis that shows that post-processing the results by reintroducing a dependency on the LPN secret for each respective query allows a type of noisy access to low degree and heavy Fourier coefficients of the function F. In particular, that the result of the randomized mapping phi on the result of the masked query has a significant correlation with any parity function induced by any heavy log N degree Fourier subset when evaluated on the unmasked query. That's in words with the limit on the screen states. Indeed, this setting may be somewhat familiar. The setting can be seen to be a Goldreich-Levin environment, so to speak. The Goldreich-Levin environment or reduction gives a method for extracting parity functions given a noisy predictor and using membership queries. That's what we have really shown here is that the randomized mapping on the Oracle result of the masked query is such a noisy predictor with any heavy log N degree parity function on the unmasked query. So we can update our algorithmic idea to the following. Round the Goldreich-Levin algorithm in the masked regime and do this in such a way that focuses on the secret subset of indices T that we are interested in. Naturally, the hypothesis hiding property given by the pseudorandomness of the query's hides T. Perhaps an interesting dynamic to highlight here is that we don't actually make any specific queries directly to the function. Instead, we simulate access to a noisy predictor for some of the heavy coefficients of F. It is natural to ask, why do we only get log N degree coefficients? One needs to get into the proof of the key lemma to see why, but it boils down to a basic counting argument that coefficients of higher degree are affected more by noise, which eventually becomes too much to handle. And finally, I'll recall the utility of this algorithm. As mentioned, we can use knowledge of the heavy log N degree Fourier subsets to produce a hypothesis, which is competitive with the best polynomial size decision tree that fits any function with respect to the uniform distribution of our inputs. We formalize this in our work as a covert learning theorem for decision trees. So it now looks like we've made it to the second part of this talk. It's time to introduce verifiability. The best way to think about the verifiable model is as an interaction with an adversarial intermediary instead of passive, which I'll now just call the adversary. In this case, the adversary not only just monitors the queries and responses, but it will actually be allowed to corrupt responses. This resembles an interactive learning setting where the learner is not requesting queries from an oracle, but from the adversary and the adversary labels the queries as it pleases. I'll note that this is on top of the already agnostic setting, so the adversary may mislabel the concept, which may be in and of itself an arbitrary function. The verifiability guarantee that we want to obtain is essentially that if the adversary corrupts any oracle responses, then the probability the learner's output does not satisfy the learning requirements as low, given that the protocol was not aborted by the learner. In order to achieve this, we will allow the learner to access a secret set of uncorrupted random examples of the concept. Now, it may be useful to compare this requirement to soundness in an interactive proof while the learning requirement corresponds to completeness, and that is essentially how we formalize verifiability. For any concept in the class, any hypothesis class in the collection, any secret set of random examples, and any adversary, the probability that the algorithm fails is low, at least given that it did not abort the interaction altogether. Now, it will be necessary to extend the privacy requirements to capture the active nature of the adversary. In order to do this, we will follow the real versus ideal world security paradigm and separate the adversary from an external distinguisher we tries to distinguish between the real and ideal world. So in the real world, we allow the distinguisher to essentially choose the conditions of the learning task. The concept within the class, the hypothesis class within the collection, and the learning parameters. Then a randomly sampled set of examples is drawn from the concept. This set is given to the learner along with the relevant learning parameters. The adversary gets just epsilon and delta. Then the learning interaction happens and the adversary gets to output some arbitrary string based on this. So finally, the output of the distribution is the learning parameters, plus the string output by the adversary, plus the secret set of random examples given to the real learner. In contrast, the ideal world goes as follows. The distinguisher still chooses the conditions of the learning problem. Then another set of random examples is drawn from the concept. And the simulator gets the same inputs as the real learner, but also this set of extra random examples. Finally, the simulator interacts with the oracle in the adversary's head and the adversary can output any string. The output of the distribution is analogous to the real world. I'll note that we have a desirable property here that the security holds even after the leakage of the secret set S. We can now move on to discussing how to transform the basic covert learning algorithm of learning for your coefficients into a verifiable version. So the idea is very simple. From the covert learning algorithm, we know that the learner's queries are pseudo random. This essentially gives the learner the ability to hide from the adversary when it is running a covert learning algorithm and when it is faking it, say by sending random examples. So this is exactly what we will do. We will randomly conduct a learning phase and a testing phase where each happens with probability one half. In pseudo code, it looks a bit like this. The learning phase consists of running the basic learning algorithm for heavy Fourier coefficients while the testing phase sends a portion of the secret set of random examples and then checks for consistency. Remember that the concept in this case always labels every example in the same way. So a consistency check is possible. Now, our main claim is that if the adversarial intermediary can with high probability corrupt even one oracle response without causing the learner to abort, then the squared log entropy LPN problem is solved efficiently. And this can be seen by observing that in the testing phase, a dishonest AI is always caught. And so any cheating advantage by the adversarial intermediary can be converted into a distinguishing advantage for the supposedly pseudo random queries of the underlying covert learning algorithm. This in turn breaks the LPN assumption. Ultimately, what this means is that either the adversary is caught and the interaction is aborted or at least one of the R iterations is an uncorrupted learning case with high probability. Then from the learning guarantee of the covert learning algorithm, the verifiability guarantee would follow. And finally, this gives theorems in the verifiable setting. So this nearly concludes my talk but I'd like to mention a few interesting questions. First are ultimately smaller questions like how to extend these algorithms to other interesting regimes like functions over real numbers or other large fields. Perhaps also other learning problems are interesting. Specifically, the exact learning of graphs is particularly relevant to the computational problems in the application to molecular biology. And so maybe this could be a fruitful ground for future research. Then some bigger questions we have are very general inquiries like, is there a compiler that transforms any PAC learning with membership queries algorithm into a covert learning algorithm? This concludes my talk. Thank you for listening.