 Meine Damen und Herren, guten Abend. Ich möchte Sie, Namen der Albert-Einstein-Gesellschaft und der Uni Bern, willkommen heißen zur zweiten Vorlesung in diesem Zyklus, Einstein-Lectures. Heute Abend ist ein bisschen früher als sonst, morgen ist es wieder um halb acht, wie immer. Und ich wünsche Ihnen viel Vergnügen. Frau Tretter wird noch mal die Referentin von diesen Lectures kurz einführen, für jene, die gestern Abend nicht hier waren. Ladies and gentlemen, on behalf of the Mathematical Institute and the University of Bern, I would like to welcome you to the second Einstein-Lecture 2019, some of you for the first time and some of you for the second time. So for those of you who haven't been here yesterday, I would like to welcome again and briefly introduce our eleventh Einstein-Lecture Professor Schaffi Goldwasser, one of the most distinguished computer scientists of our time. Now today I will only mention a few of the many impressive distinctions that she received. So in addition to two professorships at the Weitzmann Institute and that MIT, in January 2018 she became director of the Simons Institute for the Theory of Computing. And most importantly of this long list of awards is the last one, which you see in red jointly with Silvio Michali in 2012. She was awarded the A.M. Turing Prize or award, which is considered to be the Nobel Prize in Computer Science. She was elected to several academies, national academies. She was awarded two honorary doctoral degrees, one from her Alma Mater Carnegie Mellon University and one only recently from the University of Oxford. And very remarkably she was invited to US Congress for a briefing on cryptography. The purpose of this was to convince politicians of the key importance of unrestricted fundamental research in subjects like mathematics and computer science. So here unrestricted and by the way also in the mission of the Simons Foundation means without purpose and without attempt to measure it in publication numbers or impact factors just driven by the curiosity and ambition of scientists. So factorization in prime numbers and mathematical proofs we learned yesterday in Shafi's Talk are very good examples for that. And I think we also understood that we can no longer ignore their importance for our present and future security in a data driven world. Shafi, we are now very much looking forward to your second maybe more scientific Einstein lecture. So thank you Christian and all of you for an incredible hospitality in the short stay here. And also thank you for those of you who have been here yesterday for coming again. Today's lecture is much less, it's more about mathematics than about sort of the general world and the impact of cryptography in the world. So the title is pseudo deterministic algorithms and proofs. And I'll explain during the talk, what do I mean, what are these things and what do we know about them? And it's based on some joint work with a student from Weizmann, Iran Gatt, some collaborators Oded Goldreich and Dana Rohn and some graduate students at MIT of a Grossman in Diraj Holden and Professor Michel Gohmans. And when appropriate, I will attribute the paper. So basically, I guess sort of the kind of starting point is that I want to comment that really computer science, which is a field that essentially started in the 70s, in the early 70s and goes on until today, has focuses on computation, trying to find some fundamental ideas about the nature of computation, comes up with the design of fast algorithms. And probably some of the highlights that you could credit to the field is discovering and studying non determinism, randomness, a interaction, synchronization, parallelism, locality, full tolerance. So these are all sort of big topics, each one that's been in the central focus of a study of computer scientists. And we all know that it's had great impact on technology and science. And in particular, can you see this? It jumps. In particular, fast search algorithms, I've brought sort of to the Google algorithm, Akamai, algorithms for studying full tolerance in networks, enable routing of messages in the Internet, even in spite of congestion, electronic commerce that I discussed yesterday, computational biology, fast algorithms in genetics, quantum algorithms. So there's been a lot of impact. Apparently on the video, it's better to have this one. So I'll try to control it. Just a little unfamiliar with this pointer. So these ideas, which are really ideas, which are on paper, have also made a lot of impact. And the idea that I'm concerned with in this talk, or will focus on this talk, is the question of randomness. That is the use of randomness in computation in order to make some computations which are impossible possible to solve some problems. It makes more efficient is the solution of other problems and so forth. So randomness has been a big topic in computer science. And there's a huge body of research that's dedicated to probabilistic algorithms, the construction of probabilistic algorithms. It's really starting from 1976. And I'll say a few words about that. And also, as you saw yesterday, I talked about probabilistically verifiable proofs, starting from 1986. So those of you who were here in the lecture yesterday, recall we had this interactive proof, where the one who was verifying the proof was tossing coins. So he used randomness in order to generate sort of the questions and answer, the questions that he poses to the one who was proving. And also there is some embedded in the notion of a proof, there is some notion of error. So we are willing to accept the proof in realizing that there might be some small chance we are mistaken. And besides interactive proofs, I mentioned in a word there's something called probabilistically checkable proofs. So the study of probabilistically verifiable proofs is actually quite an extensive one. And he studied all over theoretical computer science. But today what I'll talk about, so where is the pointer? I'm afraid you're not going to see it on the video because then, and they're not going to see it in the audience. Okay, let's try just a couple more slides. Okay, so today I won't, I will talk about randomized algorithms that have an extra property. So they're not just probabilistic algorithms, but they're what I would call pseudo deterministic probabilistic algorithms. And proofs. So I have to define what I mean by these pseudo deterministic algorithms. So first of all just to make sure for those of you who are not computer scientist, you want to change it? Okay. Yeah, I think you're right. Okay, so we're going to talk about pseudo deterministic algorithms, which are going to be special kind of probabilistic algorithms and pseudo deterministic proofs. So these are notions, which I will define. In some sense, I think that the best outcome of this talk is that you will understand what I mean by these notions and maybe I'll convince you that it's interesting. Unfortunately, in order to fill an hour, I'm also going to show you some results. So some mathematical claims and some ideas of proofs. So it will be a little bit more technical from that point of view. So just to make sure that we're all on the same page. When I say the deterministic algorithm, we really mean just an algorithm or program, takes an input, has an output and the output and the running time. So both what the output is and how many steps this algorithm took is determined as a function of the input. So once the input is fixed, also the output is fixed and the running time is fixed. And in fact, throughout the talk, I will talk about this P. P is a class of problems, all which can be solved by deterministic algorithms, which are also efficient, efficient meaning polynomial time. So P stands for polynomial time. And this is a class that was defined early in the theory, in the world of theory of computer science by Edmunds in the early 70s. And as these things go, usually, even though these, we study these abstract classes, they start by one problem. And it was the problem of matching that Edmund was interested in and came up with an algorithm for graph matching. And noticed that this algorithm is a much better running time than an algorithm for finding large Clicks in graphs, even though these two problems were very similar to each other. And then he defined the whole class of problems that are efficient algorithms calling it P. Okay, in any case, that's a deterministic algorithm. It's just a program. You run it, you're always going to get the same output. And it's always going to run the same number of steps. A probabilistic algorithm is different. A probabilistic algorithm not only has an input like before, but this time there's sort of a, it has the ability in addition to execute steps is to toss coins. How we tosses coins is not the subject of this class. It's a very interesting subject. We sort of assume abstractly there is an ability to really toss a fair coin as many times as the number of steps of the algorithm. So it could be to every step you toss a coin. Now it's a very interesting, could be a very interesting talk to talk about where this randomness is coming from. Is there randomness in principle? And if there is, is it really a unbiased or maybe there's just some entropy and you can work with it. There's a lot of results on how to extract fair looking coins from bias sources in nature. And then maybe how to use the deterministic programs to stretch it so you can get more and more bits that look random enough for the purpose of an algorithm. But for the purpose of this talk, we're going to say let's say for a given, we have a source of randomness. Okay, it exists and we are able to use it. So the thing to point out here is that in this case, the output of the algorithm, as well as its running time, are now not just the functions of the input, but they're also function of the random bits that have been tossed. So in other words, if you let's say your coin came up heads the whole every time, so you have heads, heads, heads, heads, there might be, it might be a slow algorithm and it would have one kind of output. And if it was all tails, it would might be a faster algorithm, a different output. Essentially, there's no guarantee, could be that for every sequence of coins will be a different output and a different running time. And usually when we talk about the running time of a randomized algorithm, we may talk about averaging it over the coin tosses, what is the expected running time. And when we prove properties about this algorithm, it has to be proved with respect to these distribution of coins. Okay, so this is a fundamental difference between deterministic and probabilistic algorithms. In the same way that we had this P, which was the problems that can be solved by deterministic algorithms, there's a class called BPP, which is bounded the error probabilistic polynomial time. But in words, it's those problems that can be solved by this type of an algorithm efficiently, and solved correctly with high probability. So I will get to that definition. But essentially, what it means is that it will give you a correct output. For every input you have a guarantee over the coin tosses that the output is correct. Okay, and in fact, in this particular, the problems that one is talking about when you talk about BPP, are sort of yes no questions. Is something true about x or not true about x? Okay, so what's an example of such a problem? In fact, this is the example that started the whole interest in probabilistic algorithm. And that was the question, given a number n, can you test whether this number is a prime or not? Quickly. So yesterday, we talked about the fact that if you're given a number n, I would like to find out whether it's even, whether I can factor it, right? And I said this is a hard problem. Classical computers take a long time. Quantum computers in principle can do it quickly. But this now, I'm talking about a related problem, but different one. And that is I just want to tell whether numbers prime or composite. If it's composite, I don't want to factor it, I just want to be able to tell. So this is seemingly a simpler problem, just to be able to tell whether something is a prime number or not a prime. So five is a prime, six is not. Again, when the numbers get larger, it's not such an easy question. So if you just try to take the number and try to divide it by numbers smaller than it, that would be a very long time. That would be an exponential time algorithm. So it was a beautiful question. Can you tell whether a number is prime or not? Quickly. Let's say, even though you don't know how to factor quickly. And this is around early seventies, actually in Berkeley. There were two papers, one by Solovain Strassen, and then the other one by Mike Rabin and Gary Miller. And different algorithms, but they both have the same following property. And that was, given a number, is it prime, they would say yes or no. They ran quickly. So it was one of those probabilistic algorithms that can terminate fast. However, they were always correct if they say that the number is composite. But they said the number, because essentially what the algorithm was doing was searching for an evidence that the number is composite. And when it found it, it said composite. And if it didn't find it after a long time searching, it said, I haven't found it. Proof, probably doesn't exist. This number is probably prime. Of course, this informal argument was formalized, and one could show that the probability of making a mistake when you assert their number is prime can be made exponentially small. So you run enough trials so that if you are confident that the coins that were used are really good coins, fair coins, then you trust that the probability of error is small. Okay? So this in itself is a beautiful notion. I don't know how many people are familiar with it. But this idea that you are searching for proof in a sort of well-defined manner, of some evidence of some fact. And if you don't find that you say with high probability of the fact is false, which is kind of the idea of this algorithm is what stands behind this. So this was in the 70s. And since then, people were like, wow, we don't know how to do this deterministically, but we can do this with coin tossing. With coin tossing very quickly, we can make these assertions composite prime. And we know that the correctness of our assertion is with high probability. And this brought on the question, can you actually do this without coins? If we can do this with coins, our coins fundamental, or can we replace the use of a probabilistic algorithm by deterministic one that doesn't have coins, that would be better than we don't have to worry about the randomness, whether it exists, and if it's good enough, and so forth. So then there were a sequence of papers. So actually, let me just say that this brought on a general question, not just whether you can test whether numbers prime or not with a deterministic algorithm, but in general. Is it true that randomness is necessary? Or maybe you can solve all problems that can you do with coins without coins. Maybe it's just easier with coins, we just happen to find the solution. But in principle, we could do this without coins. So, that's been a major ongoing, what we call de-randomization effort, trying to get the randomness away. For algorithms, it's still open. We don't know the answer. So using these letters that I introduced before, which computer scientists love using, I don't know why to tell you the truth, but I've inherited this, is BPP equal to P. So the class of problems can be solved by a randomized algorithm, the same as the class of problems can be solved without randomness. In particular, the primality problem had lots and lots of beautiful work. One of the works that I'm most like of myself with a graduate student at the time, Joe Killian, was using the theory of elliptic curves in order to devise a new primality algorithm, not the one like the Rabin-Miller. A primality test, the test whether a number is prime or composite and is always correct. So it doesn't have any probability of error. If it says composite, it's composite. If it says prime, it's prime. However, it still uses randomness, which is kind of an interesting question. What's randomness good for if there's no error? So randomness is used to look for the proof. It may take time to find the proof. Using randomness would guide your search so it will be faster. But at the end, you will find the proof in case it's prime and you'll find the proof in case it's composite. Okay? This doesn't mean that you can factor. Any questions? All right. But the most important result was in 2002. And these mathematicians, Agarwalkayala Saxana, which we got a lot of accolades for it, showed that actually you could tell whether a number is prime or not without any randomness whatsoever. Not for the correctness of the answer, not for the runtime, totally deterministic algorithm. Beautiful algorithm uses some algebra, some very interesting ideas. So in a sense that natural problem for which reason we studied this whole thing, which was whether a number is prime or not, which motivated the whole study of randomized algorithm, is not a problem anymore. We know how to solve it without randomness. Still, the general question exists, which is a fascinating sort of fundamental question. Do you need randomness or not to speed up computation? Okay, so the truth is that it's kind of a kind of a, it's a problem without a reason in some sense, because we only know one other natural problem, which is testing with it in a polynomial is identically zero, which for which we know how to do it randomized, we sort of set the variables of the polynomial at random, and test whether it's equal to zero or not. And quickly, if we take, if we do this repeatedly and select numbers in some range, we know there's a few solutions, so we will find the solution that will make it nonzero, and if we don't find it, then it's zero. So there's only one problem. Not quite. It turns out that when you talk about what I call search problems, so what's a search problem? So far, this prime question was, is it a prime or is it not a prime? It was a yes-no question. But usually we are interested in much bigger questions. We're not asking yes or no. We have an input x and we have lots of possible outputs. So this is what we call a search problem. And I use it a little bit in the talk. Try not too much. There's notation. So an input x, the search algorithm outputs a y. A y now could be a long string, not just zero or one, a long answer. But what has to be true is that there's some relation that's true between x and y. If such a y exists, some relation is true. So, for example, x could be a logical formula, a logical formula from logic. And you want to know if this formula is true or not. And essentially an assignment to the variables of the formula will tell you whether it's true or not. So the relation between x and y will be true if, in fact, this assignment satisfies the formula. So those of you who are not familiar with logical formulas, we think about x as a graph. And we're looking for a short path in the graph, or we're looking for some kind of substructure in the graph. And if it exists, one of these substructures is solution. Okay, so, very natural. I've enlarged myself from asking yes, no questions to general search questions. Okay? All right. And that's going to be our interest in this talk. And just, there are two examples that are going to be especially interesting to us. First example still has to do something with primality. Okay, so given a number, I know quickly to tell whether it's prime or not. Using some algorithm, which doesn't use coins. But how about this problem? I give you a size in like 1000 bits or 2000 bits, and I ask you to give me a prime of that size. Just give me one, okay? Turns out that actually to find a prime is not such a simple problem. So the best we know how to do is to choose a number at random between let's say 1000 bits and 1001 bits, okay, and test whether it's a prime. So we choose at random a string, and then we run this test that exists to tell whether it's prime or not. And we know that the density of primes is that there are a lot of primes in these intervals, so quickly we'll find one. But we don't know a deterministic procedure that will hand your prime. Why this matters, we'll see later. But take it on faith that this is an interesting problem, how to generate primes of a given size. And we really don't know, we can do it probabilistically, as I said, choose one at random and test it. We don't know how to do it deterministically. And that's a search problem. And this is a, okay. What's another problem that I will talk about in this talk? That is you're given a graph. And you're asked, find a perfect matching in the graph. So what do I mean by perfect matching? So graph, you've got nodes, you've got edges between the nodes. I'll show a picture of it in a couple of slides. And matching is how do you match two nodes together? If they have an edge between them, you can match them. If they don't have an edge, you can't. How can you match all essentially perfect matching? Make sure that every single node in the graph is matched to somebody. So people usually think about in the states, at least, you have medical schools, you have applications to medical schools. And you, although in there, you want to match a student to a medical school, but it's not a single student. So a better example is maybe you have a room full of people, and you want to match them. You want to have one friend. So I want to match all the friends here. It could be boys and girls, girls and girls, boys and boys, whatever you like. And in any case, you have to have perfect matching. Everybody gets matched, and you have a unique match. So this problem, this matching problem, is another one of these sort of fundamental problems. As I told you, it was a problem that started with Edmonds on his quest of polynomial time algorithms. And for example, we know how to do it randomized, even in parallel, extremely quickly. And parallel is another notion later. But we don't know how to do this deterministically. So the question of how to do whether you need randomness extends from also efficient algorithm to efficient parallel algorithms. And there are these outstanding problems, many of them, for which we know how to do randomly, but we don't know deterministically. Okay. All right. So I claim that there are randomized search algorithm everywhere. We use it all the time. Obviously we use it in statistics, because you're using sort of statistics, if there's no randomness, there's no field. But what I mean is in the context of computer science, there's input and then there's outputs. There's sublinear algorithms. I mentioned already sequential and parallel. There's sublinear algorithms, distributed algorithms, streaming algorithms, space-bounded algorithms, routing, logbalancing, optimization, cryptography, learning, game theory. So randomness is very crucial in each one of these fields. So one would have to study each one of these fields and show you examples. We don't have time for it. And the reason why people usually go from deterministic to randomize is three reasons. First, sometimes it's simpler. Sometimes designing a randomized algorithm might be extremely simple. But if you design a deterministic one, but we overly complicated, if you're thinking about writing a program, even to do so, the deterministic one might be so complicated, you would be afraid of mistakes in the code where a randomized one would be simple. Often these are faster algorithms. So if you have randomized algorithms, it could be much faster than deterministic algorithms. And sometimes you can even prove that it's impossible to solve unless you had randomization. Okay? So sometimes it's actually necessary. So deterministic solutions don't exist. And I won't show an example, but I'll say it in words. The well-known problem is suppose you have a network of people that can speak to each other and they want to agree. What does it mean? Agree? You usually call the consensus problem. So each of you have a vote. You either want to vote Republican or Democrat or some kind of zero-one vote. And we want the following property that if all of you want to do Republican, Republican gets elected. If all of you Democrat, Democrat gets elected. So they call this the Byzantine General. All the Byzantine generals have to fire at once. But there's one of you who's a faulty. Faulty, he tries to confuse. He doesn't follow the protocol. So he might say to one person, I am interested in Democrat, I'm interested in Republican. Is there any way for them to agree? Turns out, there is no deterministic way to agree if there's no common clock. If there's a common clock, there are things you can do. But if there's no clock and we're just sending each other messages, hoping that at some point it converges, you can show that it's impossible to converge. However, if you use randomness, you can converge very quickly. So there's something about the ability of tossing a coin that breaks ties and is able to make such a large distributed system reach a decision. Anyway, so this is all about the nature of randomness. Okay, sometimes it helps because it's simpler, faster, sometimes you have no choice. So, and it's also in cryptography, by the way, which is my field, which is probably why I'm so interested in randomization. There's a lot of problems that we need to implement cryptosystems, like finding Primes, that problem I mentioned before. We need primes in order to generate those big numbers n from yesterday. We remember n was a product of two p times q. How do you generate primes? How do you find generators for cyclic groups, quadratic non-residues? Again, those view for this familiar find, otherwise the slide is really just a step on the way, finding irreducible polynomials, square roots, q-th roots, points on elliptic curves. So these are basic steps in a lot of cryptographic constructions and we only know randomized algorithms for them. We don't know deterministically how to do this. So, you know, these are open questions, how to do it deterministically. Okay, so the usual problem that people look at is they say, we don't like randomness. Why don't we like randomness? It's a very good question. I'm now interested in this topic of fairness of algorithms. So we had a big meeting with people from social science and law about fairness. And, you know, they're talking about cases like there's two students. They have perfectly identical folders. Let's say they're both applying to Harvard or to Berne. And you have to decide which student to accept. So, if they're identical, to me as a computer scientist, the right thing to do is you toss a coin and you choose one of them. That is the fair answer. But to anybody, at least the people I talk to in social science, this seems like a terrible idea. It's like, this is not fair. Why should a coin toss make a decision? You see my point? Anyway, okay. So in any case, it would be better if there are no coins, apparently according to a lot of people. And von Neumann specifically as a quote, anyone who considers a method of producing random digit is, of course, in a state of sin. So the question is, where are these coins coming from? Do we have them? Do we not have them? And that's a difficult physics question and so forth. So the traditional goal is, let's de-randomize. Let's replace probabilistic algorithm by deterministic ones. They might be more complicated. They might take more time. But at least we know that we don't have to worry about whether coins exist or not, whether the coins we are using are biased or not. Okay. But for this lecture, I actually want to ask a different question and that is, I like randomized algorithms. I'm going to assume there are coins that I can use, but I want to improve randomized algorithms or probabilistic algorithms to give the same guarantees as deterministic algorithms. Okay. So what do I mean by the same guarantees? What are the guarantees that a probabilistic algorithm gives you? A more general one, not a particular one for a particular problem. Okay. Eh, good. So that's the goal. So, what are some guarantees? If we want to put it in a table and understand to compare deterministic algorithms versus probabilistic ones, what is it we don't get when we have a randomized algorithm? So first of all, so this is the deterministic column. We look at runtime, we look at its output. This is, these two are two different types of randomized algorithms and people give them names. There are Monte Carlo algorithms. There are Las Vegas algorithms. Both have casinos, so both use coins, but they have different properties. So the deterministic algorithm, we know that the runtime is going to be fast because we're looking at polynomial time, they're always going to be correct. That's a guarantee of when we say that we've designed an algorithm for a problem, it's a correct algorithm, runs in a certain amount of time we claim. If it's randomized, again, the time is going to be efficient, that's essentially true for both of them. This is expected, never mind. Both of them are fast. However, Monte Carlo may make an error. He will tell you what's the probability that there's a mistake, whereas Las Vegas always will be correct, or he says, I don't know, he'd output bot. Okay. So these are the type of randomized algorithms we have and the guarantees they give on running time and correctness, looks pretty good. If the probability of error is really small, it's never going to happen. Okay. What the probability that says I don't know is really small, it's never going to happen. So what's the difference? It seems like it gives you what you want anyway from an algorithm. So the difference is the following, and that is when you run a deterministic algorithm, not only that it's efficient, it runs in polynomial time, but you also have a guarantee that on the same input, it's always going to produce the same output. You run it again, same output. You run it again, same output. So I know that a lot of people in the audience here are scientists, physicists, chemists and so forth, biologists, I mean, I don't know for biologists, but I haven't met them yesterday, but in any case, the idea of running an experiment and getting always the same answer is pretty fundamental. In fact, I would say that's how you verify that these experiments are correct in the sense that they are predictable, reproducible. So the deterministic algorithm, same input, same output, a randomized algorithm, whether it's the Las Vegas type or Monte Carlo, every time you put a new input, different output. This is the nature of the algorithm, because remember, it depended not only on the input, but also on the coin tosses. And if the coin tosses were different, could be a completely different output. So, what I'd like to say is that, I don't like it, I'd like to have new type of randomized algorithms, that they are run efficiently, they don't make mistakes, except for maybe small probability of error, which is financially small, and they always produce the same output and the same input. Okay, good. Why is it important? Well, again, in science, it's obvious why it's important, but sort of from the point of view of an algorithms person, the reason this is important is you would say, for debugging, say, suppose you have a bug in your program, it's a probabilistic program. And every time you run it, in order to debug it, you'd like to know how it behaves, if you change a line in the program. But if in any case, every time you run it, it gives you a different answer, it would be kind of hard to debug. For distributed algorithms, let's say you have an algorithm that many processors are working together to solve a problem. Again, having a deterministic solution, everybody's coordinating toward the same solution is very important. For cryptography, essentially, if you think about my slide that says that all these steps in cryptography that we need to do, like generating primes, generating points on elliptic curves, and so forth, often there's a system somebody put in place. There are these parameters, like the prime or the generator, or the reducible polynomial that everybody is using. And who generates these parameters? Often it's an authority, a government, an agency, and you would like to make sure that they haven't introduced what we call a trap door, so they don't have a side door to, let's say, break the system or have some other method of surveillance or so forth. So it's a problem that we always talk about, how do you generate system-wide parameters if you distrust whoever's generating it? So if you're using a probabilistic algorithm, it sort of in some sense lends itself to the ability of choosing one output versus another output, because one output might be more to your liking than a trap door. So we would like to have unique outputs. So there's lots of reasons. Why would you like to have a unique output? And that gives me a new definition. So this is the work with Eran Gatt from Weizmann. That is, let's define randomized algorithms in a new way. So you take a randomized algorithm and you restrict it further. We are interested in those randomized algorithms. So it's a probabilistic algorithm, or randomized, I keep choosing both terms. We'll say it's pseudo-deterministic. So it's not deterministic, it's a pseudo-deterministic. That is, it still uses coins. If the following is true, that for every input, again, the runtime is fast, the correctness is that it does solve the search problem with good probability. Usually we say greater than 2-thirds. 2-thirds doesn't sound like very much. So you repeat it and you take a majority to decrease the probability of correctness. But the new thing is that there's a canonical answer. So there exists some, this search problem, there's some answer that comes up. So here the notation is this is the input and this is the randomness of the algorithm used. So the probability over the randomness that you get this output, the canonical one, is more than a half. Let's say it's more than 2-thirds. So these 2-thirds, they are just arbitrary bounded away from the half probabilities. The point is, there is a majority answer, an answer that will come up more often than not. And that means that I can amplify these probabilities. So another way to think about it, if you don't like the 2-thirds is, these algorithms will give an answer, which is correct with extremely high probability. Furthermore, it will be unique. So there's some small chance that a different answer will come up, but it's exponentially small. Okay, so I want, why do I call this pseudo deterministic? Because I wanna say, let's say that there are two different algorithms. And you are the judge and I'm here behind the curtain and I'm either using a deterministic algorithm or I'm using a pseudo deterministic algorithm or the kind I defined here that uses coins. And you give me an input, I give you an output. You give me another input, I give you, or even the same input, I give you an output. So I want you not to be able to tell whether I'm using a deterministic one or a pseudo deterministic one. If I was using just a regular randomized one, after a few trials on the same input, I would give you different outputs each time. But with the pseudo deterministic one, I always give you the same answer. So it's indistinguishable from the deterministic algorithm in terms of the input-output behavior. Okay, questions. Just because we're into, yes. And also the deterministic one. Ah, you wanna know how it works? Okay, so the deterministic one is actually, probably I cannot give it to you on the fly. But the probabilistic one, the idea is the following, is you choose a number X less than this, let's call it P, the number you're trying to find if it's prime or not. You choose X at random, which doesn't divide P. Of course, if it divides P, you know P's not prime. So you choose X at random from one to P and you take it to the power P minus one over two. And it turns out that if your number was, if your number was prime, then this should be either one or minus one, okay? And if not, then it might be different than one and minus one. So you keep doing this and as soon as, I mean, I'm simplifying. It's not completely correct, but essentially correct. As soon as you take it, find an X, so you choose an X at random, take to the power P minus one over two. If it's different from one and minus one, then you know it's composite. Because if it's prime, it will always be either one or minus one. The reason is because X to the P minus one P minus one is the order of a group. X to the P minus one, the order of the group gives you, taking X to the order of the group gives you back one. If P's prime, then there are P minus one elements small, then it, which are relatively prime with it, which don't divide it. So it's really kind of a side, but when we can tell you later. But there's a simple calculation, this exponentiation, where it's a simple equation that you test to see whether you get one or minus one or something else. And if you get something else, you know it's not prime. Okay, if you choose these Xs at random. Okay, in any case, in my algorithm, I'm not looking for decision, I'm looking for search problems. So I wanna find just not a yes, no, but more generally. And although yes, no is included in it, but it's much more interesting for general problems. And the issue is now, is that I require that there's a canonical answer. Okay, so what's, so what's an example? So do they exist? Obviously they exist, because deterministic algorithm or a special case of a randomized algorithm. So deterministic algorithms have unique answers. But the question is whether they extend deterministic algorithms. So in other words, can you go, is there some problem that requires randomness and still can guarantee unique solutions or canonical solutions? And it's been studied in several contexts, in sequential, parallel. I'll tell you a little bit about it. And it's been discovered for some natural problems in number theory, algebra, graph algorithms. And on the other hand, so usually when we ask this type of question, is does it extend deterministic algorithms? We're asking, on one hand, can you show examples where we don't know how to do it deterministically? And on the other hand, can you actually show separations? Can you really show that there inherently things can be done deterministically and things that can require randomness but can give you canonical answers? Okay, so this is a sort of a more general study. And it's an interesting question of how you come up with these algorithms for problems where it's not obvious. And I'll show you some examples. So, the little parentheses here. So I'm teaching one more thing, which is gonna be useful to explain some separation results. And that is, we have these days something that's called a sublinear algorithm. What's a sublinear algorithm? That's just like the program before, but they have the, which got an input and got out an output, but it has this extra property that makes it sublinear, which is, if this is the input, okay, the X that came into the box, the algorithm is not actually allowed to read the entire input. So it's sublinear, it's only allowed to look at a sublinear number of bits in the input. So for example, if you think about, I don't know, the cosmos or something, you can't look at everything, you can sample it. Or if, you know, usually a beautiful example is the dinosaurs. You know, you don't find a full dinosaur. You find some bones. And yet people make theories that this dinosaur was a meat eater, vegetable eater, herbivore. So in other words, just from examining some properties to try to give an answer about the input. Those are sublinear algorithms. They have sort of a huge field at this point. And these algorithms are always randomized. So randomness is extremely important because you randomly choose which bits of the input to look at. So one question is, can you make these sublinear algorithms a pseudo deterministic? So to give you always the same answer, even though they're using randomness. And generally, of course, where you look at the input will determine the solution you give. So it seems like there's no hope for a canonical solution. And it turns out that that's not true. Sometimes you can get canonical solution if you do it in a clever way. But what you can show, which is of importance to this part of the talk, is that there's a separation. So there are some search problems where if you were allowed randomness really without restriction of requiring a unique solution, you could do it with a constant number of queries into the input. Just look at the constant number of places. If you were allowed to do it with randomness but required pseudo determinism, so you required a canonical output, you need to deal with these many queries. So it's sublinear, it's not n, but it's still not constant. And if you want to solve it deterministically and get 100% correctness, then you need to look at the entire input. So in other words, there's a separation between these three types of algorithm. Randomized, randomized with canonical answers. This talk's sort of deterministic and deterministic. So at least it shows that within some field, the sublinear algorithms, this is a different beast. So requiring canonical answers is a strong requirement which separates it from randomized, but also can do better than deterministic. Okay. So how do you design such an algorithm? If you really want to get canonical answers, how do you do it? So there's sort of two methods. One method is that you, I call it canonicalize. You take a randomized algorithm, a regular randomized algorithm which doesn't require unique solutions and you somehow find one solution and then you make it so that from all the solutions you possibly may find, depending on your randomness, you can do a small search into the unique one. Each problem is different, but this is kind of the blueprint of how you go from a random solution. How do you find the unique one? First you find a random one, then you reduce it to a canonical. The more interesting path, okay, and now comes things which are a bit more technical, is the realization that you can reduce every search problem to a decision question. So remember, decision was a yes no question. Search was just a regular search. You can output even longer answer. So it turns out that if I give you a general search problem and I show you, and you can find how to reduce this to a decision question that can be solved by a randomized algorithm, then your search problem can be solved by a canonical randomized algorithm. Okay, so this is a characterization of these new algorithms, these canonicals or pseudo-deterministic algorithms. Each day look like essentially deterministic procedures that have sort of a call to a randomized yes no question, which can be decided, I mean to a yes no question, which can be decided by a randomized algorithm. So if your problem looks like that, so it's all deterministic. So those box that I had with deterministic, where input comes in and outcomes comes out, suppose you just had a box, but it also had like a yes, there was one thing you didn't know, it had to ask, whether something is true or false, and there was a randomized algorithm to solve that, then you can make that entire thing to find the canonical solutions. The idea of being really is that you kind of look, find the canonical solution by finding its value in every single bit, smallest bit, second bit and so forth. Anyway, it's a characterization. What it means is that, I told you that this was an open question, P versus BP, are they the same? You can ask the same question about the search version, this was decision questions, these are search questions, and essentially what this tells you is that if you ever somebody would solve this problem, and usually you know how yesterday somebody asked me about P versus NP, that people believe that P is different than NP, but what if it's the same, this is a question where, even though it's been open for 30 some years, most people would bet that randomized algorithm and deterministic algorithms are the same when you talk about decision questions. If they weren't able to show that, it would mean essentially that if you wanted to address the same question about search problems, any pseudo-deterministic search problem would become deterministic. Okay, in any case it's, so I'm saying that there's a very tight relationship between this de-randomization question for decision problems and de-randomization question for search problems, and it goes through this concept of search pseudo-deterministic. All right, let's do examples. So, it's Tuesday, and it's late. I think the concept you understand, let's do some example with some pictures. So I'm gonna do an example about matching, okay, in the parallel setting. So when I say parallel, again, there's another class of problems we study in theoretical computer science. The idea is that there's a lot of processors. Each one of them can send messages to the other, and you're hoping that by having a lot of processors rather than one, you're cutting down the time hopefully logarithmically. So before it took n squared step, now you would be hopeful for logarithmic and polynomial squared steps or something like that. You'd really wanna do it way faster because you have all these computers through your disposal, but it's not so clear. What sub-problem do you assign to them? How do you coordinate to take advantage of the fact that you have lots of processors? So this is a problem. It's called NCR, those problems can be solved quickly in logarithmic parallel time. RNC is those problems that you need coin tosses to solve in logarithmic, or poly logarithmic parallel time. The issue really with these sort of randomized algorithm is that they require a lot of coordination. So, that's what I'm gonna talk about from here on. Suppose you have a picture, nice, finally a picture. So here's the picture. Picture is a graph. These are the boys, say, and these are the girls. Or these are the medical schools and the students. But let's talk about the fact that we have, let's say boys and girls, and this is an edge between them. It's an edge between this boy and this girl, this boy and this girl, but these two don't have an edge between them. An edge means that they can be matched. And the question of matching here is, how do you find a way to match boys and girls so that every boy is matched to every girl? I mean, every boy is matched to a single girl. So here, for example, the red, this guy, the red are the matchings. We chose a subset of these edges and matched them. So this is a problem, which is not so simple how to do it. It's one of these problems that has been studied, not to death, but there's very interesting algorithms to it. And I've restricted looking here at bipartite graphs. So there's either your boy or a girl. There's edges only between boys and girls here, not between the boys and not between the girls. So a general graph, you would have edges among any pair of circles can have a line between them. Okay, so this is a problem where you can think about the decision question, which is, is there a way to match everybody? Is there a way to match every boy to every girl in a way that's a unique match and nobody is left unmatched? And there's a search question, which is, find the matching. So one is, does it even exist? Some graphs, it's not true. So somehow, there's some conflicts that make it impossible to match everyone. And the third question is find one. And these are fundamentally different questions. Okay. And it turns out that you can state this as an algebra problem. So here's the graph that I had before. Maybe it's a little different. So let's say that I give names to these boys, one, two, three, names of these girls, the first, the second, and the third girl. Now I can look at a matrix, okay? So the matrix here that I've defined here is like this. They're either zero, zero variables, right? There'll be a zero if there's no edge. So between two and two, there's no edge. And that's why if you go to this entry, the two-to entry, there's zero. But if there is an edge, let's say between three and three, and I put a variable, x33. It's just a symbolic matrix, which I can define. And it turns out that if you look at this matrix and you look at its symbolic determinant, so you just write the expression for the determinant of this matrix, then if there is no perfect matching in this graph, then this isn't identically zero. All these terms are gonna be zero. So there's always gonna be some zero multiplied here in every single term. But if there is a way to match everybody successfully, then there'll be at least one term here, which doesn't go away, which isn't multiplied by zero. And in particular, if you see here, this matching, three goes to two, two goes to three, one goes to one, this is a term that doesn't disappear. So there is a way to get perfect matching in this graph. In fact, the number of perfect matching is exactly the non-zero terms. Okay, so why did I do that? Well, first of all, it's always nice in mathematics to see that you can see equivalents from different kinds, even when it's this simple, especially when it's this simple. And this tells you that a way to decide whether there's a perfect matching in this graph, it's equivalent to telling whether this polynomial here is identically zero or not. If it's identically zero, no perfect matching. If it's non-identically zero, there isn't a perfect matching. So how do you tell whether this is identically zero or not? You could assign all possible values, but good luck. Instead, what you could do is you choose random values for variables and see whether this thing cancels out and gives you a zero or not. So you plug in, you see if you get zeros. And indeed, that's a procedure. Plug random values for x, check if it's determined as zero. And in fact, you can show that by choosing these x's in the right range, this is a very quick procedure and not only that, it can be done in parallel. Why can it be done in parallel? Think about these x's as like a processor and essentially these guys assign themselves value and make a computation distributively and very quickly see if the determinant values to zero or not. So it can be done in parallel. It doesn't have to be done deterministically. Great, but what about finding the matching? All I know now is there exists one, but I don't know which term it is that to find. So one observation, which is sort of a nice observation in general, is that if there was a unique perfect matching, if there was only one, there was only one term that became nonzero, then you could actually use this procedure that we had to test where there exists one to actually find one. So why is that true? Why is there a reduction here between finding to testing? And the idea is essentially for each edge in parallel, for you know edge was a variable there, for each edge, decide quickly and randomly, this is randomized parallel, if when you take that edge away, there's still a matching. So if you take that edge away, there's still a matching, it wasn't necessary. Okay, and then take another edge, you say, is this one necessary? If I took it away, is there still a matching? Yes, it's not necessary. If I took an edge away and now there is no matching, I know this is necessary. So if it was unique very quickly, I'm only left with those edges that participate in the matching, done. Good. So if you are disposable, when I took it away, still there is a matching. That's the only idea. This only works, however, if there's a unique matching. Because if there's more than one, it could be that I'm removing things and somehow at the end, I will be left with an inconsistent set of edges. It's kind of clear. So, are we lucky? Is it true that there's always a unique one? So the answer is no. Not only that there's not always a unique one, but there is really an exponential number of them. There could be an exponential number of matchings. So this reduction from decision to search doesn't work. Now remember, this whole talk is about finding unique solutions. So where am I getting to? I want to have an algorithm that given a graph in parallel finds a matching and he finds always the same one. Same graph, same unique matching. But first I have to tell you that I actually know how to find a matching if I were at loud randomness and I didn't require uniqueness. And this was a very beautiful paper by Karp, Opfer and Wigderson and then eventually the one I'm showing is by Malmoulis Vasirani Vasirani. This is a beautiful idea which I think is useful elsewhere. Probably anybody can find, even if it's only to give your high school student a riddle, use for this theorem. And that is the following. They say, okay, if there was a unique way to match, I could find it because there's this, I could tell whether there exists one and do this trick of removing edges. So what they say is this, let's, we have a graph, it has exponential number of matching. How are we going to isolate one matching? One, the one that we should be thinking about. And the idea is, they say, just put numbers at random on these edges. So these are what we call weights. So pick some random weights in some interval. So now, for example, before, just lines. Now the cost of this line is six. The cost of this line is 10. The cost of this line is four. So I put some numbers at random in a range. And I could talk about now, if I choose a matching, what the weight of the matching is, which is just summing up the weights. So I could sum up six plus six, plus three, plus three, plus, I guess four here. No, I don't know why that four is there. Anyway, I could sum up the weights on the edges that are in the matching. And let's say I choose among all of them the one of minimum weight. Then you can prove that with very high probability over the choice of these weights, there exists a unique minimum weight matching. It could be also that there are two matchings and they have exactly the same weight and it's the smallest one. But one can show that's unlikely to be the case. So if you choose random weights with very high probability, there's a unique one. So why is that good? Because remember, we had this reduction from decision to finding in case there was a unique answer. So that what they do essentially is they show a reduction to this problem, from the decision whether there exists a perfect matching or not to finding the matching of minimum weight. And since that one is unique, this reduction from decision to search works. So it's a beautiful, they call the isolation lemmat, sort of how you isolate one solution among many solutions and it's using this randomized trick, which is kind of a general trick for all kinds of structures, particularly here for this graph. But, and this is the lemma. Choose the weights of each edge at random from this universe and with high probability if the graph has at least one perfect matching, there's a unique minimum weight matching. And so that's an algorithm to find one. Choose a random weight assignment, find the unique minimum weighted matching. The thing is, does this solve my problem of a randomized algorithm with unique answer? No. Because a randomized problem with unique answer, like a pseudo deterministic algorithm that I'm after in this talk, means that for the same input, always the same output. So the same graph, always the same matching. But notice that this thing has the step that you choose a randomized weights. So every time the weights are different, you isolate a different minimum weight matching. So it actually doesn't solve our problem. Okay, I would like for the, regardless of these randomized weights, that they will always be the same output. So the idea then is the following. And that is, for many years, people have worked on trying to sort of de-randomize this, to try to, instead of coming up with randomized weights, come with some algebraic way to assign weights in a clever way, to isolate it uniquely, the minimum weight. And they've been able to do it for all specialized type of graphs, but not let's say for bipartite graphs. Then there was this paper in 2016, which showed how to de-randomize it, but used more than a polynomial number of processors. So it wasn't as fast as what we want. But what we are going to do in this work, or what we did with Grossmann with Ofil, was a way to not de-randomize, but unify. So somehow out of all these matchings find the unique one, each time regardless of what randomness was used. And the idea is kind of this, I'm just going to give it there, because it sort of gives a hint how to do this maybe for other problems. So first of all we construct deterministically some weight assignment. So as before we chose it at random, we're going to have some algebraic way to assign weights. It's still not going to be, I won't be able to prove that there's a minimum weight assignment, which is unique, a minimum weight matching, which is unique with respect to these weights. But what I'll be able to do now is show some property. And the property is that if I now said to myself, you know what, okay, maybe there's more than one matching. Let's take the union of all the edges which participate in some minimum weight matching. Some edges don't participate, some do. The graph becomes a little smaller. I could prove that with respect to the weights that we assigned, deterministically the graph is quite small. And essentially the only randomized step here is going to be computing the union of these minimum weight matchings. The point really for us in the kind of, you know, 20,000 foot view is that this is a new algorithm. It is still randomized. There is a randomized step here. But even though you're using it to find the union of all these perfect matching, even though there are lots of perfect matchings, their union is unique. So there's only one union, okay, and you are trying to build it using randomization. So even though you're using randomization, the structures you're building or these unions are unique, and you do this again and again, the union becomes smaller and smaller and smaller till it's left with a single minimum weight matching and that's the one you output. All right, too much detail, we'll skip. This is just the weight assignment, very simple. You know, essentially you sign names to the vertices and you take powers. What else do I wanna say? Going beyond this particular problem. So we did it for bipartite graphs. Now people know how to do it for general graphs. This is a paper by Anar and Vazirani. So again, having a randomized algorithm that gives you unique perfect matching in parallel and uses randomization. There's been other studies about this concept of randomized algorithm with unique answer. For example, you could look at algorithms whose restriction is that they use only, they only lose a small amount of space. So these algorithms are studied quite extensively where in lots of contexts where you don't have a lot of space, you have a small amount of space and it turns out that randomization is pretty important for these algorithms and you could again ask the question is even though it's randomized, can you guarantee unique output? And here the answer is interesting. It says we don't know how to guarantee unique outputs but we can guarantee a small list of possible outputs. So we can somehow, instead of output a unique solution, we say here's a bunch of candidates but it's few and regardless of what randomness you use, it's gonna be one, I'm always gonna output a short list and these lists all have one answer that's always in common. What's another thing I would like to say? What about that prime finding problem that I said in the beginning? Wouldn't it be nice to be able to find, I give you 1000 bits, 2000 bits, you give me always the same prime and you can use randomness as far as I'm concerned. Can you do that? Give me always the same one using randomness. That's still an open problem. So that's a beautiful, anyone here interested in number theory, very clean problem. How do you find a unique prime? Always the same one using randomness and we don't know how to do it. There is a theorem recent one by all of us in St. Tharum from Oxford and they show that there's something called a sub-exponential pseudo-dermistik algorithm. That means it takes more than polynomial time but less than what the deterministic procedure would take. So there is something to do but you can't go all the way. Let me see, in terms of, you can ask us about approximation. Suppose you wanna do some approximation for a problem and you would like, there always to be the same approximate solution that comes up or a solution that achieves the same approximation factor. Again, there's this uniqueness but it's not the exactly same solution. It's just that there's some property of the solution that is guaranteed, like the approximation factor and that's also been studied. Now I'm gonna spend five more minutes and then I'm done, okay? So far, I'm switching gears. So far, I talked about finding a unique solution. But what about, so having a new type of algorithm but what about just verifying that a solution is unique? So suppose it's too hard to find the unique solution using randomness. But let's say that I'm all powerful. I'm able to enumerate all possible randomness to see all possible answers, take let's say the lithographically smallest one. I know that's unique because it's the lithographically smallest but it takes me a long time to do it. Can I prove to you quickly, can you verify quickly that this is the lithographically first one? How, I mean you don't have time to look at all the possible answers. How can I prove to you that something is unique? So this is in the line of the type of things I talked yesterday about having a prover and a verifier. And here I wanna give them a solution and prove that the solution is a unique one. This is joint work with Ophir Gosman and Diraj Holden and the best students at MIT. So just to illustrate and have some pictures for comic relief for the end. So the prover, for example, let's say there's a graph and she says to this verifier, she's all powerful, we don't care about, we only care about verifying quickly. She says this graph, I could color the vertices with red, white and green so that no edge has the same color in both ends. So she gives him the coloring, we check, said everything is happy. But he's assured at the end, now we're not trying to do zero knowledge or anything. He's assured at the end that the graph is three colorable and he knows the coloring. But can he be assured that this coloring is canonical in some sense? I mean, he would take any coloring. Can you somehow guarantee that this is the coloring that has somehow fewest number of reds or whatever, some property that you could define so that you always get that canonical coloring? That's the question we are asking. Another problem is, say you have these two graphs, two molecules, whatever, and they are the same, there is a morphic to each other. And that means essentially it's the same graph drawn differently. So one of the big problems in computer science is, how do you, can you tell, if given these two graphs, they're really isomorphic to each other. So they are the same graph drawn differently. Again, there are lots of isomorphisms often in these graphs, not in this one, but. And certainly, if she's really hardworking, she found a way to map these vertices like three, two, three, four to two, two to one, five to four and one to five, and you can verify that this is an isomorphism. But is it unique? Is it the graphically smallest one? Not clear. There are lots of isomorphisms. So we wanna define a new kind of proof system. Let's just do it with a picture where they can ask questions back and forth. At the end, he outputs a solution or he rejects. And the property is, if in fact there is a solution, forget about this notation, so on an input there is a solution, then this guy will output one and furthermore, it will be a canonical one. So there is a unique solution that comes up with high probability. So it's very unlikely that you can convince the verify to output two different solutions. It's really only one he will accept. So in the case of isomorphism, it might be you can convince him not only that there is an isomorphism, but it's a lexographically smallest one. And we show essentially how to do that. And why is this important? If one wants to stick to applications like the Congress, you could say again that if I am the government or National Institute of Standards, I show some cryptographic parameters, you suspect me that I have a trapdoor. If I can prove to you that no matter what procedure, what randomness I use, I always give you the same parameters, the same prime, then you are happy. So this is again the ability to convince someone that what you've given them is a canonical answer rather than an arbitrary answer that they may have chosen maliciously. So this is, I think, it. I wanna say that I wanna show you how, that there's still lots of open problems here, about finding unique shortest vectors, the prime question. Another question that people have looked at or that I think they should look at at least, I've looked at it, is that using this to define some sort of stability for learning algorithms? You all know that machine learning is a big one and there's a question of how do you come up with a stable learning algorithm? How do you define stability? This is one possible way to define stability. That regardless of the weights, you get the same outputs. Regardless, let's say of the weights that the machine learning model has. And finally, obviously for scientists, the idea of reproducible solutions is key. So it's fundamental, it's almost analogous to the truth. And I guess an interesting question is whether you can relax this notion of pseudotronistical algorithms to explore verifying scientific modeling and simulations where you're using computers. So we're also randomly placed a big key component and getting different solutions is not sort of then how do you know if it's correct or not correct. So somehow getting the same solution again and again is again very important. So you can ask the same question in other domains, not just in standard textbook algorithms. Thank you.