 What if you could predict the future? What if we all could? I'm here today to tell you that you can. We all can. We have the power to predict the future. The bad news is that we're not very good at it. The good news is that even a bad prediction can tell us something about the future. Today, we will predict. Today, we will learn. Today, we will discover why. Bayes is Bay. Introducing our protagonist. This is Thomas Bayes. Thomas was born in 1701. Maybe. We don't exactly know. He was born to a town called Harfordshire. Those, okay. Possibly, we can't know for certain. We don't actually even know what Bayes looked like. What we do know is that Bayes was a Presbyterian minister and a statistician. We also know that his most famous work was published at paper that gave us Bayes' rule was not published until after his death. Before this, he published two other papers. The Divine Benevolence or an attempt to prove that the principle and of the Divine Providence and government is the happiness of his creatures. Yes, that is one title. As well as an introduction of the doctrine of fluxons and a defense of the mathematicians against the objections of the author of the analyst. I like my titles a little bit shorter, but everybody has different preferences. So why do we care about this? Well, Bayes contributed significantly to probability with the formulation of Bayes' rule. And again, even though it wasn't published until after his death, let's travel back and put our minds in a commoner of the era. So the year is 1720. Sweden and Prussia just signed the Treaty of Stockholm. Anna Maria Mozart, the mother of the person who wrote the requiem that we just enjoyed. Wolfgang Amadeus Mozart, so not Mozart, but his mother was born in 1720. And statistics is all of the rage as well as probabilities. At the time, we can do things like say, given we know the number of winning tickets at a raffle, what is the probability of any one given ticket will be a winner? In the 1720s, Gulliver's Travels was published. This is 45 years before the American Revolution. 45 years before we went to battle with Britain and gained our independence. And also in the 1720s, Easter Island is discovered because people knew it was there before, but the Dutch didn't. So, and I don't know if you know this or if you've seen this, but there's actually a lot more to the statues. There's a lot more underneath the surface, which is also very true of probability as well. See, what we knew how to get the probability of a winning ticket, what we didn't know how to do was the inverse. An inverse probability says that, okay, well, if we draw 100 tickets and we find that 10 of them are a winner, what does that say about the probability of drawing a winner? Well, in this case, it's pretty simple. 10 are winners, we drew 100 tickets, it's about 10%. But what if we have fewer samples? What if we have one sample? We drew one ticket and it was a winner. Well, does that mean that 100% of tickets are winners? Is that what we're gonna guess? So, the answer is no. We wouldn't guess that. That's not, you know, you're like, oh, well, maybe it's a really weird raffle, but I've not found any raffles that are like that. And the reason why you were able to correctly answer that is because you can predict the future. Even if that prediction's wrong, not dead on, it's still better than making no prediction at all. This was Bayes' insight that we can take two probability distributions that are related, and even if they're both inaccurate, the result will be more accurate. We can do things with this, such as machine learning and artificial intelligence. I'll be focusing on artificial intelligence in this talk. But I wanna take a second and introduce myself. My name is Shneems. It's pronounced like schnapps. It's got the little fun sh at the beginning. I maintain sprockets poorly. I have commit to rails as well as Puma. And I'm also taking a CS and Masters, a Masters in CS at Georgia Tech with their online program. I went there for my bachelor's for mechanical engineering degree and absolutely hated it. It was brutal and not very much fun. But they're only charging me seven grand for the entire program. So it's pretty cheap, you know, not a bad deal. So I work full-time for a time-share company. We, we, it's time-share with computers. That's what we do. So you hopefully, some of you already know what Heroku is. So instead of pitching or explaining Heroku, I'm gonna explain some new features you might not have heard of. We have a thing introduced called automatic certificate management. This will provision a Let's Encrypt SSL cert for your app and automatically rotate it every 90 days, which is pretty sweet. We also have SSL for free. And that was on all paid and hobby dinos. And the SSL that we offer for free is what's known as SNI SSL. And I don't know if you heard about the legislation that went through Congress that was like, hey, FCC, you cannot protect people's privacy. Anybody hear about that? Okay, yeah. So adding SSL onto your server is going to help your clients get a little bit of protection. The free version of SSL that we have, which is SNI, does leak the host name to your ISP. But we also have a NSA grade SSL, which is an add on that you have to add and then you also have to provision and maintain your own certificate. We have Heroku CI, which is continuous integration. It's in beta, you can give that a shot. Review apps, which I absolutely, positively love, try these if you haven't. Every time you make a pull request, Heroku will automatically deploy a staging server just for that pull request. So you're like, hey, I fixed this CSS bug. It's like, did you really, did you? The person reviewing can click through, see an actual live deployed app and verify that. So that's it for the company I work for. Typically this would be the time when I do a little bit of self promotion. And typically I would do something like promote the service that I run called CodeTriage, which is the best place to get started contributing to open source. But since I'm not going to be talking about CodeTriage, instead what I want to talk about is the biggest problem our country faces, especially I've come from Texas and the state of Texas, faces gerrymandering, which is awful. And unlike CodeTriage, gerrymandering is very bad. Anyway, so this is gerrymandering. Basically given a population, you could represent it perfectly and say, okay, well there are more blue squares than there are more red squares. So we should have more blue districts than red districts. But if you look all the way over on the side, you can create those districts in such a way that, oh, magically now there are more red districts. So this is where I live. This is the district that in Texas that stretches from San Antonio to Austin. I don't know if you know, but that's a really far away. Yeah, I mean, just look at it. Seriously. So yeah, gerrymandering kind of takes away your voice, diminishes the power of your vote. And so I think we need country wide redistricting reforms and it's not just me who thinks this. My district was actually ruled illegal by the state of Texas, the judicial branch. Unfortunately, a illegal district will not deter the people in charge of redistricting in Texas and they're refusing to hear any bills on the issue. And you might say, wow, that's a really important issue. Okay, but what can I do? So I highly recommend looking up your state, representatives, you have a house representative and a senate representative, like find them, find their, mine are Kirk Watson and Eddie Rodriguez. I have their phone numbers in my phone and then call them and let them know, like, hey, I care about redistricting and I care about gerrymandering and like I want this to be an issue that we should push. You might say, oh, well, is there more than I can do? Well, there are local organizations. For example, in Texas, there's DeGerrymander, Texas, which is a really long Twitter handle. And they give guides and talk about current legislation and those types of things. So yeah, I just think that gerrymandering is very unpatriotic, un-Texan, it can be un-Arizona to no bias and it really just takes away the freedom to elect people who represent us. So okay, yeah, back to Bays. So artificial intelligence. For this talk, I'm gonna be talking about some examples for the grad course that I've been taking at Georgia Tech, where we've been using Bayes rule for artificial intelligence with robotics. If you're not familiar, this is what a robot looks like. Robots. The world is very different ever since the robotic uprising of the mid-90s. There is no more unhappiness. Affirmative. Okay, can I get the audio just like a little bit? Okay, there we go. So when we have a robot and we need to get that robot somewhere, we need two things. We need to know where the robot is and then we also need to have a plan on how to get them there. So robots don't see the world the same way we see them. They see them through sensors and those sensors are unfortunately noisy so they don't see the world perfectly clear. So given the case that we have a robot and a really simple robot can move, let's say just right and left, if we take a measurement, it will tell us about where it is. We can represent this by putting it on a graph and this is a normal distribution. So here we have a robot, it's at position zero but we don't know for sure that it's at position zero. It could be further away. It could be all the way over at point six but this is a lot less likely. It's not very probable. The more accurate our measurement, the steeper our curve will be. It'll, we are now at this point in time, it's almost impossible that it would be at point six and it's much more likely that it would be a lot closer to point zero. So a robot is an example of a low information state system. We could take thousands or hundreds of measurements of that robot as it's just sitting there and average them together but what if our world is changing or if there's other things impacting our sensors or it's like, hey, our robot needs to move and do things and so one of the things that we can do is use Bayes rule. We can make a prediction and with that prediction use it to increase the accuracy of the estimate of where the robot is. So previously we thought we were at position zero plus or minus some error. Well then we can predict what the world would look like if we were to drive forwards by 10 feet. If we did that, it would look something kind of like this. We were at zero, now we're at 10 but we wanna be sure so we take a measurement and it says, oh, we're not at 10, it's showing that we're at five. So what do we do? Our measurement and our prediction disagree. So probably a good guess might be somewhere right in between the two. We can take our measurement and our prediction and make a convolution which is a really fancy way of saying the product of two functions. The result is actually more accurate than either of our guesses individually. So even though our measurement was noisy, we don't actually know if we're at five and our prediction was noisy, we're not actually at 10, the end result is more reliable. And this gives us a common filter. A common filter can be used anytime you have a model of motion and some noisy data that you want to produce a more accurate prediction. So how good is a common filter, you might ask? This is an example of a homework assignment that was given to us. The green represents an actual robot's path where all of the little red dots are the noisy measurements. And it's so noisy that if you just take two subsequent points, two measurements, you can't tell which direction the robot is moving in because the second point might actually be way behind the first point. So it's incredibly, incredibly noisy. And this is part of the class. You can actually go to Udacity and take the course for free and this is the final thing that they do in the course. If you end up going to Georgia Tech, there's a little bit more involved. But to make things even more interesting, not only do you have to figure out where the robot is, you have your own robot that moves slightly slower than the one you're trying to find and you have to chase it. So you have to predict where it'll be, a time or two into the future, and then be there. And sorry for anybody who's colorblind. They picked the colors, not me. So what does this look like? Well, if we can apply a common filter and we end up something kind of like this, before our red dots were virtually unusable, as I mentioned, given two points, we can't even determine the direction. But with this correctly implemented, we can see our chaser robot getting closer and closer. So I like a little bit of audience participation. Who here likes money? Okay. All right, I think some people didn't raise their hands. It's okay. Before we look at how a common filter looks like, let's look at some cold hard cash. This is a 1913 Liberty Head Nickel. It was produced without the approval of the US Mint and as a result, they only made five of them. Only five of these got into circulation. As a result, it's incredibly, incredibly rare, and if you find this, it's worth $3.7 million. So yeah, I'd say that's a pretty penny, but I'll be here all week, folks. This is not a Liberty Head Nickel. This is a trick coin that for some reason, your coin collecting friend happened to have two heads instead of being the actual Liberty Head Nickel, and this coin collecting friend also has a $3.7 million coin, and for some strange reason, they put two coins into a bag and shake it up and draw one. So we have one fair coin and one trick coin in our bag. They say, hey, you know what? Like, do you wanna play a game? Do you wanna make $3.7 million? And so they take a coin out, they flip it, and they say that, oh, okay, it landed on heads. From here on, they might try to make some sort of a wager or a bet. Like, okay, well, if it's the $3.7 million coin, you can keep it, but otherwise you have to, I don't know, mow my lawn or something. I mean, it's fairly equivalent, right? But would that be a good bet or not? In order to know, we have to know what is the probability that given it landed on heads that we have our fair coin. To do this, we can use Bayes' Rule. So this is what it looks like to explain a little bit of the syntax. The P stands for probability, and we are saying the probability of A given B. So this is the probability that we have a $3.7 million coin given that we know it was heads. That's the information. That's all we knew. So in order to do this, we can flush this out piece by piece. So the probability of heads. Well, what is the probability of heads? We have three total chances of getting heads and one chance of getting tails. So we have a three out of four or a 75% chance of getting heads. Another way that we can do this is say, well, there's a 50% chance that we get our fair coin, and if we get that fair coin, there's a 50% chance that it's heads. There's, we can add that to a 50% chance of getting our trick coin, and if we get our trick coin, there's 100% chance that we are gonna get heads. And when you do that, you end up with the exact same result. This is just the more mathy way of achieving that instead of intuition, because later on, I tried to teach my program intuition. It didn't work out too well. Also, so this is a talk on artificial intelligence, and I have to admit I don't know a whole lot about artificial intelligence, or I would have written in artificial intelligence to write my talk. So, thank you. Okay, so we're gonna add this onto our equation and keep moving. So now we wanna know what is the probability of A? The probability of getting that $3.7 million coin. Well, we know we have two different cases. They're equally probable. We have a 50% chance of getting that coin, and we can add this back to our equation. The last piece is the probability of heads given that we have a fair coin, given that we have this $3.7 million coin. So in that case, we only have, like assuming that we have the fair coin, we flip it, there's only a one out of two chance that we have heads. So that's 50%, we can add it here. When we put all of that together, we end up with a one in three, or 0.33% chance of owning a multi-million dollar, 1913 Liberty Head nickel. So one in three, it's not great, but it's not nothing. This is what we can do with Bayes rule. Given two related probabilities, in this case, what is the probability we'll get heads, and also what is the probability that we'll draw our money coin, we can accurately predict that relationship. Khan Academy has a really good resource on Bayes rule, and instead, another way to teach this, this is the very mathy way, one other way to look at this is with trees. So here's essentially that. To answer this question, we need only rewind and grow a tree. The first event, he picks one of two coins. So our tree grows two branches leading to two equally likely outcomes, fair or unfair. The next event, he flips the coin, we grow again. If he had the fair coin, we know this flip can result in two equally likely outcomes, heads and tails. While the unfair coin results in two outcomes, both heads. Our tree is finished, and we see it has four leaves, representing four equally likely outcomes. The final step, new evidence. He says, heads. Whenever we gain evidence, we must trim our tree. We cut any branch leading to tails because we know tails did not occur, and that is it. So the probability that he chose the fair coin is the one fair outcome leading to heads divided by the three possible outcomes leading to heads, or one third. All right, so if we use trees or we use Bayes rule, we get the same outcome. I'm not an expert in probability, but that's probably a good thing. One element I mentioned, but didn't dwell on, was total probability, and also I'm very terribly sorry, I lied about Bayes rule. That isn't all of Bayes rule. It actually looks a little bit more like this. So this is the expanded form, and to see both side by side, this is just expanded the total probability of the expanded on the bottom. So what exactly is total probability? If we're gonna look at our problem another way, we can say, all right, well we have a 50% chance of our actual coin or the $0 trick coin. And in this problem space, if we're gonna land on heads, heads is going to completely take up the trick coin case. If we have the trick coin, there's 100% chance of heads. However, it only half takes up the $3.7 million coin. If we land on tails, tails falls entirely inside of the $3.7 million coin, and we have 100% chance that that is that coin. Now, what we actually wanna know is this section. What we want to know is the total probability of getting heads, and in order to do that, we can calculate it by adding up this section along with this section, and that will give us the total probability. To write it out long form, we have the probability of heads given that we have our fair coin times the probability of the fair coin, plus the probability of heads times the trick coin multiplied by the probability of getting that trick coin. So it's just this summation, and we did this previously when I showed you this slide, but I didn't explain exactly why we did it or where we're getting that math from. So that's where it came from. We can make this a little bit tougher though. What if we flipped two coins, or what if we flipped the coin twice and it landed on heads both times? In order to do that, it makes it actually a little simpler if we use the expanded form. I'm not gonna dwell on exactly where we got all of the numbers from as much, but here the suffix I indicates each of the different cases, so we could have a coin that is our fair coin or we could have a coin that is the not fair coin. So the probability of landing on heads twice given our fair coin is gonna be, you flip it, it's a 50% chance of heads. You flip it again, it's a 50% chance of heads. Multiply those two together. The probability of getting that fair coin hasn't changed. It never will. There's always a 50% chance of getting one out of two coins. And then we can flesh out this summation and at the bottom, again, so it's a 0.25 times a half plus if we get heads or if we have the trick coin, it's 100% probability, so it's one times the probability of getting the trick coin, which is a half. You all with me? Okay, all right. So if you add all this together, you end up with a fifth, which is 0.2. And now Bayes rule doesn't claim certainty. You know, our values are going down. It is more and more and more likely that we do not have the fair coin, but it's never actually gonna reach zero. And that's a really important part because if it does reach zero and then we flipped it again and it turned out to be tails, well, the way Bayes rule is written, it would never recover from that. Mathematically, it would never recover from that. So sorry to get a little bit mathy, but we need it. Is anybody ready for a break from math? All right, so we are gonna take a break from math with some more math. All right, for that, I'm gonna put on my math jacket. I do appreciate you all bearing with me. So if we look back at Bayes rule again, one way to represent it would be splitting the equation out. This is exactly what we had before, but on one side, we basically have a constant. The probability of getting our fair coin every single time was exactly the same. So this is gonna be called our prior. Without any information at all in the system, whether we can say that would be the probability of getting our coin. This other section is after we have information. So it's a posterior, so post information. And even if our prior is zero point five, our posterior, if we have the case where we got a tails, our posterior is so large that it actually pulls the zero point five up all the way to be 100% and say we definitively have a fair coin. So a Kalman filter is a recursive Bayes estimation, and I can guarantee you that all of these are words. Previously, we looked at a graph and we had a prediction. And so that is actually gonna be our prior. We also had a measurement and that's gonna be our posterior. This is the thing that updated after we got new information. And our convolution, we're gonna be somewhere in between. We don't exactly know where. So that's where actually implementing a Kalman filter comes from. So the next example comes from Simon D. Levy. I have a link to this resource. Step-by-step goes through and really explains the math. I know your heads might be hurting a little bit, but I'm barely skimming the surface. And some of it's really interesting. He also has a fairly unique and fairly simple example that I'm gonna walk through how to implement it in a Kalman filter. So let's say we've got a plane. And this plane is really simple. All it can do is land, apparently. And the way you control it is by multiplying your current altitude by some other value. In this case, it's 0.75. And this gives us a nice steady landing. Towards the end, it's like moving in smaller and smaller and smaller increments until eventually we kinda touch down. Unfortunately, our measurements are really, really, really noisy. So this is that line, but with 20% noise. And we're actually going below the ground here. We're going negative measurements. So according to our measurements, like we're repeatedly slamming into the ground. And I know like visually mentally, you're just like, oh yeah, there's nice little line in there. It's like, but if you are writing a system that it depends on those measurements, we need it to be a nice straight line, nice smooth line instead of this jagged thing that sometimes indicates we're below the ground. So we're gonna actually program this in a Kalman filter. We're gonna start off with our rate of descent, just 0.75, our initial position, and our measurement error. We're then gonna just make a guess. We're gonna say, well, let's just assume you were at the very first position that you were measured at. And we also introduce a new thing called P, which is our estimation error. This is our prediction error. And this is gonna be a value between zero and one that we're gonna use to remember how we kind of adjusted our robot sort of back and forth. Is it closer to the prediction? Is it closer to the measurement? And that's how we're gonna do that. To get started, we pull a measurement off of our measurement array. Oh, and I do apologize. This is in Python. Yeah, I assume everybody here's a polyglot. Luckily, all of the code is identical to what it would be in Ruby except for the very top line, the 4K in range 10. All right, so we start off with our guess. We multiply where we currently were by our constant. So 0.75. That's now where we think we are. We then wanna say, build into our system some way where if we move just a little teeny tiny bit, our prediction's probably pretty accurate. But if we move a whole lot, our prediction's not that accurate. So we're going to multiply our motion by our prediction error. And the reason we do this twice is that prediction error is actually represented as sigma squared. So it's error squared. And you don't really need to know that just multiply twice. So that's the prediction phase. Then after we've predicted, we have to update it with our measurement. I'm gonna skip this gain line and instead go straight to the actual update. So we have our guess of where we currently are. Then we add it with a mysterious gain number times the current measurement minus the previous guess. And so the way that we can think about this gain is it's sort of the ratio of our last measurement and the prediction. If our prediction error is really low, like really, really low, then our gain is really, really low. And if it's so low that it gets pretty close to zero, we can approximate zero. And when that happens, we can actually eliminate out this entire term. And that means that we should just ignore our noisy measurements altogether. Our last prediction was so good, it was so good, we don't even need our new measurements. Either that or our new measurements were so bad that it's not helping us in any way, shape, or form. If the prediction error is high, then it means we have a really high gain. And when that happens, we end up approaching one. And when we do this, we have an X guess, and then we also have a negative X guess. And those two terms cancel each other out and we end up just guessing whatever our measurement is. This means that we throw out our previous prediction and just use our measurement. You might wanna do this in a case where it turns out that your sensor is really, really, really accurate, but your prediction model is not. So a way to visualize that is if our prediction is less certain or less accurate, it's kinda a little bit more flat and our robot would be leaning towards our measurement, or if our prediction is more certain, it's a little bit more peaky, then our robot is gonna be leaning more towards the prediction. You put all of this together and you recursively update your prediction error and you end up with a graph that kinda looks a little bit like this. So that the jagged line represents our very noisy measurements. The blue line represents the actual value of the plane and the little green squares are what we are predicting. Now it's not dead on. Like again, we're not perfect at predicting the future, but we're pretty close. We're a lot better than what we had previously and given this, hopefully our plane won't crash into the ground repeatedly. So that's pretty much the simplest case of a Kalman filter. We can get a lot, a lot deeper. There's a lot more scenarios and situations. One of the more common things is having a Kalman filter in a matrix form. For example, in this case, we only had altitude, but what if we also had engine speed and like barometric pressure and the angle of our flaps and the angle of the pilot is pulling back on the controls like, and if we put all of those together and instead if they are related, instead of individually writing a Kalman filter for each of them, we put them in one Kalman filter, it actually ends up being much, much, much more accurate for the entire system. And so, yeah, this looks pretty similar, but it's, yeah, there's a little bit more going on that we don't necessarily have time to get into. The other case where a Kalman filter gets into trouble is in motion that isn't linear. So previously, yes, we had a nice gentle curve, but each step itself was linear. Each step was just based on a constant multiplied by the previous step. But there are cases where we have sinusoidal motion or logarithmic or just not linear. And when that happens, we end up having two different probability distributions. And then when we put them together, in order to add two probability distributions together, they have to be on the same plane. And here we're kind of estimating and making a bad estimation. Granted, this is still likely better than doing it without any kind of just taking the noisy measurements, but I would recommend not doing this. Instead, there's other ways. There's an extended Kalman filter. There's an unscented Kalman filter. And this is kind of the way I think of extended Kalman filters is it rotates the plane of our probability distribution so that it approximates, it still has to be on a line and it still has to, both of them have to be on the same line, but we can approximate our curve by rotating it. All right, okay, so that's it for Bayes rule, or that, sorry, that's it for Kalman filter. I did want to go back a little bit to Bayes rule and touch on the two most important parts. So the prediction, if we never predict the future, then we can't know if we're right or wrong. This is what scientists, this is why scientists start with a hypothesis. If the hypothesis is wrong, we're forced to reevaluate our underlying assumption. When we, and then whenever we get new information, we have to update. We have to update our own set of beliefs. And the interesting thing about this is we can never be too sure of ourselves. No matter how many times we get heads, we can never be 100% sure that it is a trick coin unless we actually investigate it. And that's why this is probability. As soon as it dips to that, you end up going all the way to zero, or if you just make that claim, if you say, oh, there's a zero percent chance this could ever happen, Bayes rule will not help you, your system can never recover. So yes, I already gave the example previously of even if you get tails, it's like, sorry, Bayes rule tells you there's a zero percent chance you cannot recover. So no matter how sure of yourself that you are, you always need to remain a little bit skeptical. You might think that there's a zero percent chance, or that there's a hundred percent chance of the sun coming up tomorrow. That'd be a pretty good bet. And for most days, you'd be right, but if it turns out that tomorrow is the day that our sun turns into a red giant and consumes the earth, hopefully your millennia of prior experience with the sun coming up every day, doesn't cause you to accidentally die. On that note, it always pays to have good information. And good guesses. We don't have to wait until our sun explodes. We can actually take a look at other stars and see what happens to them. We can compare our situation to, it's like, oh, maybe not, it's not exactly the same, but it'll give us a better prediction than we would have otherwise. And so the more data and the more predictions that we make, the better our outcomes will be. Let that sink in. So I highly recommend a book called Algorithms to Live By. I think it's a book every programmer should read. It's very narrative and it has an entire chapter on Bayes' Rule. It's very easy to read. It doesn't get into the math nitty gritty like I did. I also have, I see some people taking photos. I'm gonna leave it up here and speak to delay the next slide. Okay, good, good. I also highly recommend The Signal and the Noise. This is a book written by Nate Silver. It's about probability. Nate Silver runs 538. He successfully predicted our 45th president, has a one in five chance of winning, and would likely lose the popular vote. He did not predict the magnitude by which he would lose the popular vote. Just saying. The audio I got, it's Mozart in Requiem in D Minor. Previously, the Kalman tutorial. You can go to bit.ly slash Kalman dash tutorial. This is Steven D. Levy's resource. And then also, if you're really into Kalman filters and you wanna see a lot of that unscented Kalman filters, extended Kalman filters, this is a great resource. It's just bit.ly slash Kalman dash notebook. And unfortunately, all of this is also in Python. But it's, I mean, if you know Ruby, it's pretty easy to read. You can also check out Udacity and Georgia Tech. And if you didn't know, Bay is not short for baby. It's African American vernacular and it stands for before anyone else. So Copernicus built on top of Bay's theory and developed special cases of when we can truly have no prior estimate. What should we do? Laplace took Bay's work. And actually much of what we know is Bay's rule and Bay's theorem to be the nice pleasant polished thing that it is actually comes from Laplace. So before there was Copernicus, before there was Laplace, Bay's was Bay. Thank you very much.