 Okay, the sound good or is the sound good. Yes. Okay, good. All right, so. So, so, just to kind of recap where we are. We started a couple of days ago to develop time domain version control theory. And yesterday, well, we began with some some heuristic methods of systems, GID control and things like that. Realize that this is going to get too complicated to do for larger systems. And then yesterday proposed a more systematic approach known as optimal control, where you come up with some scalar positive quantity, a negative quantity that we call a cost that supposedly encapsulates what you want to do. And if you want to do several competing things at once. One strategy is just to have a linear combination of same quadratic forms. I should say this is not the only approach and there's more formal thing called multi objective optimization, which uses the notion of credo frontiers and so forth. The simplest way to have competing games is just to put everything in terms of the same units, which say some dimensionless number and then add them with some weighting corporation that weights how much you care about, say controlling your state versus how much you care about how much control effort is required. That was the trade off that we were looking at last time. And so optimal control gives the systematic approach. We even did it for a nonlinear system of swinging up a pendulum. And then for linear systems developed a complete solution which even was the feedback solution that allowed us to calculate the gains based on these cost functions. And that's for the problem of tuning the gains to choosing the cost function, but hope is that one has more insight into the costs are supposed to be what you care about so hopefully you have more insight into that. So today I want to kind of look at the contrasting part of this which is the problem of how you go from observations to an estimate of the state. And this is the notion of observability, which says that this is, this is possible. We talked about that on on Tuesday. But, and we even on Tuesday gave kind of a solution to doing it, which was the notion of this observer system. You may remember where we created kind of a shadow dynamical system that works in parallel with the physical system. And it was to put coupling and feedback so that the systems were eventually coupled and synchronized. Okay, we'll come back to this in a couple of minutes, but we had kind of a way of doing that. And today we're going to revisit that. In particular, think about the effects of noise so when we did it before that feedback coupling was a little bit arbitrary and it wasn't clear, for example, why you wouldn't use huge observer gains to synchronize the systems arbitrarily quickly. And we'll see that noise will tell us that this is a bad idea. And so in order to do this, we need to kind of take a step back and think about introducing noise into this picture of control that we've been developing. So, so today is devoted to stochastic systems. And I'll start it with kind of a plane, which is that the only reason to use feedback, as opposed to some kind of the forward is some kind of uncertainty. And why is that the case, well, feedback is always reacting to something. And in a causal system reactions always take some time. So this can be formalized in things like the commerce chronic relations or as I said, there's a polar version involving magnitude of the magnitude of a response function and the face delay. And this tells you sort of how much lag there has to be causality. There will be more like there has to be a minimum. So anytime you do something, but more informally anytime you're reacting, there's some time to gather the information think about it and then do something about it. Whereas, if you do, if you kind of know what's going to happen, you can just implement the right control. So if you know your dynamics perfectly, if there are no disturbances, anything like that, you just compute in the way that we did yesterday. The U of T, the control signal that you need to apply, and then you just apply it and it does what you, of course, this doesn't work in the real world. There's always the models you use are never perfect. There's always external disturbances and so forth. So you do need me feedback, but the feedback is really associated with this answer. And it's always good to realize that the things that you are certain about or know something about should be incorporated that you should be using feedback only for the dealings uncertainty. Having said that, one other note is that I mentioned that feedback has to be possible in the sense of something something unexpected happens and then you react to it. Okay. The interesting thing about control theory, I think, and one that isn't sufficiently appreciated in physics is how, in a way, the control can be a causal. I might have mentioned this at the very beginning, but think about the problem of controlling the temperature room like this. If you use feedback to control it, then whenever the temperature goes down in the evening, it's not quite the season, but given a month or two, when the temperature is going down in the evening, the heater should come on and form up the room. On the other hand, you know, we know and the heater can easily learn that, you know, regularly around sunset, which is a predictable event. Temperatures are going to go down. And so if you want to keep the temperature in the room constant, it's better to start heating before the sun goes down, so that you gradually just enough to keep the temperature constant. So you can do a better job of control by incorporating in some sense information from the future. Of course, it's information that you have other sources. So the way I'd like to look at this is that control systems are open to the flow of information in the same way that, you know, when we think about energy, energy is concerned, but we have open systems where energy can flow in and out. And here we have control systems where there is overall causality, but information can flow in different ways. And I think this is something for the theorists that kind of think about how to formalize better. We've been having lectures in information theory when we won't have time to try to connect information theory to control, but it's a natural connection to try to make and there have been some efforts in doing this, but I think there can be a lot more. And by the way, many actual thermostats these days, if you buy a more modern one, they have kind of a learning mode where basically they make the onset that tomorrow is going to be, you know, we'll have the same disturbances as today or an average version of the last few days. And so, even if, you know, you don't have to be that smart of thermostat to learn a daily rhythm like that. And then once it's learned it, then we can start to have a control that anticipates that. Okay. So, what I want to develop is something formerly called a Kalman filter. And it can be a little bit forbidding in the algebra around that can be a little bit forbidding. So I want to build up the notion, step by step starting from really simple, simple examples. But the, the issue is going to be, we're going to now consider systems that have, in some sense to noise sources. So the system itself will be driven by some kind of noise source and the kinds of experiments that I do. And this can be the thermal noise from a bath that's connected, for example, to a colloidal particle protein, other kinds of molecules. So you have a system that's at a finite temperature and subject to, in the case of a colloidal particle direct hits from the surrounding water molecules that they get. But at the same time, in, in a control system, you're making measurement if you are going to use feedback, you have to measure, for example, the position of a particle and apply some force. And that measurement will often will always have noise. So there's always some uncertainty associated with the measurement. In the case of the colloidal particle, typically the measurement systems involve lasers. You focus light and there's, there's various techniques, but, but, you know, in the simplest thing, you form some image on a camera, perhaps, and you're tracking it. And there's noise associated with the illumination, and particularly the illumination has low power so you're using a very weak light source, or you're trying to do very fast measurements. So if you're tracking the equivalent, then the shot noise in the laser becomes important. Number of photons detected and in the, in fact, in the extreme limit, people have done tracking where every photon detected from a measurement system is active. Anyway, so the measurements that you use have noise and these are two independent noise sources. And the challenge, one of the challenges would be that you have the system responding to noise, you have the measurements with noise and how do you sort everything out. So I tried to sketch this situation here, where, you know, here, here's, here's our physical system. I think we have inputs, and that can be inputs both deliberately because you want to control it, and there can be inputs from noise, and they can enter either in similar ways or different ways to come to that. So in output, there's there's measurement noise that leads to an output that is different from the state, even in simple cases where you think you can observe the full state vector. So remember we put functions between the internal state and what we observe. Not until now I've been focusing on this idea that well maybe you have an indimensional state vector but you're only observing one quantity, but you observe it as a function of time so if you have enough observations, use the information from the past to reconstruct the state now. So here we see that even if you have a simple one dimensional system where we're going to start and one dimensional observations of the dimensions of the observations equal the dimension of your system. There's still a problem because you have noise in that measurement. And so even in this simple case where you think you're observing everything. You still have the problem of going from a noisy measurement to an estimate of the state. Does that make sense? It's kind of a setup on the problem. Okay, and what we'll just have to sort of imagine. Yeah, I think in my book I have an example where I sketch out sort of why you might want to be more sophisticated about this process of estimating the state. So think about, for example, a noisy harmonic oscillator. So here's a noisy measurement of harmonic oscillator. In fact, it can even be just a harmonic oscillator not written by you guys. So we look something like this. And, as I've mentioned, if you were to try to estimate the velocity of the oscillator by doing a sort of naive finite differencing, because you have noise, you're going to get a really bad result. The difference for the velocity will be something like that. And we'll see that if you use this technique that's based on an observer, you get a much smoother estimate because you're implicitly averaging over the noise of the measurement. I mean, in some sense, this is an obvious strategy. If I, you know, if you have a static quantity that you're measuring and there's noise, this is like a first year lab exercise. What you're told is to measure it and time so that uncertainty gets reduced by a factor of square root of the number of finding measurement. So we'd like to do this for a measurement from say a harmonic oscillator. But the problem is that, you know, it's the thing that you're trying to measure is moving. So you just naively took the last and measurements and average them, you'd be averaging noise but you'd also be averaging motion. And so that's not something that is optimal. And so a better strategy will be this observer where you have sort of two systems that are eventually more or less doing the same thing. And then the differences are something that you can start to average. That's, that's kind of the strategy that we'll be pursuing. Okay, so let's see how this goes in a simple example. So, yeah, so the thing that we'll be thinking about is a Brownian particle. So it's just diffusing, whether potential or anything like that. And, oh, sorry. I'm using slightly more general notation than I would. But to have a particle that's being linear equations in the end, in the moment it will just be a simple diffusion. But in general, it was being you and here I'm actually giving it sort of two inputs. So there's the normal input you, which is coupling through here prime. There's noise, which is thermal noise from the strongly bath, which is also coupled in this system in principle in a different way. And I really shouldn't write this like this, I should really have just one giant matrix, and two inputs like that. It's, I'll, I'll sometimes abuse this notation. And sometimes the bees can be the same. Sometimes they can be different. So it just depends on how, you know, you is the deliberate thing you're doing to particles of course that particle and new is the forces that are acted on by the water molecules. Yeah. So you, for example, yeah, of course that you are expected to put. Yeah, so, so in the case of a one dimensional particle, they're the same, the inputs are the same, but you can imagine systems where, you know, it's a higher dimensional internal system, your, your, your control enters in one way, but noise can enter in perhaps, perhaps the same way as the control, but perhaps in other ways too. And so in principle, they can couple in differently, but you know, it doesn't, I'm just trying to be a little general here, couldn't be the same. Yeah, so one dimension is the same. And then we're going to set up this observer structure, and you remember that the idea was to make a shadow dynamic system here where it's the same, the same dynamics, and the same input. Now we can't put the noise into the observer. We don't know what the noise is, but we, what we can do is put in this feedback between the observation and the prediction. Okay, so we'll have this dual structure of the real physical system here, and the observer system here. And this is a member of the X. So the X here is becomes a measurement. This is our usual relationship and now we're adding some noise. So the coupling between the observer and the observer and the physical system enters through this term here. So this involves X hat, this involves Y, which involves X. Okay, so a lot of what I'll do, I'm going to be kind of going back and forth between discrete dynamics and continuous dynamics. There are some subtleties in exactly what's the right way to do it, but let's just be informal and think of just naively discretized systems. So, if we wanted, if we want to describe the motion of a surrounding and particles, so the diffusion particle, big colloidal bead and moving in water, just by diffusion. So X k plus one so we'll describe things at different time intervals k delta T or I think sometimes I use. Yes. Anyway, so, so k times an interval table in case of integer that index which time point we're looking at. And so, X k plus one is equal to X k, which means that in the absence of noise particles just sitting there. So stationary. But then there's, there's a noise new K which is the amount of thermal fluctuations that that accumulate we affect the particle in the interval delta T. Similarly, why K is is just in this case will assume in some sense we're directly measuring position but there's just some noise so there's no because everything is one dimensional there's no see matrix and if there were see it would just be a constant. So it's one. Okay. So, X k is the actual position. Why K is the measured position and because of noise they're different. And the only force affecting the particle is this new cave. So, a brief word on, you know, what do I mean by these noise and their stochastic process, and for our purposes, their properties are defined by their statistics. So, the average of UK and CK or zero so some sense that's almost the definition of noise because if it weren't zero if there were constant part we'd be absorbed that constant part into the dynamics. So that that that they always have zero mean. Then we can look at their co variances so this is the covariance between you and see at different times or the same time doesn't matter they're all zeros we're assuming that the two noise sources have nothing to do with each other and this is reasonable because new is arising from the fluctuations of water molecules and see is arising because of photon number fluctuations in the laser and the optics and so forth so they're really two separate things that determine these two noise sources. On the other hand, they have variances so so new K new K prime. If K and K prime are the same so we look at the noise at one particular time point K and square it. And that is something that that on average will have some value, I'll call it new squared. However, if we look at the noise at different times to the noise of this into this interval, this interval are separate. And I should say by the way. There are subtleties here that that arise for example if you use cameras that integrate a long time, you know this interval, this interval and they share common values up so this this this can be not so straightforward and practice but but for our purposes will assume it is so. The noise, the normal noises are independent different times and again the measurement voice is different. Okay, and it has an amplitude C squared. And this just keep C squared because I'm, I'm not assuming that the variance depending on time but of course in a more general formulation. The variance could vary with time, for example, if you were measuring the laser, the variance depends on the brightness of the laser so laser intensity were changing for whatever reason than the variance of the noise measurements would also changing. I mean by, when I keep saying average, and I use the brackets, this is this notation, I mean an ensemble average so I mean integral over all values of random variable of the random variable times its probability density function. So to people like Rafael, we'll get a different notation. P of new K for a Gaussian distribution, the case that we'll be taking today looks like this. This before and another notation which is here. Okay. This is here's a new case distributed as a normal distribution with zero and variance. It's a notation. Any questions. Much besides that. Just a comment on the actual case of an easy particles and we know a little bit more about the particles where the new squared for the normal noise in some interval. So I'm calling T s changing the patient. In this interval T s it's 2D times T s or for the standard deviation squared about this is a typical displacement in a time T s. So let's go from, from Einstein relation that that D is actually related to temperature so it's KB Boltzmann's constant temperature divided by gamma gamma is a friction coefficients. And for the case of a sphere moving in water this friction coefficients is set by Stokes law. And so this gamma is six pi times the radius of the sphere times the viscosity of fluid. And this, so that this, this is the coefficient that connects the force that you exert on a sphere with its velocity so it's a completely overdamped case you pull it a constant force and we'll go to constant velocity and gammas that will portionality comes between them. And I, again for the experimentalists around just to note that this specific result is for a sphere in an infinite ocean. And if you put a sphere, for example, in your surfaces, the presence of the surfaces adds more drag. And so the six would be somewhat larger. And so that's a fun thing to calculate your theorists to solve that hydrodynamic problem. If you're an experimentalist just annoying. Okay. So that's kind of the physical setup. Now, when we go to discrete times, we have to think a little bit about timing. And here there are various possibilities. Today I want to use one and tomorrow another and apologize for using but there are reasons for that choice. So there's a question so so you make a measurement. You do a computation. And then you take some action. In this case, we have we're not getting to control this. Let's just say you you you make a measurement, and then you have to get that measurement into your computer you have to have your computer calculations and eventually you do something the question is, how long does that take versus the time interval between two observations right so here's the time between two observations. And you could have a case where you can measure and figure things out in a time that's much shorter than the time. Okay. And that's the case that we'll be talking about today. So before you could have a case where it's a significant time and in particular, many experiments are designed so that this time is is precisely one time interval so that you're always kind of one time step off and have to think about a little bit of a future. Okay, so that's what the next to a typical experiment would do that. And why would it do that because a lot of the time that's in here is it's calculation but a lot of it is also just transfer time from your sensor into your computer. And if that time we're short between observations. And then as you'll see the ability to control something requires continued information so if that time we're short then that means that you require some information. And then you're twiddling your thumbs for a long time until the next one and then you fire it. So why wouldn't you try to acquire it more rapidly. And so you're sort of driven in an experiment working in a limit where all these times are comfortable. And so then a simple way to design it is sort of say well, at this moment, I acquire some information and I start to bring in the computer and I process it. And I'm completely done just at the time that the next one is coming in. And so there's this kind of sort of leapfrogging of information into one. You know, sometimes a little complicated but what we have to do so we'll look at here in a case where you assume that we can acquire the information and make use of it instantaneous. This does complicate the formalism a little bit. When you when you start to think about things like this because now we should distinguish between two kinds of estimates. So there's the estimate that we can make of the position. At time K plus one where we are predicting where the system will be at time K plus one given the best estimate that we have at time K. And then so we can do this immediately at time K. So at time K, we have the measurement YK, we can do all of our calculations, and then we can make a prediction for where the system will be at time K plus one in the case of diffusion. It's a trivial thing because in the absence of noise it doesn't go anywhere. But if there were deterministic forces we could say okay these forces will carry the system in this direction by this amount. If it's here, it will carry it over to here because we know what the forces and so forth and the friction is and so forth. So, we can immediately predict where the system will be at the next time step. And then at some and then after this interval, the information will come in the measurement is made, and then we can update it. So, what are the quantities this X K plus one X hat so X hat is the estimate K plus one is for time K plus one and minus will be in this notation. arbitrary how you find it but minus will mean that we're making prediction we don't yet have the measurement. And without the minus it means that we are now incorporating the measurement. So these are these are actually two quite different quantities, you know, they look almost the same. Is that. I mean, so you're saying that you measure us. Yes. And you have a result YK. You have a result YK, and you use it to make some best estimate X, X hat K. And then you can predict where it will be a K plus one. And that's what I'm calling exact K plus one minus sorry. These values. Is the prediction or the prediction. Okay, and the minus means that it's lacking the information. You could argue that maybe I should have not put a minus here for the plus or star or something. These are notations that some people use. It is what it is. So this is the prediction. Here. And then once you get the measurement YK plus one, then you can you can incorporate that into the yes. So here. Things are a little bit. Simple and some of this degenerates and we were just doing diffusion. I could use a simpler rotation, because X hat K plus one minus is just equal to X hat K, because there is no drift in the system. It's just diffusion. So, so unless there's random forces, it just sits there. Okay, so, so this is, these are equal but but but when there's forces around, they will be. And because the measurements, because we're assuming that we have kind of like a perfect measurement plus noise, then why had K plus one is just X hat. Plus one minus. Okay, so, so why had K plus one is is the predicted observation. So, we can predict the state, but because the state generates an observation. We can predict the observation as well in this simple case they're the same. But again remember before we had Y equals CS so there's some matrix or vector or vector relationship between the end component state vector and say the one component. Let's a little bit see. So, we can apply that to to the prediction. Okay. So, using that the observer then becomes X hat K plus one. When you incorporate the measurement. Yeah. You said that without the estimate would just be a prediction. No, I mean we're trying to formulate a prediction so I'm saying that the prediction would just be the previous estimate. Yeah, sorry. But with this look like we'll look under this but but but basically it's it's a deterministic system. So I mean drift is the deterministic part of the dynamics. So, you know, you have the state. So it's, you know, it's basically, you know, X hat K plus one is a times X hat K plus whatever. Well, no, that is the drift term is the drift term had the has the drift. Okay, I mean if a is just one then there's no good but if they, you know, we've looked at a for harmonic oscillator and you can calculate different, different things. So a has whatever drifts in plus. In addition, if there's an input that will also create a. So for the moment there's no input, so we're not doing any feedback, and there's no dynamic so it's just the part of the using actually just in one dimension like that on its own but it's only moving because of the noise. And the noise is what we don't know. So, so we would say that the X hat K plus one is the, this is the prediction. Okay. And now we're going to update our estimate after we have made the actual measurement like a plus one, and we compare it to the predicted measure. Okay, so, so if the predicted measurement, if the actual measurement is numerically equal to the prediction, then we don't do anything like we, you know, that basically says that, you know, we kind of predicted to be here. We predicted an observation and we got exactly that value so no need to do anything, but to the extent that it does differ. Then, then we correct it with this feedback L being in one dimension just a coefficient that's saying how much do we, how much do we update X hat on the basis of the new information. And we still have to choose L that's going to be kind of the point of the story. But now it's just some coefficient. And we saw before, you know, where, where differences weren't coming from statistics but just different initial conditions that that we could choose L in a way that would make everything synchronize or eventually this difference here would be going to zero. But with noise that's not going to go, it's not going to go to zero but it will go to some steady state values. Okay, so we're going to adopt an approach very much like the optimal control. But here, our cross function is going to be variants of observer errors. Okay, so what we, what we do so we want to again we want to have a systematic way to choose L in one dimension. You can just tune it and see where it works best but when member L will be n dimensionals or an n dimensional state factors so again, you know, we want a rational way to choose the game. And we'll see that that that we will have it. So let's form the error estimate between the actual state of the system xk and our best estimate x hat k which is incorporating the information at time k. So this error ek is just difference. And I want to use the ensemble average of a square the variance of errors as as the cost function. Okay, so we want to choose L in such a way that that our estimates are going to be as close as possible to the true state on average. And of course we have to square because you know, we've done this right it'll be unbiased but it will always be fluctuating around so it's the square that it's a number of ideas that come up once. Yeah, so there's there's there's sort of predictor corrective method. This is very much the same kind of. Let me, you know, I let me let me know I you and change notation slightly, and in a way that's not very natural for business but it's kind of, if you ever go to any control theory but this is what they use. So instead of writing ek script plus one I'll call this pk plus one. So p will be my co variances. I'm leading up to to to talking about tomorrow's and we should work that we did that we submitted to a physics journal make up. One of the referee comments was P, why did, why the heck do you use P so they said use, how about signal. So, so you use sigma for a covariance. Okay. But, so you're going to be to the average of the square of the error right. Yeah, so this is the definition. Okay. Okay, so, yeah, P k plus one is the square of the error which is x k plus one minus x hat k plus one squared ensemble average. But we're going to have a whole family of these people they can still be k plus one minus will be the same quantity when instead of using x hat k plus one use x hat k plus one minus. So this is the covariance in the predictions. Okay, so we so pk plus one minus is going to be bigger than pk plus one because hopefully when you make the observations that will, we can we can express e k plus one minus so this difference here. In terms of the dynamics. So x k plus one is x k plus UK. And so then, then we have x k plus one minus x k hat or x k minus x. So that's e k, and we still have this. And so we then look at the squared and taking ensemble average, because the noise is independent of the state. So then you have the state at this time the noises of everything that's happening in the interval so it's independent of what state is then then this ensemble average is zero. And so we can say that pk plus one minus for example pk plus new square. This is precisely how much the answer grows due to the fusion of the particle. So you have an estimate to within some precision and then as the, as you sit for a time interval t sub s and it becomes larger, you're less certain about. So after measuring the making a measurement, why measurement why time k plus one, then the error is going to be, we'll have x hat x k plus one minus the old prediction plus the correction. Again, we can we can take y k plus one and express it as x k plus one plus the measurement noise and gather all the terms. And so we can relate this in terms of the predicted error and measurement noise. And now we see that the observer game is entering into both of these terms. And so we squared in particular will have a quadratic form of quadratic result where this is one myself square this is all squared. So we can see that that this observer is going to increase our uncertainty because we're amplifying noise measurement, but it's going to decrease the uncertainty in a previous thing because we're incorporating new information. So it can be both helpful and hurtful and harmful. And that means that there's an optimal value of the observer game so you're, you're, you're basically balancing the incorporation of the new information, to the extent that it's, it's good information will improve your estimate, but the extent that it's noise will, will worsen it. Okay, and so this, we now have a way to sort of balance these two. Okay, so, again, you know, just the So here we have e k plus one minus and then we also have it here so so it's one minus L times and still have L times the noise here so it's entering in both cases. So it is that part again. And conceptually, the conceptual part is more important than the algebraic. So, to the extent that you have new information. So there were no measurement noise than the new information would improve your estimate of where it is. So that's, that's incorporating the, because reflecting also the effects of the normal noise is giving you information on what the normal noise is during the interval because you're observing it after the noise is acted on it at that time. So that part of it is helpful. The observation is noisy is harmful, because that will, that will give you a why that's unconnected to the system and so it will worsen your, your estimate. So that's why there's L squared C squared. And then this part here is coming because you're, you're, you're learning about the thermal fluctuations that happened during the time interval. Okay, so, so it's kind of, there are two noise sources. We have the same information of one, but we're exposed to the effects of the one balance. I'm a little worried here. But this is the, I mean, I have to say, we don't have too, too far to get in terms of, in some sense, the key point. So I really want to stop and make sure that there are questions at this stage that we answered, because it's only going to get a brief, brief summary of what we are talking about a brief summary. Okay. So, for the state of a dynamical system, in this case, the particle diffusing in one dimension. And the problem. And so, and we're making observations of it. The problem is that the system has sort of two types of noise. The particle itself is moving in a random way as a thermal fluctuations. So it's diffusing on a line. However, your observations that you make of it are also noisy. So you're not getting precise information about the state of the system. If you had no observation noise. It's here, it's here, it's here. Okay. But the problem is that, you know, it's, you know, the observations are different from it from the actual system. So you're trying to make inferences about the state of the system. Using what you know about the dynamics, which is that in this case, it's just not going to, it's just that it's not going to move. It's just thermal noise. Plus the observation, you know, the fact, plus the noise, the observations. So we're, we're trying to. So our strategy then is to define this observer. So we have the two parallel systems going together. And it's a little bit overkill in this, in this special case where I mean really the observer is, is, is designed to apply in a system where the state is moving in some deterministic way. Okay, so here the state is not moving in a deterministic way, just sitting there deterministically so it's a little bit heavy structure, but but in general, we'll have that structure. So we have this observer, which is trying to synchronize with the system. And so we have the true state of the system. And then we have the state of the observer system. So this is X and X hat. And as they march along in time. So the observer learns about the system through the observations. And so it's, it's getting an observation. And then the observation tells it where this is, you know, the observation were perfect. Then, if it moved randomly, the observation would just track that. But it doesn't, it gives you imperfect information about it. And so it sort of tracks it I mean the noise is very small. Mostly it will be tracking it, but there'll be a little bit of fuzziness. And we've tried to put this together in an estimate of what the variance is between our estimate of the system, and the actual measurement, we can do this in a theoretical way, because we, we, we, you know, we can write down the noise source that we don't know them in practice, but we can then take ensemble averages and know statistically what to expect. And what we what we find is that zoom does not work very well. And what we find is Sorry, so Yeah, so so so so we have an expression for So we got down to here. So this is an expression for the variance of our estimate at time k plus one. In terms of the prediction and the properties of the noise and and the observer game. Okay, so If we so so then L is still a number that we have to choose and so now we're going to try to choose it to minimize. Okay, so this this is this is just the calculus derivatives zero and solve for L get this and you can take a second derivative and show that it's positive so that it's found a local minimum and then go back and then having the value of L substitute back into the previous formula which depended on L using L star the optimal value. And, and so then we get, in the end, going through some algebra that p k plus one so it's actually just c squared times okay plus one star. So there's an optimal value of the game which which can differ at different times that's why it's indexed by, by k. However, if you keep iterating this, and if your system is is stationary in the sense that the statistics don't are independent of time, then it will go to a steady state value. And we can, we can find it because we now have explicit what's going on here. Yes. Okay, so we have to give me to iterate this and solve for the steady state value, and that will use these equations. And so when you substitute it in you actually get an explicit solution for the steady state value of this game. And I've defined a kind of signal to noise ratio, or sometimes people will call the signal noise ratio square, but it's the ratio of the, so the signal here is the actual movement that we're trying to detect the noise is the noise of the measurement. So, so one noise is our signal that we're trying to learn about thermal fluctuations and one noise. And yeah, that's our signal and then the other is the observation noise is sort of noise so the ratio is the point of the look at it's mentioned list. And so, so we take the positive root of the quadratic and get an exclusive expression for L star. So, it's interesting to look at this in in different limits. So in the limit where alpha is much larger than once this means the thermal fluctuations are very large measurement noise, very small, then L star in this formula goes to one. And, and the P goes to the to see square which measurement noise, which just says that this, this is kind of the usual situation that people kind of implicitly assume applies to kind of naive situation where basically you just have measurements. And you assume that your measurement is the actual thing, you know the actual state of the system just buzzed out by some noise. Okay, but it's more subtle when these are comparable. And in the limit, for example, where alpha is much less than one. So this says that that you got very small thermal fluctuations, huge measurement observations. Your observation are kind of, you know, almost always sort of statistically the same and there's just a little bit of underlying motion underneath the beam. Then it says basically you want to trust your model, more than the observations you kind of want to ignore the observations pretty much, and go with the prediction. And so, what we have is kind of a waiting between them but in the limit of infinitely noisy observations to just, you know, you knew your observations were infinitely noisy that means they're useless to just ignore completely those observations. And what this exercise that we've been going through does though is say that for finite amounts of noise, you should to some extent base your estimate on your predictive prediction which comes from the model. And to some extent on the measurement, which is sort of the naive thing that one might be doing, and that incorporating the prediction from the model in that sense can improve what you would get otherwise in the naive version. There's an interesting kind of analytic result that's also nice in this limit, which is that L star, again, we're just looking at, we're just looking at the solution of this in limits of alpha being bigger than one or smaller than one. So L star goes to square root of A in this limit, and P star the variance that is new times C, or if we go back to real units for diffusion, this is the uncertainty in the, in the position would be some kind of geometric mean between the thermal diffusion and measurement noise. So remember thermal diffusion was already squirted to yes. So now we can describe that so we have a one quarter power here. So. So it's interesting is that the uncertainty is, is only very weakly scaling with the diffusivity of the particles so it can be, you know, 10 to the 10,000 times. It's just E by a factor of 10,000 times and only lose get a factor of 10. And so that, that sort of opens up the possibility of tracking sort of very rapidly using objects. And so that's something that's been exploited in a number of experiments, particularly in physical experiments where you're tracking up the rest of labels so you want to keep the illumination bowl because photo bleaching in that molecule gets damaged. So we have very noisy observations. And, nonetheless, in the limit people have been able to track. It was pretty amazing. The individual die molecules fusing in water. So really very, very high deficient coefficient. 400 micron squared per second. Those who know these things. And yet one can actually track them with even detecting only 1000 per second. Anyway, and then it's coming from the scale. And one can, of course, just just plot these results and so this is the, this is the one limit here, this is the other limit here and some curve connecting them. And then one can also track the corresponding variance in units of C squared and so on. And now, another kind of point of view of this is to not explicitly discretize in the way that we just did but to think of the system is kind of hybrid dynamics. So, we have in this kind of mode. We're continuously diffusing, which is what the real system is doing. And then we're getting information at it periodically. Our first approach was to sort of take the equations of motion and kind of figure out some way to discretize them. But you could also treat it as a stochastic process and track, for example, the growth of. The variance would just roll with the time interval between the last measurement and the next measurement. This is nice if you're doing things like measuring it photon by photon or react to every photon but the interval between photon arrivals random so you don't have periodically space measurements, but they're just whenever the next photon happens so you can, you know, by keeping up by keeping track of the underlying continuous process more carefully. Okay. And, and this would involve solving the, it's called the Parker Planck equation for the probability density of where the systems are in this case it's just a diffusion equation for. So that's, that's for those who are a little more expert. And one thing that's, you know, as I mentioned that this is a kind of averaging, and it's useful to try to relate what we're doing here to ways of competing average. So, let's throw away the dynamical part and just imagine that he wanted to measure something constant, and we're just repeatedly measuring it. And oftentimes know from first year law courses that that the uncertainty goes down and this is a good thing to do. So let's let's just see how this arises. It's kind of formalism. So, in this case we have constant constant dynamics. So just stationary, no noise source here, but we do have measurement errors, and so on. And so we can, we can use the same results that we just got, except that now we just take you equal to zero, which means that output goes to zero. In that limit, I'll spare going through the details of the algebra, but we can rewrite our estimator updates. And what we get then is X k plus one is K over K plus one X hat. So this is an explicit expression for the time dependent game so before we were solving for infinite amount of time and look at setting state here we're going to keep track of the elves. And we see that the elves in fact so this is this is L stars going one over K plus one. And so it's a time dependent game it's going to eventually be zero that's what that formula gave us throughout the zero. It's going to be zero but but a finite time, finite value. And to get some insight into where this comes from, let's think about averaging something which is essentially the right thing to do. We can write an average over K plus one time steps as just some of all the X's divided by K plus one. Yeah, so this is this is what we need by taking average over mistakes. But I want to write this in a recursive fashion. So I can write this, this sum, I can just take the first k components. And then here's the last one. And then this K component I multiply by by K, so that I have now the average, the old value. So, so, so then I have, I can rewrite this then as X hat K plus one is the old average, plus a correction term with amplitude one over K plus one, with the new information which is a X K plus one, which is kind of like the new and minus the old estimate. Okay, so what we've done really is, you know, so we have a recursive algorithm and this recursive algorithm is something that you can always do with these linear systems. You know, for a phenomenon, but you can you can always rework any kind of linear estimation algorithm either as kind of a batch method, or a recursive algorithm. So another context where you're used to this as a batch algorithm is perfect. So people have done like, you know, take some data and fit a straight line through it. And this is an estimate of slope and intercept. Again, this is kind of a standard lab course exercise. Okay, so you might not know what it but it's actually true that you can do that algorithm recursive. So imagine that you take a whole bunch of data points, and you did align through the data points, and you get the best spoke and intercept. And now you take another point. Okay, you could just go back and do the curve fit with n plus one points instead of end points. But it turns out you can reformulate the perfecting algorithm in much the same way that we've been doing, so that you take your old slope estimate, and then add in a correction based on the new data point relative to the predicted to be. Okay, so, so again, you can go back and forth between batch algorithms where you take all of the measurements once and do an estimate and recursive algorithms where you update it. You have your best estimate and you get some new information about data and you get new information updated going in this kind of broad way. And, and, and with linear algorithms, it's always in your curve, it's in the coefficient center and in this case here. They're all kind of related. Okay. So, um, so this is a special case. And now I want to kind of briefly go through the general case. Again, not paying too much attention to the algebra, which is here if you want to go through it, but, but sort of the main results and it's really the same picture, but just more notation. Any questions up before. So, nothing, nothing conceptual is going to change here. Okay, so the general case, you know, we have we have that this now an indimensional state vector we've got an n by n dimensional matrix we've got our coupling coefficient which could be a matrix of you as a vector then then you have several inputs then matrix. We do have in some senses to put so we could have a matrix or sponsor both of them. We have our outputs why the that that are measured through a matrix C to X. And again, there's noise. The averages of the noise are zero the noise sources are independent. And then they have magnitudes now, which are set because now we're dealing with, for example, in the thermal noise, we have to look at the components that in principle we should specify the noise on each component of the state vector. And so we take UK times is transpose and form a matrix and they again this just says that you have to look at things at the same time. Things noise components at different times are independent. Same time, they're related by some matrix, which has to be in general, positive symmetric and these positive semi definite. It could be just a diagonal matrix with me just a very strange component. And similarly for the observation so I'll have you new for the variance of covariance matrix of the thermal noise, you see for the, see for the very covariance of the observations. These were just one or some number of these reviews new square and seed square before. We've taken to account, like, we've written it here implicitly we've sort of put the matrix inside so that you would then just be transposed on the scale of one component. Okay, so, so it's slightly more general but the way to proceed is exactly the same we form a prediction. Well, thanks. Now this is with the other ones. I could try that. I could switch that. I do. Yes, initially, I didn't even see the network. Okay. Yeah, it's just the connection. Okay, so we're sorry, so we're back. Yeah, so, so the prediction here is again just you take your old estimate, and your known inputs, and you update it to make a prediction. So now, this date is no longer the same as was for diffusion but but we're taking the known deterministic parts of the equation. No, you're saying that the, you're saying, okay, we know the dynamic so we predict by using only the dynamic. Yeah, because we don't know the noise. And moreover, we do know that the noise that you're only so, so that would be the best prediction. So this gives us our x k plus one, and then we update it when we do make a new measurement in the same way where we take the prediction, and then correct it by something that's proportional to the difference between the actual observation and the observation that we predict using the prediction of the state. So we take C times exact k plus one minus. So this is the prediction, we multiply by seed, make an output. And so therefore that's the output that we're expecting. And we compare it to what we actually measured and this difference that was trying to use as feedback to make a better estimate. You said that the C times x is to be multiplied by the water. No, no, why is equal to C times. Okay, so, so this is the, so, so why have is what we're predicting measure. Right, so, so there's the state that we're predicting, but the state isn't the measurement. Remember stated, you know, for example, x hat might be an dimensional and why is one dimension. It might be a component or some, some, some delay some function of all these variables. But it's a different, it's a different piece. So, so we have a different name. So it's what we're expecting the measurement to be at the next time. Having taken to account all of the deterministic parts that would affect what the measurement will be. So the harmonic oscillator again, measuring the position, then, you know, and it's connected to, you know, some spring is making an oscillate, then, you know, is what the spring is expected to hold the position to do. But then, then there's both thermal noise on the, on the system and observation noise, which are going to make why different will be expected. Okay. And so now L here, you know, if you're keeping track has to be a vector column vector so it's going to be an indimensional column vector for a one dimensional observation or it can be the end by whatever P matrix if we have P observations. We'll stick with one. Okay, so then the rest of the calculation really is very much the same. And again, I don't really want to go through all of the algebra of the same things, but we form a PK, which is now the sort of the square of the error is the difference between x, x, k and its estimate. So we turn that into a covariance matrix by multiplying by its transpose out of product. And this gives us a matrix PK, which is symmetric definition. And we also have all of the spam, you know, relative to K plus one minus, where we're using, instead of, you know, x k minus this will use the prediction minus. And then we can, we can use all of the dynamics to kind of simplify this and get to a relationship for the prediction here where the predicted variance at time, K plus one is the old variance. And then we can dynamically evolve by a matrix plus something proportional to the covariance of the observation. Okay, so it's a fancy way of just saying before, before there was no ABA was just one so we didn't have to worry about this. And so it was the old variance plus something proportional to the variance plus the variance of the uncertainty here. We've got some matrices and that's some dynamics, but otherwise it's the same thing. And then you do the incorporation. Okay, and again, it gets, again, the details get complicated, but the concepts are the same. So we have the, the error here, and we correct it. And we have a, we want a best estimate which is written in terms of prediction and the correction. And it's useful to, to, to give a name to the difference between the actual measurements and the predicted innovation. So if we define that as a variable that we can, we can go through and evaluate what that covariance is. And it's again kind of complicated in detail, but, but it's really just the same terms that I was talking about before that it involves and so I think I'll, I think I want to skip to the, maybe the summary, maybe it's easier. Okay, so just to gather all of these together. We have the, the, this, this is the mean of the states, this is how we make the prediction. This is what we predict for the observation. This is the state covariance that we predict. This is the covariance of the innovations. So this is the difference between the observation and the predicted observation has errors. So it has a variance, the covariance. And then there's a kind of covariance between the state and the observation. So that's also something that we'll need to put all of these together and then choose L by taking a derivative in the same way. And then we get an expression for the, for the optimal game, which is now a vector or possibly matrix, and then using that value in here we get the prediction for the state itself, and we also get a prediction for the covariance and so this is the one that's sort of interesting to look at. So it's, it's a times the old prediction times a transpose so this is just the time evolved. So this would be the change in uncertainty due to the dynamics. So if the dynamics, you know, take different. So if you have a position is here, and the position is here which it might be because of uncertainty, then there'll be a different difference here from the force here. And so after a certain time this distance, for example, may grow or shrink depends on the other dynamics. Okay, so the dynamics will take a spread of X values and change them over time because different forces are acting on different positions. So that's the meaning of this, this term here. And then there's another contribution from the disturbances that, that, for example, the diffusion processes will just make the answer spread over some time. On the other hand, when we make an observation, we reduce the uncertainty, because now we have new information that happened, for example, after the, and so then this is this is the sort of negative contribution. Okay. And so, so this gives us, again, set of time dependent variances which can go to a steady state. And when you have a steady state then then then this is just become key stars, the K plus one K disease go away. And now what what you might not pick up on immediately because because L and K are so L is also related to the keys is that this is essentially just in a slightly disguised form the same kind of equation that we had for the control problem that we did last time. And if you remember the comments that the discussion we had about duality that you think of a system being controlled as dual to one being observed, where we make a set of a goes to a transpose the input goes to the output. In this case, the control will go to the workings. So if you make this correspondence, it's formally exactly the same problem. And so, in some sense we have to have the same thing. And so you can, you can, you can write this out in a slightly more explicit form for a continuous system. And then you can really see if you compare this equation to what we had for for s that s matrix. The last time you'll see it's, it's the same equation once you make these A goes to a transpose and C transpose and so forth. So the, the problem of estimating the state is dual to the problem of controlling. And so again this goes back to this idea that that when you're, you know, when you have a system you're observing it in the past, in order to control it in the future. And if you kind of flip around the direction of time, you know, you're thinking that, you know, it's like, now the control part is the observation. So, in retrospect it shouldn't be surprising that, that you end up with the same thing, but if you haven't thought about it, you can be completely surprised at that. You get the same complicated equation describing how you, you know, what's the best way to use information about observations. The state turns out to be basically the same problem as what's the best gains to use in order to minimize some cost function for the state and control it. So, so, so you can then put this all together and try to control this. So you have a system now you're making observations. And yesterday, for example, we assume that you just observe the state. But now we have a way of taking a sequence of observations and turning it into an estimate of state even in the presence of thermal noise measurement noise. And then we can also have the problem now of finding control. Using for example that estimate as our state. And again, I think I wanted to save time for a couple. And so I think I'll go very quickly through this because the end result is something very simple, which is that these two problems in a linear system decompose and can be treated separately. So you can do exactly what I just said. Running this common filter, which is the name for this procedure that I just described, and using it to form an estimate of the state at some time. And then you can do what we did yesterday, which was sort of assume that at every moment in time, you know what the state is and design a feedback law for it. And it's not obvious. And in general, not true that these problems decoupled that that you can separately estimate the state and use that estimate as if it were a perfect measurement. And that's the optimal thing to do. In general, it won't for a nonlinear system for linear systems these problems really do separate. And so there's this separation principle, which occurs. So I go through this a little bit in the notes, but since the result is quite simple, maybe, maybe we'll skip past this. So then, yeah, and in particular, I kind of, I go back to this one dimensional example, and try three different strategies. So this is sort of assuming that we have perfect information is what we had before, and then this leads to an expression for the variance of. So now we're trying to control a particle. And overall question is how well can we keep the particle at a given position, given that it's being subjected to uncertainties that we're measuring is just really at intervals of delta T so we're not controlling it during those times. And also that there's there's observation notes. So if there is no observation noise then you can do with the variances. If you use the observations naively. So this is what I would call naive feedback where. So you're so the feedback is just to to apply negative control push it back to the origin. And you can do this using the measurement and then basing the feedback on the measurement that's what I'll call the naive approach. So you can, you can explicitly model what that does here, and you get another expression for the variance which is, which is higher. And also, when you, when you look at the value of the control that you use would be a lower gain that corresponds to a higher variance. And then you can do this with the whole observer structure that we just sketched out and show that it. It sort of is optimized with the same game that you would get from from the naive picture, although the variance is still higher. So that's trying to try to just illustrate this separation principle and then and then of course you can go through the general formulas and the algebra is not so pleasant, but the end result of the sales is quite simple. So here's an example by the way where this separation principle would not work just to convince you that it's a special result for the near system. So imagine that you were trying to control the position of a particle that was naturally bound to the origin here. Well, let me let me say the words while it's trying to. So, so, yeah, so so this example. So we have a particle that would naturally relax to the origin here. We can control it if you push it around, but there's also thermal fluctuations. But now the observation is let's say somebody puts a little screen which I sketched in blue so you can't see it at all in this interval, but you can see it when it's when it's here or here. I mean, without going through a formal calculation, you can sort of see that that the best thing to do for control particularly if you're trying to care about minimizing the amount of control that you use is do nothing when it's inside here because you really have no information at all about where it is, and only try to apply control when you can actually see it. And so, in this case, the feedback mechanism has to know about this screen and the observations about when to apply the feedback or not. So it violates the separation principle. So the point is just that that, you know, this, this, this idea that you can sort of treat the estimate and make your best estimate and then just kind of use it naively assuming that it's kind of like perfect information only is only works for the linear system in general. And otherwise you have to, you have to think about it in case by case basis. Okay. So I wanted, so, so I'm still. Yeah, so we just have a couple minutes I just want to sketch a couple of generalization generalization and a one one one fun example. So, so, so let's go back to just the control problem and forget about measurement noise and so forth, just to simplify life. But now let's think about situations where the dynamics that we're interested in our non-linear in some way. So we have excited f of x plus some, some noise, which has usual properties in a continuous time, the independence of noise at different times delta function t minus t prime so the noise is only correlated at that instant. And one can go back to this Belman principle that we talked about last time, and we do it for stochastic system so we remember that one of our concepts was the cost to go function so this is the cost to reach the final, so final goal is time tau, we're at some time t before it. So we're interested in the costs and the final part of the interval. And so, before we expressed it in terms of the terminal cost plus the running costs. And now in a stochastic system, we can, we can, we can define the cost of being the expectation value of this repeated over many trials. So, so this is our cost to go. And in the same way that we've proceeded last time, we would try to choose a u of t in this that would minimize this cost to go. Okay, so again, how to do it is, is, is not entirely obvious but you can choose that of all the possible controls you apply between now and then you choose the one that minimizes, we'll call this J star, where we've done this search over all possible controls that we would do in the same interval. So it's a lot of it's a continuous infinity of functions, so it's a lot. And so, we, you know, the ideas that when you add noise to the system now, this will produce some kind of extra diffusion in the state, which is going to be something like u squared times the prime interval. And so the, the idea is that we should expand this expectation of J at different positions to second order in, in, in Delta X because the contribution to Delta X was sort of T. So we're looking at what happened in the prime interval, T, it's a second order. And the, the, the upshot is that we end up with this Hamilton Jacobi-Gelman equation, which now has an extra term, which is proportional to the, to sort of the second derivative of J, which is coming from, from, from the expansion due to the noise. And so this is just sort of one more term in our, which is sort of representing the diffusion that's happening because of the noise in the dynamics. So this is known as the stochastic Hamilton Jacobi-Gelman equation. And I just wanted to show how this, what this can kind of lead to a choice. So let's just do one problem just to give a kind of taste of what can happen in more complicated situations. So imagine that we have a particle that is in one dimension, just drifting deterministically, but in the other dimension it's diffusing kind of a little artificial, but it has some simple, simple dynamics. And so it's, it's, it's diffusing up and down. And we can also apply a force on it up and down. And the goal is to make it pass through one of two poles when it gets to the end. Okay, so it's going to take a deterministic amount of time to go here to here. And while it's doing it will sort of be drifting up and down. And if it didn't do anything, it'll probably just hit the wall. So if we gave it some forces, it can go through either this hole or this hole. So the goal is to make sure that it goes through one of those two poles, we don't care which one. And to use the minimum amount of force to do this. Because in some sense, one strategy would be to wait just until you're right next to the screen, and then use a huge amount of force to, you know, in an instant push it to the, to one of those holes. It's not a possible strategy, but if it costs you to use you, like if we have something proportional to square, then it's not a good, good choice. Okay, and it starts sort of at the, while it's booting so you've got the two, you know, you've got a particle starting in the middle between poles. And if you did it some, some known time, you can put the vision in here to make that answer. But that's not so important. So it'll hit it in no time. It's diffusing like this. And you can push it up and down as well. And the goal is to make. So, it's possible that it doesn't enter in the rules that it should just. Yeah, if you don't do anything, it will diffuse and randomly hit wall. In fact, we're going to make the holes is intesimally small. So with probability one is kind of hit the wall. Okay. Yeah, the internet's not working. They're going to be infinitely small. I think I might be the first one. I don't know if the wireless. So, so time zero is at x equals zero. The motion is given by this so that the velocity is either given by whatever you put in this control, or by the division term. Sometimes how we want it to be precisely plus or minus one. Okay, and the noise follows this division one. And our cost is going to be our, give it a coefficient, but a half hour square. Okay. And so the cost to go is need to go from time t to tau of this function here. So let's get this into this stochastic Hamilton to go be government equation. And so this is our running cost. It's in terms of the J star which we're trying to figure out. So we have a derivative here times you which we had before and now we have a second term that involves the noise D. So the first thing you have to do is minimize the respect to you. This is what we were doing last time. And so there's no constraints and stuff so you can actually just solve it here by derivative. That's fine. And then you for you here, and then you get this awful looking partial differential equation here, which is, which is the J star DT, but now we've got a DX J star squared. And then we have a derivative and then a second derivative. So this looks kind of hopeless. But then there's a miraculous change of variables that I will kind of miraculous change of variables that you can do that I will let Rafael answer any questions about the whole Huff transformation about, you know, when and where, when and where this would help. But it, but it, but it, it works for an equation of this form. So we let J be minus lambda log psi, the size of your new variable, in terms of this new variable low and old, we have a linear diffusion equation. So we've taken this nonlinear partial differential equation, taken a nonlinear coordinate transformation, and it becomes a linear equation that we can actually solve. And again, I think the code like Rafael is explaining, you know, why this transformation exists with this problem for all the other relation, not for all the Ellen. Yeah, yeah, but, but this works to linearize this is. And, but, you know, what classes of equations are, you know, have this wonderful transformation separate store, but here, you know, a little bird told us to do this and we're happy. Okay. So now we just have to solve the division question notice that there's a minus sign so again, as is usual in these problems we're solving it from the goal at in time. And so it's a diffusion problem backwards in time from a starting point, given by the sum of delta functions. So that's something that one can solve for psi and then turn it back into an equation for J. The interesting thing is the result here where if you plot the J star and the corresponding you which is very derivative, it goes from a quadratic looking thing to a quartic to a double well shape. So exactly like a base transition free energy in a base transition. And so the feedback law that is associated with it. It goes from a linear feedback law, just like we've been talking about something that's quite non linear. And what this means in terms of our problem is that the right strategy basically is to do essentially nothing and and weekly confine the particle to to a mean at the same position. So if you find it weekly enough that the diffusion, the typical variance is about the width of the slits, so that by diffusion, it might happen to be close to one slit. And at a critical time before the, before the end of the protocol, you switch strategies and you say okay whichever slit and closest to I will steer to make it go through that slit. So it's, it's a strategy that can be advantageous because since the variance is comparable to the to the separation of the slits by chance of reasonable fraction of the time to be fairly close to one you won't need to steer it much. If you can find it more closely than you'd always have to steer it, you know, half the distance between them. If you can find it too far, you could be far away but there's just the right amount of slot to give it. And then you, you go to the closest one. But the really interesting thing from my point of view is that that there's a, there's a, an analytic switch in strategies. It's kind of like a phase transition that happens at a specific time where you switch modes from just waiting with the right with the right amount of sort of slop to steering towards the nearest target. And so the point of introducing this example and going way too fast on the formalism was just say that in these nonlinear kinds of problems. So things like phase transitions and strategy. That's a new qualitative thing that arises from the nonlinear problem. And we'll see another example of this tomorrow. But I just wanted to give a first, first one that comes out of this kind of thing. So, so again, this is something where we're ignoring any observations. And this is this kind of thing. The switch where it changes from this unique and leads to choosing where it's, where it's, what does it depend on the time that it's. Oh, it depends on on this on how much you wait to control. So, if our is, that's, you know, okay, I don't have the details here. I'm sorry, but, but if our is is large so that control is very. It's imposed by time bar because it's how much control you want to do. So I don't have to depend on the distance or the particle from the wall. Yeah, yeah. So it's trying to say that, okay, now that it's here, I'm going to have to apply some some control and depending on on how costly and then it doesn't. I have a question. I am trying to combine the first part of the lecture in the second morning, you have to want to control a system with an optimal protocol. So you first designing by using these techniques we have to explain them. And then you implement them in your system and try to measure with the strategies that we have explained the first part of the right. Yeah, but the thing I was trying to say about with the example where we're using the particle with a cover where you couldn't see it in the center but you put outside is that that the natural thing to do is to use the Just what you said, which is just, you know, figure out a way to do the estimation problem so you can estimate your state and figure out, given the state that's the right control. That's all right. So what I'm trying to say is that the natural thing to do, which is just what you said, which is just, you know, figure out a way to do the estimation problem so you can estimate your state and figure out, given the state that's the right control. That's only fully optimal in the case of a linear, you know, linear dynamics. Most of the time you're doing nonlinear things, or often when you're doing nonlinear things, and then you have to check how good that is that may not be, you know, there may be some couple estimation control algorithm that is better. So you have to prove that that's the right thing to do. Yeah, so, yeah, so that's that's what I had to date. So just just for tomorrow. Tomorrow's going to be a little bit different. We'll have a first half where I want to revisit this estimation problem from the point of view of Bayesian inference. So it'll lead to the same thing, but the viewpoint is different than and I'm not sure I did the advantage of doing things the way that I just did is first of all that's kind of a historical way that they were developed. And second, it's a, I would say, I think it's a it's a very seat of the past approach that we base it on, you know, we have a covariance of variance of errors that we make and we're just trying to tune this game to minimize that so it's a, it's a pretty simple approach but whatever is complicated, the concepts are simple. In the Bayesian approach, maybe a little more abstract, but once you get the hang of it, it's I think a more elegant way to look at it and and also suggests how to do this kind of filtering process or state estimation cases where the dynamics are not linear. We've just done it for for a linear system, but you can do the same kind of thing for nonlinear as well. So it's the way to go that so we'll talk a little bit about that. And then, for the first part, and then the second part I'll kind of shift gears into more of a seminar mode and present some work that we've just done that uses all. We're almost there. Again, I'm sorry for all the glitches. Thank you very much on today one more great lecture. So I would just like to remind all in person participants for tomorrow session and for the session for you later today, if we could try to be on time because of lectures.