 So in this lecture I want to cover the fourth approach to data simulation of implementing the forecast analysis cycle that we're going to discuss, which is the particle filter. The particle filter corresponds to that fourth case of uncertainty propagation that we discussed, which is running a Monte Carlo approach to propagating uncertainty. So there the idea being you run a large enough ensemble that you can approximate the posterior probability distribution just by samples from that distribution. This is the same approach that we typically take in Bayesian MCMC approaches. We fit a model and what we get out are samples from that distribution. So here we're taking the, for example, taking those samples and then running them forward in the forecast so we can now get a sample of forecast with thousands to tens of thousands of samples involved so that we can get a good approximation of the distribution. And so the key advantage of the particle filter is we have gotten a rid of the need to make the distributional assumptions about the shape of the forecast, which can be really powerful. In doing so, it's also very easy to relax a lot of the normality assumptions of the Kalman filter and to actually have a lot more flexibility in your choice of data model and also makes it a lot easier to propagate other sources of uncertainty and to update things like parameters that can be harder to do in the classical Kalman approaches. That said, the problem we encounter with the Monte Carlo approach to uncertainty propagation is that when we come to the analysis step. So when we hit the analysis step, we get new data and we need to have some way of updating our forecast with this new information. But our forecast in this case is just a set of ensemble predictions, which we'll call particles. So each ensemble member is a particle, how do we update that particle? So that's the key problem. So the first key insight into the particle filter is how do we do that? And the idea behind this is we do it by assigning weights to those particles. So for example, when I first start out, I sample all of my particles from the initial conditions and they all have equal weight because they were all equally likely from the initial conditions because they're sampled randomly. If I run them forward in the future and then I observe some data, some of those particles will be more compatible with data and some will be less compatible with data. And we can make that more formal and say some of those are more likely and some of those are less likely when using the likelihood, the probability of the data given the model to calculate the likelihood of each particle. So I can do this independently for each particle. So every single forecast gets compared to the data and from that I get a likelihood back, so the likelihood of that individual particle. I can then weight each particle, each ensemble member in proportion to that likelihood, so I can sum up the likelihood of all of the particles, divide each by their individual likelihoods to get a weight. So those weights sum to one. Now, when I do that, particles that are more consistent with the data get a higher weight, particles that are less consistent with the data get a lower weight. So now every particle has two pieces of information with it. The state, the thing that we actually forecast and a weight. Now, to make use of these weights, we can apply a lot of the concepts that we do with processing MCMC output. For example, if I wanna know the mean of a particle filter, I can calculate the mean of those particles, but unlike in MCMC where all those particles are calculated ethically, I now have to calculate a weighted mean. Likewise, if I wanna calculate the variance of my forecast after the analysis, I can do that as the weighted variance. Similarly, I can do a weighted histogram or weighted constant calculation of the quantiles to get a confidence interval. So we can do all of the posterior statistical analyses that we want to do with those analyses that come out of the particle filter. We just need to use the weighted version. So this idea of weighting the particles in the particle filter actually leads to a second problem that we need to overcome in using the particle filter, which is that as we do this sequentially, we add more weight to parameters. So I start, I add some weight, some parameters become more likely, some become less likely, some states become more and less likely. And then I observe more and more data each time I'm adding more weight to certain particles and less weight to others. So that creates a problem called degeneracy, which has two sides of the same coin. The first is when I put a lot of weight on a small number of ensemble members that are the ones that are most consistent with the data, I'm effectively reducing my sample size. I'm not doing a good job of approximating the histogram. So in the limit, if all the weight ends up on one particle, then my histogram is just a point estimate at that one particle. And it's not really good histogram if I only have one data point in it. So I've lost what I started with, which was the attempt to represent the distribution of the forecast just by samples of that forecast. On the flip side, I have particles that keep getting lower and lower weights, at which point I'm doing all the computation of continuing to make forecasts with those particles. And they're contributing almost nothing to my forecast because I then multiply them by a weight that downweights them considerably. This standard solution to dealing with this problem of degeneracy, of weight collapsing on a small number causing a small effective sample size and wasted computation on particles that aren't contributing anything, is to do what's called resampling. So in resampling, this is simply the process of resampling with replacement in proportion to the weights. So I will resample the particles that have a high weight more often. I will resample the particles that have a low weight less often. And the ones that have extremely low weight might not be resampled at all. They might be lost from the set of particles we're considering. So in sense, we're essentially pruning out the particles that aren't contributing to our forecast and replicating those that are near the center of our distribution. For this to work relies on us having a forecast model that includes process error. So if I have a model where I make a forecast and if that model forecast is completely identical every time I give it the same inputs, then I'm not really gaining anything by resampling that particle. What I need is a forecast model where if I resample a given set of initial conditions and make a forecast with it, I will get slightly different predictions so that they'll get some spread around that part of parameter space or that part of forecast space. This idea of resampling itself has some subtleties to it that needs to be addressed. So if I resample too often, then I will lose parameter estimates or particles just by drift. So if you imagine if I had all particles weighted equally and I resampled, I'll eliminate some of them just by chance. And if I keep resampling them, I'll keep eliminating some just by chance. And if I did this, even if all were equally likely, eventually I will converge just on one of them. On the flip side, if I don't resample, then I encounter this problem of degeneracy of all the weight falling on a small subset anyway. So what we're going to do is we're going to employ sense of heuristics that often need a bit of tuning for each project to think about how often we want to resample and set a criteria for that. One common criteria is to calculate the effective sample size of the ensemble set of particles. But when that effective sample size drops below some threshold, that would trigger when we were resampling. So for example, if the effective sample size is less than half of the total number of particles, then that might trigger a resampling event. This idea of resampling only works when the particles are able to diverge from each other as we move forward in time. That case is very easy to see for the state of the model itself where process error will cause different ensemble members or different particles to make slightly different predictions, even if they start from the same initial conditions. Where this can remain to be a problem is with model parameters. So if I resample the model's parameters and I run the model forward, at the end of the forecast, the model parameters are still the same. So parameters don't have this process of diverging out as the forecast. So they don't fill in parameter space on their own. Because of that, there's a few other variants of the particle filter that do attempt to address that problem by creating ways that allow parameters to spread out a bit. The details of those are beyond this short video. So just to wrap up our discussion of the particle filter, the particle filter has many advantages in terms of it being relatively simple to implement, being very general, being very flexible, being very easy to parallelize. If you have access to computers where you can run different ensemble members on different CPUs or different nodes, it has the flexibility to include not just your state variables, but also your parameters and other sorts of uncertainty. The main disadvantage of the particle filter is its computational cost. You need to run a much larger ensemble because you need enough ensemble members to continually approximate that distribution. And then you need to continually keep an eye on that distribution using resampling and resample move to make sure that the effective sample size is still remaining large enough to approximate that distribution.