 So the idea of hierarchical models is that they represent a middle ground between fitting data to models independently for different groups of data versus fitting that same model to a whole set of data all at once. These approaches are useful because we often have a lot of structure to how our data is collected and organized. So for example we might have repeated measurement campaigns over time. So we might have a year-to-year variability we're trying to capture or census to census variability we're trying to capture. We might also have variability among measurement units. So if I put out plots I would want to have some measure variability from plot to plot or watershed to watershed, lake to lake, island to island. Whatever our measurement unit is or however that measurement is structuring. So I guess not just the measurement unit itself but often how that is structured. So I might be measuring individuals within a plot. Those plots might be measured at multiple sites and so I might have structure to how the data is collected. One of the reasons that's important is because observations within one of these observational units are not necessarily independent of each other. So we would want to account for the fact that in observations within a plot or other measurement unit or observations within a year are likely to be more similar to each other than to others. Like I said a high-level hierarchical models are trying to write down models that describe the variability in the parameters in our models. So to take example of if I'm trying to write down a model for the mean of a process is a very simple example. So right now I might have the option of fitting one mean to all of the data. I could fit a bunch of different means to every single data set uniquely. But hierarchical models might say well I have a mean for every site or a mean for every year but I also have another layer in the statistical model that describes the site to site variability or describes the year to year variability where that what I'm trying to capture is the variability in that parameter the mean from site to site or year to year. So I'm writing down they're called hierarchical because I have one level where I have a model of the data and then I have another level where I have a model describing the variability in the parameters in this other model. For example just the variability in the mean. So by taking this approach of hierarchical modeling one of the things that it does allows us to partition the uncertainty in a process into multiple terms. So I might be able to for example say instead of just having one overall residual error I might be able to say here is the residual error then here is the unexplained year to year variability and the unexplained plot to plot variability. One of the other things that is is really useful about hierarchical models is they allow us to do what we call borrowing strength. So as an example imagine I have measurements on you say five plots and I have a whole lot of data on four of those and one of those happens to be very data poor for whatever reason I have less sampling there. I might be able to get a very good well constrained estimate of what's going on in the highly measured plots but I have very little constraint on what's going on in the the less measured plot. So if I have a model at a hierarchical level that's describing the site to site variability then in a Bayesian sense that model is is acting as the prior for each of these site to site calibrations and so it essentially allows the well informed sites to generate an informative prior on the less constrained site except we're not doing this as a sequential operation we're doing this all at once as we fit the whole model we're updating the fits at each site at the same time that we're updating the across site constraint. So one important special case of hierarchical models are what are called random effects models. What random effect models do is try to rewrite this idea of a model describing the parameters in a model into one component that describes the overall mean value of those parameters and then another set of parameters that describes what is the deviation for each site from that overall mean. This is mathematically equivalent to writing you know a collection of means and then an overall mean but it has the advantage of often be able to write down these hierarchical effects in terms of additive combinations of the global value plus some anomaly for a specific effect which if you have multiple things structuring the variability can make it easier to write down multiple terms I can write down say an overall mean plus the anomaly for site to site variability and then I can say plus the anomaly for say inter-annual variability and not have to structure this as you know sites nested within years or years nested within sites and create you know multiple layers of hierarchy that starts getting complicated. If we have models that have random effects and we combine those with models that have you know more traditional covariates which we call fixed effects that combination of a random effect and a fixed effect gives us what is called a mixed effects models. Mixed effects models have both traditional say regression coefficients and regression covariates plus some set of random effects so a linear model with random effects would be a mixed model and then a generalized linear model with random effects would be a generalized linear mixed model. Those are really common special cases for hierarchical models. One of the things that is useful about hierarchical models beyond how we can use them to describe processes and how we can use them to partition out the variability when we're trying to understand a process but hierarchical models can also be very useful when we're trying to make predictions of processes. So one of the things they do is they allow us to partition variability when we make predictions and that often gives us a way of thinking very formally about the difference between in-sample and out-of-sample predictions. So for example you know I described the idea that if I have you know a fit of a model for a bunch of different sites and then a model describing that site to site variability I want to make a prediction with insight I would use the model that I fit for that site and the parameters that are appropriate to that site. By contrast if I wanted to make a prediction for a new site I don't have constrained parameters for that site but I have this estimate of the global across-site mean of those parameters and an estimate of the site to site variability. So if I want to make a prediction for a new site I need to be able to integrate over that uncertainty about the site to site variability. So this very naturally leads to greater confidence about forecast made within sites that we're already studying and less confidence when we make predictions to new sites and that really formalizes what I think is a very important intuition that you know when we make a prediction at a new site we should be less confident about it but as we start making measurements at that site we can borrow strength and get greater constraint fairly quickly. I guess in summary the idea here is that we're using hierarchical models and random effects to account for the unexplained and unmeasured variables with the hope that those eventually become explained and measured variables but like I said earlier we don't always have the liberty of measuring every possible thing so there will always be some amount of unmeasured or unexplained variability.