 Let's look at some diagnostics as well in Poisson regression models. So I will introduce some residual analysis briefly, just to sort of introduce some of these ideas as well. Although we've already discovered that based on the log likelihood test statistic and the Pearson-Quai-Square test statistic, the model may not hold, but let's look at what it does to the residuals and introduce what type of residuals there might be for Poisson regression models. And then we can later on extend our model to a more sophisticated model in particular using a categorical variable in the model or using another explanatory variable in the model. So let's look at diagnostics in Poisson regression models, residual analysis. And let's continue with our example from our previous session about the recall of stressful events. Residuals represent variation in the data that cannot be otherwise explained by the model. So that's quite a nice feature generally about residuals and they can help us. Residual plots can help us to understand our model better and to diagnose any particular problems. So the residual plots can be used to discover certain patterns, certain outliers, mis-specifications of the models. So basically ideally we would like to see a sort of random pattern in our residual plots and if there are some sort of more systematic patterns then we can identify certain particular problems and it can help us to reformulate the actual model. So if the residuals exhibit no pattern then in a way that's a good indication because that would imply that the model is probably appropriate for the particular data at hand. I would like to introduce three different types of residuals for Poisson regression models. So the raw residuals would just be the difference between the observed and the expected values. The Poisson residuals or the standardized residuals are basically the raw residuals divided by the square root of the expected values and then also the adjusted residuals and they can be rather helpful for actually the diagnostics and the actual plots that we are going to look at. So that is defined as the observed minus the expected values divided by the standard deviation of these observed minus expected values. So basically we've got adjusted residuals that we would like to look at and to use for our residual plots. So if H0 is true the adjusted residuals have a standard normal distribution with a zero mean and a standard variance of one. So basically at least for large samples that should be the case and looking at that for the recall of stressful events example that basically means that we look at the adjusted residuals and we would like to see how they compare. So we compare the observed and the expected values divided by the standard deviation of that. So we obtain the adjusted residuals and for each category basically for each month we look at the adjusted residuals that are greater or smaller than 1.96 comparing it with the normal distribution that would hold if H0 actually holds. And if you see this bigger discrepancy so adjusted residuals larger than 1.96 or smaller than minus 1.96 that would give us an indication that there is divergence from the H0 hypothesis. So if the adjusted residuals follow indeed the normal distribution which is true under H0 we would expect roughly one adjusted residual being larger than 1.96 or smaller than minus 1.96. So we would only be finding one large or a small adjusted residual we would expect. Now looking at the actual data we saw that in months 1, 3 and 4 we had actually positive adjusted residuals and in months 16 and 17 we had negative adjusted residuals that are larger or smaller than 1.96. And basically we see that it's actually more likely to report more recent events. So the positive residuals means that observed data is larger than the expected data and it's more likely to report a stressful event in the months immediately prior to interview. So we do see some sort of time trend probably in our data set that obviously isn't captured with the equity probable model. So we want to define our model in a better way or improve our model and we see in the next session how we are going to do that. Looking at the plot of adjusted residuals per month we also see a downward trend so the adjusted residuals are on the y-axis and the months are given on the x-axis and we can see just by plotting those types of adjusted residuals that there is a downward trend so again it's not a random pattern there's a downward trend and we might see basically the sort of time trend. Another way of looking at residuals are the normal QQ plots. So basically these are probability plots that plot the quantity of one distribution with the quantiles of another distribution and here we would like to compare the distribution of the observed adjusted residuals with the expected residuals i.e. the normal residuals from a normal distribution. So basically here and Q stands for quantiles i.e. for a quantile against quantile plots effectively. So basically we are plotting observed quantiles against these expected quantiles and hence we plot quantiles of adjusted residuals against the quantiles of the standard normal. That means that the points should actually lie just on the straight line of the y-equal x-line at least if the adjusted residuals indeed follow the normal distribution which is true under the h-node hypothesis. So again we can compare divergence from the h-node hypothesis. And here looking at the QQ plots from this particular data set we can see that in the tails so in the upper end and in the lower end there is divergence from the straight line relationship so again we would conclude that there is some kind of time trend in our data set. So conclusions are clearly that there is divergence in the tails from these straight lines. There is overall strong evidence that this model doesn't really hold, doesn't fit the data that well which maybe isn't too surprising and we would like to we'll see sort of more likely or we see that the data is more likely to report recent events. So basically such a tendency would result if respondents were more likely to remember recent events than distant events. So basically again there is strong evidence that we should be using some kind of other model and we are going to now explore the Poisson time trend model a Poisson model with a covariate.