 Let us start. So today's material is continuation of what we stopped the last time. We started the analysis of a data set that includes the household hunger score and the food insecurity experience scale. So today what I want to do is to continue the analysis, demonstrating one way in which the analysis of the data can be conducted by integrating the two tools. So as usual, the first lines are just to load in the libraries that are needed and we will use HAVEN to load data from the SPSS format in which it has been shared with us. So the data for this session is again, thanks to Nicolas, data collected in the Food Security and Nutrition Monitoring Survey in South Sudan. And for this session, I'm just using the round 23. And as we already commented last time, this is a data set that includes the FIS question coded, labeled here from X01 to X08. And then the household hunger scale data labeled from G01 to G06. And the way in which the household hunger scale has been collected is by first asking whether the experience occurred in the past 30 days. And then only if the answer was yes, followed up by a question on how often did this happen? And this can be coded as one, two, or three, depending on the frequency. And the same for the two other questions related to experience of hunger and experience of going for a whole day without eating. So we can start by just selecting the FIS scale, which be achieved subsetting the column from X01 to X08. Then because of the way in which the data have been coded, every time the answer, the response is neither zero nor one, we must set it as missing. So I'm achieving this with a simple line. Let's say that everyone, the data is greater than one, I put it as NA. The row score would be obtained by summing along the rows. And in this data set, we see the vast majority of people reporting row score eight, meaning they have responded yes to all questions. For convenience, I'm just renaming the columns with the standard names that we assigned to the FIS data. So now we have a data set, that is coded as we need, with only one zero and NA's, meaning one when the answer was yes, zero when the answer was no. Now, you notice that here we have a reverence period of 12 months, and this is something that we need to take into account. So the traditional way of using the FIS data is by running the dichotomic rush model. We do it with the command rm.weights from the rmweights package. And you will see that I'm running very, very simplified analysis. I'm not worried about the use of sampling weights or other things, because for this demonstration, I just want to guide you through the principle of the analysis. So the result that we will obtain are not results that could be considered final assessment of food insecurity based on this data. The result, as we already noted in one of the previous lectures, is that there is one item that has a slightly high input, 1.39. So in principle, we should have dropped these conduct analysis with only seven items. But as I said, I just want to do a demonstration of the methods. So allow me to continue by just maintaining the eight item scale in this particular case. So far, nothing really new. Now, what is new in this dataset is that this includes also the HHS data. And the HHS data can be selected by extracting the columns from G01 to G06. Now, instead of carrying on two columns per each experience, I am just consolidating them. And essentially, what I want to achieve is merging, for example, G01 and G02, so that I will have only one variable coded with 0, 1, 2, and 3, meaning 0 corresponding to no, 1 corresponds to rarely experienced, 2 corresponding to sometimes, and 3 corresponding to often. So I just need to merge these two cells and then maintaining the frequency of response. There may be many ways of doing this. The way I do it, I will just replace this. Every time the values in G01 is 0, I will put the corresponding 0 in the G02 variable. So line 22 here initializes to 0, the records in G02 whenever G01 is 0. And I do the same for G03 and G04 and G05 and G06. And then I just reserve the three new compiled variables. And I am defining here now a new dataset, household hunger score, maintaining only the three columns, which I rename, run out, hungry, and whole day. And here you see these are coded from 0 to 3, possibly. There is only one column per each question. Again, we can compute the row score. In this case, the row score will be from 0 to 9 because each question can receive a score of 0, 1, 2, or 3. The maximum possible is if all three questions have been answered with often, and therefore you have 3 times 3 a score of 9. And now you will see a very already interesting comparison we commented already last time, that whereas from the point of view of the FIES, the majority of the respondents said yes to all questions. So the highest frequency is to the greatest possible row score. With the household hunger score is a different situation in which actually the highest number of respondents have a row score of 0. And then there are different numbers of respondents from row score 1 to 9. So this is already in itself evidence that the household hunger score is capturing a more severe phenomenon than what is being captured by the FIES. Okay, so now this is the innovation. And what I am suggesting here is that the household hunger score data can be analyzed using a version of the rush model that allows for the possibility that responses are scored, not simply as 0, 1, but scored with a graded scale. And so in this case, we have 0, 1, 2, and 3. And the important assumption here is that we impose that frequency in this particular case reflects severity. Meaning that a higher frequency is equivalent to higher severity. So that frequency and severity can be collapsed on the same underlying latent trait. If this assumption is correct, we can use the so-called partial credit model, which has been also, is discussed in the book by Bond and Fox. So those of you who have read the book have encountered it and has been implemented by Sara Viviani into the RM weight package through the command PC.W, which stands for partial credit. So what I simply do, I apply this command to the new matrix that I have created that contains three columns, each of which is codified from 0 to 3. And we can look at results and results as it is the case for the typical rush model can be analyzed in terms of items, severity parameters, input statistics, output statistics, et cetera. The only difference is that now the item parameters are actually a more complicated structure, which for each question contains three columns. This is because when you have a graded response, there is actually a probability of responding yes to the first frequency, or to the second, or to the third. So we can actually estimate a different number of thresholds. What is listed in column X2 is the threshold when what is the probability of switching is the severity level at which a respondent would switch from saying yes rarely to saying yes sometimes. And the last column is from saying yes sometimes to yes often. What we focus here is the first column. And the first column is what is called the rush turnstone threshold, which is the one that would be comparable with a typical dichotomous rush model threshold. This is the reason why I am selecting as a severity parameter to be associated with each item. The first column, the X1 column in the result of the PCW command. This is just to accommodate the way in which Sarah has decided to program the partial graded command. So here now you see we have three numbers showing that the first item which is run out is actually a little bit more severe than the second one. And these two are somehow less severe than the third one. But of course the numbers in this particular case are not standardized to have mean zero. And this is something that we need to take into consideration because if now we wanted to compare the result of having estimated the household hunger score data with the result from the rush model on the FES, we cannot compare directly these severity parameters. So we need to conduct a calibration of the measures so that they are defined on the same unit of measures, if you wish. And one way of conducting this calibration is that assuming that the three items of the household hunger score, which I labeled run out hungry and whole day are actually equivalent to the last three items of the FES, which are also labeled hungry run out hungry and whole day. So what I do now, I compute standardized values of the item parameter B but also the respondent parameters A and of the standard errors of the respondent parameter. And how do I obtain the calibration is just by centering them so that they have a mean of zero. And I am not changing the standard deviation. So now if I do just show the mean of the standardized HHSB parameters, this is zero because I have shifted all these measures by the average of these three numbers. And just for future reference, this scale, which is an arbitrary scale, has a standard deviation of these three severity parameters of 0.63. So what I am doing now, I am defining a reference scale that will be used for comparison, which is calibrated around the mean of the household hunger score partial credit model severity parameter and preserves their standard deviation. So what I need to do if I want to compare, for example, the results of the FES analysis with the result of the HHS analysis, I need to standardize the severity parameter of the FES in the same way so that the mean and the standard deviation of the common items should be the same. And to achieve this, I just center the original scale around the mean of the last three parameters. Here I am, this is where I am making the assumption that the last three items, six, seven, and eight of the FES are equivalent to the three items in the household hunger scale. And then in this case, I am also standardizing by rescaling them so that now they have the same standard deviation of the three household hunger score parameters. And I have to make the same transformation to the respondent parameters and to the standard error. Now just to demonstrate what I did, I can show what is the equivalent of these parameters. So I am plotting here the last three parameters of the FES against the three parameters in the household hunger scale. And then I'm adding the standardized value of the FES parameters. And so this is the equivalence between the two scales. In which of course the common items are the last three. As you can see, this is far from being perfect because run out and hungry have reversed so very the order in the two scale. But the distance is not very large compared to the relative distance from the other items. And what is particularly interesting is that the whole day turns out to be quite more severe than the run out and hungry, which is a feature that is common to the FES too. But more than this equivalence, it was important now to compare the severity associated with different row scores in the two scales. So my next chart is plotting the row score corresponding to the FES scale and the row score corresponding to the HHS scale. And this picture should demonstrate quite clearly why the two scale so to say explore a different range of severity. The FES is like having a ruler that measures relatively less severe conditions, whereas the household hunger score is an extension of the severity measurement on the severe hand. Yes, Firas, you're raising your hand. Yeah, thank you Carlo. Allow me to go back at the beginning of the waiting process. Can you elaborate more about why we choose the first column of the partial credit results and not the third? And to me, the third may be more as values, more consistent with the last three items of FES, because this is positive. Can you explain again? Thank you. No, yes, I understand the question and it's a legitimate question. The position of the columns x1, x2, x3 is not related to any consideration in terms of whether they are comparable to the dichotomic version of the RASH model or not. This is only because of a decision that has been made in coding this result into the RM weights package. What I wanted to insist is that x1 is a different type of threshold compared to x2 and x3. And this is a feature of the partial credit model. And x1 is the threshold that represents the severity level of a responder that has a 50% probability of switching from having said no to say yes. And for that reason, it's comparable in concept with the threshold of the dichotomic RASH model. So if I had to code, probably I would have given different names or put this set of values in different structure, reserving the name B for only the first column. But this has been a decision made by Sara. As long as we know what is it, the parameters that we need to refer to, it makes no difference. And what we need to do whenever we apply the pcw command is to refer to the first column, the column x1, which contains the threshold that are comparable to the dichotomic RASH model. Then to conduct the equivalence, of course, I cannot compare numbers like minus 3.9, minus 3.6, and minus 2.75 with, let's pull out the the parameter of the RASH model, the original parameter. And this would be 0.28, 0.46, and 1.47. So we cannot compare them because these are defined on different scale. Here, the average of this number is 0 by construction. Here instead is the average of all nine potential thresholds that is equal to zero. So these are defined on two different scales. And in order to conduct the calibration, I need to bring them on the same scale as we do whenever we do a calibration. The question here is, how do I identify the equivalent, the common items? Now we are not comparing two applications of the FIES. We are comparing the FIES with the HHS. One assumption is that there is full equivalence between the three items of the HHS and the last three items of the FIS. So I'm imposing that these are the common items and therefore calibration amounts to ensure that the common item have the same mean and the same standard deviation, which is what I have achieved by doing this rescaling. And the result is that the row score zero on the household hunger scale is equivalent to more or less row score two or three in the FIES. And even row score eight in the FIES is equivalent to only a row score three or four on the household hunger scale. But this is exactly the reason why the household hunger scale was developed to be able to discriminate better around more severe conditions. And now let me show you one problem that we have with the data, the way in which it was collected in South Sudan, is that there was no attempt at imposing some coherence between the household hunger scale and the FIES. They were asked as two separate modules. And this means that maybe there is some inconsistency. And in fact, if we try to see the distribution of response of, for example, the run out item in the FIES and the run out item in the household hunger score, we see that there is a small number of cases in which people responded would have said no. This is the run out of food. So let's go back to the actual question. The actual question was during the last 12 months, was there a time when your household ran out of food because of lack of money? And then there are 259 people who said no to this question. But they said yes to the question in the past four weeks, was there ever no food to eat of any kind in your house because of lack of resources to get food? And actually, 259 people said yes rarely. 46 people said yes sometimes. And two people said yes often. These are not very big numbers, very large numbers, but still it's an incoherence between the two data sets. So somebody has to make a decision and say, okay, should I trust when people responded to the FIES question or should I trust when people responded to the HHS question? Or maybe the confusion arise because the questions were asked in slightly different manner. In the FIES we asked, was there a time when your house would run out of food? In the HHS we asked, was there ever no food to eat of any kind in your house? So it's not exactly the same terminology. I don't know how these have been translated into the local language in South Sudan. Maybe those cases where there is an inconsistency could be traced to some particular enumerators or in some particular region. But the point is that there is an inconsistency here. And therefore we need, if we want to conduct a proper integrated analysis of the two scales, rather than conducting the analysis separately, we need to solve this incoherence. And there are two ways of resolving this incoherence. One is borrowing the yes answers from the household hunger scale and neglect the three last answers to the FIES. So compiling a new FIES scale that puts together the first five columns of the original FIES and the three columns of the HHS. And now this will be a dataset that contains five dichotomic items and three polytomic items. And so if we want to use this dataset, again, we need to use the partial credit model. Sorry, I need to run this first and then this. So now you will see also that the partial credit model can be run also when not all questions are dichotomic. And so maybe in this way it will be also clear for FIES the difference between the column X1 and the columns X2 and X3 in the results. If we look at the result of this partial credit version, you will see now that the scale has eight items, but only the last three items have more than one threshold. Whereas the dichotomic items have only one threshold, because in all cases we can always determine the severity level for which the probability of saying yes and the probability of saying no is 50%. But when we have the polytomic answer, we also have a threshold to switch from yes rarely to yes sometimes and from yes sometimes to yes often because these have additional severity levels. Now, again, I want to do a comparison of these new integrated FIES household hunger score scale with the previous two results. So I need to do the same standardization. In which again, I want to make sure that the mean and the standard deviation of the three common items are the same. The three common items being the last three. So I do the same standardization that I did before and I can add on the same chart the row score that now corresponds to this integrated FIES household hunger score scale. And you see that now we have the row scores that go from actually zero. One is not showing here. It's probably even to the left from zero to 14 and they span a larger range of severity. One other alternative is instead of maintaining the three items as polytomics, we can decotomize them so that whenever the answer has been yes rarely sometimes or often we just set it as one. In this case, we can run the simple rush model because now this will be a decotomic scale, not more a polytomic scale. Of course, in this case, will be only eight row scores. But you see that having integrated the answers from the HHS in which more people have said yes to the three severe question, the entire scale is less severe than the original FIES. Remember that the severity of an item depends on the frequency of responses. And now we are increasing the frequency of yes answers because we are borrowing these yes from the HHS. So we are reducing the number of no answer. And therefore, the new scale is less, is measuring the phenomenon that appeared to be less severe. That finally, another way of resolving the incoherence by integrating the FIES and the HHS question is to make a different assumption and to assume that those answers should have been no, simply because they said the people had already said no to the FIES question. So if I didn't experience at all a condition in the last 12 months, I should not experience it in the last three months because last three months are part of the last 12 months. So this is a result that is not logic. And so alternative is like trusting more the answer to the FIES question. I just define an extended FIES that is equal to the original FIES but whenever the answer to round out hungry or whole day has been yes, I borrow the frequency response from the HHS. So now I have another scale which has eight items, five of which are dichotomic and the last three are polytomic. And I can rank the partial traded model. And I call this the extended FIES because this will be fully consistent with the original FIES but with the addition of the frequency response from the HHS. And in general this is my preference, this will be my preferred choice of integrating the two scale when the data have been collected separately. Again we need to standardize the result if we want to compare. We can verify now that all these different scales are such that the mean and the standard deviation of the last three items is indeed the same. And now we are ready to calculate the prevalence of food insecurity using each of the alternative, there are actually five options that I have created, sorry I coded only four of them, in which now because I have standardized the severity parameter of the respondent and the standard error, I can define a same threshold and obtain in principle comparable prevalence. And the way we do it is as typically by assuming that around each row score there is a normal distribution centered on the standardized severity parameter and that has a standard deviation equal to the standardized standard error. And then this will create a probability of being beyond the threshold for each respondent. And the prevalence is simply the cumulative, the weighted probability of being beyond the threshold using the relative distribution of cases by row score. So here I'm calling prevalence one, so I need to run all the lines otherwise it won't work, prevalence one is the prevalence that would be generated by the HHS data alone, prevalence two is the prevalence that would be generated by using the FIS data alone. And you see that these are quite similar, 53%, 57%. Then if I use one of the two versions of the combined integrated scale, I have a lower prevalence of 42%. And this is the one where I borrowed the FIS answer and I only took the frequency question from the HHS. And finally, prevalence four in which I am actually trusting more the HHS result. And I have a higher prevalence of 53%, which is very similar to the prevalence that I had using the HHS alone. Now, again, and this completes the presentation for today, how to set the threshold. It's an arbitrary decision. It's a convention that we should consider depending on the purpose of the analysis. If this was an analysis intended to compute the SDG indicator for South Sudan, we would probably need to set a threshold at a much lower level, because the moderate or severe food insecurity level for SDG is quite mild compared, for example, to the severity level typically used in the FPC acute food insecurity classification, or the severity level that informed the use of the household hunger score in its most traditional application. But here, now, with the combined HHS FIS scale, we can actually set threshold over a very large range of values. And so, for example, in using this integrated FIS HHS in the context of IPC food insecurity classifications, we might even explore different thresholds that might correspond to IPC phase three, IPC phase four, and IPC phase five. Please send me comments, questions, observations, if you have. Thank you, everyone, and see you next week.