 Thank you very much indeed for the invitation to present this research at this conference. I'm really very pleased to be able to spend some time discussing some of the work I've been doing recently in the World Bank's research department with my colleague, Haian Deng. And as was already announced, the title of this paper is Welfare Dynamics Measurement, Two Definitions of a Vulnerability Line and Their Application. And let me start with perhaps just a few words about where this research should have originated. It started in the summer of 2012 when there were intensive discussions at the World Bank around a new goal that the World Bank was contemplating pushing forward, which was the suggestion that the development community should focus not just on ending global poverty, which is one of the new goals that the World Bank has articulated, but in addition also focus on boosting shared prosperity. This was a goal which at the time was being circulated at the bank, but we were discussing intensively amongst ourselves in the research department about how we might want to sort of think about that and define this idea of boosting shared prosperity. As it happens, the decision eventually converged on the decision to focus on shared prosperity and the idea of boosting shared prosperity as a simple exercise of boosting growth of the bottom 40% of the population. This is a decision that was taken, that was driven by I think a number of considerations, including ease of communication, the degree to which it related to earlier measures of growth, such as growth of GDP per capita. And the idea was to try and find a new goal that would resonate with some of the traditional established procedures of measuring growth, but also try to shift attention a little bit more to the lower end of the income distribution, and hence the bottom 40% of the population. But what I wanted to do today was talk a little bit about some of the other ideas that were circulating at the time before we converged on this decision to define shared prosperity in terms of the growth of the bottom 40%. And indeed, one of the areas that we were thinking a lot about was this idea of linking shared prosperity and boosting shared prosperity to notions of vulnerability, the idea being that when we have a society, we might have a society that's poor, we might have a society that comprises the vulnerable, and then we have a society of those who can be considered as secure, and the idea of boosting shared prosperity could also be thought of as possibly as increasing that share of the population that is neither poor nor vulnerable. And so that presupposes some idea of how we might want to measure vulnerability. And that's really what I wanted to talk about today. And this idea of thinking of the secure, of the non-volable, non-poor as a prosperous or the secure might also resonate with some of the other concepts that are floating around in popular discussions around the middle class. And the idea, for example, there was a recent series of articles in the Financial Times about the fragile middle class, linking this group of the population that's not poor, but this notion that there's at least some of that non-poor population that's vulnerable. And again, those are the kind of ideas that are going to resonate in the discussion that I wanted to have today about vulnerability measurement. So that's really the motivation. And of course, when you think about vulnerability, it's also something that's a natural extension of thinking about poverty more general. When we do poverty assessments, we often think about it's important to identify those who are in need of assistance. And we start in the first instance by thinking about the poor as being those who need assistance. But in fact, it makes a lot of sense to also think about those who may not be currently poor, but who might be vulnerable to falling into poverty when we're thinking about, say, the design of safety nets and interventions to protect, to introduce social protection measures. And so this idea of knowing who the vulnerable are is also interesting just from a more generic perspective of thinking about anti-poverty policies and interventions on the social protection front. So if we wanted to think about vulnerability and we want to make a distinction about the population that's considered poor, the population that's considered vulnerable, and the population that's considered, say, secure, or maybe we want to call them the prosperous or so on, then clearly what we need to have is some idea of how we may need to think about how we want to define a vulnerability line. Now in the World Bank, when we do poverty assessments, the experience has been if you look at World Bank reports on poverty, there's very often a discussion about vulnerability as well. People often talk about, well, this is the population that's poor and this is the population that's vulnerable. But when you look carefully at how that definition of vulnerability in those reports is sort of implicitly, where does that come from? It's very often a quite ad hoc, rather arbitrary designation of a certain segment of the population as being vulnerable, simply by saying, for example, that we take the poverty line, the established poverty line, and we multiply it by a factor of two or 50% or something, and then we say, okay, the population that's between that higher poverty line and the sort of official, notional poverty line can be called the vulnerable population. But that's done in a fairly arbitrary way. And so you see, for example, in India, there was a discussion around vulnerability in the population and the decision, or the proposal in this discussion was to simply take the poverty line and multiply it somewhere between 25% and 100%, and to sort of designate the population that's between the actual poverty line and this vulnerability line as vulnerable. And it's a kind of arbitrary scaling of that poverty line. In Vietnam recently, there was a decision taken to scale the poverty line by 30%, scale it up by 30% and call that population below this higher poverty line, and between sandwiched between this higher poverty line and the official national poverty line and call them the vulnerable. But that's clearly a fairly arbitrary decision. And what we're going to be doing in this paper is trying to give a little bit more substance to this type of exercise of coming up with a higher poverty line that we could call a vulnerability line. What we would propose is that it's interesting to think about setting a vulnerability line that looks explicitly at the risk of falling into poverty, that actually looks at this issue of the likelihood that population groups might have faced in terms of falling back into poverty and designating the vulnerable as those who in some sense face a higher risk of falling back into poverty, even if they're currently not considered or not counted amongst the poor. The idea is not entirely new by any means. There's been discussions around vulnerability and measurement of vulnerability that go back quite some years. Indeed, there's a paper by Lant Pritchett from 2000 that was thinking very much along these lines and decide and specify the vulnerability line as that level of income below which a household experience a greater than even chance of poverty in the near future. But in this particular approach or framework that was proposed by Pritchett, both the poor and the non-poor could be considered vulnerable, i.e. as long as you face a higher than a certain cutoff point risk of being poor in the next period, you would be considered vulnerable whether or not you were poor today or not. What we wanted to do in our paper was to try distinguish between the poor and the vulnerable, i.e. define the vulnerable as those who are currently not poor but who face a higher risk of being poor. Those who are currently poor are by definition already in some sense vulnerable. Our idea is to link vulnerability to the notion of susceptibility to something harmful that has not yet occurred. We're trying to identify those who are currently not poor but who face a higher risk of being poor in the next period. What we propose is to set a vulnerability line that builds on the poverty line. It's very explicitly linked to the poverty line and that separates the population into three groups. Those who are poor, those who are vulnerable and those who are secure or prosperous or the middle class if you want, we could think of any number of labels for that group at the top. Those above the vulnerability line therefore might be called the middle class or the resilient or so on. There has been attention in the literature to these ideas but there really isn't a consensus. What we're proposing here is one contribution to that debate which may hopefully eventually result in some kind of a consensus. We have two approaches, two ways of thinking about how we might want to come up with a vulnerability line that lies above the poverty line. The first approach is to say, okay, well, let's designate some kind of acceptable risk of falling into poverty. If there was anything higher than that, we would consider this risk to be unacceptable. What we would be doing in this particular approach would be to identify that population whose risk of falling into poverty in the next period is either this threshold level or lower. Anything higher would be considered unacceptable. We would define a vulnerability line as the lower bound income level of this population. We look at this population and we see that this certain population has a certain risk of falling into poverty. We set a certain maximum level of risk and then we find the poverty line as being the lower bound income level that separates this population from the vulnerable population. That's where the location of the vulnerability line would be defined. The population that lies between that vulnerability line and the poverty line would then be designated as the vulnerable. We can think about this in the context of this little stylized picture here. If you think of the population in the upper panel, we have three population groups. What we're trying to do is come up with this vulnerability line. How would we want to designate or define where that vulnerability line should be? This first approach that I've just described is that we look at this population group here. We say, okay, this population group has a risk of falling into poverty that's quite low. Clearly, if we let this poverty line get closer and closer to the vulnerability line, get closer and closer to the poverty line, then the risk amongst this whole population group will start increasing. We have a certain threshold in mind above which we wouldn't feel the risk of falling into poverty would be unacceptably high. We set the poverty line as the lower bound income level, the lowest income level that separates this population group with this certainly acceptable risk of falling into poverty from the rest of the population. That's how we define where that vulnerability line occurs. That's the first approach. The second approach is kind of the analog or the dual of that same approach, which is to now say, let's specify an acceptable risk of falling into poverty, a risk above which we would consider unacceptable. Then what we would do is we would start at the poverty line and we would start incrementally raising the poverty line and looking at the population that's above this slightly raised poverty line and looking at that group that's between the slightly raised poverty line and the actual poverty line and looking at the risk of falling into poverty amongst that population group and keep raising that modified or that tweaked poverty line until we get to the point where we reach that acceptable threshold of vulnerability. That would then allow us to define the vulnerability line in terms of the second group and looking back at this picture what we have in mind is this group here and what we do is we start with this vulnerability line but we place it very close to the actual original poverty line and we look at the population group just below between the poverty line and this vulnerability line and we measure its risk of falling into poverty in the next period and clearly since it's so close to the poverty line the risk of falling into poverty will be quite high amongst this population group but as we let that vulnerability line slide further up the population group that would be lying between the poverty line and that vulnerability line would be changing in composition with a declining risk of falling into poverty in the next period and we would get to the at some point we would reach a vulnerability line such that the risk of falling into poverty of that population group of falling into poverty in the next period would be at that threshold level that we've designated as being the acceptable level of vulnerability and so that would be a second approach to defining what the vulnerability line is so the two approaches we have in mind are both simply approaches to try to calculate where would that vulnerability line should that vulnerability line be and as I said earlier to date many of the practical studies that sort of talk about vulnerability set that vulnerability line disarbitrarily they just scale the actual poverty line by some amount and they define vulnerability as occurring there what we're doing is what we should be doing what we're proposing is that we should be defined that vulnerability line by appealing in some sense to this notion of looking at the population and empirically assessing what the risk of falling into poverty in the next period is and defining the vulnerability line as a function of what we observe in terms of that risk of falling into poverty and we have as I described these two alternative approaches for doing so now there's the features of this approach is that it's by defining a threshold level of risk of falling into poverty we could compare two very different situations say two situations over time or two different countries and we could they might have very different poverty rates they might have very different poverty levels but by having a similar threshold level of poverty and defining vulnerability in terms of some acceptable threshold level of vulnerability we could still we could make comparisons across these very different settings in terms of and then compare these societies in terms of the degree of vulnerability in those societies and in our practical examples which I'll be turning to we'll be doing exactly that the other point to emphasize is that the the actual empirical implementation of this method is very straightforward it's a very simple task of just looking at at the data at to work out what the vulnerability line should be using either one or the other of the two approaches whichever one appeals and I'll be describing both of those two approaches empirically in a few minutes the other feature that I want to emphasize which is one of the reasons that that were that we feel is promise this method is promising is that because of the fact that it's very light in terms of its data demands it means that we can also implement this method in a settings where we don't have panel data because I think as is implicit and this is clear that the approach I've been describing so far relies on the availability of panel data the need to be able to look at the risk of falling into poverty across two periods which implies that we have data on households and they're there the situation their particular situation in in multiple periods and that presupposes panel data now panel data certainly in the developing country context are not widely available and certainly not at a nationally representative level so it's not something that one would be easily readily able to implement in many empirical settings but because the calculation is very straightforward we are we argue that it can actually also be implemented using some of the methods that have been introduced in the literature recently for producing synthetic panels i.e. in a setting where we have to cross-section data where we do not follow the same households over time we can construct synthetic panels and we can implement this particular procedure that I've been describing with those synthetic panels and I'll describe in a few minutes a little bit how that synthetic panel methodology works and how one could implement that with the data we have an example here for implementing the same procedure for both countries with panel data as well as countries with synthetic panel data and we can show that it works reasonably well in both cases so if those arguments are considered acceptable then what this does is it opens up the opportunity to look at issues of vulnerability look at issues of income mobility move transitions out of and into poverty even in settings where we don't actually have panel data the method therefore can be contrasted with other approaches to measuring part vulnerability those some of those for example that the work by Shrevellum Charduri which was which is much more intensive makes much more intensive use of the underlying data or the work by Luis Felipe Lopez Calva and colleagues which relies on very strongly on the existence of panel data both of these alternative approaches of make such more heavy use of data and a more restrictive use of panel data and are less likely to be implementable with this more realistic data setting which is comprising a series of cross-sections data one important caveat is that the the method is intended to work with you know aggregate numbers the proposal is not to identify individual households and to track their vulnerability and to examine their vulnerability the much of the precision or the reliability of the numbers that we're producing here relies on the fact that we're aggregating across households that's inherent in the imputation procedures that we're using there we're going to be using here and as a result it isn't appropriate to use this method or the sort of line of argument that we're describing here to try to track and examine the vulnerability of individual households at the time but it should be informative and useful for policymakers that one would hope to have at least some some are some statements about the general tendencies in the population as a whole so as was implicit I believe in the in the description I gave based on the the the the figure I showed earlier the two approaches can be quite easily calculated and in the first approach what you do is you set some designated threshold level of vulnerability that you considered acceptable and then you solve for vulnerability line based on this particular expression here which is simply looking at the share of the population that is poor in period one having not been poor in period zero relative to the share of the population that was not poor in period zero and you solve for the the vulnerability line such that that equation is equal to the designated threshold level that you specified ex ante and in this approach to it's very similar we set we designate again a threshold level of vulnerability that we deemed to be acceptable and then we find the vulnerability line which solves this particular expression such that this particular expression is equal to this threshold level of of of of vulnerability the probability of being poor in period one having been somewhere between the poverty line and the threshold the new threshold of vulnerability line in period zero relative to the probability or the percentage of the population that is in that interval between the poverty line and the vulnerability line in period zero that's the calculation with panel data it's very easily implemented with synthetic panel data because of the synthetic the nature of the synthetic panel work there's a whole imputation exercise that first has to be done but it is relatively straightforward to do also once the synthetic panel has been constructed and I'll describe both of those two procedures so to look at the implementation of this method first with with actual panel data we look at two countries the US and Vietnam in US we look at PSID data for 2005 2007 and 2009 which is a panel data set with a 5,335 households and in Vietnam we look at the panel consumption component of the VHLSS living standard survey in Vietnam for 2004 2006 2008 the VHLSS has a sort of rotating panel component where half the sample includes households that were that were surveyed in the preceding survey year so if you have three rounds of data you've got a 25 percent of the final round data set that was actually surveyed in both of the preceding two rounds of data collection so we have 1800 households that were covered that were sampled in both in all three years and 3700 households that were covered that were sampled say in 2006 and 2008 only and that's we're going to then look subsequently at a setting where we don't actually have panel data but where we have cross-sectional data and we'll look at the NSS national sample survey data for India from 2004 and 2009 and what we will create a synthetic panel out of those two cross-sectional rounds of data and then implement the same procedure to measure vulnerability in India during this during the same interval okay so approach one just to revisit the idea approach one so if you think of the you start off with this idea of the population that's not poor and you look at what is the what's the risk of falling into poverty of that population that's currently not poor so if we look at the US the poverty line in the US was was it was 13,305 dollars per person per year and at that point there is an 11 percent poverty rate i.e. 89 percent of the population is not poor at that poverty line and with that poverty rate what we see is that there's about four percent of the population in the US has a risk of falling into poverty in the subsequent year and I'm doing this for 2006 and 2008 so in 2006 in about four percent of the population in 2006 were not poor will have been not poor in 2000 or sorry sorry four percent of the population 2008 would have been non-poor in 2006 and have fallen back into poverty by 2008 so if we said if that were the acceptable threshold of vulnerability there would be no need to specify a vulnerability line because that's that's the the percentage of the population that's vulnerable at the poverty line but if we think that that that a meaningful threshold of vulnerability is something lower than that suppose we were to say we think that vulnerability people should be considered as non-vulnerable if the risk of falling into poverty in the subsequent period is one percent or less then we would have to raise this poverty line from 13,000 to 61,000 and we would have to have a vulnerability line and the population above that vulnerability line would have a risk of falling into poverty of one percent or lower and then the population is between that vulnerability line and the poverty line of 13,000 we could then consider as the vulnerable they have some risk between four and one percent of falling into poverty in the next period and that's what we would designate as the vulnerable population in in Vietnam the calculation is rather similar at at the official poverty line of 2,560,000 dong we have a poverty rate of 16 percent or just under 16 percent and therefore a non-poor population of 84 percent of the population and the the the vulnerability of that non-poor population is is around six percent this population these 84 percent of the population have a risk of around six percent of falling into poverty in 2008 and when falling back into poverty in 2008 between 2006 and 2008 now we thought that the population that it should be one percent and that we want to define the vulnerability line such that the population that lies above this vulnerability line has one percent risk of falling into poverty we would have to raise the vulnerability line above the poverty line to 53 to five million 320,000 dong so we'd have to essentially double the the poverty line and so again we could consider the vulnerable population is the population that lies between that higher vulnerability line and the poverty line that's the way that we would get this vulnerable share of the population using the first approach if we wanted to implement this approach the second approach again with using the panel data for the US and Vietnam here we start at the poverty line and we start taking an epsilon say we tweak the poverty line up and we look at the share we look at the population that lies between the poverty line and and this slightly tweaked poverty line and we look at the share of the risk of falling in the poverty of that group now the one thing that we have to be careful of is that we can't do it the we can't do it the interval can't be too small because we have to have a sufficient number of households in under between the poverty line and this in this tweaked poverty line such that we can make a meaningful calculation of the risk of falling in the poverty so in our paper what we've done is we've arbitrarily said we're going to look at at intervals of 500 households in the survey data or more so what we find in the US is that if we take 500 households that and we look at the 500 households that lie above the poverty line and we look at their risk of falling in the poverty we find that their risk of falling the poverty is about 19% so that's the that's the vulnerability that's the risk of for this little group of households that lie between the poverty line and this vulnerability line is has a 19% risk of falling the poverty and we've had to raise the poverty line by 59% we've had to raise the poverty line from 13,000 to 21,000 dollars in order to get those 500 households in the sample to be able to make a meaningful calculation now if we were to say that is we consider that to be too high a risk of falling into poverty and we wanted to say set that to 10% then what we would have to do is not raise the poverty line to 21,000 but we would have to raise the poverty line to 36,000 dollars so it would be a 177% increase in the poverty line that would then give us a population group that lies between the poverty line and the vulnerability line that would have a risk of falling in the poverty in the next period of 10% so that 10% or higher essentially that's the way that we would use the implement the second approach it's a kind of the dual or the analog of the first approach and the first approach we're looking at the population above the vulnerability line the second approach we're focusing on the population between the vulnerability line and the poverty line in Vietnam the same exercise would occur if we wanted to set the poverty line at the we've taken the 500 households above the existing poverty line and we calculate their risk of poverty we see that we would have to raise the poverty line by 30% to get to to this this group of the population and that would be associated with a 22% risk of falling into poverty in the next period again if we wanted to set the risk of falling into poverty we'd be no more than 10% or at least 10% or higher then we would have to raise the poverty line in Vietnam by 114% and what we have in this particular case we would have around 45% of the population in Vietnam would have a poverty rate would have a would be lying between the poverty line and the vulnerability line and similarly in in in the US we'd be looking at about 26% of the population that would be lying in the in that interval between the poverty line and the vulnerability line so that's the the way the the approach would be implemented in the US and Vietnam using the panel data now we can we can once you've defined the vulnerability line and we've done that using the 2006 and the 2008 data we can now say okay now let's take this interval of 2004 to 2008 because we do actually have data for all three countries that roughly cover that cover that same interval so it's kind of interesting just to do a comparison across the three countries of the US Vietnam and India over this period from 2004 to roughly 2008 2009 and so what we do is we take the vulnerability that we've lined and we just use CPI data and the same we treated as though it's like an absolute line in the same way that we would scale the absolute poverty line up by cost of living adjustments to make comparisons over time we scale the vulnerability line that we've defined here in such a way that we have the vulnerability line and we can compare 2004 and 2008 and then we can decide that look at the population in these two in in these countries and look at what has been happening to the poor the vulnerable and and and the and the prosperous and this is what we find against focusing at this point still only on the panel data results we find that in the US poverty increased between 2004 and 2008 from just under 9 percent to 10 percent vulnerability increased from 26 to 28 percent and the middle class or the prosperous or the secure or whatever we want to call them decline somewhat from 65 percent to 62 percent this resonates very much with the sort of popular story that people have been telling about uh in distributional uh patterns in the US during the 2000s in vietnam in contrast we have an interesting pattern of declining poverty a declining vulnerability and a significantly increasing middle class so a very contrasting story of what's been going on and it's not just poverty that's falling but even vulnerability has been falling in vietnam during this time period so those are the the those are the findings that we get based on these two panel data exercises and as i i hope to have have have illustrated to you that procedure is really quite straightforward and easy to implement in in the context of panel data being available now in india we have to deal with the fact that we don't have panel data and in a parallel uh research that i've been doing uh at the bank in recent years we've been trying to work on a method to develop synthetic panels out of cross-section data because clearly if we're able to do that in a satisfactory way then this will open up the opportunity to do this kind of vulnerability analysis even in settings where we don't have uh panel data and this these procedures are based on imputation models so there's some contrast to the imputation models that we were discussing yesterday in the panel which were very much imputation exercises being carried out at aggregate level sort of national level genies and imputing from one year into another year and so on well we're proposing in the work that we do here is imputation as well but it's very much at the very micro level working with unit record household survey data uh we've had a fair amount of success in the sense that we've been validating these procedures and i'll show illustrate some of those validation exercises here showing that these methods can replicate the kind of results that we get in actual true panel data so we've tested the synthetic panel method by looking to see whether we can replicate the kind of findings that we would get in those settings where we actually have true panel data to see whether we can imitate we can mimic the kind of results that we would get with true panel data in those settings where those data exist clearly if we feel that the method is promising enough we then take it to settings where there are no panel data and we can't validate those results but we kind of draw and we rest on on on the fact that in those settings where we can make the comparisons the method seems to have worked quite well so i'm going to uh very quickly i'm not sure for how much time i have left but i'm going to try to very quickly outline uh uh in the the basic synthetic panel method in in the hope of convincing you that this is a a reasonable way to proceed it the idea starts very much with the kind of ideas that that i was associated with when we when i was first working on efforts to produce poverty maps the idea of imputing consumption from a household survey into a population census data and then using the census data to calculate poverty rates at the subnational level so the very similar ideas are going to be at work here but we were with the idea now is that we're going to work with and i'm going to you know suppose that we have two household surveys two rounds of data the approach and the ideas can be extended also to settings where we have multiple rounds of data but in this case let's confine ourselves with just two rounds of survey data but there cross-sectional data they're not panel data okay but we do have characteristics in those household survey data i have in mind an lsms type of household survey which has lots of information about income and consumption but also lots of information about household characteristics occupations in demographic outcomes education levels and so on so we have a number of of of variables in these household survey data these two cross-sectional data where the the the the the x variables the characteristics of the households are covered in both of the household surveys and amongst the house those characteristics in these data there will be some characteristics that don't change over time an easy one to think of for example would be race of a household head of a household head in the survey period one has a particular race that same household even though we don't observe that household in the in the next round of the data that the idea is that that household's race would not have changed over time it does it's a kind of time invariant characteristic so there's a number of variables that might exist in our household survey data that have this time invariant feature there's also characteristics of the household's head that don't change across rounds of data and we can also include time we can include time varying characteristics if those if there are questions retrospective questions in the survey about such time varying characteristics so there's a number of variables that we can look at in the household survey that have a feature of being time invariant and we work with those variables to produce this these synthetic panels the idea is that we have these we do have these two rounds of household survey data and we will be working with essentially projections of consumption on to these explanatory variables these expanded these household characteristics we have one model that we're going to be estimating in the round one data and we have another model that will be estimating in the round two data the characteristics in these two data sets are the same in the sense that they're not the same for each individual household the households are different but the variables are the same and ours are we're confining ourselves to characteristics that don't change over time so think in a the most stylized version think of it just as a a single variable that say has religion or ethnicity or race as the explanatory variable okay and what we're going to be ultimately interested in is calculations such as this this type of calculations the the the probability that you're poor in period in period one given that you are not poor in period two so the likelihood of having fallen into poverty this is the kind of calculation ultimately that we're going to want to to get to estimate now the method that we use i'll start off with a very so stylized non- parametric approach and then narrow it down to a more parametric approach subsequently but in a very stylized non-parametric approach what we do is we take the sample of households that are observed in the round one in the first round of the data and we regress the consumption of those households on these characteristics that don't change over time we take the parameter estimates from that model the beta hats and we also take the residuals from that model and we we we take those to the next round of the data we take the in the next step we take for each household that's observed in the round two we take a random draw with replacement from the empirical distribution of the residuals and then we combine these with the parameter estimates and the known x variables to estimate round one income or consumption for each household in round two so effectively what we're doing is we're going to round the round two data we have an observed actual directly observed measure of consumption for each household in round two of round two consumption we're going to try to do is predict what that household in round two's consumption was in round one and we're going to predict it by appealing to the model that we've estimated in the round one data that relates consumption to these time invariant characteristics since the very characteristics of time invariant in round two we can take the parameter estimate from that model that we estimated in round one to predict what the household in round two's consumption was in round one so we essentially produce a predicted consumption level for each household in the second in round two of the data we a predicted consumption level for that household of predicted consumption in period one that's the that's the basic idea and then we calculate some movements in and out of poverty as once we basically working with this observed consumption in round two and predicted consumption in round one for each of the households in round two in the round two data now we do this many times because each time we also draw from the residuals that we were that we calculated in the round one data we add those to the x prime beta component of the of the prediction and we take many we do this many times then we take the mean over the our replications as our estimate of of the poverty transition or the the percentage of the population that stayed poor in both periods now this procedure as I very quickly outlined is is consistent only under very specific circumstances the first condition that we need is that the underlying population that's sampled in these two rounds of data is the same essentially what we have to what we're assuming is that the households in period two have a similar characteristics to those of the households in period one and so if we have those characteristics we would be able to the household with with certain characters in period one we would be able to predict what their with their consumption level is in period two that's the first condition that's perhaps not such a controversial one in a case where we're looking at a particular country and it's a only a few years difference and the surveys are properly done but what's much more difficult to satisfy is the second condition here which is that the residuals from those models are independent of consumption in period two and that's only true under very extreme circumstances like for example the we would if there was no fixed effects at all in in in in in the data or if if the shocks to consumption or income are only transitory shocks and these are not likely to be the true there is likely to be a fixed effects in the error term and shocks are likely to be non transitory so it's not likely that these conditions that I was describing will hold so the actual properly poverty transition that I've described that we could calculate using this procedure would not be the accurate poverty transition but what it would be it would be an upper bound of the poverty transition because what we're doing is assuming that the errors have absolutely no correlation with each other when in fact the reality is that there probably would be some correlation so we we expect this condition to be to be violated as long as the errors and these models are positively correlated what we we're doing is getting an upper bound we will be overstating mobility we'll be getting an upper bound of what mobility would be of on movements in and out of poverty if the correlation of the errors was negative then this procedure wouldn't work at all but it's unlikely and we actually demonstrate in the in the work that we've done that with true panel data you never seem to find cases where these correlations are negative it's almost always the positive correlation that you observe in the residuals so that second point is in practice hasn't been found to be a real problem in the same way that we produce an upper bound estimate as I've just described we can also tweak the procedure a little bit to produce a lower bound estimate and the lower bound estimate is simply by taking also estimating the model in the period two and rather than drawing from the residuals from the period one regression we actually take the observed regression residuals from the period two regression and add that to the prediction of consumption in period one this is a tantamount to assuming that there's a perfect correlation in the residuals over time and as a result what that does is it produces a lower bound estimate of mobility or lower bound estimate of transitions into and out of poverty so this procedure that I've described here allows us to produce bounds on mobility and when we when we look at this with data and we actually compare this to true panel data results we find that the method works reasonably well let me just quickly in the interest of time move on to that if we take the case of Indonesia one of the countries where we're looking at actual panel data and we implement we test it with the pseudo panel approach the true percentage of the population that is poor in both periods is around 4 percent where the confidence interval the confidence interval 95 percent confidence interval ranges between 4.7 percent and 0.7 percent that's the percentage of the population that's poor in both periods in the Indonesia case from between 1997 and 2000 with our bounds what we're able to suggest is that our lower bound estimate of mobility suggests that it's somewhere between 10 and 12 percent and our upper bound estimate of mobility says that it's somewhere between 2 and 3 percent and the true slide somewhere between 4.7 and 7 percent so what we're finding is that these bounds do sandwich the true poverty transition estimate and this is we found this to be the case in Vietnam and subsequent work using data for Peru for Nicaragua a variety of other countries find similarly that these bounds approaches do sandwich the true estimates of course one of the things that we do worry about is that the bounds can be reasonably wide and that's something that we do that we do worry about and it's something that we would like to improve on one of the things that we've been working on in recent years is this in recent yeah recent years I guess is to say okay can we build on this methodology the methodology so far has been very non-parametric in many respects or semi-parametric at best but we could also implement this by doing a more parametric approach and that involves making a few additional assumptions the first one is that the epsilons the residuals have a bivariate normal distribution and we assume that the correlation between these error terms is positive and then we if we have that correlation then basically the procedure that I've outlined the non-parametric procedure is tantamount to basically saying that either the correlation is perfect or the correlation is absent altogether so this row term this correlation term is is either one or zero but once we have this parametric framework we can then experiment with versions of row that lie between zero and one and we can narrow the bounds on our transition probabilities as a function of the degree to which we allow row to take a narrow arrange and then most recently we've been working on an approach which allows us to estimate row directly from the data using a sort of cohort level analysis of looking at the correlation of consumption across birth cohorts and that's what we've been implementing and that's what we've been that's what we implement when we apply this method to the Indian data we take a parametric approach of this procedure to construct synthetic panels and measure vulnerability in the Indian data and we validate it with the Vietnam data but just to give you the final punchline because I'm running out of time once we've implemented this with the NSS data for 2004, 2009 and we can put this picture alongside those picture for Vietnam and the US and we find that during the same interval of 2004, 2009 we're seeing a decline in poverty in India an increase in vulnerability in India and the increase in the size of the middle class in India or the prosperous in India so the contrasting experience of these three countries is different in all three cases over this time period with the vulnerability group experiencing a different evolution over time in these three countries I'm very sorry for taking my time going over but I'll stop there thanks very much