Poverty Dynamics in India between 2004 and 2012: Insights from Longitudinal Analysis Using Synthetic Panel Data

Recent National Sample Surveys point to significant poverty reduction in India since 2004/05, with a marked acceleration between 2009/10 and 2011/12. This paper enquires into important aspects of income mobility between 2004/05 and 2011/12, based on new statistical methods to convert the three pertinent National Sample Survey rounds into synthetic panels. The analysis draws on the synthetic panels to derive a vulnerability line for India that can be used to separate out a population subgroup comprising non-poor households facing a heightened risk of falling into poverty. The paper documents a strong pattern of upward mobility out of poverty and vulnerability into the middle class, with a noticeable acceleration between 2009/10 and 2011/12. The paper further undertakes a careful investigation into the comparability of the survey rounds, prompted by the observation that fairly significant modifications had been made to survey questionnaires. The findings suggest that changes in questionnaire design have not compromised the comparability of the data.


Policy Research Working Paper 7270
This paper is a product of the Poverty and Inequality Team, Development Research Group. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://econ.worldbank.org. The authors may be contacted at hdang@worldbank.org.
Recent National Sample Surveys point to significant poverty reduction in India since 2004/05, with a marked acceleration between 2009/10 and 2011/12. This paper enquires into important aspects of income mobility between 2004/05 and 2011/12, based on new statistical methods to convert the three pertinent National Sample Survey rounds into synthetic panels. The analysis draws on the synthetic panels to derive a vulnerability line for India that can be used to separate out a population subgroup comprising non-poor households facing a heightened risk of falling into poverty. The paper documents a strong pattern of upward mobility out of poverty and vulnerability into the middle class, with a noticeable acceleration between 2009/10 and 2011/12. The paper further undertakes a careful investigation into the comparability of the survey rounds, prompted by the observation that fairly significant modifications had been made to survey questionnaires. The findings suggest that changes in questionnaire design have not compromised the comparability of the data.

I. Introduction
Poverty has steadily decreased in India over the past decade. Since India makes up a quarter of the world's poor (i.e., those living under $1.25 a day), which is roughly half again its share of the world's population (17 percent), reducing poverty in this country would not only impact its welfare alone but would also register a significant impact on global poverty estimates. 1 What is particularly striking is the acceleration of poverty reduction that appears to be taking place.
Between 2004/5 and 2009/10 poverty declined from 37.7 percent to 29.9 percent. Over the subsequent two years, poverty declined by a further 10 percentage points, to 20.0 percent. These achievements in poverty reduction have been widely remarked on and celebrated. 2 The aim of this paper is to consider the recent experience of poverty decline from two perspectives that have not historically received a great deal of attention in the Indian context -due most likely to the scarcity of nationally representative panel survey data. 3 First, we ask whether there is any suggestion that those who are currently non-poor remain at a heightened risk of falling into poverty. Second, we investigate to what extent one can discern a core subset of the population 1 We use the poverty rates and population data respectively from the World Bank's PovCalNet database (http://iresearch.worldbank.org/PovcalNet/index.htm) and Development Indicators database. All figures are estimated averages for the two years 2011 and 2012. 2 There have been nevertheless some concerns raised around the credibility of the most recent episode of poverty decline, we come back to more discussion in the next section. Unless otherwise noted, all the poverty rates are based on the national poverty lines. 3 Smaller panel surveys have been fielded for India, but none of these provide nationally representative data; see Dercon and Shapiro (2007) for a recent review. For recent studies that use these panel surveys, see, e.g., Munshi and Rosenzweig (2009), Krishna and Shariff (2011), and Dercon, Krishnan, and Krutikova (2013) respectively for analysis of the REDS panel between 1982and 1999, the NCAER panel between 1993-94 and 2004-05, and the ICRISAT panel between 1975and 2006. While panel surveys allow more in-depth analysis of mobility, Rosenzweig (2003) discusses potential issues that can bias these surveys (that are not nationally representative) such as split-offs or attrition. A new nationally representative panel survey (IHDS) fielded by the University of Maryland and NCAER promises much improvement over the previous panels (http://ihds.umd.edu/index.html). But note that compared to the NSSs, the IHDS has less than half the sample size and collects a much reduced version of household consumption data (i.e., 47 consumption items in the latter vs. more than 400 items in the former). that remains stuck in poverty and is somehow unable to participate in the processes of upward mobility.
Our analysis studies the period between 2004/5 and 2011/12, and draws on three "thick" rounds of National Sample Survey data referring to 2004/5, 2009/10, and 2011/12. As noted above, these data sources indicate a substantial decline in poverty over the entire 7-year period, with a sharp acceleration occurring after 2009/10. Our analysis suggests that during the first episode of poverty reduction between 2004/5 and 2009/10, poverty decline was accompanied by an increase in the share of the population that can be considered vulnerable, or facing a heightened risk of falling into poverty. Between 2009/10 and 2011/12, poverty decline accelerated further, and this was now accompanied also by a decline in the fraction of the population that was particularly at risk of falling into poverty.
We show further that aggregate trends in poverty reduction mask a considerable degree of entry into, and exit out of, poverty, but that a substantial core of the poor have remained poor over the duration of the study period. We document some of the key household characteristics of those who have managed to escape poverty and vulnerability, and contrast these with those who have fallen into this undesirable welfare status during this period.
Before turning to a detailed discussion of our empirical findings, we confront in this paper several methodological challenges that have typically held back investigations of the kind we are attempting here. The key difficulty is that analysis of poverty transitions and of the likelihood of escaping, or falling into, poverty depends on the availability of panel data that permit the analyst to follow households over time. Yet in India, as in many other countries, nationally representative panel data are not available. The existing data sources underpinning poverty analysis-the NSS 3 surveys-are high quality cross-sectional data sources that offer at best a snapshot of living conditions at specific moments of time.
In order to overcome this limitation, we implement in this paper a methodology for converting the NSS cross-sectional surveys into synthetic panels. The approach we follow has been recently introduced into the literature (Dang, Lanjouw, Luoto and McKenzie, 2014a;Dang and Lanjouw, 2013) and a number of studies which validate the method have generally yielded encouraging findings (Dang et al., 2014a;Dang and Lanjouw, 2013;Cruces et al., 2014;Martinez, 2015). We highlight the assumptions imposed by the methodology and discuss their applicability to the Indian context. 4 The methodology for constructing synthetic panels is predicated on strict comparability of the underlying cross-section surveys. It has already been noted that India's NSS surveys are generally regarded as high-quality data sources. We also focus our attention here on the "thick" rounds that involve larger sample sizes and that are designed to be representative at the rural/urban and state level. Nonetheless we investigate whether the 2009/10 and 2011/12 rounds are strictly comparable, since the possibility of a breakdown in comparability is prompted by the remarkable rate of poverty decline as well as evidence that there are some noticeable changes in the design of the consumption questionnaire between these two years. We note that there had been intensive and contentious debate around the comparability of the 1999/00 round of the NSS survey with earlier NSS rounds, after a certain number of changes and adjustments had been made to the 4 Synthetic panels constructed using the Dang et al. (2014a) and Dang and Lanjouw (2013) methods have been applied to study poverty dynamics in various settings including multi-country analysis for Latin America (Ferreira et al., 2013;Vakis, Rigolini, and Lucchetti, 2015), South Asia (Rama et al., 2015), and Europe and Central Asia (Cancho et al., 2015). Specific country case studies using synthetic panels investigate countries including the Kyrgyz Republic (Bierbaum and Gassmann, 2012), Bhutan (World Bank, 2014), and Senegal (Dang, Lanjouw, and Swinkels, forthcoming). Another promising use of synthetic panels is to evaluate program impacts (Garbero, 2014). 4 questionnaire (Deaton and Kozel, 2005). We ask therefore, whether there is any call for similar disquiet with the surveys examined in this paper.
We tackle this question with an imputation-based method recently explored in Dang, Lanjouw and Serajuddin (2014b) that builds on a number of earlier studies (Elbers, Lanjouw, and Lanjouw, 2003;Tarozzi, 2007). 5 Our findings suggest that the 2009/10 and 2011/12 survey rounds do not appear to suffer from serious comparability issues. The observation of a sharply accelerated poverty decline after the 2009/10 round, from 29.9 to 20 percent in 2011/12, seems robust. We also appear to be on solid footing with respect to the data underpinnings for converting these three NSS rounds into synthetic panels.
We start, in Section II, with a brief discussion of poverty trends during the late 2000s and explore further the question of whether the 2009/10 and 2011/12 NSS rounds are comparable.
Section III describes our efforts to assess the comparability of the 2009/10 and 2011/12 surveys.
Besides offering validation evidence for the recent poverty decline, these two sections also describe the preparatory data work required to construct synthetic panels with which to study poverty dynamics. Section IV then implements our approach to convert the three most recent NSS rounds between 2004/05 and 2011/12 into synthetic panels. We also describe and implement in this section an approach to construct "vulnerability lines" that permits a richer mobility analysis by allowing us to identify a particular portion of the population that is non-poor but that faces a heightened risk of falling into poverty. We implement a procedure proposed in Dang and Lanjouw (2014) for specifying vulnerability lines anchored explicitly to the observed incidence of non-poor 5 Elbers et al. (2003) provide a method that imputes household consumption from a survey into a population census. Adapting this approach for survey-to-survey imputation, Christiaensen et al. (2012) impute poverty estimates using data from several countries including China, Kenya, Russia, and Vietnam; other studies analyze data from Morocco (Douidich et al., 2014) and Uganda (Mathiassen, 2013). See also Tarozzi and Deaton (2009) and Rao (2003) for other studies on survey-to-census imputation. 5 households falling into poverty. This procedure makes light demands on data and can be straightforwardly applied to synthetic panel data.
We then turn in Section V to a discussion of mobility between the three population segments that derive from this analysis: the poor, the vulnerable, and the middle class, and we produce some basic profiles of the population in different transition categories. We end in Section VI with concluding remarks.

II. Poverty Trends and Data
Steady GDP per capita growth has helped drive down poverty rates in India in the late 2000s. 6 In particular, GDP per capita increased by almost half (47 percent) during the period 2004(World Bank, 2015, and poverty decreased by 21 percent over the same period. The country's continued economic growth resulted in a further increase of GDP per capita over the subsequent two years, by almost one-fifth (19 percent) in 2011/12. While this robust growth rate should be expected to bring more poverty reduction, the contemporaneous fall in poverty rates turned out to be much larger than expected. To quite a few observers, the fall in poverty has been startling. 7 Figure 1 plots the annual growth rate of GDP per capita (left axis) and the headcount poverty rate (right axis) between 2004 and 2012. Since a large share of the labor force is employed in agriculture, the figure also displays the annual growth rate of the value added per worker of the agricultural sector. The disconnect between GDP per capita growth rates and poverty reduction is 6 See, e.g., Datt and Ravallion (2011) and Ravallion (2011) for comprehensive discussions on economic growth and poverty in India for earlier periods. 7 For example, Dutta and Panda (2014) observe that there is much controversy around the (arbitrariness) of the specification of the poverty line. Saxena (2013) points out a couple of inconsistencies such as the share of the population that need food subsidies or the slum population in major cities are much larger than the reported poverty rate, and that the specified poverty lines may be too low and may potentially be distorted due to political motives. In addition to these last two issues, Himanshu (cited in Rao, 2013) voices the concern that imputed spending values for certain social transfer programs may not be calculated correctly. See also the BBC (Limaye, 2013), New York Times (Gupta, 2013), and Washington Post (Lakshmi, 2012) for related discussion on the debates on poverty in this period. 6 brought out sharply where, despite a remarkably weaker growth of the former, the slope of the line representing the latter is much steeper in the second period than that in the first period. An even weaker growth of the agricultural sector further highlights this difference.
Despite the various arguments for or against this swift fall in poverty, one simple but perhaps not unreasonable hypothesis is that the questionnaire design of the consumption module in the 2011/12 (68 th ) round of the NSS is not comparable to that in the 2009/10 (66 th ) round (and 2004/05 or 61 th round), which in turn leads to inconsistently constructed and incomparable consumption data. Indeed, there are several major changes to the questionnaire in the 68 th round that include: i) changing the consumption code, ii) aggregating consumption items in broader groups, iii) disaggregating consumption items in smaller groups, iv) using/ providing somewhat different item names, v) dropping some consumption items in previous rounds, and vi) adding new consumption items. These changes may not be harmless in affecting the comparability of the consumption data over time. 8 To further investigate whether these changes may lead to different consumption aggregates over time, we explore the raw item-by-item consumption data at the household level and examine a variety of alternative consumption aggregations over time. The results shown in Appendix 1, Table 1.1 confirm that these questionnaire revisions could be a source of concern. While most consumption groupings make up rather similar shares in total household consumption, the share of the items with some change in code (grouping number 2) are two percentage points lower, and the share of the new items added in the 68 th round are three percentage points higher than those items in the 66 th round that are dropped. 9 While these differences may balance out on average, and may not result in any significant change to the total consumption aggregate, they may also point to potentially deeper comparability issues with the consumption data. Moreover, even if mean values are not much affected, these changes could affect different parts of the consumption distribution differently, and could thus still have a bearing on poverty estimates.
The discussion above evokes a similar, but much larger, poverty debate that took place in India in the early 2000s. In the late 1990s, the National Sample Survey Office (NSSO) revised the questionnaire of the NSS in 1999/2000 (55 th round) in an attempt to bring estimates of household consumption from the survey in line with those from national accounts. In particular, these revisions include changing the recall period for household durables and education expenses from a 30-day interval to a 365-day interval, and using both the traditional 30-day recall period as well as a new 7-day recall period for food items. The Government of India published estimates showing that the headcount poverty rate fell by 10 percentage points between 1993/1994 and 1999/2000. Independent researchers, however, noted the possibility of non-comparability of the published consumption data, and applied a variety of methods to adjust for this. A variety of estimates were produced with some suggesting a rate of decline ranging from only somewhat lower than the official estimates (Deaton and Dreze, 2002;Tarozzi, 2007) to one estimate suggesting a mere three percentage point decline in poverty during the decade of the 1990s (Sen and Himanshu, 2005; see also Kijima and Lanjouw, 2003). As is powerfully argued in the book "The Great Indian Poverty Debate" (Deaton and Kozel, 2005), concerns about comparability can greatly complicate assessments of poverty trends.
We describe in the next section a method for gauging comparability between the 2009/10 and the 2011/12 rounds of the NSS.

III. Predicted Poverty Trends Using Imputation
We provide here a brief overview of the survey-to-survey imputation method described in Dang et al. (2014b) before discussing results. Further discussion on technical details and estimation procedures are available in the cited paper.

III.1. Overview of the Imputation Method
Let xj be a vector of characteristics that are commonly observed between the two surveys, where j indicates survey round, j= 1, 2. 10 These characteristics can include household variables such as the household head's age, sex, education, ethnicity, religion, language, occupation, household assets or incomes, and other community or regional variables. Household consumption (or income) data exist in one survey round but are missing in the other survey round, thus without loss of generality, let (survey) round 1 and round 2 respectively represent the survey round with and without household consumption data, and y1 represent household consumption in round 1.
Alternatively, we can also refer to round 1 as the base survey, and round 2 as the target survey.
To further operationalize our estimation, we assume that the linear projection of household consumption on household and other characteristics (x) in both survey rounds-if such consumption data were also available in period 2-are given by a cluster random-effects model 11 (1) 10 To make notation less cluttered, we suppress the subscript for each household in the following equations. 11 This assumption assumes that the returns to the characteristics xj in both periods are captured by equation (1) and precludes the (perhaps exceptionally) rare situations where there could be no correlation between these characteristics and household consumption due to unexpected upheavals in the economy or calamitous disasters. Contexts where there are sudden changes to the economic structures (e.g., overnight regime change) may also introduce noise into the comparability of the estimated parameters.
where j β are the vector of coefficients, and the cluster random effects j µ and the error term j ε are assumed uncorrelated with each other and to follow a normal distribution, conditional on household characteristics. Equation (1) thus provides a standard linear random effects model that can be estimated using most available statistical packages. Let z2 be the poverty line in period 2, if y2 existed the (headcount) poverty rate P2 in this period could be estimated with the following quantity ) ( 2 2 z y P ≤ (2) where P(.) is the probability (or poverty) function that gives the percentage of the population that are under the poverty line z2 in round 2.
Assume that the sampled data in round 1 and round 2 are representative of the population in each respective time period, such that estimates based on the same characteristics x in these two survey rounds are consistent and comparable over time (Assumption 1). And assume further that given the estimated consumption parameters from round 1, the changes in the distributions of the explanatory variables x between the two periods can capture the change in poverty rate in the next period (Assumption 2). 12 Given these two assumptions, Dang et al. (2014b) propose an approach to impute the poverty rate for round 2, where the parameter estimate 1 β and the distributions of the cluster random effects and the error term estimated from data in round 1 can be imposed on the data in round 2. Note that the standard errors of the imputation-based estimates can in fact be even smaller than that of the true (or design-based) rate if there is a good model fit (or the sample size in the target survey is larger than that in the base survey; see, e.g., Matloff, 1981).
If consumption data are available from both the base and target surveys, we can use an Oaxaca-Blinder type decomposition to formally test for Assumption 2 to shed further light on model selection. In particular, the change in poverty between the survey rounds can be broken down into two components, one due to the changes in the estimated coefficients (the first term in square brackets in equation (3) below) and the other the changes in the x characteristics (the second term in square brackets in equation (3) Furthermore, if we make a stricter assumption about the error term in equation (1) following a standard normal distribution, that is ) , we can estimate equation (1) by a random effects probit model instead of the linear random effects model.
But the standard modelling tradeoff holds: if our stricter assumption is correct, estimation results are more accurate and vice versa. For comparison purposes, we will later present estimates using both the linear random effects and random effects probit models. 13 Following the estimation procedures in Dang et al. (2014b), our empirical implementation involves a two-stage process. First, we apply the estimated parameters from the 2004/05 round on the 2009/10 data to impute poverty for the latter. Since the questionnaires remain the same over these two survey rounds, their consumption data are comparable, and we can thus validate these estimated poverty rates against those based on the actual consumption data for the 2009/10 round.
Second, we produce imputation-based poverty estimates for 2011/12 using the same (model) specifications as with the first step, but with the estimated parameters from the 2009/10 round on the data from the 2011/12 round. Put differently, the first step would offer further validation that this imputation method works in the context of India, as well as provide the appropriate specification to use for the imputation; these two steps would satisfy the two assumptions discussed earlier.

III.2. Estimation Results
Since changes in household (heads') characteristics may indicate the corresponding changes in household consumption, it can be useful to examine as a preliminary check the distributions of household characteristics across the two survey rounds in 2009/10 and 2011/12. The summary statistics provided in Appendix 1, Table 1.2 show that these changes appear rather negligible with most of the differences being not statistically significant. Some characteristics that are associated with higher levels of household welfare (e.g. heads with completed post-graduate education, household members with regular salary incomes, or urban residents receiving regular wages) show a statistically significant improvement over time; but others that have opposite effects (e.g., backward classes and radio ownership) also have statistically significant changes. 14 The picture provided from considering the pairwise changes in the distributions of these variables over time thus seems mixed at best.
We then proceed to impute poverty for the target survey in 2009/10, using the estimated parameters from the base survey in 2004/05. Assumption 1 on survey comparability is satisfied since the questionnaires (and sample design) for these two survey rounds remain the same. To satisfy Assumption 2, we can then consider five different household consumption model specifications where the changes in the distributions of the explanatory variables x between the two periods can capture to varying degrees the change in poverty over time. These specifications are built on a cumulative basis for comparison purposes (and robustness checks), with later specifications sequentially adding more variables to earlier specifications.
Specification 1 is the most parsimonious specification and consists of household size, household heads' age, gender, and dummy variables indicating whether the head is Hindu or Islam, whether the head belongs to a scheduled tribe, a scheduled caste or backward classes, whether the head is literate, and the head's education levels. Specification 2 adds to Specification 1 household demographics such as the shares of household members in the age ranges 0-14, 15-24, and 25-59 (with the reference group being those 60 years old and older). Specification 3 adds to Specification 2 employment variables, which include dummy variables indicating whether the household has any member working for a regular salary, whether the head is self-employed in the agricultural sector or the non-agricultural sector (for rural residents), and whether the head works for regular wage, is self-employed or engaged in casual work or other type of work (for urban residents).
Specification 4 adds to Specification 3 a variable indicating home ownership. Finally, Specification 5 adds a more detailed list of asset variables, which include the energy sources for lighting and cooking, whether the household has a radio, television set, electric fan, sewing machine, freezer, air conditioner, bicycle, motorbike, and a car. However, slightly more than 5,000 and 1,000 households are missing these assets variables in the 2004/05 and 2009/10 rounds, respectively. Full model specifications and regression results are provided in Appendix 1, Table   1.3.
Estimation results using the linear random effects model shown in Table 1  It is useful to note that the standard errors for the imputation-based estimates are progressively smaller in the normal linear regression models and random effects probit models than that of the design-based poverty estimate. This is consistent with our earlier discussion since assuming the specification is correct, a good model fit can help bring down the standard errors. Similarly, the random effects probit models make a stricter (modelling) assumption on the error term than the linear random effects models, thus their standard errors are consequently smaller.
As a further check on the model specification, we provide a decomposition test of the changes in poverty due to the changes in the household characteristics and the estimated coefficients in We turn next to impute poverty for 2011/12 with the estimated parameters from the 2009/10 survey round. 15 We have preferred specifications for analysis but we also show estimates for all the other specifications for comparison in Table 3. Our preferred specifications show that the imputation-based poverty estimates can be range from 22.9 percent (Specifications 3 and 4, the probit model) to 25 percent (Specification 2, the linear regression model). Interestingly enough, except for Specification 5 that could be excluded due to overfitting concerns, all other estimatesincluding even Specification 5 with the probit model-fall within this range.
These imputation-based estimates are larger than the design-based estimates of 22 percent, and the differences are statistically significant (outside the 95 percent confidence of the latter).
However, considering all specifications together, the difference between the probit estimates and the design-based estimate is between one and two percentage points, while that between the normal linear regression estimates and the design-based estimates is between two and three percentage points. Thus according to our imputation-based estimates, while the design-based estimate may 15 We use estimated parameters from the 2009/10 round, rather than the 2004/05 round, to impute poverty in the 2011/12 round since these parameters may change over time. Indeed, the null hypothesis of the equality of the estimated parameters in these two survey rounds is rejected with significantly large value from a Wald test (results available upon request). More generally, survey rounds that are closer in time are more appropriate for imputation. underestimate poverty in 2011/12, it appears that this underestimation may in practice be not that large.

IV. Constructing Synthetic Panels
Our findings in the previous section suggest that the sharp decrease in poverty rate between 2009/10 and 2011/12 is reasonably captured by the 66 th and 68 th rounds of the NSSs. Put differently, these two survey rounds provide comparable consumption data for most practical poverty measurement purposes, which is a prerequisite for constructing synthetic panel data. We next provide a brief overview of the methods that will be used.

IV.1. Overview of the Synthetic Panel and Vulnerability Analysis Methods 16
Let xij be a vector of household characteristics observed in survey round j (j= 1 or 2) that are also observed in the other survey round for household i, i= 1,…, N. These household characteristics include variables that may be collected in only one survey round, but whose values can be inferred Then let yij represent household consumption or income in survey round j, j= 1 or 2. The linear projection of household consumption (or income) on household characteristics for each survey round is given by Let zj be the poverty line in period j. We are interested in knowing such quantities as which represents the percentage of households that are poor in the first period but nonpoor in the second period, or which represents the percentage of poor households in the first period that escape poverty in the second period. In other words, for the average household, quantity (6a) provides the joint (unconditional) probabilities of household poverty status in both periods, and quantity (6b) the conditional probabilities of household poverty status in the second period given their poverty status in the first period.
If true panel data are available, we can straightforwardly estimate the quantities in (6a) and (6b); but in the absence of such data, we can use synthetic panels to study mobility. To operationalize the framework, we make two standard assumptions. First, we assume that the underlying population being sampled in survey rounds 1 and 2 are identical such that their timeinvariant characteristics remain the same over time. More specifically, coupled with equation (5), this implies the conditional distribution of expenditure in a given period is identical whether it is conditional on the given household characteristics in period 1 or period 2 (i.e., x i1 = x i2 implies y i1 |x i1 and y i1 |x i2 have identical distributions). Second, we assume that i1 and i2 have a bivariate normal distribution with positive correlation coefficient ρ and standard deviations σ 1 and σ 2 respectively. Quantity (6a) can be estimated by where ( ) . 2 Φ stands for the bivariate normal cumulative distribution function (cdf)) (and ( ) . 2 φ stands for the bivariate normal probability density function (pdf)). In equality (7), the parameters j β and j ε σ are estimated from equation (5), and ρ can be estimated using an approximation of the birth-cohort-aggregated household consumption between the two surveys. Note that in equality (7), the estimated parameters obtained from data in both survey rounds are applied to data from the second survey round (x2) (or the base year) for prediction, but we can use data from the first survey round as the base year as well. It is then straightforward to estimate quantity (6b) by dividing quantity (6a) by Φ stands for the univariate normal cumulative distribution function (cdf). 18 Using the given poverty lines zj, quantities (6a) and (6b) classify the population into two groups, one is poor and the other nonpoor. But we can obtain richer analysis by further disaggregating the nonpoor group into two additional groups: the vulnerable (those that are nonpoor but still face a significant risk of falling into poverty) and the middle class (the remaining group with higher consumption levels). The dividing vulnerability line vj that separates these groups can be derived from a specified vulnerability index P, which is defined as the percentage of the non-poor population in the first period that fall into poverty in the second period. This 18 Further asymptotic results and formulae for the standard errors are provided in Dang and Lanjouw (2013). vulnerability index can be anchored to, say, social protection targets, within the bounds given by the data (Dang and Lanjouw, 2014). We will further discuss the vulnerability line in the next section.
Given vj, we can extend expression (6a) to analyze the dynamics for these three categories: poor, vulnerable, and middle class. For example, the percentage of poor households in the first period that escape poverty but still remain vulnerable in the second period (joint probability) is Do we also have lower vulnerability for the longer period 2004-11? In general, this is an empirical question since a longer period is likely to be associated with larger vulnerability indexes unless household consumption grows so fast that it can offset this trend (Dang and Lanjouw, 2014). Note that we provide more detailed estimation results for India for the period 2004-09 in this paper than those in our other paper (Dang and Lanjouw, 2014). Our estimates are also different from those in the latter, which deflate all numbers to a population-weighted monthly national poverty line instead. 09 and 2009-11 on the same graph, and adds that for the period 2004-11 as well. 20 The curve for the period 2009-11 lies everywhere below that for the period 2004-09 and is closer to the origin, which provides a graphical illustration of lower vulnerability in the latter period as discussed above. The curve for the period 2004-11 lies below both those for the other two periods, even though far more so compared with that for the period 2009-11, thus indicating that vulnerability is lowest for this period.

IV.2. Setting Vulnerability Lines
What is then an appropriate vulnerability line to use? A common, but rather ad hoc, approach is to arbitrarily scale up the poverty line by a certain factor to obtain the vulnerability line. In  Bank, 2012). This approach has the advantage of being simple and easy to understand, but appears to be based on no underlying welfare theoretical framework.
The recent approach proposed in Dang and Lanjouw (2014) instead derives the vulnerability line from a specified vulnerability index P in the spirit of vulnerability to poverty. 21 This vulnerability index can in turn be obtained in several different ways. One way is to identify the percentage of the population that can be supported with the available social transfer budget which can provide a guideline to the associated level of vulnerability. 22 Another way is to consider the 20 As discussed earlier, we consider the cohorts that range from 25 years old in the first year in each period, which is 2004 for the periods 2004-09 and 2004-11, and 2009 for the period 2009-11. While we may also adjust the age range so that the cohorts studied for the period 2009-11 are the same as those for the period 2004-09 (i.e., considering the cohorts that range from 30 years old in 2009), this setup is more natural if we are to compare mobility in these two periods (since it keeps ages and other associated characteristics fixed for each periods). 21 See Hoddinott and Quisumbing (2010) for a recent review of approaches to measuring vulnerability. 22 For a (very) simplistic example, assume that the available social transfer budget can help, say, 30 percent of the vulnerable population from falling into poverty, we can then identify the vulnerability index and vulnerability line 20 highest risk of falling into poverty, say, 15 percent, that is deemed socially acceptable; another is to put forward a social protection target that aims to reduce vulnerability to below a certain level (which can be similar to the recent goal of reducing the global headcount poverty to 3 percent or less by 2030 proposed by the World Bank). We will employ a vulnerability index of 20 percent and the associated vulnerability line for our welfare analysis in the next section, but we will also provide, for comparison purposes, some estimates that are based on twice the national poverty line

V. Welfare Dynamics Analysis
We have discussed the changes in poverty over time in the previous section, thus will focus on discussing the other dynamics with vulnerability in this section. We start first with showing the welfare transitions for all the population before delving further into population groups.

V.1. All Population
The welfare transition matrixes for the three consumption groups for the two periods 2004-09 and 2009-11 are respectively shown in Panel A and Panel B in Table 5 (Table 4). 23 As noted earlier, we restrict the data to households whose head's age is between 25 and 55 in the first survey round and adjust accordingly for the second survey round (e.g., age ranges 25-55 for 2004/05 and 30-60 for 2009/10 in the period 2004-09) to keep household units stable. This results in some slight differences with poverty rates based on this data compared to the full data. We are using the first and second survey rounds respectively as the base and target surveys for constructing the synthetic panels. For these reasons, the estimated poverty rate for 2011/12 slightly change from 23.7 percent (Tables 5 and 6) to 25 percent (Table 7) below. This slight difference appears not very large in practice and is consistent with the imputation-based estimates of poverty in section III. seems to continue in the second period 2009-11, but with a faster shrinkage of the poor and growth of the middle class and almost no change to the vulnerable. Specifically, the fall in poverty rises from 14 percent (i.e., =1-(31.9/36.9)) during the first period to 22 percent during the second period, while the middle class growth increases from 24 percent to 28 percent over the same time interval.
In terms of absolute numbers, the vulnerable category decrease by roughly ten percentage points and shrink from making up more than half of the population in 2004-09 to less than half of the population in 2009-11; the middle class is two and a half times as large in the latter compared to the former (e.g., =30.6/12.2).
Another useful way to gauge welfare mobility in the two periods is to look at the percentage of the population that change their welfare status over time. Estimation results are shown in Table 6, where we keep Panel A the same as with Table 5  . Another difference is that the vulnerable now expands in the second period, but at a slower rate (3 percent) than that (5 percent) in the first period. In terms of absolute numbers, the vulnerable are slightly larger but the middle class is around 35 percent to half a time larger in the second period. With respect to mobility, the population as a whole are still better off and more mobile in this period, with 27 percent and 17 percent of the population moving up and down one or two welfare categories respectively. These results are qualitatively similar with those obtained from keeping fixed the vulnerability index at 20 percent for both periods as shown in Table 5.
We now turn to looking at the welfare transition over the longest interval 2004-11, and provide estimation results in Table 7

V.2. Profiling of Population Groups
Keeping the vulnerability fixed at 20 percent, Figure 3 plots the percentage of the poor or vulnerable in the first year that move up one or two welfare categories in the second year in the two periods 2004-09 and 2009-11. 25 The transitions are disaggregated by education levels (i.e., less than primary education, primary education, middle education, secondary education, and college), occupation (which is further broken down into two categories of residence: i) rural areas: self-employment in non-agriculture, agricultural labor, other labor categories, self-employment in agriculture, remaining categories, and ii) urban areas: self-employment, wage workers, and remaining categories), and socio-ethnic groups (i.e., scheduled tribe, scheduled caste, other backward groups, and remaining groups). 26 Two remarks are in order for Figure 3. First, more education achievement, urban residence, wage work, and belonging to social groups other than the scheduled or backward groups are positively associated with higher-than-average chances of upward mobility. For example, these results are shown for the period 2009-11 with the orange dots representing these probabilities lying above the orange dashed line that represents the national average. Second, the period 2004-09 25 We show the conditional, rather than the joint, probabilities for Figures 3 to 6 since this provides larger numbers that help bring out more clearly the transition patterns for the different population groups. For example, a small percentage of the population with secondary or higher education are usually found in poverty or vulnerability in the first period to start with, consequently their transitions to higher income categories are smaller. 26 An additional assumption required for producing these graphs is that mobility for each population group/ profile should generally follow that for the whole population.
shows qualitatively similar albeit weaker mobility than in the period 2009-11. For example, ceteris paribus, jumping from a middle education to a secondary education is associated with having one percentage point higher for upward mobility in the first period, but as large as having six percentage points higher for upward mobility in the second period. This generally concurs with our earlier findings that the period 2009-11 exhibits more mobility than the period 2004-09. Factors that are positively correlated with upward mobility are in general related to those associated with escaping downward mobility, but this may not always hold (see, e.g., Dang, Lanjouw, and Swinkels (forthcoming) for an analysis of mobility in Senegal). We thus produce two figures for downward mobility for the same population groups (Figures 5 and 6). Interestingly, for India it is generally true that the same factor can be associated with both increasing upward mobility and decreasing downward mobility. For example, out of all occupational categories, wage workers living in urban areas have the largest and smallest chance of upward mobility and downward mobility, respectively. 27

VI. Conclusion
We investigate in this paper the poverty and vulnerability dynamics in India between 2004/05 and 20011/12 using three rounds of the NSSs. In the absence of actual panel data, we construct synthetic panels using statistical methods that were recently developed by Dang et al. (2014a) and Dang and Lanjouw (2013). We present analysis using vulnerability lines that correspond to a vulnerability index of 20 percent and that are also close to twice the national poverty line.
Estimation Factors including more educational achievement, urban residence, wage work, and belonging to socio-ethnic groups other than the scheduled or backward groups are positively associated with higher-than-average chances of upward mobility and lower-than-average changes of downward mobility.
Our paper also presents a two-step analysis procedure where careful checks should be done in the first step to ensure data comparability across survey rounds before synthetic panels can be constructed in the second step. This procedure may be relevant to quite a few other contexts since situations where data are not comparable across survey rounds-leading to, for example, the recent debate on poverty decline in 2011/12 in India-appear to occur more frequently than one might think. We discuss a statistical method (Dang et al., 2014b) that can be employed for this checking 26 purpose. Estimation results show that the poverty decline between 2009/10 and 2011/12 is not severely over-estimated (or equivalently, the design-based poverty estimate using the 2011/12 survey round is practically comparable to those from previous rounds).
Our methods are promising of richer analysis for welfare dynamics that can further exploit the richness of the NSS data. For example, future research can provide more disaggregated analysis within each state, and analyze either more survey rounds to study transition trajectories between more than two periods or survey rounds that are farther apart to investigate longer-term transitions.
Another direction is to make better use of the "thin" rounds, in addition to the "thick" rounds, to build a more comprehensive picture of these dynamics over time.       price for all rural India, which can be inflated to those in 2009 and 2011 respectively with a scale factor of 1.51 and 1.83. Relative increases of the vulnerability line from the poverty line for each country is shown under the columns "Increase" (columns 4 and 7). All numbers are estimated with synthetic panel data and weighted with population weights. Household head's age range is restricted to between 25 and 55 for the first survey and adjusted accordingly for the second survey in each period. Estimation sample sizes are 91,751 and 76,479 households for the first period, and 73,681 and 75,159 households for the second period. The incremental value for iteration is five rupees. The exchange rate is US$1 for 45.3 rupees in 2004 (World Bank, 2015).