Gender, Geography and Generations: Intergenerational Educational Mobility in Post-Reform India

India experienced sustained economic growth for more than two decades following the economic liberalization in 1991. While economic growth reduced poverty significantly, it was also associated with an increase in inequality. Does this increase in inequality reflect deep-seated inequality of opportunity or efficient incentive structure in a market oriented economy? This paper provides evidence on economic mobility in post-reform India by focusing on educational attainment of children. We use two related measures of immobility: sibling and intergenerational correlations. We analyze the trends in and patterns of educational mobility from 1992/93 to 2006, with a special emphasis on the roles played by gender and geography. The evidence shows that family background plays a strong role; the estimated sibling correlation in India in 2006 is higher than the available estimates for Latin American countries. There is a persistent gender gap in rural and less-developed areas. The only group that experienced substantial improvements is women in urban and developed areas. Men experienced little or no upward mobility. Almost 70 percent of the variance in children’s education can be accounted for by parental education and geographic location. We provide possible explanations for the apparently puzzling improvements for urban women in a country with strong son preference.


(1) Introduction
The increasing inequality in income distribution at a time of considerable economic growth during the last couple of decades has rekindled interests in intergenerational mobility in both developed and developing countries. 2 Following wide ranging economic liberalization in the early 1990s, India experienced sustained high economic growth; per capita GDP grew at a 4 percent rate over the two decades after liberalization. The evidence indicates that while growth led to a significant poverty reduction, it was also associated with a rise in inequality (World Bank (2011)). 3 There is increasing concern that the benefits of economic growth were not shared broadly, and remained especially concentrated in urban areas, thus widening the rural-urban gap (Bardhan (2007(Bardhan ( , 2010, Dreze and Sen (2011), Basu (2008)). 4 The estimates of top incomes by Banerjee and Piketty (2005) show that the share of top 0.01, 0.1, and 1 percent in total income has increased substantially from a trough in the mid1980s, and this increase coincided with the move away from 'Socialist' to more market oriented economic policies. According to their estimates, in 1999-2000, per capita income gap between the 99 th and 99.5 th percentiles was four times as large as the gap between the median and the 95 th percentile. Another recent study finds that between 1996 and 2008, the wealth holding of the Indian billionaires increased from 0.8 percent of GDP to 23 percent, before declining to 14 percent in 2010 (Walton (2010)). 5 However, the relevant question is whether the observed increase in cross-sectional inequality is a natural outcome of efficient incentive structure in a liberalized and market oriented economy that rewards hard work and entrepreneurial risk taking, or it is primarily due to inequality of opportunity due to 2 Among the developing countries, China and India are two prominent examples where impressive economic growth has been accompanied by an increase in inequality. The recent decline in intergenerational mobility in USA and UK has also attracted a lot of attention; see, for example, Deparle in New York Times (January 4, 2012) and Mazumder (2012) on USA, and Dearden et al. (1997) and Blanden et al. (2005) on UK. 3 For evidence on rising inequality in India after 1991, see Ravallion (2000), Deaton and Dreze (2002), Sen and Himanshu (2004). A recent survey of the available evidence shows that consumption inequality has increased slightly, but the income inequality in India is much higher than what is usually thought of (close to Brazil) (World Bank (2011)). It is now widely appreciated that the available estimates of consumption and income inequality may be significantly biased downward, because the household surveys fail to cover the top income households. 4 Dreze and Sen (2011) argue that Indian economic reform has been an "unprecedented success" in terms of economic growth, but an "extraordinary failure" in terms of improvements in the living standards of general people and social indicators. Basu (2008) comments that "A certain amount of inequality may be essential to mitigate poverty….But the extent of Inequality in India seems to be well above that". 5 The volatility in the wealth of billionaires reflects the volatility in the stock market. The common perception about a significant increase in inequality is reinforced by spectacular conspicuous consumption by the super-rich: Mukesh Ambani, the chairman of Reliance Industries in India owns and lives in the first billion dollar house in the world (Woolsey, M, Forbes.com, April 30, 2008), and in the mega wedding of two sons of Subrata Roy, the 'chief guardian' of Sahara Group, $ 250,000 was spent on candles alone (Srivastava, S, BBC online, February 11, 2004)! The popular perception that rural areas in India have been largely left out of the recent economic growth is, at least partly, shaped by the reports of farmers' suicides, among other things. differential access, for example, to education and markets. The rise in cross-sectional inequality becomes a serious concern when it is primarily a result of inequality of opportunity, i.e., the inability of children born in poorer families and disadvantaged social groups to move beyond their parents' position in economic ladder by their own effort and choices. 6 An immobile society may require policies, public investments and reforms to ensure both efficiency and equality of opportunity. 7 Understanding the trends in, and levels and patterns of intergenerational mobility during the post liberalization period has thus become important for academics and policy makers (Bardhan (2010), Banerjee and Piketty (2005)). 8 This paper provides evidence on intergenerational economic mobility in India during the post liberalization period by focusing on the educational attainment of children. Education is used as an indicator of economic status in the absence of suitable data on permanent income. 9 There is a broad consensus in the literature that education is among the most important avenues for poor to escape from poverty traps and climb up the economic ladder (for recent surveys, see Orazem andKing (2008), Strauss andThomas (1995)). The role of education may be especially important in post-reform India where growth has been concentrated in skill intensive sectors: the software industry and call centers being iconic examples (Kochhar et al. (2006), Bardhan (2010), Kotwal et al. (2011)). 10 The goal of this paper is to analyze the trends in and levels and patterns of educational mobility over a period of almost a decade and a half after the liberalization in 1991 (1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006), with a special focus on possible gender and spatial differences (rural vs. urban and developed vs. less-developed states). We use two related measures of educational mobility: (1) sibling correlation in educational attainment and (2) persistence in educational attainment across parents and children. The standard approach to the study of intergenerational educational mobility is to estimate the parent-offspring association in educational attainment. 11 It has, however, been well appreciated in the literature that the influence of family background on children extends much beyond what is implied by parent's education (Corcoran et al. (1990), Mazumder (2008, 6 Higher inequality of opportunity is likely to lead to a higher cross-sectional inequality (Atkinson (1981)). 7 In an immobile society, many high ability children from poor families may not be able to go to school and thus fail to realize their productive potential. 8 For a broader discussion on the importance of equity in economic opportunity for development, see Equity and Development (World Development Report (2006)). 9 Reliable data on children's and parents' income over the life cycle are not available in a developing country such as India. As emphasized in the recent literature, one needs good quality income data over a number of years at appropriate phase of the lifecycle to tackle the attenuation bias in the estimated intergenerational correlation in income (Solon (1999), Mazumder (2003)). The analysis of intergenerational persistence in income in India is also complicated by the fact that a majority of population especially during parent's generation were engaged in family farming as self-employed workers making it difficult to attribute income to individual members. 10 This is in contrast to the Chinese experience where growth has been dominated by agriculture and labor intensive manufacturing. Bardhan (2010) and Datt and Ravallion (2010) emphasize low and unequal human capital as an important constraint on poverty reduction in India. 11 For a survey of this literature, see Black and Devereux (2010), Bjorklund and Salvanes (2010) for developed countries, and Hertz et al. (2009) for developing countries. 2011,2012), Bjorklund, Lindahl and Lindquist (2010), Bjorklund and Salvanes (2010)). Sibling correlation is a much broader concept that provides a summary measure of all common family and community background factors that affect child outcomes but are not chosen by children themselves. 12 A significantly higher sibling correlation implies greater influence of family and community backgrounds on economic outcomes, which in turn indicates that the role one's own effort and choices can play is limited. To the best of our knowledge, there is no study in the literature that exploits estimates of both sibling correlations and intergenerational correlation to trace out the levels, trends in and patterns of intergenerational mobility in a developing country.
There is now a large and mature literature on intergenerational economic mobility in developed countries, most of which focus on intergenerational correlation between parents' and children's incomes (for reviews, see Solon (1999Solon ( , 2002, Black et al. (2010)). 13 However, economic analysis of intergenerational mobility in the context of developing and transitional countries remains a largely unexplored area of research (among the available contributions, see Murgai (2008), Hnatkovska et al. (2011), Dahan and Gaviria (2001), Emran and Shilpi (2011), and Emran and Sun (2011)). Also, the existing economic literature on sibling correlation in education focuses primarily on a set of developed countries that include USA, UK, Norway and Sweden. The only exception known to us is Dahan and Gaviria (2001) who provide estimates of sibling correlations for 16 Latin American countries. They find that El Salvador, Mexico, Colombia and Ecuador are the least mobile countries, with sibling correlation explaining almost 60 percent of the variation in educational outcomes. The available evidence on developed countries shows that factors common to siblings explain from 40 to 65 percent of variation in educational outcomes (Bjorklund and Salvanes (2010)). In contrast, intergenerational correlation between parents and children-the traditional measure of intergenerational persistence --explains from 9 to 21 percent of variations in children's educational outcome. An interesting finding from these studies is that gender or geographic location (as measured by neighborhood effect) does not exert any significant influence on the intergenerational persistence in children's educational outcomes. Are gender and geography also largely irrelevant for educational mobility faced by children in developing countries? One can argue that the role of gender and geography might be much more prominent in a developing country such as India, because gender bias against women is more common and stronger, geographic mobility is 12 It is, however, important to recognize that parent-children correlation is not a proper subset of the sibling correlation in representing the effects of family background on economic outcomes. The intergenerational link between the parents and a child captures genetic similarities that may not be shared by the siblings, except for the identical twins. 13 See, among others, Arrow et al. (2000), Dearden et al. (1997), Mulligan (1999, Solon (1999Solon ( , 2002, Birdsall and Graham (1999), Fields et al. (2005), Bowles et al. (2005, Blanden et al. (2005), World Development Report (2006), Mazumder (2003), Hertz (2005, Bjorklund et al. (2006), and Lee and Solon (2009). lower, and many areas (especially rural) are not integrated with the urban growth centers because of underdeveloped transport infrastructure. 14 On the other hand, the high tide of rapid economic growth in Indian economy for more than two decades may have lifted all the boats, improving economic mobility across the income distribution, irrespective of gender and geographic location.
The data used in this paper come from the1992/93 and 2006 rounds of the National Family Health Survey (NFHS) in India. The first period of our sample nearly overlaps with the period of economic liberalization (1991)(1992), and the second period is about 15 years after liberalization. For both survey rounds, our analysis focuses on the same age cohort (16 to 27 year olds) who constitute the bulk of new entrants into the labor force. 15 To examine the spatial aspects in depth, the empirical analysis is done separately for families residing in different areas such as rural vs. urban areas and relatively developed vs. less developed states. To discern any possible gender bias, we implement the empirical analysis separately for male and female samples. We use the mixed effects model to estimate the sibling correlation. An advantage of this approach is that both the family and community level covariates can be included in the analysis to examine their relative influence on sibling correlation (Mazumder (2008), Bjorklund et al. (2011)). We examine the influence of two sets of covariates on sibling and intergenerational correlations: the first set relates to caste and religion of the household which are identified as important determinants of educational attainment in India, and the second relates to common neighborhood environment faced by all children growing up in a village/community. 16 Our estimates of sibling and intergenerational correlations suggest no significant change in the intergenerational persistence in educational attainment for a large proportion of the population in India from 1992/93 to 2006. Sibling (and intergenerational) correlations in our full sample have declined only marginally from 0.64 (0.57) in 1992/93 to 0.62 (0.54) in 2006 respectively. 17 However, the aggregate 14 There is evidence that geographic location may be important for economic opportunities faced by households in developing countries. For example, Jalan and Ravallion (1999) show that there are geographic poverty traps in China. Emran and Hou (forthcoming) find that better access to markets increases household consumption in rural China in a significant way. They also find that the effects of domestic market centers are much larger than that of international market access. 15 Our conclusions, however, do not depend on this particular age cohorts. In the robustness checks section, we provide evidence using an alternative age cohort sample. 16 The recent evidence using NSS data shows that the influence of caste and religious identity on the strength of intergenerational link between parents and children has become much weaker (Hnatkovska et al. (2011)). Our results also show similar pattern. It is however, important to note that the direct effect of lower caste and Muslim dummy remains significantly negative on children's educational attainment. 17 Note that a formal test of equality of the estimates in 1993 and 2006 rejects the null because of very small standard errors due to the large sample sizes (number of observations is 34000 in 1993 and 38000 in 2006). However, statistical precision is largely irrelevant here, because the difference in the numerical magnitude of the estimates between 1992/93 and 2006 is very small in most of the cases, suggesting the lack of any substantial change in intergenerational mobility over a period of almost a decade and a half of impressive economic growth. picture hides important gender and spatial differences. While the evidence indicates that the sibling correlation among men (brothers) has remained effectively unchanged (it increased slightly from 0.614 in 1993 to 0.624 in 2006), it experienced a moderate decline for women, (sisters) from 0.780 to 0.696. In terms of geographic pattern, we find that sibling correlation remained essentially unchanged in rural areas and declined marginally in urban areas. The sibling correlation also declined slightly in the developed states, but increased in the less-developed states. Perhaps the most interesting trends and patterns emerge when we partition the data using both gender and geography. The sibling correlations among men (brothers) in rural areas and less-developed states have increased a bit, but the correlation has in fact declined marginally in urban areas and remained virtually constant in developed states. In contrast, the sibling correlations among women (sisters) registered a decline irrespective of geographic partitioning of the data. However, geography matters for women also, only the women in urban areas and developed states experienced substantial decline in sibling correlations. As a result, the gender gap in sibling correlation has disappeared in urban areas though it remains virtually unchanged in rural areas. We also find that among the urban women, it is the lower caste women who experienced the largest decline in the sibling correlation. The evidence on improvements in educational mobility of women is similar to the available evidence on China and Malaysia (see Emran and Sun (2011) on China and Lillard and Willis (1994) on Malaysia). 18 The broad trends in and patterns of educational persistence discussed above are also observed in the estimates of intergenerational correlations in education between parents and children.
In contrast to the evidence from developed countries, majority of the variations in sibling correlations in India can be explained by two factors: parental education and neighborhood effect.
The estimates indicate that a decade and a half after the economic liberalization in 1991, the absolute magnitudes of sibling and intergenerational correlations in India in 2006 are still very large, falling above or near the upper bound estimates available for developed and Latin American countries (for sibling correlations) and Asian countries (for intergenerational correlations). The influence of family and community backgrounds is especially dominant for rural women: about 70 percent of variations in sisters' schooling levels can be explained by common family and community factors shared, but not chosen, by them. After more than two decades of impressive economic growth, a large proportion of Indian population experienced no significant change in their educational opportunity; place of residence and gender still play a large role in a child's educational attainment and thus his/her economic fortunes.
The absence of a positive effect of economic growth on educational mobility, especially for men, is, however, not peculiar to the Indian experience following liberalization. Recent evidence shows that 18 The positive evidence on women may seem puzzling given the fact that son preference is prevalent in all three countries. We provide a set of explanations for the observed trend later in the paper. educational mobility of men in rural China has in fact worsened during the high growth post reform period (1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002) (Emran and Sun (2011)).
The rest of the paper is organized as follows. The conceptual framework underpinning empirical work is described in section 2. Data and empirical strategy are elaborated in section 3. Section 4 organized in different subsections presents the main empirical results along with robustness checks.
Section 5 concludes the paper.

Sibling Correlations
For the estimation and interpretation of sibling correlations, we adopt a conceptual framework that has been utilized widely in the empirical literature on sibling correlations (see, Solon et al (1991), Bjorklund et al (2002), Bjorklund and Lindquist (2010), Salvanes (2010), Mazumder (2008) and (2011)). )). Let be the years of schooling of sibling j in family i. It can be expressed as: (1) Where is a family component which is common to all siblings in family i and is the individual specific component for sibling j which captures j's deviation from the family component. Assuming these two components are independent, the variance of can be expressed as the sum of variances of the family and individual components as: (2) The sibling correlation in education then can be expressed as: ( 3) The sibling correlation depicts the share of variance of years of education that can be attributed to common family background effects. Thus sibling correlation can be thought of as a summary statistic measuring the importance of common family and community effects which includes anything and everything shared by the siblings. It is useful to distinguish among different types of family and community factors that are commonly experienced by siblings. The family level variables include observable factors such as parental education and occupation as well as unobserved factors such as common genetic traits, parental aspirations, child rearing ability and style, cultural inheritances and interaction among siblings. The community effects include factors such as school availability and quality as well as peer effects within the neighborhood. Though sibling correlation captures most of the family background influences, it does not capture all of them. For instance, genetic traits not shared by siblings, differential treatment of siblings and time dependent changes in family and neighborhood factors will show up in the individual component of outcome variance, though they might be part of family background. As a result, the estimate of sibling correlations can be taken as a lower bound estimate of the total influence of the common family background on children's education outcome (for a discussion on this point, see Bjorklund and Salvanes (2010)).

Intergenerational Correlation
It is instructive to look at the difference between sibling correlations and intergenerational correlations as measures of intergenerational persistence in economic outcomes. The standard regression model to estimate intergenerational correlation between parents and children can be written as: Where is the parental year of schooling in family i, and is the intergenerational regression coefficient. Because individual component in equation (1) is orthogonal to the family component, one can express the family component as: Where denotes family factors that are orthogonal to parental education. It follows from equation (5) that:

Where
is the intergenerational correlation in education. The above equation shows clearly that sibling correlation is a broader measure of the impact of family background than the squared intergenerational correlation. Also, the intergenerational correlation parameter ( ) is different from intergenerational regression coefficient ( ).

Estimating Equations
To estimate the sibling correlations, we extend the regression model in equation (1) and specify the following mixed effects model: Where is a vector of control variables. To estimate the intergenerational correlation in education, we augment equation (4) to estimate the following regression specification where the education variables of both generations are standardized: Equations (6) and (7) can be estimated as soon as vector is specified. Following Bjorklund et al. (2010) and Mazumder (2008Mazumder ( , 2011, we take a sequential approach in introducing variables to vector.
All regressions in this paper include age and/or gender dummies, latter is added whenever relevant. In addition, we introduce two sets of explanatory variables sequentially. 19 The first set includes dummies for caste and religion. Evidence from India suggests that educational outcomes vary systematically across different caste and religion groups, which motivates us to include them as controls in the regressions.
Next, we add a village/neighborhood level fixed effect as a part of to capture any common neighborhood effect faced by the children growing up in the same locality. A comparison of sibling correlations estimated using alternative specifications can shed light on the importance of caste and religion as well as the neighborhood effect. 20 As noted in earlier studies (summarized in Bjorklund and Salvanes (2010)), if households are sorted across neighborhoods according to their attributes (well-off families living in better neighborhoods), then the estimate of neighborhood effect is biased upward. So the comparison will provide an upper bound estimate of neighborhood effect. In contrast, the estimate of intergenerational correlation can be biased upward (due to correlation in genetic traits) or downward (due to measurement error).
We compare the estimated sibling correlations with the estimates of intergenerational correlations and neighborhood effects. This allows us to deduce the extent of sibling correlations that can be accounted for by the parent-child link and the neighborhood effect. The part of sibling correlations that remains unaccounted for by these two factors is mainly due to common family environment such as family structure (e.g. divorced/separated parents) and parental skills and patience in child rearing etc.
Note that if the strong sibling correlation observed in the data is due mainly to intergenerational correlations in education and common neighborhood effects, then it indicates higher inequality in opportunities than if it were due to parents' child rearing skills. 21 19 The NFHS 2006 dataset has more detailed information about some of the family background variables such as mother's age at first marriage, mother's health, domestic violence faced by mothers as well as birth order of children. Our analysis indicates that many of these variables are influenced significantly by mother and her spouse's education level, and when we add parent's education in the regression, these variables lose much of their explanatory power. In this paper, we thus focus on caste and religion related variables which are mostly exogenous to parent's education. 20 This approach follows Mazmuder (2008Mazmuder ( , 2011 and Bjorklund et al. (2010). The basic idea is that if the estimated sibling correlation is primarily driven by factors such as neighborhood effects, caste and religion, then the estimate would decline significantly once these factors are included in the regression. 21 Bjorklund, Lindahl and Lindquist (2010) find a sibling correlation of around 0.21 for Sweden. Almost 70 percent of sibling correlation can be explained by parental involvement in school work and mother's patience (willingness to postpone benefits into the future and propensity to plan ahead). Intergenerational correlations in education as well as neighborhood effects are found to have small influence on sibling correlations. Sweden however is characterized by nearly universal access to quality education, generous child care assistance and low income inequality.
Finally, equation (7) can be estimated using simple OLS regression. For the estimation of sibling correlation in equation (6), the family and individual components need to be estimated. The available literature on sibling correlations relies on two alternative estimation methods. Mazumder (2006Mazumder ( , 2011 uses the Restricted Maximum Likelihood (REML) method which has better small sample properties under the normality assumption. Bjorklund et al. (2010) instead utilize a mixed effects model to estimate the family and individual components. The procedure in Bjorklund et al. (2010) can be implemented as a two-step procedure similar to the method employed also by Solon et al (1991) and Bjorklund et al (2002).
The weakness of this procedure is that its small sample properties are not well understood. We implemented both procedures. Given the large size of the samples used in this paper (more than 34,000 in 1993 and 38,000 in 2006), both procedures produce nearly identical parameter estimates, and for the sake of brevity, we report estimates from the procedure suggested by Bjorklund et al. (2010). This choice is based on the fact that Bjorklund et al. (2010) approach allows us to cluster standard errors at the family level. The estimates using Restricted Maximum Likelihood are available from the authors. The estimates of sibling correlations presented in this paper are from the mixed effects model using Stata GLLAMM procedure. As noted before, all standard errors are corrected for clustering at the family level. 22

(3) Data and Empirical Issues
The data for our analysis come from the National Family Health Survey ( While sample sizes of the NFHS are comparable to that of National Sample Surveys (NSS) in India, the data from NFHS offer two distinct advantages for our analysis. First, all children up to 17 years of age in the NFHS are matched to their co-resident parents (not just to the household head). This is particularly important in the Indian context where joint families are still common. This matching of children to parents allows us to estimate both sibling and intergenerational correlations for the same sample of children. The second advantage of NFHS is that education data are more detailed. Instead of reporting education level in discrete intervals as is done in the case of NSS, NFHS collected information on years of schooling for both parents and children, which facilitates more precise estimates of sibling and intergenerational correlations.
To define the estimation sample, we follow the literature and restrict our sample to closely spaced young adult siblings between the age of 16 and 27 years. The argument for estimating sibling correlations from closely spaced siblings rests on the fact that there may be important changes in the family structure as well as shocks to family life over a longer time horizon diluting the already conservative estimate of family background on children's outcome. To check the sensitivity of our results, we report the estimates of sibling and intergenerational correlations for other age groups also.
While the sample for estimation of the effects of family background on children's outcomes should ideally focus on co-resident children, it may bias the estimation of intergenerational correlations in education between parents and children. If, for example, among older children, the best educated ones tend to leave household earlier than less educated children, it may bias intergenerational regression coefficient downward, but may not necessarily bias the intergenerational correlations coefficients.
Because such exit of better educated children from the household would also reduce the variance in children's education, thus offsetting the decline in the intergenerational regression coefficients.
The problem of not observing all of the children as co-residents in the household is more prevalent in the case of older age cohorts, particularly for women who usually leave their natal household upon marriage. In the case of women, if educated women delay marriage and we have better probability of observing them as co-resident children, then estimate of intergenerational correlations from our sample may be biased upward. On the other hand, if marriage timing follows birth order and there is a substantial birth order effect as reported in Black, Deverux and Salvanes (2005) and Booth and Kee (2009), then estimates from our sample will be more on the conservative side. We address the issue of non-coresident children in two ways. We keep all singleton households in the sample. This is likely to reduce the bias in the estimate of intergenerational correlations by allowing the two opposing factors discussed above to offset each other. This also improves the precision of the estimate of individual component in the case of sibling correlation. Second, we check robustness of our results by estimating both sibling and intergenerational correlations from a sample of younger age cohort (16-20 year) where possibility of having non-coresident children is lower. Note also that for older age cohorts, the co-residency pattern changes, as it is the parents who co-reside with children at old age. If parents tend to co-reside with better educated and well off children as is the usual custom in developing countries, then intergenerational regression coefficients for older cohorts will be biased upward. Tracking the same younger age cohort [16][17][18][19][20][21][22][23][24][25][26][27] between years has the added advantage that our estimates are comparable and are not un-duly influenced by changes in co-residency pattern over the life cycle.
While we are not aware of any paper that provides direct estimates of sibling and intergenerational correlations in education for India, some indirect evidence on intergenerational persistence in education can be found in two recent studies. Our empirical approach, however, differs in some important ways from that of the existing studies. Jalan and Murgai (2008) use the NFHS 1998 data to estimate intergenerational regression coefficients for different age cohorts. 24 Hnatkovska, Lahiri and Paul (2011)  As noted before, our main sample consists of all children in the age group 16-27 who are coresident with the mother. 26 Estimation was carried out for all children and separately for brothers and sisters. Since an important objective of our study is to uncover spatial differences in intergenerational mobility, we also estimate the sibling and intergenerational correlations for sub-samples defined on the basis of geographical location such as rural and urban areas, and developed and less developed regions/states. The number of observations for different sub-samples is reported in Table 1 The estimated regression coefficients are in general different from the intergenerational correlations that take into account the changes in the variance of the children's education. Trying to uncover trends in intergenerational correlations on the basis of estimates from different age cohorts is problematic when co-residency pattern of children and parents changes over life cycle. As noted above, the coefficients tend to be underestimated for younger age-cohorts in the presence of birth effect in education and tend to be over-estimated when parents co-reside with better educated children. Thus intergenerational regression coefficients may suggest a spurious decrease in intergenerational persistence across age cohorts simply due to changes in co-residency pattern over the life cycle. 25  narrowed considerably between these two survey years. 27 A similar trend can be detected in mother and father's education as well though the gender gap in the parent's generation remained substantial. Average education of father increased from 5.33 years to 6.43 years between the two survey years, while that of mother increased from 2.63 years to 3.75 years. The improvements in years of education were associated with a decline in the standard deviation of education levels between the survey years. Consistent with international evidence in Hertz et al. (2009), the variances of education levels are higher in parent's generation compared with the kids in both the survey years. This decline in variance implies that relying on intergenerational regression coefficient to understand intergenerational mobility may be misleading.
The summary statistics for the rural sample are also reported in Table 2. As expected, average education levels are lower in rural areas compared with our full sample. Consistent with national trends, average years of schooling have increased for both boys and girls in rural areas. The gender gap in education has also narrowed though the gap is still larger in rural areas compared with our full sample.
Summary statistics for other sub-samples also confirm improvements in education attainment of children during this period. The trends in education levels reported here are consistent with those reported in other studies (ASER reports, World Bank (2011)).
In addition to education levels, Table 2 provides summary statistics for age and caste and religion composition of our sample. Overall, the samples from two years appear to be comparable to each other in terms of age and caste-religion composition. In the following section, we present our estimation results based on samples from these two survey years.
27 Similar convergence in educational attainment between boys and girls over the reform period is observed in China (see, for example, Behrman et al. (2008)).

(4) Empirical Results
Equations (6) and (7) form the basis of empirical estimation of sibling and intergenerational correlations respectively. To estimate the individual and family components of equation (6), we followed the two-step procedure suggested by Bjorklund et al. (2011). 28 Unless otherwise noted, all standard errors are clustered at the family level. All sibling pairs are given equal weights in all estimation results presented in this paper.
(4.1) Results from the Full Sample Table 3 reports the results for the full sample. The sibling and intergenerational correlations estimated from our simplest specification of equations (6) and (7)  The third row in Table 3 reports the estimates of the intergenerational correlations between children and parents in education. We define the parent's education variable as the maximum of father's and mother's years of schooling. We, however, note that the results and conclusions in this paper are not sensitive to alternative definitions of parental education such as average of mother's and father's years of schooling. The intergenerational correlations reported in panel A are estimated from a simple specification that controls only for age and gender. The estimates for all children show a slight decline in 28 Equation (6) can be estimated directly (without the two step procedure) using Stata GLLAMM procedure when the set of control variables is small. However, it becomes unmanageable in the case where we introduce neighborhood fixed effects. For the sake of comparability, we report results from the two step procedure in this paper. The results from single step estimation do not differ from that of two step procedure when applied to specifications that does not include neighborhood level fixed effects. 29 The highest estimate is 0.60 among 16 Latin American countries, for El Salvador (Dahan and Gaviria (2001). Among developed countries, sibling correlations are found to be highest in USA. The estimates range between 0.6 (Mazumder (2008) for biological siblings in the same household for age cohort born during 1957-1969 and 0.63 (Conley and Glauber(2008) for siblings with same biological mother for age cohort 1958-76). The average estimate for Nordic and European countries is around 0.4 (see Bjorklund and Salvanes (2010)

) Gender and Intergenerational Mobility in Education
To understand any possible gender bias in the intergenerational educational mobility, we report estimates of sibling correlations for brothers and sisters separately in columns 3 to 6 of We also analyze the trend in intergenerational correlations between parents and children across gender (columns 3-6,  (2008)).
As discussed in the conceptual framework above, the square of intergenerational correlation provides an estimate of the share of total variance in schooling that can be explained by parent's education alone. The estimates (5 th row in Table 3) show that parent's education alone can explain between 27 to 29 percent of variations in years of education for men (brothers) and 31 to 39 percent variations for women (sisters). In contrast, for developed countries, parental education explains only 10 to 20 percent of total variations in schooling years (see Bjorklund and Salvanes (2010)).  (2006), Aslam et al. (2011)). In the next specification of our regressions, we include dummies for SC, ST and other backward castes. We also include a dummy for households whose head is a Muslim, as Muslim are among the most economically lagging groups in India (World Bank (2011)). The effects of the caste and religion dummies on the estimated sibling correlation is minimal; the estimates in panel B of Table 3   To provide a sense of relative importance of intergenerational correlation between parents and children as well as neighborhood effects in explaining the variations in children's education, we use estimates from 33 The largest estimate for neighborhood correlation is 0.15 for USA (Solon et al. (2000)). 34 The relatively larger role of location for women probably reflects lower geographic mobility among them. Table 3 and plot them in Figure 1. Figure  For women, intergenerational persistence has declined but neighborhood correlations remain nearly unchanged.

(4.2) Geography of Intergenerational Educational Mobility
The evidence on strong neighborhood effects in sibling and intergenerational correlations discussed above brings the focus on geographic location as an important factor in understanding educational mobility in post-reform India. This raises the question whether the levels, time trends and gender patterns of sibling and intergenerational correlations differ significantly across different geographic areas; for example, are there any significant differences between rural and urban areas, between less developed and more developed states? The recent academic literature and reports in popular press in India give a strong impression that the rural areas and certain lagging states such as Bihar and Uttar Pradesh (UP) have been largely bypassed by the positive effects of economic liberalization and strong economic growth that followed (World Bank (2011)). In this subsection, we provide a more indepth analysis of the role of geographic location in intergenerational educational mobility.  Table 3, we represent estimates from three different specifications of equations (6) and (7) in three panels (A, B and C) of Table 4 and 5. These specifications correspond exactly to the specifications in Table 3 and are not discussed here again for the sake of brevity. Using estimates from Table 4

.2.2) Intergenerational Educational Mobility across Regions
Living standards in India vary widely across states. The incidence of poverty among poorer states in India is amongst the highest among developing countries. On the other side of the spectrum, many states such as Punjab have low poverty rates that are comparable to richer countries (e.g. Turkey) (World Bank (2011)). The NFHS 1992/93 identifies the "backward" districts. 35  Bengal. These states are among the poorest in terms of income in 1993/94 (four of them belong to the socalled BIMARU), and also suffer from poor educational attainment and infrastructure indicators (Kingdon (2007), Deaton and Dreze (2002)). 36 We take the districts in these five states as less developed regions with rest being lumped under the developed category. The estimation results for these samples are reported in Tables 6 and 7. Tables  35 The Ministry of Health and Family Welfare, Government of India, has defined backward districts as those having a crude birth rate of 39 per 1,000 population or higher, estimated on the basis of data from the 1981 Population Census. 36 Note that there may not be a one to one correspondence between backward districts in our sample and backward states according to income and education indicators, because there are backward districts even in developed states. The estimates of Head Count Ratio by Deaton and Dreze (2002) show that Rajasthan has a lower incidence of poverty, but the evidence reported in Kingdon (2007) shows that it lags in terms of schooling indicators, even though it has made substantial strides in primary schooling. West Bengal, on the other hand, suffers from higher poverty, but has better schooling indicators. As noted by Deaton and Dreze (2002), West Bengal belongs to the same "slow growth" group with the BIMARU states (4.

2.3) Intergenerational Educational Mobility across Caste groups and Geographic Space
Our main results presented in Tables 3-7 suggest no substantial differences in sibling and intergenerational correlations across caste groups. Readers may be curious if this conclusion holds true across geographical areas and gender groups. Table 8

(5) Robustness checks
We check the sensitivity of our estimates in two ways. As already mentioned, as children become old, they tend to leave the parental household, because of marriage (especially for girls), jobs and higher education. If children who exit household are better educated, then it may bias the estimates of intergenerational correlations. For instance, if there is substantial and negative birth order effect on children's education, and marriage timing follows birth order, then our estimate of intergenerational correlations may be underestimated. On the other hand, if better education among women delays marriage, the bias would be in the opposite direction. To check out sensitivity of our estimates, we repeat our entire analysis for younger age cohorts [16 to 20 years]. The possibility of having children exiting household at this age cohort is much smaller than that for 16-27 year age cohort. Table 9 reports the results for the full sample. The estimates of sibling correlations for this age cohort are larger in magnitudes both for men and women. This is consistent with the evidence in literature which finds higher sibling correlations among closely spaced children compared with widely spaced children (Bjorklund and Salvanes (2010)). Such higher correlations arise from the fact that for more widely spaced children, family background may change substantially over time.
In the case of intergenerational correlations, we find no significant differences in the estimates for any of the sibling groups reported in Table 9 from those reported in Table 3 for any of the survey years.
This is reassuring as it suggests that exit of the relatively older children from households is approximately random. The changes in the sibling and intergenerational correlations between 1992/93 and 2006 implied by Table 9 are similar to those implied by Table 3. Consistent with Table 3, the estimates from Table 9 suggest large gender differences in sibling and intergenerational correlations. It also highlights the importance of parent's education and common neighborhood environment in explaining the sibling correlations. We omit the results from regional analysis for this younger age cohort to save space, but overall conclusions from our analysis based on 16-27 year age cohort hold true for 16-20 year age cohort too.
The results presented above took maximum of father and mother's education as the relevant metric of parental educational attainment. A reader might wonder if the conclusions reached earlier depend on this specific definition of parental education. To allay such concerns, we use the average of father and mother's education as an indicator of parental education and re-estimate all of the regressions.
The results in Table 10 show that if anything, the magnitudes of estimates are slightly larger in this new formulation. Thus our estimates of intergenerational correlations presented in the previous tables can be taken as conservative estimates of the effects of family background. The gender and geographic patterns and trends also remain unchanged with this alternative definition of parental education.

Our main empirical results indicate that sibling and intergenerational correlations in education in
India remained largely unchanged over a period of almost a decade and a half (1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006) after the economic liberalization in 1991. The only group that experienced significant decline in the sibling and intergenerational correlations are women in urban areas. These results seem to contradict the evidence presented by Jalan and Murgai (2008) who find substantial improvements in educational mobility in India over time. They use 1998-99 NFHS data and find that the magnitude of the intergenerational regression coefficient declines substantially for younger age cohorts. We discussed earlier the pitfalls in relying on cohort based analysis when data consists of only the co-resident children. In fact, Jalan and Murgai (2008) are well aware of the limitations of the cohort analysis and discuss many of the same points we raised earlier. Our results thus can differ on two grounds: (i) intergenerational correlations take into account the declining variance in education in children's generation, and (ii) we compare the same age cohorts (16-27 years) across two surveys, instead of relying on the different age cohorts in a single survey round. It is, however, important to check if we get estimates similar to those of Jalan and Murgai (2008) from a cohort based analysis. We replicate the analysis in Jalan and Murgai (2008) and report the estimated intergenerational regression coefficients for both rural and urban areas for three age cohorts [15-19, 20-24 and 25-29 year] in Table 11. It is interesting that the estimates show a declining effect of parental education for the younger age cohorts for both the survey years which is consistent with the estimates in Jalan and Murgai (2008). This seems to justify the worry that the estimates for younger cohorts may be biased downward. As noted before, the estimated intergenerational regression coefficients are likely to be smaller for the younger cohorts simply because of the fact that some of the children have not completed their education. Also, since geographic mobility has increased over time, and better educated children usually migrate first, the downward bias in the estimate would be more pronounced in younger cohorts. On the other hand, estimates tend to be biased upward for older age cohort when parents co-reside with better educated children. This highlights the need for using separate survey rounds and multiple measures to understand the trends in intergenerational persistence given the data constraints in developing countries.

(6) Toward an Understanding of the Trends in and Pattern of Educational Persistence
The focus of this study is on providing robust evidence on the trends in and pattern of educational mobility in post-reform India with special emphasis on the roles played by gender and geography. In other words, the goal here is to help establish the "facts" about educational mobility in India in a period of a decade and a half after extensive economic liberalization. In this section, we attempt a first pass at understanding the observed trends in and pattern of educational mobility in post-reform India delineated in earlier sections. We, however, hasten to add that our discussion is only the first small step in a major research program that needs to be undertaken to understand the nature of educational mobility in postreform India. We also emphasize a caveat widely understood in the literature that although the estimates of sibling and intergenerational correlations are important for tracing out the changes in educational mobility over time, they do not imply causality. Among other things, the literature has emphasized the difficulties in causal interpretations because of correlations in genetic endowments (ability) and preference among the siblings and also between the parents and children (see, for example, Bjorklund and Salvanes (2010)). 37 We, however, note that the changes observed over time are not likely to be driven primarily by changes in genetic correlations among siblings and between parents and children, as a decade and a half is a short span of time for any significant changes in genetic correlations. Thus when one observes large changes over a relatively short period of time, as we do in the case of women in urban areas, for example, it is more likely that they reflect changes in the 'environmental factors' in the household and the community. The evidence in this paper can be helpful in narrowing down the search 37 It is, however, important not to push the distinction between "nature" and "nurture" too far, because there are important interactions between the two, a point emphasized in the Behavioral Genetics literature (see, for example, Plomin et al. (2001)). For interesting discussions on the limitations of the nature vs. nurture debate, see Goldberger (1979) and Manski (2011). for potential causal factors. It thus constitutes an essential first step to policy relevant economic analysis of educational mobility. Our results indicate that the focus of a causal analysis of the observed educational persistence should primarily be on the geographic location, parental education and their correlates. The large impact of geography including the neighborhood effect points to the importance differences in school availability and quality and access to urban markets (returns to education). The importance of parental education, on the other hand, suggests credit constraints and role model effects as potential causal channels. 38 The recent literature has underscored the importance of schooling expansion and returns to education as major factors in determining trends in educational mobility (and more generally economic mobility including income mobility). 39 The evidence indicates that educational mobility improves when government invests heavily in educational infrastructure to ensure access at low costs. The lack of a significant improvement in educational mobility in post reform India thus immediately raises the issue of access to schools and its quality.  (2006)).
The evidence also indicates that many public schools are plagued with teacher absence and fail to offer quality education and thus the learning outcomes are very poor (ASER Report (2006), Das and Zajonc (2010)). The growth in private schooling has taken place more in those places where public school quality is poor. While the recent expansion of primary schooling has been successful in achieving near universal enrollment, improvements beyond primary schooling remain limited. The returns to secondary and tertiary education have experienced the most increase, but inequality in access to secondary schooling remains high. 40 The increasing role of private schools and private tutoring has raised concerns about 38 For evidence that parent's may be important role models, especially for women in Nepal, see Emran and Shilpi (2011). 39 See, for example, the discussion on the role of inequality in access of higher education and increasing wage premium for higher education in explaining the observed decline in mobility in UK by Blanden et al. (2008). 40 Inequality of access measured as the difference between the top and bottom quintiles of the income distribution. See World Bank (2006). inequality in educational opportunity (see, for example, Kingdon (2007)). 41 A private market for education can be especially inequalizing in a developing country such as India where the credit market is underdeveloped in general, and the student loan market is almost non-existent.

(6.1) Rural-Urban Gap: Understanding the Higher Correlations in Urban Areas
The magnitude of sibling and intergenerational correlations are in general larger in urban areas.
This is true in 1993 for both men and women, also for men in 2006. It seems puzzling, because the schooling infrastructure and financial sector are expected to be more developed in urban areas. However, there are a number of factors that may help explain the observed higher persistence in educational attainment in urban areas compared to the rural areas.
Most of the schools, both primary and secondary, in rural areas are public schools and thus tuition free. The public primary schools also provide mid-day meals. The absence of tuition costs and mid-day meals help the poorer households (parents with lower education) to send their children to schools, especially the primary schools. Moreover, the private market for supplementary tutoring is not developed in rural areas. A private market for quality tutoring could potentially give an advantage to the children of richer (and more educated) parents, creating inequality in educational opportunities. The above factors combined together weaken the link between parental income and children's educational attainment in the rural areas. In contrast, there has been dramatic growth in private schools and supplementary private tutoring in urban areas in India in last couple of decades (Kingdon (2007), (World Bank (2009)).
According to one estimate the share of enrollment in private secondary schools in urban India was about 30-40 percent in 2002. Thus parental income and access to credit have become increasingly important in urban India for children's education, creating more prominent role for parental education and family background. This raises the worry that the inequality in access to education may accelerate in the urban areas in the coming years.
Another important factor is the differences in returns to education. The available estimates for 1993/94 shows that while returns to primary education were higher in rural areas, returns to higher education were higher in urban areas (Duraisamy (2002)). The recent estimates indicate that the ruralurban gap in returns to education has increased after the liberalization (Aslam et al. (2011)). The returns to one more year of schooling in 2007 for self-employed is estimated to be 9.8 percent in rural areas and 34 percent in urban areas. For wage employment for men, it is 6.3 percent in rural areas, but 32 percent in urban areas. For female wage employment, the corresponding returns are 8 percent (rural) and 44 percent 41 Banerjee et al. (2007) show that private tutoring improves learning outcomes in India. For an analysis of the role of private schools in decreasing mobility in UK, see Green et al. (2010).
(urban) (see Aslam et al. (2011)). 42 As noted by many observers, the economic growth in India after economic liberalization has been both skill-biased and urban-biased, driven by service sector growth including information technology (Kotwal et al. (2011), Bardhan (2010)). 43 The higher returns to education in urban areas make the investment in children's education more attractive for all the parents.
The potential positive effect of higher returns to education on children's educational mobility is, however, counteracted by an important dynamic interaction between parental education and higher returns following liberalization. After the liberalization, the more educated parents could take advantage of the emerging opportunities in the urban labor market and they experienced higher income growth. The higher income allowed them to invest in children's education to reap the benefits of increasing returns to education. But the poor (and relatively uneducated) parents were less successful in taking advantage of the skill intensive growth process. Thus while the children of relatively educated parents in urban India continued to receive more and better education, the children of less educated parents failed to move beyond their parent's ranks, resulting in persistence between children and parent's education in the urban areas. Our evidence suggests that this is especially true for men in urban areas, but the experience of urban women requires additional explanations as they had substantial improvements in educational opportunities in the face of the forces discussed above. We turn to possible resolution to this puzzle in the next section.

(6.2) The Curious Case of Urban Women
Although the factors discussed above are expected to tighten the link between parental education and children's education in urban areas, the evidence in this paper shows that women in urban areas experienced substantial improvements in educational mobility from 1992/93 to 2006. This comes across as especially counterintuitive in the context of a country where son preference is strong and social norms against women's work are conservative. However, note that even though the sibling correlation among sisters has gone down the most over the sample period, even in 2006 the magnitude of both sibling and intergenerational correlations remain significantly higher for women, indicating lower educational mobility compared to men.
The apparently puzzling improvements in the educational mobility of women in urban areas during last two decades can be explained in terms of relevant economic and social forces. The urban parents in general experienced higher income growth after liberalization compared to the rural parents, 42 The skill biased nature of economic growth can also be seen from growth of wages across different schooling levels. The average wage in 1999/2000 for someone with a college degree was 73 percent higher than someone with high school degree, and 67 percent higher for someone with high school degree compared to someone with middle school education (based on NSS data). 43 A substantial part of the readymade garments industry is located in large cities including Delhi and Bangalore. thus they can afford to invest in the education of daughters to reap the benefits of high returns. 44 Interestingly, contrary to the conventional view, son preference in education also implies that as incomes grow (and/or credit market access improves), parents find it acceptable to invest in a daughters' education. To see this, note that given the high perceived returns to a son's education (family lineage, old age support, dowry, social prestige etc.), the parents try to invest in son's education even if they face poverty. They start to invest in lower perceived-return assets such as a daughters' education only when they have more income and/or face lower credit constraints. It is thus only natural that the urban parents began to invest more in girls' education when their income grew following the liberalization. As noted above, the returns to education for women in urban areas has increased substantially over the reform period which makes it more attractive to invest in daughters' education.
The age at marriage for girls is also higher in urban areas, implying that the parents might be able to recoup some of the financial investment in the form of income support from working daughters before they get married. 45 Also, there are indications that the trade-off between dowry and investment in education of daughters has started to tilt in favor of education in urban India, and thus parents might find investing in education as better option than accumulating savings for dowry (Mishra (2011)).
Also, the force of the social norm against women's labor market participation is much weaker in urban areas, because parents are better educated and the life expectations of young women are influenced by peer effects and access to better information. A related important point emphasized by Munshi and Rosenzweig (2006) is that the women may be able to adapt better to the new occupations, as they are not expected to follow in their father's footprint. The sons inherit the social and occupational network of the father, and most in the fathers' generation are not employed in new occupations such as information technology. This 'freedom by neglect' actually helps the daughters achieve better occupational mobility in the post-reform India which feeds into higher educational attainment. Using survey data from Mumbai, Munshi and Rosezweig (2006) find that the sons were channeled into local language schools with established parental network and entered into parental occupations thus reproducing caste based allocation of talents. The daughters, on the other hand, enrolled in English medium schools and were better prepared to take advantage of non-traditional jobs, especially those with a premium for English proficiency. Our evidence also indicates that, among urban women, the lower caste women experienced 44 As noted before, this is due to urban and skill biased nature of economic growth in post-reform India. 45 In the context of Malaysia, one explanation for higher educational mobility of daughters during the New Economic Policy is that parents invest in an older daughter's education in expectation that she in turn will finance the education of younger brothers by sending in remittances from high paying urban jobs. In an environment of low financial deepening and negative real interest rates, this strategy may make a good deal of economic sense, especially when the returns to women's education is high in urban areas, as is the case in India (see Lillard and Willis (1994) on Malaysia). the largest decline in the educational persistence as measured by the sibling correlation. This may in part reflect lower social constraints on their labor market participation which allowed them to take advantage of the opportunities in the post-reform India.
A straightforward implication of the above discussion is that the same set of factors is likely to be responsible for lack of improvements in educational mobility for women in rural areas. Low parental income, low returns to education, lack of skilled jobs, and stronger social norms against women's participation in the labor market, all combined together can result in little or no improvements in educational mobility for rural women. Also, note that among the few socially-coveted jobs in rural areas for educated women are public sector jobs, for example, in schools and health clinics. However, hiring in public schools and health clinics was frozen or curtailed after 1991 liberalization as part of the fiscal reform, but there was no compensating private sector growth. This might have reinforced the disincentives for investing in daughter's education in rural areas.

Conclusions
The Indian economy grew at a robust pace since its economic liberalization in 1991and achieved significant reduction in poverty. At the same time, the evidence indicates an increase in inequality (World Bank (2011), Deaton and Dreze (2002), Datt and Ravallion (2010)). This paper examines the trends in  (2009)). We find that neighborhood environment accounts for a larger share of sibling correlation in rural areas where access to schooling is more limited and labor market returns to education are lower. In urban areas where returns to education tend to be higher, parental education accounts for a larger share of sibling correlations. In contrast to evidence from developed countries, the majority of the variations in sibling correlations in India can be explained by two factors: parental education and neighborhood effect.   Robust t statistics in parentheses. Standard errors corrected for clustering at family level * significant at 10%; ** significant at 5%; *** significant at 1% Robust t statistics in parentheses. Standard errors corrected for clustering at family level * significant at 10%; ** significant at 5%; *** significant at 1%  Robust t statistics in parentheses. Standard errors corrected for clustering at family level * significant at 10%; ** significant at 5%; *** significant at 1%