Human Capital and the Changing Structure of the Indian Economy

Using panel data for the fourteen major states of India over the 1980-2000 period, the authors estimate the effect of human capital endowment on the performance of the state economies. They find that greater availability of skilled workers had a positive and significant impact on output in the service sectors. They do not find any such effect for the manufacturing sectors. The paper shows that the differential effect on services and manufacturing arises because service sectors are more skill intensive.


Introduction
The Indian economy has undergone remarkable changes since the early 1980s. India's GDP per capita grew at an annual rate of 3.8% during the 1980-2000 period as compared to 1.7% during the 1950-80 period. Much of this expansion is attributed to the services sector which grew at 5.1% per annum (in per capita terms) increasing its share in GDP from 38% in 1980 to 49% in 2000. Some of the key development policy questions today are whether India's improved performance is sustainable and whether other developing countries can emulate India. While growth is a complex and elusive phenomenon, it may be helpful to identify specific factors that help explain India's growth pattern.
The present paper focuses on the role of the highly skilled. Using panel data for the fourteen major states of India over the 1980-2000 period, we examine the performance of per capita GDP and value added in agriculture, manufacturing (registered and unregistered) and services. Controlling for other factors, and using the system-GMM and the traditional instrumental variable methods to address endogeneity concerns, we find that greater availability of skilled workers had a statistically significant and positive impact on per capita output of the aggregate services sector, but no significant impact on agriculture, and more surprisingly, no impact on manufacturing. One reason for the much stronger impact on services could be that, relative to manufacturing sectors, the service sectors are more skill intensive and therefore more likely to benefit from an expansion in human capital. We test for this idea using data at the disaggregated level (within services and registered manufacturing sectors) and find strong evidence in its favor.
That skilled labor has played an important role in India's overall performance and especially so in the services sector has provoked much discussion but only limited rigorous analysis. Rodrik and Subramanian (2004) note that India's productivity growth in the last two decades has benefited from its stock of the highly educated but provide no formal evidence. Kochar et al. (2006) show that India's share of output in skill-intensive industries is higher than that of China and comparable to that of much richer countries like Malaysia and Korea. Gordon and Gupta (2004) highlight a number of factors behind India's "services revolution" but human capital is not one of them. Bhide and Shand (2000) and Ahluwalia (2000) do include measures of human capital to analyze variations in growth rates across Indian states in the post 1980 period but both these studies rely entirely on literacy rates. A second problem with these two studies is that they do not address the problem that endowments of human capital could be endogenous.
The plan of the remaining paper is as follows. Section 2 describes the data and the empirical methodology. In section 3 we provide our main results at the aggregate level.
Sections 4 contains results at the disaggregate level. Some extensions of the main results are contained in section 5. We summarize our findings in the conclusion and suggest scope for future work.

Empirical methodology
Our econometric analysis is based on estimating a standard reduced form equation ( One problem with estimating (1) directly is that data on the stock of skilled labor (H st ) at the state level are not available. What we do have is data on the number of students enrolled in higher education (colleges and universities) which is a flow measure since current students add to the stock of human capital in the future. To use this flow measure we make two assumptions and modify equation (1) to an appropriate form.
First, we assume that, from the date of enrollment, it takes about four years for a student to complete her higher education and find a job. This is a reasonable assumption since the data on enrollments that we use below covers students at the undergraduate level (Bachelor of Arts, Science and Commerce) which is a three year program in all the states. 3 The second assumption relates to inter-state migration of the highly skilled. If such migration were significant then it would weaken the case for using enrollments as a 1 In the regressions, the dependent variables are appropriately deflated by the total population of the states. 2 List of the 14 major states is provided in Table 3. 3 Further, enrollments in a given year relate to the academic year September to August while data on the outcome variables (GDP per capita, etc.) relates to the financial year beginning in April implying an additional lag of about 4 months before the full impact of new graduates is felt on output. If we assume that it takes about 5-6 months to search for a job we get that it takes about four years from the date of enrollment before a student's skills have any significant effect on output. measure of human capital flows at the state level. Using available evidence from the fiveyearly rounds of the National Sample Survey, we estimated the likely magnitude of interstate migration and these are reported in Table 3. 4 As a percentage of students completing their higher education and entering the labor market, inter-state migration of the highly skilled varies between 0.48% (Andhra Pradesh) and 3.01% (Haryana). Since annual data on interstate migration are not available, we cannot control for migration related effects in our regressions. However, given its relatively small magnitude, the assumption that there is negligible interstate migration is unlikely to affect our results significantly. 5 Using the two assumptions above, we can rewrite (1) as a "4 th difference per capita" equation as follows: (2) ...
where P st is the total population of state s in year t, is the addition to the stock of the highly skilled in the labor market over t, t-4 period in state s which is equal to the total number of students enrolled in Bachelor of Arts, Science and Commerce courses in year t-4 and state s. In the remainder of the paper, we will use the terms human capital, enrollments and skilled workers interchangeably for st P / − . 4 These estimates are based on ongoing work on interstate migration in India by Utsav Kumar, Aaditya Mattoo and Arvind Subramanian. 5 Other studies also suggest that inter-state migration in India is quite small. See, for example, Cashin and Sahay (1995) and Topalova (2004).

Description of main variables
A formal definition of all the variables used in the regressions is provided in Table 1 along with the data sources. Summary statistics are reported in Tables 2 and 3. All nominal variables used in the regressions are expressed in real terms (1993-94 prices).
Enrollments at the national level averaged 4.2×10 -3 over the entire period with the standard deviation of 1.9×10 -3 . Over time, the level of enrollments varies between 2.9 and 6.6 per thousand population. The best performing states here are Karnataka, Maharashtra and Gujarat while the worst is Rajasthan (column 1, Table 3). The figure for Karnataka is more than twice that for Rajasthan. In general, the traditionally laggard states are far behind the others in the level of enrollments. Figure 1 shows the relationship between enrollments per capita and change in service sector value added per capita over t, t-4. The relationship is positive, as our regression results will later confirm. Figure 2 shows the relationship between enrollments and output in the registered manufacturing sector while figure 3 does the same for agriculture. Both these relationships appear to be weak as is confirmed later by our empirical results.
The relationship between human capital and output measures depicted in the figures discussed above cannot be interpreted as truly causal due to possible reverse causality and omitted variable bias problems. Although these problems cannot be ruled out completely, we believe that they are less severe with our estimation than is otherwise the case for the following reasons. Second, current enrollment levels in higher education are largely determined by past investments in educational infrastructure (number of colleges, universities, etc) and are unlikely to be affected by current changes in income and sectoral output. 6 Finally, in the panel data estimation, we use the system GMM method which utilizes lagged values of the potentially endogenous variables as instruments. In the cross-section, we use the traditional instrumental variables (IV) estimation strategy. Hence, reverse causality is unlikely to be a significant problem for our estimation.
A relatively more serious concern could be the omitted variable bias problem. The controls in our specification include measures of infrastructure availability, quality of institutions, size of the states measured by total population, development 6 In section 5 we look at the relationship between current enrollment levels and sufficiently lagged values of the number of colleges at the state level. Our findings show a strong positive relationship between the two. Further, we do not find any evidence of a direct effect of colleges on current changes in output levels. For infrastructure we use total installed power generating capacity (Power st ) and total road length (Roads st ) at the state-year level. There is no readily available direct measure of the quality of institutions at the state-year level for India and we follow Kochar et al. (2006) in using a proxy measure which is transmission and distribution losses of state electricity boards as a percentage of total power generated (TDL). India has one of the highest rates of TDL in the world averaging around 20% (see Table 3) and it is widely known that these losses are largely due to theft and pilferage reflecting poor quality of governance at the state level. 7 However, our results for TDL should be treated with caution because more work is needed to assess how well the variable proxies for the quality of the institutional environment. Total population of the state (Population st ) captures possible increasing returns to scale and also the availability of (unskilled) labor.
States that invest more in human capital are also likely to spend more on other development activities such as the provision of health care, etc. These factors could have a direct effect on output which we filter out by controlling for development expenditure Our measure of human capital is most highly correlated with power availability (correlation of .373) followed by development expenditure (.233). 8

Regression results for the aggregate sectors
In this section we report regression results for the aggregate sectors. That is, in separate regressions, the dependent variable in equation 3 equals per capita change in: GDP, value added in services sector, valued added in total manufacturing, registered manufacturing and unregistered manufacturing, and value added in the agriculture.
Regression results for the services sector are reported in Table 4. Without any additional controls, the estimated effect of human capital on services output is positive and significant at less than 1% level (column 1, Table 4). Controlling for power and development expenditure, which are most highly correlated with the human capital variable, we find that the estimated coefficient of human capital declines only marginally but remains positive and significant at less than 1% level (columns 2 and 3, Table 4).
Controlling for the remaining variables in the specification does not change our results 8 The correlation between human capital and other variables in equation 3 are as follows: -.191 with roads, .015 with transmission and distribution losses and -.133 with population. qualitatively, although the estimated coefficient of the human capital variable does decrease in magnitude. The estimated coefficients of the remaining variables do not show any robust effects on changes in services output with the exception of the lagged dependent variable. Its coefficient is positive (.715) and significant at less than 1% level implying a sharp tendency towards divergence. Table 4 also reports on test statistics for Sargen-Hansen overidentifying restrictions and second order serial correlation. For the validity of the instruments, it is important that both these test statistics are small (statistically insignificant from zero).
Our estimation results show no evidence of second order serial correlation but the Sargen-Hansen test statistic is weakly significant at close to 10% level (p-value of .096) without any additional controls (column 1, Table 4). However, as evident from Table 4, even this weak significance level disappears when we add our basic controls to the specification.
We experimented with a number of alternative specifications but found that the positive and significant impact of human capital on services sector output remained intact. First, it is possible that the state and year fixed effects may filter out much of the variation in our enrollments variable. If this is indeed the case then our final results above may not be robust to small alterations in the sample or the specification. We estimated equation 3 without the state and year fixed effects. Second, in the specifications above we used change in the quality of institutions as an explanatory variable. To the final specification (column 4, Table 4), we added the level of institutional quality as another explanatory variable under the assumption that change in output may be affected by the level of institutional development rather than annual changes in it. Third, expansion of the services sector has been particularly strong in the state of Maharashtra. Change in services output per capita equaled Rs 403 per annum in the state compared to the next highest figure of Rs 291 per annum for the state of Gujarat. We dropped the state of Maharashtra from the sample to ensure that our results above are not driven by a single state. Our main findings reported above were robust to all these changes.
Regression results for the remaining outcome variables are reported in Table 5.
Panel A contains results for the effect of human capital on the outcome variables without any additional controls. In Panel B we report results with the various controls discussed above included in the specification. Without additional controls, we find a positive effect of human capital on GDP, total manufacturing and registered manufacturing, significant at less than 5% level. However, these statistically significant effects do not survive our robustness checks as evident from Panel B of Table 5 (columns 1-3). For unregistered manufacturing and agriculture we do not find any significant effect (at 10% level or less) of human capital with or without additional controls (columns 4 and 5, Table 5).
As for the services sector, we find the estimated coefficient of the lagged dependent variable is positive and statistically significant at less than 1% level for the registered manufacturing sector. GDP, unregistered manufacturing and agriculture show only weak significance of the lagged dependent variable (significant between 5-10% level). None of the remaining explanatory variables show any significant effect (at 5% level or less) on the outcome variables in a robust way. The only exception here is the positive and significant effect of development expenditure on output in unregistered manufacturing sector. 9 Our main finding in Table 5 of a statistically insignificant effect 9 One reason for this could be that development expenditure is primarily targeted towards the poorer sections that are more likely to seek employment in the unregistered manufacturing sector. of human capital on the outcome variables survives a number of robustness checks such as dropping the state of Maharashtra from the sample, removing the state and year fixed effects from the specification and adding the level of institutional quality as an explanatory variable.

Disaggregated results
In this section we look at the disaggregated sectors within registered manufacturing and services. These sectors are quite diverse and fitting a single regression at the aggregate level does not allow us to exploit the diversity to better understand the effect of human capital on output. For example, we found above that the highly skilled had a strong effect on the services sector but a weak one on the other sectors. The most natural explanation of this is that the service sectors are the most skill intensive sectors and hence likely to benefit more from a greater availability of human capital. Testing for this requires differentiating across sub-sectors by their respective skill intensities which is the main goal of this section.
We pool data for the 5 main services sectors and 16 registered manufacturing sectors at the ASI 2 digit level. These sectors are listed in Table 6. Consistent data on output of these sectors are available for the time period 1980-1997.
Diversity across sectors is captured using factor intensity ratios which reflect the relative importance of the explanatory variables across sectors. These factor intensity ratios are computed using the input-output matrix for India. To appropriately map the sectors in our data onyo the ones in the input-output matrix we had to group our initial set of 16 manufacturing sectors into 12 which left us with a total of 17 sectors. 10 where subscript i denotes the sector, IFE are the industry (sector) fixed effects, Serv is a dummy variable equal to 1 if the sector is a services sector and 0 otherwise. Equation (4) is analogous to equation (3) with industry fixed effects and the interaction terms added. Variation in skill intensity across sectors is captured by Skill_Int which equals the total remunerations of the highly skilled as a percentage of the total remunerations of all the workers (skilled plus unskilled). These skill intensities are reported in Table 7. The coefficient 11 β measures how the effect of human capital varies across sectors depending on their skill intensity. As mentioned above, we expect the coefficient to be positive implying a greater impact of human capital on the relatively more skill intensive sectors.

Diversity across sectors is captured by
We also experimented with a number of additional interaction terms and some of the important ones are included in equation (4) and the remaining ones are discussed briefly below. The intensity ratios for roads (Road_Int) and power (Power_Int) are defined as the 10 The exact mapping is stated in Table 6.
total input use of the factors as a percentage of the total output of the sector (reported in Table 7). We pay special attention to how the effect of population and development expenditure varies across sectors. It is possible that scale economies may be more important for manufacturing relative to services sectors. Development expenditure (on health, basic education, etc) may complement skilled labor in affecting output. Due to data limitations, it is difficult to assess the importance of these two variables across the narrow sectors. Hence, we take a general approach here by controlling for their differential effect on output across the broad group of manufacturing and services sectors. This is done by interacting the population and development expenditure terms by the dummy (Serv) for the services sectors.
Regression results for equation (4) are reported in Table 8. Without any additional controls, the estimated coefficient of our main interaction term ( 11 β ) is positive and significant at less than 1% level (column 1, Table 8) implying that the effect of human capital on output increases sharply with the skill intensity of the sectors. The result is robust to controls for population, population, power, roads, development expenditure and the quality of institutions (column 2, Table 8) and the remaining controls discussed above (columns 3 and 4, Table 8). For all the specifications reported in Table 8, the Sargen-Hansen and the second order correlation test statistics are statistically insignificant implying that the exogeneity of the instruments cannot be rejected. 11 We experimented by allowing the effect of institutions to vary across the group of manufacturing and service sectors. This was done by including the interaction of the TDL 11 The Sargen-Hansen test for overidentifying restrictions is weakly significant at close to 10% level (pvalue of .097) with the human capital variables alone (column 1, Table 8). However, even this weak significance disappears when we control for power, roads, etc (columns 2-4, Table 8). This result is similar to what we found for the aggregate service sector in section 3.
term in equation (4) with the dummy for the service sectors (Serv) but this did not change our main results much either here or elsewhere in the paper. Further, we did not find any significant difference in the effect of institutions across service and manufacturing sectors.
The overall effect of human capital on output ( i Int Skill _ 11 β β + ) varies significantly across sectors. It is negative but statistically insignificant at 10% level or less for the least skill intensive sectors (all manufacturing) and positive and significant at less than 5% level for the most skill intensive sectors (banking and insurance, telecommunication services, business services and real estate services).
Lastly, the coefficient of the lagged dependent variable is positive and significant at less than 1% level implying a strong tendency for output changes to diverge from the mean. The divergence is similar to the findings in the previous section. For the remaining variables, we do not find any significant effects with all the controls in place.

Long-run effects
In this section, we briefly report on regression results using the traditional two stage instrumental variables (IV) estimation strategy. We apply this strategy to equation (4) after averaging both sides of the equation over time. 12 Motivation for averaging is twofold. First, it is difficult to instrument for year-to-year variations in the explanatory variables. Averaging removes the time dimension from the equation completely making our task (of finding appropriate instruments) relatively simpler. Second, averaging allows us to analyze the long-run impact of human capital (and other explanatory variables) which is helpful if some of our explanatory variables affect output slowly over time.
We note that averaging over time implies that we are left with a pure cross-section at the state-sector level. It also implies that all the explanatory variables above apart from the interaction terms, are absorbed into the state and industry fixed effects. Thus, all state-year and sector-year variants are fully controlled for by these fixed effects. IV regression results are reported in Table 9. Without any additional controls, the estimated coefficient of our main interaction term is significant at less than 1% level (column 1, Panel A, Table 9). In column 2 of Table 9, we report regression results controlling for state and sector fixed effects to filter out all the non-interaction terms. The estimated coefficient of the human capital -skill intensity interaction term here remained positive and significant at less than 1% level. Adding the remaining controls to the specification, we find the estimated coefficient of the main interaction term remains 13 For Roads*Road_Int we use Highway*Road_Int as an instrument where Highway is value of total length of Highways per capita (average over the 1970-79 period). Data on road length (other than highways) for the period shows significant year to year jumps for some of the states due to changes in data coverage. Excluding roads completely from the specification does not change our main results much. significant at less than 5% level (column 3, Table 9). For the remaining variables, we find that the effect of development expenditure is much higher for the services sectors than for the manufacturing sectors (significant at less than 1% level). The remaining terms do not show any significant effect on output change. In all the specifications reported in Table 9, our instruments performed reasonably well as reflected by the overdidentification test statistics reported in Panel A and the Shea partial correlations in Panel B.

Conclusion
We sought to identify some of the causal factors in the impressive performance of the Indian economy since early 1980s. We find that the endowments of the highly skilled had a significant effect on the services output of individual states. However, manufacturing and agriculture have been largely immune to the availability of the highly skilled. The results also show that the key reason for this differential impact is the skill intensity of the sectors with the more skill intensive services sectors being the primary beneficiaries of greater skill availability.
A key issue is the relative role of history and human capital. It is increasingly believed that the root cause of backwardness in many parts of the world lies in their history. For example, the quality of a country's institutions may be related to its (colonial) history. Many institutions are slow to change which prompts pessimism about the likelihood of coming out of the poverty trap. An interesting question is whether building effective educational institutions is feasible in the medium term with a combination of domestic policy action and external assistance. If this is indeed the case, then, our findings suggest that investments in higher education could provide a way of alleviating the constraints of inadequate institutional development in the medium term.         Inter-state migration: The estimates shown are average rates for the 1980-2000 period and based on ongoing work on interstate migration in India by Utsav Kumar, Aaditya Mattoo and Arvind Subramanian. The interpretation of these numbers is that if inter-state migration of the skilled were taken into account then the net addition to the stock of human capital as measured in our data would change (in absolute value) by an amount equal to the migration rates listed above. These estimates are derived using NSSO data for the number of skilled migrants in various states. p-values in brackets. Significance levels are denoted by *** (1% level or less), ** (5% level or less) and * (10% level or less). p-values in brackets. Significance levels are denoted by *** (1% level or less), ** (5% level or less) and * (10% level or less). "Mfg" is Manufacturing.  Source: Authors own calculations based on input-output matrix of India from Global Trade Analysis Project (GTAP), Version 6. Skilled labor (column 1) denotes remunerations of skilled labor in the sector as a percentage of total labor remunerations. Intensity ratios for power (column 2) and roads (column 3) equal total expenditure incurred by firms in the industry on the two respective inputs and expressed as a percentage of total value added in the sector.. p-values in brackets. Significance levels are denoted by *** (1% level or less), ** (5% level or less) and * (10% level or less).