Specialization, Diversity, and Indian Manufacturing Growth

This paper examines the specialization and diversity of manufacturing industries within Indian districts. Prior to India's recent economic growth and liberalization, specialization levels in 1989 were substantially higher than similar metrics calculated for the United States. From 1989 to 2010, average specialization levels for Indian districts declined to a level that is now quite comparable to the United States. Diversity levels similarly increased. Specialization and diversity levels in India are becoming more persistent with time. Manufacturing plants display higher productivity in districts that display both properties. From 1989 to 2010, manufacturing employment growth was higher in districts that were more specialized at the start of the period.


Introduction
The continued economic development of India has the capacity to better the lives of over one billion people. It is widely recognized that a key aspect of this development process is improvements and enhancement in India's industrial organization, perhaps in close connection with increased urbanization and/or greater formal organization of business. India is emerging from an industrial past with extensive distortions due to government controls and regulations, and many scholars today argue that India still contains extensive misallocation of activity across plants and regions (e.g., Hsieh and Klenow 2009, Ghani et al. 2013a, Desmet et al. 2011.
Despite the recent progress of researchers and policy makers towards understanding these issues, many big questions remain unexplored.
This paper takes on one such topic-the specialization and diversity of Indian districts. It has been over two decades since Glaeser et al. (1992) brought the connections between city specialization/diversity and urban growth to the forefront of urban economic analysis. Yet, at this point, we are not aware of any single study that attempts to even measure these specialization and diversity traits for India. We aim to fill that void and to document the connection between these district traits and two key economic outcomes: the productivity of establishments and the employment growth of districts over time. We build on a solid body of research for advanced economies (e.g., Duranton and Puga, 2000), but our analysis is also focused on a country where it is unclear how many of these lessons will continue to hold. India's emergence from its economic liberalization in the early 1990s leaves it at quite a different place from the United States' declining manufacturing sector post-1980 that is most often studied.
The theoretical foundation for the link between industrial composition and growth is forged via technological externalities-that is, the spillovers of innovation and improvements from one firm to another without full compensation. Although there is general agreement on this particular channel, it is not clear whether diversity or specialization is more likely to lead to these externalities. Three theories of dynamic externalities link specialization/diversity and economic development, as depicted by Glaeser et al. (1992). Marshall-Arrow-Romer (MAR) type externalities hold that spillovers come from within industries, but only when concentration is high. Porter (1990) also holds that spillovers emerge from within industries, but only in the presence of competition. Thus, these first two theories favor specialization over diversity for 3 innovation and growth. Jacobs (1969), on the other hand, argues that knowledge spills across industries and that competition is crucial to innovation. Thus, according to Jacobs, industrial variety and diversity are conducive to growth.
Thus, the theory as to whether specialization or diversity most enhances growth is ambiguous, and the empirical literature for the United States is nuanced and sometimes conflicting. This makes empirical measurement for India all the more important. Previous studies emphasize that urban diversity fosters employment growth (Glaeser et al., 1992), new and innovative industries, and the production of less-standardized or non-traditional items (Henderson, 1997a). Specialized cities are often thought better suited for mature industries and the production of standardized and export-oriented products. Some of the channels linking diversity/specialization to economic development are the roles that diversified cities play in fostering innovation Puga, 2001, Feldman andAudretsch, 1999), increasing employment (Glaeser et al., 1992), promoting entrepreneurship (Jacobs, 1969), or reducing costs (Lall et al., 2003). Duranton (2013a) provides a recent review in the developing country context. This paper documents a detailed set of trends and outcomes for the Indian manufacturing sector. We measure patterns of industrial specialization and diversity in Indian districts from 1989 to 2010. We document that average specialization levels for Indian districts are decreasing over time, from very high levels before liberalization, and are now on par with levels observed in the United States. Specialization in urban areas is particularly strong and follows a similar path.
While the levels are comparable to the United States, the persistence of district-level values in India remains very weak. That is, India's spatial distribution of manufacturing is still quite in flux, though stability is increasing with time. Modern industries are quite spatially concentrated, and traditional industries form the highest specialization for most districts. In terms of economic consequences, local specialization and diversity are jointly associated with stronger production functions, and the marginal impacts of these benefits are mostly located outside of the industry leading the local specialization. Perhaps most important, we document a very strong link within India for the initial specialization of districts and their employment growth in manufacturing.
The link between diversity and growth appears more non-linear.
The remainder of this paper is as follows: Section 2 describes the data used for this paper and the calculation of our specialization and diversity indices. Section 3 provides an extensive 4 description of the specialization and diversity in Indian manufacturing districts and the trends over time. Section 4 estimates the role of these district traits in manufacturing production functions and the employment growth of districts over the 1989-2010 period. The final section concludes and provides some thoughts about future work on these important topics.

Indian Manufacturing Data and Index Calculations
This section begins with a description of the Indian and U.S. manufacturing data that we use in our study. We then outline how we calculate our indices of district-level specialization and diversity and describe some of their important empirical properties.

Indian Manufacturing Data
We employ repeated cross-sectional surveys of manufacturing establishments carried out by the government of India for the fiscal years of 1989, 1994, 2000, 2005, and 2010. In all cases, the survey was undertaken over two fiscal years (e.g., the 1994 survey was conducted during 1994-1995), but we will only refer to the initial year for simplicity. The organized and unorganized sectors of Indian manufacturing are surveyed separately, as described next. In every period except the last one, our surveys for the two sectors were undertaken contemporaneously.
In the last period, we combine the 2009-2010 survey for the organized sector with the 2010-2011 survey for the unorganized sector. We will again refer to this period as 2010 for simplicity.
The organized sector comprises establishments with more than 10 workers if the establishment uses electricity. If the establishment does not use electricity, the threshold is 20 workers or more. These establishments are required to register under the India Factories Act of 1948. The unorganized manufacturing sector is, by default, comprised of establishments which fall outside the scope of the Factories Act. The organized sector accounts for over 80% of India's manufacturing output, while the unorganized sector accounts for over 80% and 99% of Indian manufacturing employment and establishments, respectively (Ghani et al., 2013a).
The organized manufacturing sector is surveyed by the Central Statistical Organization through the Annual Survey of Industries (ASI). Our data for the unorganized sector come from the National Sample Statistics (NSS). These surveys are used for many published reports on the state of Indian businesses and government agency monitoring of the Indian economy. The typical 5 survey collects data from over 150,000 Indian establishments. In this respect, the surveys are comparable to the Annual Survey of Manufacturing conducted in the United States, with the Indian sampling frame being about three times larger.
Establishments are surveyed with state and four-digit National Industry Classification (NIC) stratification. The surveys provide sample weights that we use to construct populationlevel estimates of total establishments, employment, and output by district. Districts are administrative subdivisions of Indian states or territories that provide meaningful local economic conditions. The average district size is around 5,500 square kilometers-roughly twice the size of a U.S. county-and there is substantial variability in district size (standard deviation of ~5,500 square kilometers). Indian districts can be effectively considered as self-contained labor markets and, to some degree, economic units.
Our surveys record economic characteristics of plants like employment, output, and raw materials. When we estimate plant-level production functions, we use the reported traits directly.
Most of our analysis considers aggregated measures of manufacturing activity in locations. For this purpose, we sum the activity of plants up to the district or district-industry level, combining the organized and unorganized sectors and using sample weights. We use the two-digit level of the NIC system for calculating industrial specialization and diversity for districts. This level of aggregation contains 22 manufacturing industries. 1 Our core sample contains 429 districts. This sample is smaller than the total number of districts in India of 630, but it accounts for almost all of the plants, employment, and output in the manufacturing sector throughout the period of study. The reductions from the 630 baseline occur due to requirements that manufacturing employment be observed in every period for the district (i.e., we have balanced panels of districts from 1989 to 2010). Even with these requirements, some districts have a small number of observations, and this could be worrisome given that our data do not constitute a complete census of Indian businesses and have state-level survey stratification. We consider several checks below (e.g., excluding smaller districts) to verify that the results discussed are robust to these considerations.

United States Manufacturing Data
It is useful when discussing the levels of Indian specialization and diversity to benchmark them against other countries. Surprisingly, there is very little tabulation of these metrics internationally. We will compare our Indian metrics with those that we calculate for the United States using 2011 manufacturing data from County Business Patterns (CBP). Our index calculations take the same approach, and we have two CBP data issues to highlight. First, we utilize the three-digit level of the North American Industry Classification System (NAICS). At this level of industry aggregation, NAICS contains 21 manufacturing industries, which is very comparable to the 22 industries at the NIC two-digit level. The NAICS and NIC industries are defined somewhat differently, and to a limited degree this could influence the measures. For example, a more evenly balanced distribution of employment across one set of industries would reduce the likelihood of a very high specialization value occurring. Nevertheless, given the broad comparability of manufacturing industry definitions, we believe this issue to be very minor.
Second and more important, CBP data do not disclose values if tabulations risk violating the confidentiality of individual plants. That is, CBP will not provide the employment for the chemicals industry in a location if there are only one or two chemical plants, as this would be very informative about the underlying plants themselves. This issue is very prominent at the county level, and it becomes increasingly less important with higher levels of spatial aggregation. We thus consider states and metropolitan areas (which we refer to as cities for simplicity). CBP reports the total manufacturing employments for cities, and thus we can calculate the share of the total employment that has been allocated over industries. We report below tabulations that use four levels of aggregation (observation count): states and District of Columbia (51), the top 100 cities in terms of manufacturing employments (100), cities with at least 90% of manufacturing employment allocated (193), and cities with at least 60% of manufacturing employment allocated (746).

Specialization and Diversity Indices
We follow Duranton and Puga (2000) in the indices that we use to measure specialization and diversity for Indian districts. Before providing the formulas, we outline the key building 7 blocks required. We index districts by d and industries by j. We measure the relative specialization of a district through the formula: In this formula, we look across industries within each district to find the highest value of the employment share ratio. By definition, the specialization index of a district must be greater than or equal to one. To see this, note that if a district exactly mirrors India as a whole, the ratio j j d s s / , is equal to one for every industry; thus the maximum ratio observed in the district is also one. If any employment share is then reallocated from one industry to another, then one of those two industries will have a ratio that exceeds one, yielding a maximum value that is greater than one. The maximum value of the district specialization index has the properties of the individual ratios described above.

8
We also measure the relative diversity of a district through the formula: .
In this formula, we first calculate the absolute difference between j d s , and j s to measure the degree to which a given industry is over-or under-represented in the district on a share basis. We then sum across industries. This sum represents the share of the district's employment that would need to be reallocated across industries in order for the district to have the same industrial employment proportions as India does nationally (double counting the deviations).
We then take the inverse of this sum such that a larger value of the diversity index indicates that less employment needs to be reallocated in order for the district to resemble India as a whole. Considering the extreme values of the index can again illustrate its properties. If a district has all of its manufacturing employment in one industry that is very small in size nationally, the denominator of the index becomes large, starting to approach 200%. In such cases, the diversity index as a whole takes a very small value that approaches zero. On the other hand, if the district exactly mirrors India as a whole, then the denominator of the index becomes very small, staring to approach 0%. In these cases, the diversity index as a whole takes a very large value, indicative of substantial spread in the employment of a district across industries. 2

Empirical Application of the Specialization and Diversity Indices
The next section provides descriptive statistics of these metrics for Indian manufacturing, and we highlight first three important issues for our empirical application. First, it is important to note that the indices are related to each other but also not redundant. variations and trends in the data. When we move to productivity and growth estimations, we primarily report results that use a ten-point scale for the decile in which a district's specialization or diversity falls compared to the whole set of Indian districts. For example, a district receives a value of 3 if its specialization level is between the 30th and 39th percentile for India. This approach makes the scales and variances of our indices more comparable and aids in interpretation, although we derive quite similar results with other approaches.
Finally, it is important to note that the indices do not directly relate to or build upon other properties of districts (e.g., size, income per capita, etc.). Both measures are calculated over industry distributions within districts and thus do not build upon these features specifically. This is not to say, however, that the indices are orthogonal to these properties either. For example, it is increasingly difficult for large districts with lots of employment to have a very undiversified industrial base, while it is easier for large districts to maintain specialization in one industry.
Thus the connections of our indices to these properties are intriguing and may differ from those in advanced economies, and we quantify these relationships below.

The Evolution of Indian Specialization and Diversity
This section describes the levels and trends in the specialization and diversity of Indian manufacturing. As we are the first to depict these patterns for India (and generally among the first studies conducted outside of an advanced economy 4 ), we devote extra attention to this section. We also calculate the change in specialization and diversity for a district in relative terms using the formula of (AverageValue 2000-2010-AverageValue 1989-1994 )/ AverageValue 1989-2010. The last two columns provide summary statistics for these metrics. 5 We focus our descriptive analysis on 2005 due to changes in industry codes between the 2005 and 2010 surveys that make 2010 less comparable to earlier periods (discussed further in Table 6). Looking at 2005, the average specialization value is 6.65, and the median value is 4.49.

Levels, Trends and Persistence of Indian Specialization and Diversity
This suggests that, for the average Indian district, the maximum degree to which one industry's employment exceeds its national share is around 600% (e.g., the industry constitutes 6% of the district's employment relative to 1% nationally). There is a wide variance in this metric, with the standard deviation equal to the mean. The lowest winsorized level in 2005 is 2.1 (e.g., a 2% local 4 Duranton (2013b) considers specialization and diversity in Colombia.
Is a value of 6.65 high or low? specialization corresponding to increasing diversity-but the patterns are more muted, likely in large part due to the diversity index's metric design being more stable than the design of the specialization index. For both specialization and diversity, we see some re-widening of the distributions in 2010 (greater standard deviation). Table 1b shows a very similar set of patterns when considering only the urban areas of districts, with the additional observation that the specialization of urban areas tends to exceed that for the district as a whole. that higher diversity is associated with reduced specialization. The Indian correlation is weaker at -0.2 than the U.S. correlation of -0.4. Visually, the U.S. data exhibit a much more regular pattern in terms of this trade-off than the Indian data do.
In advanced economies, specialization patterns of cities are quite stable over time (Henderson, 1997). The substantial trend discussed in Table 1a suggests that this finding may not hold in India. Since 1989, India has experienced rapid economic growth, major infrastructure investments, a greater openness to foreign trade, and the adjustment of many industrial policies regarding location choice (e.g., reduced effort to have industry locate in lagging regions, development of special economic zones). These and other factors may weaken the persistence of district specialization and diversity levels.
Tables 2a-3b show the limited persistence of Indian specialization and diversity levels.
For the specialization index in Panel A of Table 2a, there is a 0.2-0.3 correlation of district values across adjacent surveys that are about five years apart. By itself, this correlation over a short time period is pretty small. Across two surveys, or about ten years in duration, these correlations are neither economically or statistically meaningful. Thus, there is rapid change in specialization levels for the period that we consider. Panel B shows a similarly rapid decline for the diversity index, although there is less additional attenuation over the ten-year span. Table 2b looks pretty similar when we isolate urban areas of districts. In both tables, there is some evidence that the patterns may be showing greater persistence in the later periods than earlier, which we quantify next.
Tables 3a-b show an alternative approach for measuring persistence. We develop a transition matrix to follow cohorts of districts over time. In Panel A of In words, this group is moving substantially up the distribution of districts in terms of relative specialization. By contrast, the bottom row of Panel A shows that the group with the highest initial specialization values declines to an average quintile value of 3.53 by 1994, and then to the Indian manufacturing sector that we noted above. Duranton (2013b) also observes a very low degree of persistence in the production structure of Colombian cities.  Table   1a. As described above, our index values do not depend directly on the size or economic advancement of a region. To this end, no more than 3 of the 12 districts for any of the eight lists provided are in the same Indian state. Likewise, specialization and diversity are related, but not one-for-one. Only 1 of the 24 districts that form an extreme average value on the specialization index in Panels A and B of Table 4a is also an extreme value on the corresponding diversity lists in Table 4b.

Industry-Oriented Perspective
The specialization and diversity metrics are defined for districts, but it is also useful to take an industry-oriented perspective. To do so, we calculate for an industry the weightedaverage specialization and diversity index values for the districts in which the industry resides.
We weight by the employment levels of the industry across districts. Thus, if most of the employments for an industry are in districts that are highly specialized, we will measure a high average specialization value for the industry.
In Table 5, we report the levels of these values for industries and their changes over time.
The clearest pattern is that more advanced industries like office, accounting and computing machinery (NIC 30) and radio, television, and communication equipment and apparatus (NIC 7 A comparison point from U.S. states for 2005→2011 is a transition set of specialization of 1.2, 2.3, 3.0, 3.9, and 4.6 (lowest initial quintile to highest initial quintile). For diversity, the transition set is extremely stable at 1.1, 2.1, 3.2, 3.7, and 4.9. Unfortunately, the MSA definitions in the CBP data are not consistent enough to calculate at a lower spatial level.
15 32) are located in more specialized districts. At the bottom of the table, we group industries into "traditional" or "modern" and find that the latter is generally located in more specialized areas than the former. 8 The traditional versus modern differences are stable over time. By contrast, and showing more of the distinctions between the specialization and diversity indices, there are no differences evident with respect to the average diversity values of the districts in which these two groups of industries locate. Table 6 provides a second industry perspective. We count by survey the number of times that each industry is responsible for the specialization value of its district. We also tabulate the average of these counts for an industry across all surveys and the changes in these counts from 1989-1994 to 2000-2010. While modern industries tend to be in very specialized locations, this concentration means that on average for our 429 districts, it is a traditional industry that is responsible for the local specialization value. Roughly two-thirds of districts have their specialization in a traditional industry, and this share is climbing slightly over time. The computer and communication industries, by contrast, form the specialized industry for typically 10-15 districts over the sample period, which is a substantially smaller count than many of the larger traditional industries (average count across industries is 20).
The bottom of Table 6 Table 1a for comparison to the United States. Most of the remaining analyses only employ the 2010 data to calculate growth rates at the district level and do not depend upon industry definitions, thus minimizing the importance of this issue. Several important observations can be made from Table 7. First and most important, this substantial battery of district traits has pretty weak predictive power for industrial specialization and diversity. This weakness is observed in the limited number of regressors that pick up a statistically significant coefficient (often only two out of the 14 regressors modeled) and the low R-Squared value overall. To an important degree, this is the outcome of the metric design that makes these traits more or less independent of factors like district size. Given this independence, we can expect to find broad stability in our upcoming regressions with and without district covariates being modeled (which will hold true).

Correlations to District Traits
Second, there are two traits of districts that consistently link to these industrial traits. The literacy rate of places is closely linked to their initial levels and subsequent changes. Districts with high literacy rates in 2000 start with both a more specialized and diversified industrial base.
It is important to recognize that these partial correlations are conditional on other district traits like education shares, population levels, and consumption per capita. Places with high literacy rates also shift towards less specialization over time. 10 Second, the manufacturing share of the 9 Appendix Tables 1 and 2 report univariate correlations between the index values and district and industry traits, respectively. Caution should be exercised about over-emphasizing any single correlation, given the lack of other controls.
district's employment in 2000 matters for the changes observed. Areas with a high concentration of local employment in this sector are observed to have 1989-2010 shifts toward specialization and away from diversity. In the upcoming growth analyses, we will control for the traits modeled in Table 9, so that we are measuring the impact of specialization and diversity in a district's industrial base over-and-beyond these correlates. While this approach better isolates the role of specialization and diversity, we should not forget these two basic connections when using stricter econometric frameworks.
By contrast, the absence of some correlations is striking. In the United States, larger cities tend to be less specialized and more diversified (e.g., Black andHenderson, 1998, Duranton andPuga, 2000). Duranton (2013b) also observes this for Colombia. For India, at least with respect to manufacturing, we do not see this pattern. These null patterns are also evident with population density. 11 It is also striking that infrastructure metrics and banking conditions tend to have limited correlation with the specialization and diversity metrics. Finally, overall urbanization levels of districts are not clearly linked.

Economic Productivity and Growth with Specialization and Diversity
This section describes our empirical exercises regarding economic outcomes. We first consider production functions for manufacturing establishments that estimate how the specialization and diversity levels of the district in which a plant is located link to its productivity. We then consider whether the growth of a district's manufacturing employment links to the district's specialization or diversity. For example, we have yet to include services in our framework, and the non-traded nature of services is one reason given for the size-specialization relationship. (It is worth noting, however, that most of the informal sector employment that defines our manufacturing metrics comes in effectively non-traded forms.) Big cities, where firm headquarters and service firms are often based, tend to specialize in business services. Henderson (1997) finds that medium cities have more mature manufacturing industries (but not newer technology-centric manufacturing).

Manufacturing Production Function Estimations
simple production function with log output Y of each establishment i in a district d and industry j as the dependent variable. The specification takes the form: Our core regressors are the specialization and diversity measures for each plant's district. For these estimations, we use a form of indices that measures specialization and diversity on a tenpoint scale ranging from zero to nine based upon the decile of a district in the overall distribution for India. This transformation makes the β and δ parameters easier to interpret as described below. This transformation also implicitly suppresses some of the variation in specialization index value vis-à-vis the diversity index.
We include a vector X i of plant inputs into the production function: log employees, log book values of capital, and log costs of materials. We exclude plants with missing values for these metrics, which increases the relative weight of organized-sector facilities in the estimations. We also control for other district traits described below with a vector of districtlevel controls Z d . Regressions include three-digit industry fixed effects j γ to capture regular differences in production techniques and spatial locations across industries. We use establishment weights from the surveys and cluster standard errors by district.
Column 1 provides a baseline estimation of the plant-level production function before district-level conditions are incorporated. These underlying parameters for the production function, emphasizing employees and materials, are very stable across estimations. Columns 2 and 3 show that simple connections do not exist for the production function to specialization or diversity in isolation. Interestingly, Column 4 finds, however, that both specialization and diversity correlate with greater productivity when jointly modeled. The 0.015 coefficients can be interpreted as quantifying that each increase in decile of specialization or diversity (e.g., moving from the 8 th to the 9 th decile) is associated with a 1.5% increase in the conditional output of the plant.
While we have yet to come to terms with why both features of the district's industrial base need to be modeled, we do know that this pairing is not simply reflecting other aspects of the district landscape. Column 5, for example, shows very similar results when controlling for log manufacturing employment in district per square kilometer, log manufacturing employment in plant's district-industry per square kilometer, log share of local manufacturing employment in the unorganized sector, and log share of district-industry manufacturing employment in the unorganized sector. The first two of these controls are often associated with the urbanization and agglomeration premiums of dense locations, and we use a per square kilometer normalization as Indian districts vary in spatial size. 12 Column 6 likewise finds very similar results when further modeling the manufacturing share of local employment. Controlling for these extra covariates does not impact the reduced significance for specialization or diversity when they are modeled independently of each other.
Columns 7-9 consider heterogeneity in the sample. Columns 7 and 8 again split the Indian sample by traditional versus modern sectors, finding that the specialization effect is present in both sectors, but the diversity impact is concentrated in variation over traditional sectors. Column 9 determines for each district its specialized industry and then enters a dummy variable for that industry. We further interact this dummy variable with the two specialization and diversity indices. Interestingly, the indicator variable itself suggests an upwards productivity shift for the plants in the specialized industry, although this effect is not statistically significant.
On the other hand, the interactions suggest that the marginal productivity gains to further specialization and diversity are associated with plants outside of the specialized industry itself. A similar unreported exercise introduces an indicator variable for plants in the organized sector and their interactions with district specialization and diversity. Organized plants demonstrate substantially higher productivity than unorganized plants, but most of the marginal productivity gains to further specialization and diversity are associated with plants in the unorganized sector.
We stress that these estimations only document partial correlations, and we have not identified an exogenous shifter in local specialization or diversity. The rapid changes noted in the earlier section may allow for such a metric to be identified in the future. Nevertheless, even in their current form these estimations show a potentially important link of the local industrial organization to the productivity of Indian plants not understood before.

Growth of Manufacturing Bases in Districts
Moving from the productivity regressions, Table 9 considers the potential link between specialization and growth. The topic of specialization/diversity and city growth has been often studied over the past two decades (Duranton, 2013a), and we seek to provide early evidence on this link within Indian manufacturing. We estimate a cross-sectional growth equation at the district level of the form: Our core regressors are again the specialization and diversity measures that utilize a ten-point scale. The t0 subscript signifies that we measure these attributes at the start of the sample using the 1989-1994 data. The outcome variable is the log employment growth in manufacturing for the district from 1989 to 2010. Regressions include a vector of state fixed effects s χ to account for differences in regional growth in manufacturing for India across the period. The vector Z d contains the additional district-level controls that we modeled in Table 7. We weight districts by their log population in 2001 and report robust standard errors.
The first column shows a large partial correlation for the specialization index. An increase of one decile in the specialization index is associated with 9% higher employment growth (e.g., moving from a growth rate of 2.00% to 2.18%). This is a quite substantial boost that Columns 2 and 3 show is not present with the diversity levels of districts. Similar elasticities are evident without the state fixed effects, too.
Column 4 further shows that the results are robust to including district-level covariates taken from the 2001 Census discussed earlier. These covariates tend to have weak multivariate explanatory power, with the main exception being a robust relationship between education shares and higher growth. Regardless, the inclusion of these controls does not impact the specialization index's connection to growth. Columns 5 and 6 show similar results when we do not weight observations or when we exclude districts with less than one million people in population. The latter check is important given that the exclusion of small districts guards against cases where our manufacturing data start to become thin for measuring the index values, and it is noteworthy that the standard errors remain consistent despite the reduction in sample size. Column 7 shows 21 similar results when we include a control for the expected growth based upon the industrial distribution for the district in 1989 and the national rate of growth by industry across the period.
Column 8 replaces the ten-point measure with indicator variables for districts being in the 50 th -75 th percentile range or in the 76 th -99 th percentile range. These estimations measure coefficients relative to the bottom half of the distribution. The results indicate that the most substantial differences for growth come from districts being in the top quartile of initial specialization. We also find quantitatively similar results when considering raw index values.
Interestingly, the diversity indicator variables suggest that the general null result observed on this margin may be due to non-linear features of the data, as the third quartile's coefficient is quite strong. Finally, Column 9 shows that these results are associated with the initial specialization value and not the change in specialization over the period.
In addition to these variants, we have performed several other robustness checks. First, if using average index values across the full period, the coefficients for specialization and diversity are 0.102 (0.032) and -0.009 (0.038), respectively. Using just the 1989 value for initial specialization lowers the coefficient to 0.047 (0.022), which is to be expected given the dramatic changes observed earlier during the initial 1989→1994 adjustment. Adding an additional control for the average employment level in manufacturing across the 1989-2010 period in a district yields a specialization coefficient of 0.082 (0.031). Finally, we observe employment growth both within the specialized industry for a district and also outside of it. The coefficient when isolating the specialized industry of districts is 0.115 (0.049), while it is 0.074 (0.032) when focusing on all industries in a district other than the specialization district.
On the other hand, there are some limits to these findings. While employment growth displays a clear pattern, the relationship of initial specialization to output growth is more tenuous and depends significantly on data preparation steps like deflator use. Second, when looking at changes in employment by period, we do not observe a positive relationship in all cases, with again some modeling choices that do not matter when examining the whole period (e.g., the inclusion of state fixed effects) starting to become important when considering each period separately. These break-outs suggest the most important growth period was the 2000→2005 period. Table 10 further extends these regressions to consider the specialization and diversity differences across urban and rural areas of districts. Interestingly, we see employment growth for the district as a whole is most closely connected to both traits in urban areas and to rural specialization. The latter may be especially related to the movement of organized plants to rural areas in many Indian districts (e.g., Ghani et al. 2012a), while the former is more closely associated with traditional city dynamics that can favor both industrial forms.
To summarize, there is a very clear and robust relationship between specialization and growth in the Indian manufacturing data over the 1989-2010 period. Similar to the productivity estimations, these relationships remain non-causal in nature. They do, however, provide new guidance as to how Indian specialization and diversity link to local growth, which we can begin to compare with other countries.

Conclusions
The industrial specialization and diversity trends documented in this paper for India are very exciting. They show evidence of a rapid change from specialized, undiversified manufacturing districts prior to India's liberalization to a pattern today that has a distribution more closely resembling that of the United States. While India's economic geography remains in flux, the very rapid shifts during the 1990s have segued to more modest adjustments during the last decade. Looking forward, modern industries are quite spatially concentrated in India, which has the corollary that most districts derive their highest specialization from a traditional industry.
In terms of economic consequences, local specialization and diversity are jointly associated with stronger production functions. The marginal impacts of these benefits are mostly located outside of the industry leading the local specialization. In terms of manufacturing employment growth, the greatest gains over the 1989 to 2010 period are observed in districts that had the highest initial specialization levels, with diversity perhaps showing a non-linear role.
We see several very promising avenues for further study. First, there remain some intriguing issues and question marks within the patterns that we have outlined. We have noted the general movements of Indian districts towards less specialization and greater diversity; we have also observed, however, that districts with higher initial specialization exhibited greater manufacturing employment growth. These facts are not necessarily at odds with each other, as much of the growth could have come outside of the initial specialization industry, just as we observed broader marginal productivity spillovers. But it is not clear why this would be the case, and it hints that very interesting patterns may exist when jointly studying districts at the regional level. For example, it is known that Indian government policy prior to the deregulations promoted industrial placement in less developed locations in the name of distributional equality.
These patterns are perhaps showing how this initial, artificial placement unwound itself, with a general movement of regional industry towards a focal point (somehow determined). Our current framework treats each district as a separate entity, and answering these types of questions will necessitate modeling the spatial distances between districts of various types. Such an extension would also allow us to use plant age data contained in our surveys to measure the degree to which maturing industries move from diversified districts to specialized locations (e.g., Duranton and Puga, 2001).
Second, we would like to evaluate the longitudinal changes highlighted in this study in terms of the impact of specific infrastructure projects (e.g., the Golden Quadrangle project) or trade liberalizations. These factors have been shown to play an important role for Indian manufacturing (e.g., Goldberg et al., 2010a,b), but we do not know their role in terms of the specialization and diversity of India's districts. For example, highly tradable goods may have also been the goods that could have formed the basis for the most specialized districts before India's reforms. Tracing out the economic geography of these economic shocks is important, and the rich longitudinal data for India provide a unique laboratory for doing so in a developing economy. Likewise, Ellison et al. (2010) consider coagglomeration and the inter-linkages of industries within a local area. It would be interesting to quantify what happens to related industries located in specialized or diversified districts when these dramatic changes occur.
Finally, manufacturing is a natural starting point, especially given its study in many countries. India, however, has relied extensively on services growth for much of its economic development, and it is critically important in this context for us to learn about how this sector's patterns are similar or different to those in manufacturing (e.g., does services growth link to specialized districts for services). This extension would also allow us to compare India more closely with the work on Colombia by Duranton (2013b), who finds some interesting differences 24 between manufacturing and services performance around specialization dynamics. This added perspective will provide a richer foundation for understanding growth in Indian cities.         Table 1a. An asterisk denotes a correlation is statistically significant at the 10% level.
Notes: See Table 2a.            Notes: Estimations quantify the relationship between district employment growth in manufacturing and district specialization and diversity indices. District-level traits used for covariates are those included in Table 7's estimation. Estimations weight observations by the log of district size and report robust standard errors. + significant at 10% level; * significant at 5% level; ** significant at 1% level.  App.