Geography and Exporting Behavior: Evidence from India

This paper examines locational factors that increase the odds of a firm’s entry into export markets and affect the intensity of its participation. It differentiates between two different sources of spillovers: clustering of general economic activity and that of export-oriented activity. It also focuses on the effect of the business environment and that of institutions at the spatial unit of districts in India. The study disentangles the within-industry effect from the within-firm effect. A simple logit specification is used to model the probability of entry. The analysis is based on a panel of manufacturing firms in India, which allows for the introduction of firm-specific controls and a battery of fixed effects. The findings suggest that exporter- specific clustering, general economic agglomeration, and institutional factors affect firms’ export behavior.


Introduction
Policy makers consider exporting to be unambiguously goodfor the firm and for the economy. Although there is a lively debate about whether exporting really has a causal and positive effect on firm productivity, many national governments have set aside resources to provide domestic firms with an impetus to enter foreign markets, i.e. to start exporting. Developing countries are no different in this regard. This paper will study the decision of Indian firms to export and will analyze the factors that determine the extensive and the intensive margins of exporting. In other words, it will identify how the characteristics of a given firm, industry and location determine export participation.
In the paper export participation has been defined in two waysthe propensity of firms to start exporting, and the intensity with which they export. First, the paper tests what sorts of factors affect the probability that the firm will start to export. Factors specific to the firm, such as firm-level productivity, type, age and the size of the firm, could account for the decision to start exporting. Equally, factors specific to the industry or the location, such as agglomeration could also play a role in reducing the sunk costs of entry. In the second half of the paper, the analysis focuses on whether these factors also play an important role in affecting the performance of the firm, conditional on entry. The paper will also disentangle the cross-sectional variation across firms and the time-series variation within firms. While the former reveals how the factors of interest affect firms within a given industry, the latter reveals how these factors affect any given firm.
There are two strands of literature that are relevant to the question of sunk entry costs in exporting, and of positive externalities associated with agglomeration. Theoretical models developed by Baldwin and Krugman (1998) and Dixit (1989) describe the presence of fixed costs faced by firms to enter into export markets. These sunk costs of entry might relate to information on foreign markets, the establishment of distribution channels, the costs of complying with new or more developed product standards etc. Theoretical models have described the scope of the benefits from industrial clustering at different levelsown-industry (Marshall 1890, Arrow 1962, Romer 1986), inter-industry (Venables 1996) and through industrial diversity (Chinitz 1961, Jacobs 1969). This paper is mainly concerned with the intersection of the predictions from these theoretical models, i.e. how the presence and scope of agglomeration economies lower the sunk costs of export entry. The follow-up question is to what extent the performance of the firm is affected at the margin after entry when it continues to export. Duranton and Puga (2004) describe microeconomic mechanisms, such as sharing, matching and learning etc., through which the benefits of agglomeration could flow to individual firms within a location 1 . There is also a lively empirical literature on measuring export spillovers. Aitken et al (1997) find that the presence of multinational firms affects the probability of entry in export markets for Mexican firms by a factor of 0.035. Becchetti and Rossi (2000) find that geographical agglomeration significantly increases export intensity and export participation (by a factor of at least 0.02) in their study of Italian firms. Greenaway et al (2004) find a similar result for firms in the UK, whereby clustering in the same region and industry increases the probability of entry by a factor of 0.016. Lovely et al (2005) find that domestic firms cluster in response to exports to countries with higher barriers to entry, suggesting the presence of export spillovers. Konig (2009) studies export spillovers by destination for French firms and finds that exporter-agglomeration positively affects the probability of starting to export to a given country by a factor of 0.14 and that these effects are destination-specific.
However, the findings of the literature are not conclusive as there are papers that find little or no evidence of export or other spillovers. Barrios et al (2003) find no evidence of export spillovers between exporters or multinationals for domestic firms in Spain, and Bernard and Jensen (2004) find no evidence that export or agglomeration spillovers affect export entry for firms in the United States.
This paper will directly test for these hypotheses to understand what factors might affect the decision of the firm to start exporting. The paper will use a panel of heterogeneous firms, wherein firms differ with regards to characteristics such as productivity, size, age and type and with regards to participation in export markets. There are two distinct types of spillovers, (1) generated by agglomeration of more general economic activity within a location, and (2) generated by exporter-specific clustering within a location. The paper will also study the effect of the business environment more generally proxied by variables relating to levels of general infrastructure and by institutional variables. The model will control for firm and location attributes and will identify the effect of factors specific to the firm, those associated with Krugman's (1991) first and second-nature geography 2 and the general investment climate. The empirical analysis is carried out using districts as a geographical unit of studyequivalent to a county in the US or in China, a unit that coincides reasonably well with Marshall's notion of agglomeration.
The remainder of the paper is organized as follows: The next section provides a descriptive overview of the clustering of economic activity, general and exportoriented, across districts in India. Section 3 outlines the theoretical model and the estimation framework. It also describes the variables used and lists the sources of data. Section 4 presents the results of the model, for the extensive margin and Section 5 for the intensive margin of export participation. Section 6 concludes and discusses the contributions, limitations and implications of the findings.
problems. And learning refers to mechanisms based on the generation, diffusion and the accumulation of knowledge. 2 First-Nature geography is when the characteristics of the natural geography determine clustering, and second-nature geography is when interactions between economic agents and increasing returns to scale determine clustering.

2 Descriptive Analysis
An important focus of this paper is to ascertain what part of firms' exporting behavior can be explained by the effect of agglomeration, in other words, if spillovers between firms can lower the sunk costs of export entry. In fact, whilst this study will also focus on the effect of various infrastructure and institutions within a location, at this stage it is pertinent to establish if there is any evidence of clustering. At a later stage, the analysis will focus on trying to disentangle the effects of second-nature effects from the natural geography and the more general sources of business-oriented advantages.
There are two phenomena that would indicate that agglomeration and exporting go hand-in-handif exporters were drawn to other exporters, and/or if exporters were drawn to industrial activity more generally. Different methods can be used to ascertain whether firms are uniformly distributed across various locations in the country, or if they show patterns of spatial concentration. Clustering in its simplest form can be shown through a bird's eye view of where economic activity is located. In this paper I look at the location of firms at the level of the district 3 . In particular, I compute (1) all firms, exporting and non-exporting, as a percentage of the population 4 and (2) firms that export as a percentage of the population. It should be kept in mind that the sample of firms contains mostly medium and large-scale manufacturing units.
In other words, I look at clustering of all firms and that of exporters after having controlled for the size of the district.
I find that there is much concordance between the districts hosting general and exportoriented economic activity. Not only do firms and exporters show evidence of clustering in a few districts, they also seem to cluster in the same districts. Table 1 lists districts in descending order of the economic activity hosted. Thus, while economic activity, whether exporting or not, seems to be located in the same districts, there is evidence of some locations hosting much higher proportions of exportoriented activity.
Having established that there is evidence of clustering of exporters and economic activity in the country, this paper will now examine to what extent the characteristics of the location affect the propensity of firms to start exporting and other attributes of their exporting behavior more generally. The effect of second-nature clustering will be identified separately from that of first-nature geography and the investment climate. The latter are particularly interesting, in so far as public policy makers can directly affect the provision of infrastructure and affect institutional variables within a location. The next section will provide a brief overview of the theoretical literature and outline a few empirical studies of relevance.

Econometric Model
The decision to start exporting is estimated using a Logit model that controls for the specific characteristics of firms, locations and years. Consider a firm  i that makes a decision to start exporting. The associated profits are   i and the sunk cost of entering export markets is  f i . Since I am mainly interested in firms that begin to export for the very first time, in this model I do not consider firms that continue to export. Because there is no need to account for export experience of a given firm, this approach has the added benefit that there is no endogeneity bias owing to the introduction of lagged export status (see Tybout 1997, andBernard andJensen 2004).
Following Konig (2009) it is assumed that a firm will start to export if profits associated with entry exceed the cost of entry, i.e.


 i  f i . Thus, the probability that a firm  i starts to export at time Profits of a firm are assumed to be a function of productivity and other characteristics of the firm, and the sunk cost of entry is assumed to be a function of local exporting activity and agglomeration specific to a given industry (  k ) in a location (  j ). Rewriting Equation (1), the probability of starting to export is given by: where firm characteristics are included in the vector  X it , and characteristics affecting the sunk cost of entry specific to the location and industry are included in  Z jkt . This expression can be estimated using a Logit model under the assumption that the error term is distributed logistically. Thus, the dependent variable  Y it is a dummy variable 6 describing whether the firm  i starts to export at time period  t . The regressions include only those firms that have entered the export market at least oncein other words, firms that have never export over the sample period are excluded. Additionally, the dependent variable equals 1 for the year in which the firm first starts exporting and equals 0 for all other years leading up to that year. If firms continue to export, or if they switch status after having entered the export market for the first time, these observations are not included in the regressions.

Specification of Variables
The deterministic component of the function consists of the various attributes of the location that can influence the propensity of a firm to start exporting. The random component consists of the unobserved characteristics of the location and measurement errors. As mentioned above, the dependent variable is a dummy variable at time Firm-specific characteristics include it  , which represents the productivity of the firm,  age it , which represents the age of the firm, it size , which represents the size of the firm, and  type i , which represents the type of firm (private domestic, private foreign, public or mixed). Agglomeration (or, second-nature geography) are described by jkt , which represents inter-industry trading relations measured by the strength of buyer-supplier linkages, and  U jt , which represents urbanization economies in district j. Other economic geography variables getting at first-nature geography include  MA jt , which summarizes access to markets in neighboring districts, and j Port , which summarizes distance for a given district from the closest port. The remainder of this section provides a detailed description of each of the variables used in the modelfirm-specific variables, second-nature geography, first-nature geography, infrastructure and institutional. For easy reference, a summary of the variables is provided in Table 2.

Firm-Specific Controls
Firm-specific controls include the productivity, size (sales), age and the type of firm (private domestic, private foreign, public or mixed). The literature suggests that exporters are usually more productive than non-exporters because of two distinct mechanisms -self-selection into export markets and learning-by-exporting. Exporters may be more productive than their counterparts, who only supply the domestic market, simply because more productive firms are able to engage in export activity and compete in international markets. The second mechanism is post-entry productivity benefits, since when firms enter into export markets they gain new knowledge and expertise, which allows them to improve their level of efficiency. While this paper is not concerned with the casual impact of exporting on productivity, it is important to control for the self-selection of more productive firms into exporting. Lagging productivity by one period effectively controls for possible endogeneity, since the decision to 'start' exporting takes place only once.
To obtain consistent production function estimates, this paper follows Olley and Pakes (1996) to compute firm-level total factor productivity (   it ). This approach controls for two distinct sources of bias (1) simultaneity between outputs and inputs, which would bias the labor coefficient upward and (2) endogeneous exit of firms from the sample, which would bias the capital coefficient downward. Under fairly general assumptions, Levinsohn and Petrin (2003) show that with simple OLS estimations the labor coefficient will be upward biased and the capital coefficient will be downward biased. This would imply that productivity estimates would be upward biased for more capital-intensive firms, such as exporters. I compute the labor and capital coefficients under simple OLS assumptions and using the Olley-Pakes procedure and report these in Appendix Table A.1. I use data on the firm's total wage bill as a proxy for the labor input, and on its fixed assets 5 as a proxy for capital. These nominal values are deflated using NIC 2-digit level output and input specific price indices 6 .

Agglomeration Variables
The count of other exporters within the district ( jt exp ) and the count of other exporters by industry within the district ( jkt exp ) are weighted by the district population captures the effect of export spillovers. The idea is that proximity to other exporters could result in knowledge spillovers that might help non-exporters to start exporting. In addition more general industrial agglomeration within a location would also increase the likelihood for denser interactions between exporters, no matter what the proportion of exporters in the overall cluster. Thus, not only does the specification control for the effect of other exporters, by industry and otherwise, within a district, it also includes measures of own-industry and input-output agglomeration and industrial diversity. It should be noted that clustering could also be associated with diseconomies such as congestion or increased competition. Thus, the estimations will capture the net effect of the positive and negative impacts on export participation.
Localization economies ( jkt  ) can be measured by own industry employment in the district, own industry establishments in the district, or an index of concentration, which reflects disproportionately high concentration of the industry in the district in comparison to the country. I measure localization economies as the proportion of industry  k 's firms in district  j as a share of all of all industry  k firms in the country for a given year t. The variable takes a different value for each industry in a given district, across districts. It identifies spillovers that are associated with within-industry clustering, regardless of the final markets that these firms serve. The higher this value, the higher the expectation of intra-industry concentration benefits in the district. There are several approaches for defining inter-industry linkages: input-output based, labor skill based and technology flow based. Although these approaches represent different aspects of industry linkages and the structure of a regional economy, the most common approach is to use the national level input-output accounts as templates for identifying strengths and weaknesses in regional buyer-supplier linkages (Feser and Bergman 2000). The strong presence or lack of nationally identified buyersupplier linkages at the local level can be a good indicator of the probability that a firm is located in that region. To evaluate the strength of buyer (supplier) linkages for each industry, a summation of regional (here district) industry firms weighted by the industry's input (output) coefficient column (row) vector from the national inputoutput account is used: I use the Herfindal measure to examine the degree of economic diversity in each district. I refer to this measure as urbanization economies ( jt U ) in each district in a given year t. Urbanization economies are a reference to large urban areas, which are industrially diverse, which enjoy access to large labour pools with multiple degrees of specialization, access to financial and professional services, better physical and social infrastructures etc. The Herfindahl Index, although it captures only the level of industrial diversity within a region, is a proxy for these larger urbanization economies. The Herfindal Index of a district j ( jt U ) is the sum of squares of firm shares of all industries in district j: Unlike measures of specialization, which focus on one industry, the diversity index considers the industry mix of the entire regional economy. The largest value for  U j is one when the entire regional economy is dominated by a single industry. Thus a higher value signifies a lower level of economic diversity.

First-Nature Geography Variables
In principle, improved access to consumer markets (including inter-industry buyers and suppliers) will increase the demand for a firm's products, thereby providing the incentive to increase scale and invest in cost-reducing technologies. The proposed model will use the formulation proposed initially by Hanson (1959), which states that the accessibility at point A to a particular type of activity at area B (say, employment) is directly proportional to the size of the activity at area B (say, number of jobs) and inversely proportional to some function of the distance separating point A from area B. Accessibility is thus defined as the potential for opportunities for interactions with neighboring districts and is defined as: Where, jt MA is the accessibility indicator estimated for location j in year t,  S m is a size indicator at destination m (in this case, district population) in a given year,  d jm is a measure of distance between origin j and destination m, and b describes how increasing distance reduces the expected level of interaction 7 . The size of the district of origin  j is not included in the computation of market accessonly that of neighboring districts is taken into account. Thus, the accessibility indicator is constructed using population (as the size indicator), distance (as a measure of separation) and is estimated without exponent values. The measure of distance is 7 In the original model proposed by Hanson (1959), travel-time distance (in number of minutes) connecting any given pair of districts. Origin and destination points are located at the geographic center of each district, and the travel-time estimate is based on the least time-consuming path between the two. Time is computed 8 using Geographic Information Systems (GIS) as the length of the road between two points with assumptions about the speed of travel according to different road categories 9 . The same travel-time measure is also used to compute j Port , the distance of a given district to the closest of the 13 largest trading ports in the country. Access to a major merchandise chipping port should, in theory, positively impact the probability of starting to export.

Infrastructure Variables
The next set of variables deals with the general quality of infrastructure within a district, since one would expect that the general business environment would have a positive impact on the probability of a firm to enter export markets. Such variables are also particularly interesting to policy makers since, unlike agglomeration, targeted investments within a location can help to improve infrastructure and make a location more business-friendly. I include the density of roads, i.e. the length of roads per square kilometer within a district, as a proxy for transport infrastructure. I computed these values using ArcGIS data, and the density data is time-invariant 10 . I assess quantitatively the role played by human capital by including the proportion of the population within the district with a high-school education in a given year, captured by the education variablejt Ed . I define jt X as a measure of 'natural advantage' through the embedded quality and availability of infrastructure in the district. I use the availability of power (proxied by the proportion of households with access to electricity) within a location as an indicator of the provision of infrastructure. In addition I also use the proportion of households within a district with a telephone connection as an indicator of communications' infrastructure. jt W is an indicator of labor costs in location j, and is given by nominal district-level wage rates (i.e. non-agricultural hourly wages). The expected effect of this variable is hard to pin down theoretically. On the one hand, if wages were a measure of input costs then one would expect export activity to be inversely related to wages. However, it is also important to control for the skill set of the workers since a positive coefficient on wages could be a proxy for more skilled labor. Although I am unable to directly control for the ability of the worker, I include 'education' as a proxy for the level of human capital within the district. And thus, the proportion of high-income households ( jt WE ) within a district is an indicator of the general level of wealth, or more specifically, consumer expenditure within a district. The variable is constructed using household consumption data and refers to those households that belong to the highest monthly per-capita consumption expenditure group 11 .

Institutional Variables
I also control for the quality of institutions within the location. I include a dummy variable which is set equal to one for states with labor laws rated as pro-business by Besley and Burgess (2004). While labor regulations are mainly legislated and enforced by state governments, they also have an important effect on the cost of contracts at the district level. I also include a district-level variable on the frequency of riots and social unrest per capita across different years as a proxy for social institutions. This information is drawn from Marshall and Marshall (2008).
In summary, the firm characteristics and economic geography variables are supplemented with controls for infrastructure (transport, education, electricity and telephone), input costs (wages) and institutional variables (flexibility of labor regulations and social unrest).

Sources of Data
Firm-level data on export behavior (i.e. when a firm starts to export and following entry into export markets, the value of exports as a proportion of sales) and on output and inputs is drawn from the Prowess database. Prowess is a corporate database that contains normalized data built on a sound understanding of disclosures of over 20,000 companies in India. The database provides financial statements, ratio analysis, fund flows, product profiles, returns and risks on the stock market etc. The Centre for Monitoring the Indian Economy (CMIE), which collects data from 1989 onwards, assembles the Prowess database. The database contains information on 23,168 firms 13 for the year 1989 -2008 12 . After cleaning the data, the final dataset contains 6,296 firms. Since there is limited data for other district-level variables, the analysis is restricted to fewer years (1999)(2000)(2001)(2002)(2003)(2004). The analysis is limited to the manufacturing sector (i.e. National Industrial Classification 2-digit unit 14 to 36). I also exclude firms for which data on sales, gross assets and wages is missing, since these are crucial to the computation of firm-level productivity. Of the firms in the final dataset, 3,638 firms enter the export market at least once over the period of study. There is also a large degree of firm heterogeneity in terms of size and age.
However, some caveats should be mentioned here. It is not mandatory for firms to supply data to the CMIE, and one cannot tell exactly how representative of the industry is the membership of the firms in the organization. Prowess covers 60-70 percent of the organized sector in India, 75 percent of corporate taxes and 95 percent of excise duties collected by the Government of India (Goldberg et al 2010 13 ). Large firms, which account for a large percentage of industrial production and foreign trade, are usually members of the CMIE and are more likely to be included in the database. And so, the analysis is based on a sample of firms that is, in all probability, taken disproportionately from the higher end of the size distribution. As Tybout and Westbrook (1994) point out, a lot of productivity growth comes from larger plants, which are also more likely to be exporters, providing confidence in the comprehensive scope of the study.
Measures of agglomeration are constructed using unit-level data for the years 1999-2004 from the Annual Survey of Industries (ASI), conducted by the Central Statistical Office of the Government of India. The ASI covers all factories registered under the Factories Act of 1948 that employ 20 or more workers, or that employ 10 or more workers and use electricity. Although the ASI data has a large sample size, certainly larger than Prowess, and it contains data on firm-level characteristics, it cannot be used to study firm-level export behavior. This is because even though the ASI provides information on whether a firm exports or not, the database does not follow firms over time. In other words, firms are sampled afresh every year and it is not possible to create a panel of firms over time. Data on the number of units, i.e. the plant or the factory, is used in the analysis since employment-level data is often scarce or missing. As ASI collects data for primarily the manufacturing sector, agglomeration measures do not account for the activities of services enterprises. This is a shortcoming of the analysis since service sector activity and clustering within a location might be strongly associated with the availability of essential inputs that might reduce entry costs into export markets. Data on market access is constructed using district-level population drawn by various surveys of the National Sample Survey Organisation (NSSO).  Table 3.

4
Results: The Extensive Margin

Across Firms
The results of the econometric specification are provided in Table 4. The dependent variable is 'start' i.e. a dummy variable that equals one if the firm starts exporting and zero otherwise. All columns include year and industry (2-digit NIC level) controls. From left to right, the columns present estimations that include an increasing number of variables and then finally also include location-specific effects. Model specification (1) controls for firm-level characteristics only; in addition to these, model (2) includes agglomeration variables; model (3) adds first-nature geography variables; model (4) adds further infrastructure variables, and model (5) adds institutional variables. Model (6), which includes all variables, firm, second and first nature, infrastructure and institutional, also includes state fixed effects. Owing to limited data availability for 15 infrastructure variables, as described earlier, the number of observations is considerably reduced in model (4). Due to missing data, this loss of observations is exacerbated in models (5) and (6).
As one would expect, firm-level productivity is strongly and positively associated with the decision of the firm to enter export markets, providing some evidence for self-selection of the most productive firms into the export market. Additionally, the size of the firm seems to effect the export decision negatively, suggesting that smaller firms are more likely to start exporting. Interestingly, once productivity is controlled for, the age of the firm has no statistically significant effect on the log odds of entry. Also, once infrastructure variables are included in the regressions, these effects are no longer statistically significant.
The count of existing exporters per capita, by industry and in total, within a district seems to have little or no discernible effect on the propensity of the firm to export. For example, the magnitude of the coefficient of 'Exporter Count' in model (6) can be interpreted as follows: a unit increase in the percentage of exporters within a district decreases the log odds of starting to export by 0.2857, although this variable is only significant at the 10 percent level.
Other aspects of more general agglomeration within a district, i.e. localization, inputoutput economies and industrial diversity have a stronger impact on the odds of entering export markets. In fact, the coefficient on localization is negative suggesting that this variable might be capturing some aspects of competition across firms within the same industryalthough the coefficient is not significant in any specifications barring the one in model 4. Input linkages, i.e. access to suppliers, have a positive effect before infrastructure controls are introduced. On the other hand, proximity to buyers, i.e. those that firms sell to, seems to have a negative effect. The effect of industrial diversity is stable and negative, but statistically insignificant, across different specifications.
In Column (3) first-nature economic geography variables are introducedmarket access and access to the closest port. Neither variable seems to have any effect on the probability of starting to export.
Model (4) introduces infrastructure variables into the specification, and finds that most of these variables seem to have little or no effect on the odds of a firm entering export markets. Interestingly however, the effect of education (i.e. the proportion of the population with a high-school degree) is positive and significant at the 10 percent level. In lengthier checks (not shown here) the introduction of road density reduces the significance of the education variable, suggesting that some of the effect of more skilled labor is explained by the availability of better transport infrastructure. Lower wages seem to reduce the costs of entry, but the effect is not significant across specifications. And finally, institutional variables at the state (i.e. the flexibility of labor regulations) and at the district (i.e. social unrest per capita) are introduced in model (5). The effect of business-friendly labor regulations is insignificant. The impact of riots per capita, however, is negative and significant suggesting that more social unrest within a district lowers the odds of a firm's entry into export markets.
The last column (6) introduces location (i.e. state 14 ) fixed-effects in an attempt to control for any unobserved characteristics of the location that are not captured by the first-nature geography variables. In summary, the effect of firm-specific characteristics, namely productivity and size of the firm, has a significant effect on the odds of entry into exporting. Additionally, the agglomeration of same-industry firms within a district seems to have a negative effect, although that of exporter-specific clustering within the district is harder to pin down. Access to suppliers positively effects entry, while access to buyers does not. The level of skilled labor within a location has a positive effect, and social unrest is associated with lower odds of entry.

Within Firms
The previous regressions have been estimated at the industry-level. Including industry dummies implies that the coefficients are averaged for all firms within a given industry (and year and/or state). However, it could also be the case that a change in industry-level (or state-level) characteristics could affect firms in that industry differently, depending on the individual characteristics of the firm. For instance, Bown and Porto (2010) study the effect of a change in preferential market access for the Indian steel industry and find that some firms within the industry, such as those which historically had ties to developed markets, responded more quickly than others in order to increase their exports. Indeed, as their analysis shows, aggregating variables at the industry-level fails to capture the differences across firms, some of which are large producers who were active for a number of years prior to the shock and others that were relatively new entrants to the market 15 . And since ultimately the analysis is concerned with studying the effect of agglomeration and other characteristics of a location on the propensity of a given firm to enter export markets, this section re-runs the regressions with the introduction of firm fixed-effects.
Taking firm-fixed effects not only constrains the coefficient to be averaged withinfirms and not across firms, it also provides the most stringent control. It effectively controls for any possible endogeneity running from unobservables at the level of industries and locations, and the coefficients describe the effects at the level of firms over time. It is then redundant to take account of industry or location unobservables and the introduction of firm dummies provides a much cleaner analysis of effects at the level of the firm. Table 5 reports the results from the specifications that include firm fixed-effects. Since convergence was not reached with the inclusion of infrastructure variables, the models were run without them 16 .  Controlling for all characteristics of a given firm, the size of the firm has a negative effect on its odds of entry. Just as in the earlier section 4.1 on across-firm estimations, this suggests that smaller firms are more likely to start exporting 17 . However, the coefficient on productivity is not only larger in magnitude, it is also highly significant across all specifications. Since productivity has been lagged, this is robust evidence to support the theory that more productive firms are more likely to self-select into exporting.
There are some marked differences compared to the results of the across-firms analysis. The effect of clustering of exporters within the district has a strong negative and significant effect on the probability of a given firm to enter export markets. In other words, being surrounded by other exporting firms, irrespective of industry type, seems to discourage entry. On the other hand, the effects of more general industrial agglomeration are much more noteworthy than in earlier models. For instance, withinindustry clustering of firms seems to have a strong positive effect on the log odds of entry into export markets. Additionally, local industrial diversity seems to have no effect on the log odds of entry. Compared to earlier results, now access to larger neighboring markets positively affects the odds that a firm will enter the export market.
Although these specifications shed much light about the effect of geography and firm-level variables on the decision of the firm to start exporting, they don't say much about how these very variables might affect export participation conditional on entry. The next section will explore these effects in greater detail.

Results: The Intensive Margin
There is evidence (Das, Roberts and Tybout 2007) to show that entry costs are substantial not just with regards to the decision to export, i.e. the extensive margin, but also with regards to how much to export, i.e. the intensive margin. Thus, it could be argued that characteristics of the location affect not only the probability that a firm might start to export, but that they also have an effect on the continued success of the firm in export markets. Firms seeking to enter foreign markets might face fixed costs of participation for every additional year of exporting. Indeed, there is some evidence (Arkolakis 2009) to show that firms begin by exporting small quantities and increase their volume of exports quickly over time. Thus, export performance could also be measured as the intensity with which firms export.
To identify the effect of geography and firm-level variables on the intensity of export participation, I regress the log of the value of exports on a set of firm, industry and location-specific characteristics, similar to those in Equation (2): Equation (3) is estimation using Ordinary Least Squares (OLS) regressions 18 . Firm characteristics are included in the vector  X it , and characteristics specific to the 18 I also had the choice of regressing exports of the firm as a proportion of sales on the explanatory variables. In this case the dependent variable would be a fraction that varies between 0 and 1, and using OLS would lead to incorrectly identified coefficients. This is because the effect of any explanatory variable cannot be constant through its entire range. Additionally, the predicted values from an OLS regression often produce figures outside the range of 0 to 1. Papke and Woolridge (1996) examine potential econometric alternatives and support using quasi-likelihood methods. Accordingly, I try and use fractional logit 20 location and industry are included in  Z jkt . As in Section (4) above, I will identify the effects of geography and firm characteristics for firms within a given industry and location, and then for a given firm.

Across Firms
The first set of results is presented in Table 6, wherein the model specifications are the same as those in Table 4. However, the dependent variable is now the log of total exports of the firm, since I am mainly interested in understanding the factors that affect the intensity of participation in export markets.
The first striking result is that lagged productivity has a negative effect on value of exportsin fact a 1 percent increase in productivity seems to lower exports by anything between 29 to 33 percent. Age is also negatively associated with export intensity, indicating that younger firms tend to export more. And, intuitively, the size of the firm is positively associated with exports.
The clustering of exporters within the district seems to affect export intensity negatively, while the clustering of exporters of the same industry within the district has a positive effect. A percentage increase in the number of same-industry exporters within the district increases the value of exports by 16 percent. In the same vein, more general clustering, i.e. clustering of firms within the same industry, has a positive and significant coefficient. Thus, there is some evidence of positive externalities of within-industry clustering on the intensity of a firm's participation in export markets. Access to suppliers has a positive effect, and access to buyers has a negative effect, although these coefficients are not statistically significant once infrastructure and other variables are controlled for.
Market access has a negative effect, although the magnitude of the effect is small. Access to the closest port has a negative effect suggesting that firms closer to large trading ports are more likely to export more. In fact a 1-minute increase in the traveltime distance to the closest port decreases exports by 0.11 percent (see model 6). It seems that being located close to a port does not affect the odds of starting to export (see the result in Table 4) but that it does positively affect the intensity of export participation. Firms close to a large trading port are not more likely to start exporting, but once they do start exporting they are more likely to export more.
Infrastructure variables seem to be statistically insignificant in all cases, except for road density, which seems to suggest that higher density is associated with more intensive exporting, although the effect is insignificant and negative with the introduction of institutional variables and location fixed-effects. And lastly, whilst social unrest might negatively affect the propensity of firms to turn to foreign markets, once they do start exporting it is no longer statistically significant. In fact, the flexibility of labor regulations now seems to be much more importantmore probusiness regulations are associated with higher intensity of exports.
regressions, but find that these models do not converge with the introduction of firm fixedeffects.

Within Firms
Just as in the case of the extensive margin, the paper will now concentrate on how these factors affect the intensity of participation in export markets for a given firm. As before, the introduction of firm fixed-effects will help to convincingly deal with any omitted variables bias and will indicate the true effect of locational and other factors for firms. Additionally, variables that are time-invariant are not included in the analysis since the coefficients on these would be zerothese include distance to the closest port, road density and labor regulations.
Columns (1) to (5) in Table 7 introduce the different sets of variables, and column (6) also includes year dummies along with firm fixed-effects. When the coefficient is averaged within firms and not across firms, i.e. after controlling for firm fixed-effects, some of the earlier results remain stable. Productivity continues to negatively impact export intensity, and the size of the firm has a strong positive effect. However, the age of the firm seems to little or no impact on export intensity.
Interestingly, the count of exporters within a district affects export intensity positively, as compared to the across-firm specifications presented in Section 5.1. Although the result is not statistically significant once infrastructure and other variables are controlled for, this does seem to suggest that accounting for the effect of firm unobservables might be important to estimate the average impact of exporter agglomeration. It appears that aggregating the coefficients within firms seems to reverse the impact of spillovers from exporter clustering. The impact of withinindustry export clustering is now statistically insignificant.
The remaining economic geography variables don't have any statistically significant effects on the intensity of exports. Market access has a small and negative effect on export intensity, which disappears once year dummies are introduced. Additionally, neither the infrastructure nor the institutional variables have any impact on the intensity of exports for a given firm. The introduction of firm fixed-effects effectively seems to absorb most of the variation in the data, especially for those variables that vary only by district and year.

Conclusions
This paper investigates the factors that affect the decision of the firm to start exporting and its performance thereafter. In particular, it studies the impact of firmspecific characteristics and those of the locationagglomeration, infrastructure and institutional. It separates the effect of these factors across firms and within firms. When comparing firms within the same industry and year, the paper finds that the impact of local agglomeration of exporting firms seems to negatively affect the odds of entry into export markets, and with the introduction of firm fixed-effects the impact is negative and significant across all specifications. Within-industry clustering of firms seems to have little or no impact on the odds of entry when the coefficient is averaged across firms, but after controlling for unobservables at the level of the firm, it positively affects the log odds of entry. Educational attainment and institutional factors seem to matter. The paper also finds compelling evidence of self-selection of more productive firms into the export market. The effect of other location-specific factors, such as infrastructure and institutional controls vary by model specification.
Once firms have started to export, the paper also models how these factors affect the intensity of participation. More productive firms are less likely to export intensively, whilst the size of the firm is an important determinant of its export participation. Controlling for unobservables at the level of the firm seems to indicate that clustering of other exporters might affect participation positively, but there is little evidence that other factors affect export intensity.
Three important contributions are made to the existing empirical literature. Evidence on factors that affect the sunk cost of entry in developing countries is rare, and the paper provides new evidence for India using firm-level data. Additionally, the paper contributes to the empirical literature on industrial development and economic geography. Clustering and agglomeration activities in the analysis are defined and studied at the level of the district, a spatial unit disaggregated enough to measure spillovers at local levels. And lastly, and importantly, to the author's knowledge, this is the first paper that explicitly and separately identifies the cross-sectional variation from the time-series variation. In other words, it shows how the same set of factors can have differential effects when aggregated across firms within a given industry and location, and when disaggregated within firms over time.
The main limitation of the paper is that since the firm-level data do not provide any information on the final destination of exports, the paper does not analyze if destination-specific firm agglomeration affects the sunk costs of entry to particular destinations. In fact, Moxnes (2010) finds that country-specific costs are three times the magnitude of global costs. However, it could also be argued that some sunk costs are incurred for all global export markets. For instance, the costs of international product standards common to multiple foreign markets need only be borne once (Shephard 2007).
The policy implications of these findings are relevant, not just for those wishing to encourage export participation by firms in India, but also more generally for policy makers in developing countries elsewhere. The results from the across-firm results provide indications on the sorts of factors that affect firms within given industries. Indeed, if one were interested in providing incentives to allow a particular domestic 25 industry to export, these results would be particularly relevant. Better education and better institutions are the most important factors. On the other hand, if one were hoping to give certain kinds of firms within particular industries a boost into export markets, then the within-firm results would be important. In other words, if all that mattered was that the best, or the most productive, firms within given industries and locations accessed foreign markets, then more general agglomeration within a location would be an important factor.
In summary, these findings suggest that if, in fact, there are positive externalities from clustering of export-oriented activity, then governments could provide incentives to encourage such co-location. However, the existence of spillovers from more general economic clustering might suggest that governments might be limited in their ability since their effect on generating agglomeration economies is unclear. And lastly, investment in more general education infrastructure and in improving institutional characteristics of regions might also help to reduce the sunk costs of export entry.