Geographic clusters, regional productivity and resource reallocation across firms: Evidence from China

We link industrial clusters, regional productivity and resource reallocation efficiency with geographical and sectoral disaggregated data. Based on a county-industry level panel from 1998 to 2007 in China, we find that industrial clusters significantly increase local industries' productivity by lifting the average firm productivity and reallocating resources from less to more productive firms. Moreover, we find major mechanisms through which resource reallocation is improved within clusters: (i) clusters are associated with a higher firm turnover with increased entry and exit rates simultaneously; and (ii) within clusters' environment, the dispersion of individual firm's markup is significantly reduced, indicating intensified local competition within clusters. Such results suggest that industrial clusters in China help improve regional productivity and resource allocation efficiency with intensified competition and accelerated firm dynamics. The identification issues are carefully addressed by two-stage estimations with instrumental variables and other robustness checks.


Introduction
Economic activity in many countries around the world tends to be spatially concentrated.Dating back to Marshall (1890), theorists have highlighted the benefits of local economies of scale and inter-or intraindustry externalities arising from firms being co-located together, leading to improved regional growth (Arrow, 1962;Jacobs, 1969;Krugman, 1991;Porter, 1990).While empirical estimates for the effects of spatial agglomeration on economic growth have been flourishing over the last few decades, the findings are far from conclusive (e.g., Cingano and Schivardi, 2004;Dekle, 2002;Feldman and Audretsch, 1999;Glaeser et al., 1992. Henderson, 2003;Rosenthal and Strange, 2003).The conflicting results suggest that the net effect of geographical agglomeration remains ambiguous and may depend on different factors of which we still have little understanding (Grashof, 2020;McCann and Folta, 2008).
Productivity growth driven by technological progress and other productivity-enhancing factors beyond traditional input factors, which is usually measured by total factor productivity (TFP), 1 has been recognized as a major contributor to economic growth, and it explains a large proportion of the variation in economic development between regions (Caselli, 2005;Galor and Tsiddon, 1997;Hall and Jones, 1999;Klenow and Rodriguez-Clare, 1997).At the same time, it is evident that resource misallocation problems can significantly reduce aggregate TFP (Bartelsman et al., 2013;Collard-Wexler and De Loecker, 2015;Restuccia and Rogerson, 2008).If agglomeration positively affects economic growth, what is the relationship between agglomeration, regional TFP and resource misallocation?Does agglomeration affect regional TFP through resource reallocation?The answers are unknown.
This study aims to fill the existing knowledge gap by linking industrial clusters and region-level TFP in China, focusing on resource reallocation among firms in a locality.Specifically, we ask whether industrial clusters improve resource allocation and thereby enhance the TFP of a local industry and, if so, what the mechanisms are through which such effects work on-site.
To address the questions mentioned above, we employ a large firmlevel panel dataset in China's manufacturing sector to create a countyindustry level industrial cluster panel from 1998 to 2007, using a density-based Index (DBI, a la Guo et al., 2020). 2 Based on both countyand firm-level data, applying the methodology of Olley and Pakes (1996), we find that industrial clusters not only increase the aggregate and average TFP but also improve the resource reallocation TFP 3 of a local industry.The existence of an industrial cluster in any countyindustry is associated with a 2 % increase in reallocation TFP per year between 1998 and 2007, indicating that the existence of industrial clusters can explain 8.5 % of the increase in reallocation TFP in countyindustries of China. 4Moreover, we provide evidence on how clusters mitigate resource misallocation.We find that firm entry and exit are much more active in clusters than outside clusters, and firm markup dispersion is significantly reduced within clusters.These findings indicate that local competition is intensified within clusters, which reduces resource misallocation across individual firms, supporting the arguments of Porter (2003).
Our discoveries contribute to the research on economic geography in several ways, taking one step forward to explore the net effect of agglomeration and the potential mechanisms.Above all, this study is among the few studies that connect agglomeration and regional TFP using micro-level data, complementing Cingano and Schivardi (2004) and Greenstone et al. (2010).Because of data constraints, empirical studies on the regional effects of agglomeration have typically focused on regional employment, wage growth, or innovation (e.g., Delgado et al., 2014;Feldman and Audretsch, 1999;Glaeser et al., 1992;Henderson, 2003;Henderson et al., 1995;Porter, 2003;Rosenthal and Strange, 2003), assuming the proportional employment gains or innovation improvement in a region are the results of overall productivity increase.Such an assumption is problematic because increases in input in one or two dimensions may be accompanied by decreases in others, and the net productivity growth may be balanced or even reversed due to the elasticity of substitution of input factors (Hicks, 1932;Hicks and Allen, 1934).Therefore, we propose that TFP, a comprehensive measurement of productivity improvement focusing on how input factors are utilized, is appropriate to capture the net effects of agglomeration.It is important to address this issue because it is a key identification problem that brings empirical research closer to the much-discussed theoretical predictions on cluster spillovers.
In addition, this work complements existing research by linking clusters and regional TFP growth with a focus on agglomeration-driven resource reallocation.Resource sharing, knowledge transfer and competition have been identified as major benefits of agglomeration (Arrow, 1962;Jacobs, 1969;Krugman, 1991;Porter, 1990).However, firms are heterogeneous in their capability to take advantage of resource sharing or knowledge transfer (Grillitsch andNilsson, 2017, 2019;Hervas-Oliver et al., 2018;Knoben et al., 2016).Moreover, intensified competition in clusters may cause damage to some firms due to the increased congestion costs and lowered incentives for innovation under certain conditions.Therefore, the net effects of agglomeration should be related to how resources are shifted among firms under the competition in clusters.Unfortunately, most empirical studies on the regional effects of agglomeration assume that firms within a cluster benefit equally from agglomeration.Such a knowledge gap may bring about conflicting findings in existing studies.This study demonstrates that certain types of clusters may help improve regional growth by enhancing the shift of resources from less productive to more productive firms, filling the existing knowledge gap and opening a new dimension for agglomeration study.Furthermore, our discoveries showing that intensified competition serves as an important channel for achieving agglomeration externalities in China, where institutional constraints are significantly strong, call for further insightful examinations of Porter's (1990) externalities at the regional level under different conditions.At the same time, it makes theoretical contributions to the debates on the impacts of competition on growth.
Finally, our findings contribute to the literature on economic development and China studies.Scholars suggest that the low aggregate TFP in developing countries is mainly due to micro-level resource misallocation (Banerjee and Duflo, 2005;Caselli and Coleman, 2001;Gancia and Zilibotti, 2009;Restuccia and Rogerson, 2008).Studies have suggested that weak institutions and policies, including labor market regulations, licensing and size restrictions, and the dominance of state ownership, may potentially cause such misallocation (Hsieh and Klenow, 2009;Kochhar et al., 2006;Lewis, 2005).The organization and coordination of entrepreneurial firms within clusters in China have been identified as an institutional innovation to overcome institutional impediments (Guo et al., 2020;Long and Zhang, 2011;Xu, 2011).However, we have little understanding of the mechanisms under which clusters work in China.By providing the first evidence on the relationship between clusters, resource reallocation and regional productivity growth based on micro-level data, this study suggests that cluster-based production, at least in the context of China, may alleviate the problem of resource misallocation for firms operating within clusters through intensified competition.It enriches the evidence to explore the mechanisms that mitigate the resource misallocation problem.
The rest of the paper is organized as follows.The next section discusses the institutional features of industrial clusters in China and reviews the relevant literature.Section 3 discusses the data and samples and introduces how we construct the variables of regional TFP growth.Section 4 presents the empirical findings on clusters and aggregate, average and resource reallocation productivity and addresses the identification concerns using IV regression.Section 5 examines the mechanisms through which clusters affect resource reallocation efficiency, focusing on product market competition.Finally, Section 6 concludes this study.

Clusters in China under institutional constraints
The critical conditions for "clustering" in market economies are the protection of property rights and factor mobility.Under these conditions, the market prices of mobile factor inputs will affect firms' colocation decisions that are essential to forming clusters (Ellison et al., 2010;Fujita et al., 1999;Marshall, 1890).Unfortunately, these primary conditions are not met in China due to institutional constraints.Above all, private property rights were not constitutionally recognized until 2004.Moreover, law enforcement for private property rights remains weak (Guo et al., 2014).Meanwhile, almost all input factors are not mobile or tradable.The number one issue is land ownership.According to the constitution, urban land is state-owned, whereas rural land is collectively owned by villages and is not tradable for non-agricultural usage.Associated with the government control of land is the Hukou system (residence registration system), which restricts labor mobility (Au and Henderson, 2006;Meng, 2000). 5Furthermore, the financial market in China is highly underdeveloped and is particularly biased against lending to private enterprises (Allen et al., 2005;Guo et al., 2014).Under the above institutional constraints, firms' co-location decisions are not made solely based on market prices.Instead, the development path and organization of industrial clusters in China differ significantly from those in market economies.
Above all, the emergence and development of industrial clusters are the consequence of the joint efforts of local entrepreneurs and governments to overcome institutional constraints.Research has shown that most of China's industrial clusters have origins in township-village enterprises (TVEs).TVEs are essentially the creatures of local governments and entrepreneurs in the face of the suppression of the private economy (Xu and Zhang, 2009).At the beginning of the economic reform, when the private sector was not recognized and restrictions on rural land use were strict, the only way for rural entrepreneurs to engage in business activities was to use the 'red hat' (Hongmaozi) strategy, i.e., to register a business as a collectively owned TVE to seek legal protection and mitigate the discriminations against the private sector (Xu, 2011).Although the TVEs differ in detail across regions, they share the following key characteristics: all were led by rural entrepreneurs, all had vague definitions of ownership at the incipient stage, reflecting certain institutional constraints, and all had close ties with local governments (Chang and Wang, 1994;Che and Qian, 1998;Qian and Xu, 1993).Since the late 1990s, when political and legal resistance to private ownership was gradually relaxed, many TVEs have become privatized.In regions where local governments provided continuous support to those subsequently privatized firms, local residents actively engaged in entrepreneurial activities and eventually formed industrial clusters observed today (Xu and Zhang, 2009).The key role local governments have played was to help relax some of the institutional restrictions (e.g., providing the 'red hat' for private enterprises, relaxing restrictions on private lending and land usage, etc.), provide public goods and coordinate local entrepreneurs (Long and Zhang, 2011).In the past two decades, industrial clusters with a concentration of private entrepreneurial firms coordinated by local governments have emerged rapidly in vast rural areas in coastal provinces.
In addition, firms within clusters are often small in size and highly specialized.In particular, with limited access to formal financial and land resources, private entrepreneurial firms generally start small and use their profits for reinvestment (Ruan and Zhang, 2009).Production processes, which are usually integrated within a single firm in developed countries, are segmented into many small "firms," with each narrowly specialized in one production step.These specialized small firms are linked together through subcontracting networks where a collection of many specialized firms produces a final product.With repeated close interactions within a cluster, the members build trust that forms a basis for coordination and mutual support in many aspects of the business.In such a way, the requirements for input factors such as financial, technological and skilled-labor resources are lowered for entrepreneurial firms in clusters without sacrificing productivity improvement (Ruan and Zhang, 2009).With the concentration of a vast number of small and specialized firms, many cluster-centered townships have become national or international centers of specific products.These clusters often consist of a large number of privatized TVEs or their derivative companies.

Clusters, regional productivity and resource reallocation
A central idea of agglomeration economics is that firms can enjoy local economies of scale from co-locating with each other, thereby increasing productivity.The well-known Marshall-Arrow-Romer (MAR) model emphasizes the advantages of regional specialization, while Jacobs (1969) highlights the benefits of knowledge spillovers across industries caused by urban diversity.A more recent theory proposed by Porter (1990) argues that the advantage of agglomeration comes from the intense competition of firms in a locality.This competition provides significant incentives for firms to innovate, accelerating the rate of technological progress and hence productivity growth.Porter (1990) shares with the MAR model by emphasizing the benefits of regional specialization while he favors Jacobs in highlighting the positive impacts of local competition on knowledge spillover.
However, empirically, the effects of agglomeration are found to be inconclusive.Many studies document the positive effects of agglomeration on local economic growth and firm productivity though the explanations vary regarding which kind of externalities matters.For example, several studies find positive effects of agglomeration of firms from diverse industries on the growth of employment, wage, and firm productivity (e.g., Feldman and Audretsch, 1999;Glaeser et al., 1992), supporting the argument of Jacobs externalities.Some other studies, however, support the claims of the MAR model and document the positive effects of regional specialization (e.g., Cingano and Schivardi, 2004;Dekle, 2002;Delgado et al., 2014;Howell, 2017;Jofre-Monseny, 2009;Rosenthal and Strange, 2003;Van Oort and Stam, 2006).At the same time, some studies find evidence for both the Jacobs and MAR externalities, depending on the maturity of the industries (Henderson et al., 1995).However, using cross-country panel data from 70 countries, Henderson (2003) fails to observe the growth-promoting effects of agglomeration by any means.Moreover, some studies find no evidence of the positive effects of agglomeration on firm growth (Globerman et al., 2005;Van Geenhuizen and Reyes-Gonzalez, 2007), firm market value (Zaheer and George, 2004) or employment rate of firms in related industries (Beaudry and Swann, 2009).
The conflicting results from the empirical assessments suggest that the net effect of geographical agglomeration remains ambiguous and may depend on different factors of which we still have little understanding (Grashof, 2020;McCann and Folta, 2008).One of the major knowledge gaps is that the agglomeration literature is disconnected from recent studies on productivity growth and related mechanisms.Empirical studies on the regional effects of agglomeration have typically focused on regional employment, wage growth, or innovation outputs (e.g., Delgado et al., 2014;Feldman and Audretsch, 1999;Glaeser et al., 1992;Henderson, 2003;Henderson et al., 1995;Porter, 2003;Rosenthal and Strange, 2003), assuming the proportional employment gains or innovation improvement in a region are the result of overall productivity increase.However, the elasticity of substitution of input factors is well evident (Hicks, 1932;Hicks and Allen, 1934).An increase in one or two input factors may be accompanied by a decrease in the others.For instance, labor and capital demands may be reduced when technology progresses.Therefore, the net productivity may be balanced or even reversed due to the elasticity of input factor substitutions (Cingano and Schivardi, 2004).Therefore, TFP, a comprehensive measurement of productivity improvement focusing on how input factors are utilized (through technological progress or other productivity-enhancing strategies), is more appropriate to capture the net effects of agglomeration.However, due to data constraints, studies linking agglomeration and regional TFP with geographical and sectoral disaggregated data are limited, with a few exceptions focusing on agglomerations in the US (Greenstone et al., 2010;Henderson, 2003) or Italy (Cingano and Schivardi, 2004).As a result, how agglomeration affects regional TFP in developing counties is unknown.
Another knowledge gap left by the existing literature is that most studies on the effects of agglomeration, whether on firm or regional economic performance, assume (often as an implicit assumption) that agglomeration effects are homogeneous to firms within a cluster.Such an assumption is problematic.It is well documented that firms differ in resources and capability, which may determine that they generate different outputs even with the same inputs.Indeed, some recent studies have provided evidence that the effects of clusters on firm performance depend on the region or firm-level factors (Arzaghi and Henderson, 2008;Greenstone et al., 2010;Hervas-Oliver et al., 2018;Knoben et al., 2016;Lee, 2018;Rosenthal and Strange, 2003;Speldekamp et al., 2020). 6Identifying which types of firms benefit or suffer from clusters based on firm-level data is essential for understanding the insights of agglomeration.However, we have little knowledge about how resources are shifted among firms within clusters and how the heterogeneous effects of agglomeration on different types of firms relate to its net effect on regional productivity growth.
Finally, Porter's (1990) competition arguments are underinvestigated in the agglomeration literature.According to Porter's theory, the most important agglomeration economies are dynamic efficiencies.The intense competition of firms in a locality provides significant incentives for firms to learn and innovate, thereby accelerating technological progress and productivity growth.However, theoretically, the relationship between agglomeration, competition and regional growth is debated.First, the intensified competition within a cluster may reduce firms' incentives to innovate.Following Schumpeter (1950), Aghion and Howitt (1992) argue that when competition is intensified, the laggard's reward for catching up with the technological leader may fall.Therefore, innovation incentives may be reduced by competition.Second, the increasing density of firms in a region may lead to congestion costs and thereby cause the diseconomies of agglomeration (Prevezer, 1997).Such congestion costs may be reflected by the increased transportation, labor, capital, and land costs when firms compete for such input factors (Audretsch and Feldman, 1996;Folta et al., 2006;Stuart and Sorenson, 2003).Third, firms within clusters may be trapped in the regional competition and lose the capability to explore growth opportunities outside the region (Asheim and Isaksen, 2002;Boschma, 2005).
Recognizing the above knowledge gaps, we seek to examine the effects of industrial clusters on regional growth in China from a new perspective to take one more step toward exploring the net effects of agglomeration and related mechanisms.Specifically, based on a largescale firm-level panel, we examine the effects of industrial clusters on regional TFP, focusing on resource reallocation among firms, taking firm heterogeneity into consideration.Moreover, we explore the mechanisms through which the agglomeration effects are achieved, focusing on the dynamic efficiencies of agglomeration.
The dispersion in firm productivity within the same industry or the same market is well documented.A growing literature has emphasized the role of resource reallocation across firms in explaining aggregate productivity growth (Bartelsman et al., 2013;Foster et al., 2001;Melitz, 2003).Some suggest that aggregate productivity can rise not only because the firms, on average, become more productive (usually because of the upgrades in the technology, improvement in management or investment in R&D) but also due to shifts in production factors from less to more productive firms (Collard-Wexler and De Loecker, 2015;Restuccia and Rogerson, 2008).Institutional changes such as deregulation or trade liberalization that lead to the exit of less productive firms or the expansion of more productive firms can improve aggregate productivity (Bernard et al., 2009;Bustos, 2011;Melitz, 2003;Pavcnik, 2002;Schmitz, 2005).On the contrary, in the presence of institutional distortions such as market imperfections, monopoly power, the lack of protection of property rights as well as the discretionary provision of production factors, highly productive firms may not have sufficient access to resources, and such restrictions to the further development of these firms could lower the aggregate TFP of the economy (Acemoglu et al., 2018;Banerjee and Duflo, 2005;Hsieh and Klenow, 2009;Restuccia and Rogerson, 2008).
In the case of China, it has been known that a substantial amount of production factors is not allocated through the market.Evidence of the negative effects of resource misallocation on aggregate TFP and the connection of such resource misallocation to institutional problems are discovered in the literature (Adamopoulos et al., 2017;Brandt et al., 2012;Hsieh and Klenow, 2009).For instance, Hsieh and Klenow (2009) find that productive firms are much smaller in China than they would be in an undistorted economy.At the same time, state-owned enterprises (SOEs) are much larger than the optimal level.Brandt et al. (2012) evident that the differences in productivity between the entering and exiting firms explain a substantial part of the aggregate TFP growth in China between 1998 and 2005.Additionally, Adamopoulos et al. (2017) evidence that eliminating resource misallocation caused by restrictions on land ownership and labor mobility in rural China could increase agricultural productivity by 1.84 times.These results reflect a significant linkage between institutional distortions and resource misallocation in China.
The development of industrial clusters in China is not only an economic geography phenomenon but also an institutional arrangement.As we have discussed, the organization of production within clusters evolved with the efforts of entrepreneurs and local governments to overcome the institutional restrictions on the mobility of production factors.Specifically, clustering deepens the division of labor in the production process.As a result, it makes it possible for small entrepreneurial firms to enter the market by focusing on a narrowly defined production stage.These highly specialized entrepreneurial firms closely coordinate alongside the value chain within the clusters.With such a division of labor, the capital and technical barriers to entry are lowered, resulting in increased competition within the clusters (Long and Zhang, 2011;Xu and Zhang, 2009).Studies have systematically shown that these industrial clusters have significantly positive effects on local economic growth (Guo et al., 2020).Therefore, we expect to observe the positive effects of clustering on the aggregate TFP of a locality.Moreover, with lowered entry barriers and a more market-oriented ecosystem, we expect to observe more efficient resource reallocation among firms within the industrial clusters in China.In particular, we expect intensified competition to serve as a major channel through which the clustering improves resource reallocation among individual firms.

Data and sample
Our primary dataset is the Above-scale Industrial Firm Panel (ASIFP) from 1998 to 2007.This dataset provides detailed firm-level information, including the industry, location, age, size, ownership, and financial information of all SOEs and non-SOEs with annual sales of 5 million RMB or above.Admittedly, this dataset would miss micro firms, i.e., non-state firms with annual sales below 5 million RMB.Aware of possible data bias, we use the 2004 Economic Census Data (ECD), 7 which covers all business entities in China in that year, to cross-check the accuracy of cluster identification.Results show that identified clusters using the DBI measurement are consistent across different datasets, including both the ASIFP and the ECD (see Appendix A).
(2003) document the decay of agglomeration effects over geographic distance.Lee (2018) finds that domestic technological leaders, which have significant technological distance to the global pioneers, benefit from clustering more.Speldekamp et al. (2020) discover that weak firms may take the advantages of strong network and urbanization to compensate the limited internal resources they have.On the contrary, firms with strong internal resources may benefit from urbanized clusters when they have many local partners.
7 2004 ECD is the only available census data for the examination period.
D. Guo et al.Compared with the census data, as of 2004, the enterprises covered by ASIFP account for 90 % of the total sales of all manufacturing industries in China.8Furthermore, 84 % of the firms in ASIFP are officially labelled as small enterprises, defined by with no >300 employees.We therefore suggest that the estimations are reliable and not biased by the data used.
A key dependent variable in our study is county-industry level resource reallocation efficiency, measured for each industry in each county.To compute reallocation efficiency, we first calculate the TFP of each firm within each county-industry.We use three TFP measures to ensure the robustness of the results.The first measure, TFP_ols it is the ordinary least square (OLS) regression residual from a log-linear transformation of the general Cobb-Douglas production function with yearfixed and industry-fixed effects.The OLS approach considers only tangible inputs while ignoring unobservable shocks and assuming that all types of inputs are exogenous and hence have no correlation with the error term, the computed TFP itself.To account for these shortcomings, we also calculate firm-level TFP following Olley and Pakes (1996), which is a semi-parametric method to account for both the unobservable production shocks and the non-random sample selection.Specifically, we calculate TFP_op1 it with industry-fixed effects, and TFP_op2 it with both year-fixed and industry-fixed effects.The details of the TFP calculation are summarized in Appendix B.
The resource reallocation efficiency in each county-industry is obtained following a standard decomposition method of Olley and Pakes (1996).Concretely, the county-industry TFP for industry j in county k at time t, tfp jkt AGG , is calculated as the sum of each firm i's TFP in the countyindustry, tfp ijkt , weighted by the market share of this firm, share ijkt .Olley and Pakes (1996) show that the aggregate TFP can be decomposed in the following way: where tfp jkt and share jkt are respectively un-weighted average firm-level TFP and average firm market share in industry j, county k, and year t.
The first component, tfp jkt AVG , is the un-weighted average firm-level TFP of the county-industry.The second component, tfp jkt RAL , measures the covariance between firm productivity and market share.Changes in the latter measure represent a reallocation of market share among firms of different productivity levels: a higher level of tfp jkt RAL would represent a higher level of resource reallocation efficiency.
Our key explanatory variable is the existence and strength of the DBI cluster in industry j, county k, and year t.As discussed by Guo et al. (2020), employing standard regional specialization or interconnectedness measurements to identify industrial clusters in China is not the most suitable method due to institutional constraints on factor mobility and location decisions of firms in China.At the onset of the post-Mao reform, all the firms were owned or controlled by the state or local governments; thus, governments made decisions on firms' locations.The situation changed gradually, but the legacy is substantial.The concentration of heavy industries in certain regions of China was primarily driven by political concerns.Regions with giant SOEs are likely to be highly specialized when measured by standard cluster measurements such as Herfindahl-Hirschman Index (HHI), Gini coefficient, Krugman Dissimilarity Index (KDI), or location quotient (LQ).Therefore, clusters identified by applying standard indices to China are often located in regions dominated by giant SOEs. 9s discussed previously, the development of industrial clusters in China is characterized by the clustering of a large number of small and medium-sized firms within a region, implying that the density of firms in the industry within a locality is one of the essential features of entrepreneurial clusters in China.Hence, we apply the DBI (a la Guo et al., 2020) to measure clusters in China.The DBI counts the number of firms in the same industry within a county (the construction of the DBI is discussed in detail in Appendix A).Explicitly, we define a county to have an industrial cluster of a particular industry if the county is among the top α percentile of all counties regarding firm density for that industry, and we assign α = 5. 10 We then construct a dummy variable Cluster jkt , which equals one if firms of industry j have formed a cluster in county k in year t, and 0 otherwise.We further measure the strength of each cluster based on its relative contribution to the national total industrial output or establishment number.When measuring cluster strength based on industrial output, we first calculate the contribution of each cluster of industry j in county k to the national total industrial output of industry j at time t by S_V jkt = , which is a percentage.Based on S_V jkt we can distinguish weak clusters versus strong clusters.Specifically, we construct a categorical variable Strength_V jkt .It equals 0 for non-clusters if firms from industry j have not formed a cluster in county k in time t.It equals 1 for clusters with below-median S_V jkt compared with other clusters from the same industry j at time t, and equals 2 for clusters with exact or above median S_V jkt .
Similarly, when measuring cluster strength based on the total establishment number, we define Strength_E jkt , which equals 0 if firms from industry j have not formed a cluster in county k in time t.It equals 1 for clusters with below-median S_E jkt compared with other clusters from the same industry j at time t, where S_E jkt = Establishment jkt Establishmentjt is the contribution of the cluster in county k of industry j to the total number of firms of industry j at time t.Strength_E jkt equals 2 for clusters with exact or above median S_E jkt .
Table 1a reports the summary statistics of clusters measured by the DBI.Each year, there are about 1500-2000 industrial clusters in all counties in China, accounting for 5 % of the observed number of county industries (given α = 5).These clusters comprised >30 % of the manufacturing firms, contributing to around 30-40 % of the national industrial output and employment from 1998 to 2007.The summary statistics of variables related to cluster existence and strength are reported in Table 1b.
When estimating the effect of clusters on county-industry TFP, we control for a set of industrial firm characteristics, including Average firm age, Average firm size, Average firm state-ownership, and Average firm leverage of the firms within each county-industry.We also control for the size effect of the local industry using County-industry employment, which is the total number of employees for each county-industry.We further include County per capita GDP and County total GDP in our regressions to control the effects of regional development level and regional economic size.These data are from the China Socio-Economic Development Statistical Database.
Besides the above-mentioned firm and county-level factors, we also control the industry diversity level of a county because the industry structure of a location may have impacts on regional firm TFP and resource reallocation.The industry diversity level is measured by the HHI of industries' output in a given county each year.
In addition, local governments play an essential role in shaping a region's institutions, which may affect the TFP and resource reallocation in a locality.We therefore control the market-based institutions in a region.Precisely, we control the overall marketization index score (Marketization index1)obtained from China Marketization Index (CMI, 1998(CMI, -2007)).The CMI is a composite index based on an annual assessment of 25 factors across five aspects of marketization in each province, including the relationship between market and government, the development of the non-public economy, the development of the factor market, the development of factor market, and the development of legal and market services (Fan et al., 2011).
Finally, in the panel estimations, we also include year dummies and county×industry dummies to control for time trends and time-invariant heterogeneities across county-industries.Detailed definitions of our variables are summarized in Appendix C.
Our sample covers firms in >2800 counties and 39 two-digit industries in China from 1998 to 2007.During our sample period, some counties changed their names or judiciary boundaries.We identify the changes and convert the corresponding county codes into a benchmark system.In addition, China also modified its industry coding system in 2002 (from GB/T 4754-1994 to GB/T 4754-2002).Therefore, we tract the four-digit industry codes that have become either more disaggregated or more aggregated after 2002 and use the more aggregated codes to group the industries from 1998 to 2007.
Table 2 reports the summary statistics of dependent and control variables for clusters and non-clustered county-industries (non-clusters).On average, clusters have significantly higher aggregate, average, and reallocation TFP than non-clusters.Furthermore, firms in clusters tend to be younger, larger in size, have less state-ownership, and have a lower leverage ratio than firms outside clusters.Finally, counties with clusters are more likely to have higher per capita and total GDP than counties without clusters and a lower level of industrial concentration measured by HHI.

Findings on industrial clusters and resource reallocation across firms
In the subsequent section, we estimate cluster effects on aggregate, average, and cross-firm resource reallocation TFP.The identification issue will be addressed by two-stage regressions.

Industrial clusters and reallocation TFP
The following eq.( 2) is our baseline regression model.We use it to estimate the effect of clusters on local-industry productivity.
TFP jkt is measured by tfp jkt AGG , tfp jkt AVG , or tfp jkt RAL , which are decomposed county-industry level TFP elements.These TFP elements are calculated with eq. ( 1) based on firm-level TFP, measured by TFP_ols, TFP_op1, and TFP_op2, respectively.Cluster jkt is the DBI cluster dummy variable that equals one if firms from industry j have formed an industrial cluster in county k in year t, or 0 otherwise.
The baseline regression results are reported in Table 3.The cluster variable Cluster j, k, t is always significantly associated with higher county-industry aggregate productivity tfp jkt AGG regardless of how firmlevel TFP is measured.The increased aggregate productivity in clusters seems to be derived from higher average firm productivity tfp jkt AVG , and more efficient resource reallocation across firms within clusters tfp jkt RAL .In particular, except for the case of TFP_ols (where TFP is estimated by the OLS method), Cluster j, k, t is significantly associated with higher reallocation productivity tfp jkt RAL .One possible mechanism behind this phenomenon is the expansion of higher-productivity firms and the exits of lower-productivity firms in clusters, which we will further study in the next section.Models ( 6) and ( 9) indicate a 2 % increase in reallocation TFP per year between 1998 and 2007.Given that the mean of the reallocation TFP (measured by TFP_op1 or TFP_op2) is approximately 0.235 during this period, industrial clusters alone can explain 8.5 % of the increase in reallocation TFP in county-industries of China.
Tables 4a and 4b report the effects of clusters with different strengths

Table 1a
Summary statistics of identified industrial clusters and their contribution to the national economy.3).For instance, for aggregate productivity calculated using the OLS method, the coefficient of Cluster jkt (not differentiating strength) is about 0.222.It is larger than the coefficient of weak clusters (Strength_V jkt _1, β=0.196), but smaller than the coefficient of strong clusters (Strength_V jkt _2, β=0.283).As for reallocation productivity, no matter how TFP is measured, the coefficients of Strength_V jkt _1 are always insignificant.On the contrary, the coefficients of strong clusters, Strength_V jkt _2, are always positive and highly significant.When productivity is measured by TFP_op1 or TFP_op2, the coefficients of Strength_V jkt _2 are about twice as big as those of Cluster jkt (see Table 3).As shown in Models ( 6) and ( 9), the coefficient of  3).The magnitudes of the coefficients of Strength_E jkt _1 are the smallest.Weak clusters do not seem to have a significant effect on reallocation productivity.Strong clusters, on the other hand, have a positive and significant impact on the reallocation TFP and the coefficients of Strength_E jkt _2 range from 0.024 to 0.048, depending on how TFP is measured.
The results in Table 3 and Table 4a and 4b imply that industrial clusters, especially clusters with a strong presence of output and firm establishment, are associated with higher aggregate productivity, and such an increase in productivity not only comes from the lift of average firm productivity but also the improvement of resource reallocation efficiency across firms within the local industry.The significant correlation between clusters and improved productivity is interesting, but that alone is insufficient for inferring causality between clusters and productivity improvement.Alternative explanations for the baseline estimations remain.For example, one could argue that the existence or strength of the cluster is a result rather than the cause of the productivity improvement, although such a possibility is slim in China, given that the

Table 4a
Cluster strength (measured by output) and county-industry productivity.
( location choices of firms are constrained by institutions generally.Moreover, omitted variables, such as the local entrepreneurship culture or the management or production skills of the local people, may have contributed to the improved productivity and the rise of industrial clusters simultaneously.To address such concerns, we employ two-stage estimations using two IVs to identify the effect of clusters.Specifically, the first IV we use is the per capita number of nongovernment organizations (NGOs) in each city in a given year.The number is the per capita sum of the stock number for three types of NGOs, including foundations, private non-enterprise entities, and social groups.As we have discussed, a key feature of Chinese industrial clusters is a large number of small specialized firms linked through subcontracting networks to produce the final product.With repeated close interactions within the cluster, trust is built between members, forming the basis for coordination and mutual support among firms in many ways.Social trust and social coordination are therefore essential elements in the development of Chinese industrial clusters (Weitzman and Xu, 1994;Xu and Zhang, 2009).Many empirical studies have shown substantial evidence of the positive impact of NGOs on fostering social trust and community coordination (Coleman, 1990;Putnam, 1993;Fukuyama, 1995;Knack and Keefer, 1997).In the case of China, NGOs were abandoned entirely after 1949 and before the economic reforms.However, they recovered gradually from the mid-1980s and grew dramatically from 1998 onwards when the central government relaxed the laws regulating the non-government sector (Teets, 2009).These NGOs are involved in various activities such as education, poverty alleviation, environment, health, and community development and provide services to vulnerable groups such as the rural poor, migrants and women.Given the critical role social trust and coordination of society plays in fostering the development of industrial clusters, we expect that the presence of industrial clusters is associated with the development of NGOs in a region.On the other hand, the development of NGOs in a region should not directly influence the average firm-level TFP or resource relocation TFP in a specific industry of a county unless the impact is achieved through the presence of industrial clusters in such an industry.Above all, it is well documented that the objectives of NGOs are pretty different from those of for-profit firms.More importantly, in this study, it is noted that the major dependent variables are the average firm TFP and reallocation TFP at the county-industry level.NGOs may be more active in areas where TFP is higher.However, this in no way suggests that NGO development in a city may be directly related to the aggregate or average TFP of firms in a particular industry in a county unless this association is achieved through an increase in the county's economic performance caused by the development of industrial cluster in that specific industry.Moreover, it is even harder to relate the development of NGOs in a city with the resource reallocation of firms within a specific industry in a county, which is the key focus of this study.We therefore suggest that the density of NGOs in a city is a qualified IV.
The second IV we use is the per capita inter-city passenger traffic, measured by the ratio of the total number of inter-city passengers by all transportation means (including embarking and disembarking) over the total population of each city in each year.Inter-city passenger traffic should be closely related to the presence of industrial clusters.First, the development of industrial clusters is often associated with a large wave of population migration, increasing a region's per capita passenger traffic (Fu and Gabriel, 2012;Kerr et al., 2017).Second, more businessrelated passenger flows will naturally occur with increased trade activity led by industrial clusters.At the same time, we suggest that this IV satisfies the exclusion condition that the per capita passenger traffic should be exogenous from the error terms of the estimations for the average firm TFP or reallocation TFP.While the relationship between productivity and intercity passenger traffic in a region is evident in both theoretical and empirical studies, all available research suggests that this relationship is achieved through agglomeration, regardless of whether the relationship is positive or negative or the agglomeration is defined by urbanization or specialization (Beeson, 1990;Ciccone and Hall, 1996;Eberts and McMillen, 1999;Fujita et al., 1999;Combes and Overman, 2004;Graham, 2007;Ke, 2010).Therefore, we suggest that this IV satisfies both the relevance and exclusion conditions.
The two-stage estimations for aggregate, average and reallocation productivity are reported in Table 5.With two IVs being introduced into the two-stage estimates, we provide statistical evidence for both the relevance and exogeneity of the IVs.Panel A presents the results for the first-stage regressions.Panel A presents the results for the first-stage regressions.Consistent with our expectation, both Per Capita Passenger Traffic and Per capita NGO are significantly and positively correlated with the existence of industrial clusters, confirming the relevance of the IVs.Meanwhile, the Sargan-Hansen tests statistically confirm that these two IVs satisfy the exclusion condition.The second-stage regression results are reported in Panel B. In all the regressions, the coefficients of the instrumented cluster dummy are always positive and significant, except for the average TFP calculated following Olley and Pakes (1996) with an industry-fixed effect.Such results confirm that industrial clusters can increase overall productivity, average productivity, and resource reallocation efficiency across firms of the local industry.In sum, the results of the two-stage estimates are consistent with those of the baseline regressions.Thus, the causal relationships between industrial clusters and increased aggregate, average, and reallocation productivity are confirmed.

Additional robustness checks
Using the two-stage estimates, we have provided statistical evidence for the causal relationship between industrial clusters, productivity and resource reallocation.In this subsection, we conduct additional robustness checks to rule out alternative explanations for the estimation results to establish the causality further.

Table 4b
Cluster strength (measured by establishment number) and county-industry productivity. (1) ( First, we check whether our results may be rejected if we include some alternative variables measuring market-based institutions in a region.Specifically, we focus on two sub-indices of the CMI to replace the original overall marketization index, i.e., the development of factor market (Marketization index2)and the protection of property rights (Marketization index3) in a province, which are particularly relevant to a locality's productivity and resource allocation.Tables A-1a and A-1b present the estimates in which the two sub-indices are controlled, respectively.As shown in the tables, the significantly positive relationships between the existence of industrial clusters and average firm TFP and the resource reallocation TFP stay robust no matter which specific marketization index we include in the estimations.
Second, to further confirm the causal relationship between the clusters and the productivity and resource allocation efficiency of a local industry, we add the interaction terms of the marketization indices and the existence of industrial clusters in the estimation.The rationale is that we emphasize that industrial clusters help firms overcome various institutional constraints in China; if this is indeed the case, we should observe that the impact of industrial clusters diminishes in regions with better institutions because those firms, whether in clusters or nonclusters, face fewer institutional constraints than those in regions with weaker institutions.So if the coefficients of the interaction terms of marketization indices and the existence of industrial clusters are negative in the estimates of average and reallocation TFP, it should have confirmed implications for the causality proposed.We present the estimates in Tables A-2a, A-2b and A-2c, in which we add the interaction terms of industrial clusters and three marketization indices (i.e., the overall marketization index and the indices measuring the development of the local factor market and the protection of property rights in a province in a given year), respectively.As shown in the tables, both the existence of the industrial clusters and marketization indices stay significantly and positively related to the average firm TFP and reallocation TFP within a local industry.However, the coefficients of the interaction terms are constantly and significantly negative, supporting our propositions.Such results suggest that in regions where the institutions are better, the gaps in average firm TFP and reallocation TFP between clusters and non-clusters are narrower, confirming that by overcoming institutional constraints, industrial clusters help improve local firm TFP and the resource reallocation among firms in a locality.
Third, the industrial profile varies across locations so that some regions may have a concentration of heavy industries, and others may have a concentration of high-tech industries.As a result, the productivity of different industries may vary a lot.As our cluster measurement is derived from the density of firms, one may be concerned with the impacts of industry distribution on the productivity of localities.
Although we have controlled for county industry HHI in our analysis, we further run a set of regressions in which we control the three largest industries in each county.Specifically, for each county in a given year, we first identify the three largest industries in terms of output.We then include a set of dummy variables in the regressions indicating whether the industries concerned belong to these largest ones.The results in Table A-3 show that with the three largest industries controlled, the effects of clusters on the aggregate, average, and resource reallocation TFP stay robust.Fourth, another concern is the impacts of megacities on clusters and productivity.It is known that all Chinese megacities are located in coastal areas.Therefore, it might be possible that the cluster effect in improving productivity we discover is driven by the megacities, which are clusters at a much larger scale than those defined in our study.In our baseline and two-stage estimations, we have controlled county×industry fixed effects.However, if the megacity effects overwhelm the countyindustry effects, the effects of clusters we have observed from the baseline estimations may have been inflated.To address such concerns, we run two sets of additional regressions.First, we look at the subsample of the counties located outside the megacities (Beijing, Shanghai, Guangzhou, and Shenzhen).Second, we control the megacity effects directly, measured by the population of the megacities.The estimates presented in Tables A-4a and A-4b show that with the effects of megacities controlled, the results we present in the baseline estimations remain robust.
To summarize, the estimations presented in Tables A-1a to A-4b confirm that clusters improve the aggregate and average productivity of firms; moreover, they reduce resource misallocation across firms by improving resource reallocation efficiency.

Mechanism: industrial clusters and local-industry competition
In this section, we explore how industrial clusters in China alleviate the problem of resource misallocation across firms and improve productivity.Previous studies on Chinese clusters suggest that clusters may have reduced entry barriers and improved competition within the local industries (Huang et al., 2008;Long and Zhang, 2011;Xu and Zhang, 2009).Specifically, clusters deepen the division of labor in the production process.As a result, it makes it possible for small entrepreneurial firms to enter the market by focusing on a narrowly defined production stage.However, previous studies are either based on case studies or cross-sectional data, and competition is not systematically measured.In the following, we systematically evaluate the procompetitive effects of industrial clusters by investigating their relationship with firm entry and exit patterns and how they affect firm markup dispersion within the local industries.

Industrial clusters and firm entry and exit
To examine the firm entry and exit within clusters, we follow Dunne et al. (1988) to calculate firms' entry and exit patterns in all countyindustries in China.We then compare the statistics between clusters and non-clusters.The entry and exit statistics are defined in the following: NE jkt = number of firms that enter industry j of county k between years t-1 and t; NT jkt = total number of firms in industry j of county k in year t, including firms that enter industry j of county k between years t-1 and t; NX jkt− 1 = number of firms that exit industry j of county k between years t-1 and t; QE jkt = total output of firms that enter industry j of county k between years t-1 and t; QT jkt = total output of all firms in industry j of county k in year t; QX jkt− 1 = total output of firms exiting industry j of county k between years t-1 and t.
The entry and exit rates and firm turnover rate of industry j in county k between year t-1 and t are defined as the following: Turnover jkt = (NE jkt +NX jkt )/NT jkt Moreover, to measure the relative size of the new entries and exiting firms, we calculate the average size of entering firms relative to incumbents (ERS) and the average size of exiting-firms relative to nonexiting-firms (XRS) as: Table 6 reports the comparative statistics of firm entry and exit patterns within clusters and non-clusters based on the ASIFP data.The results indicate that firm entry and exit are more active in clusters than those in non-clusters.Moreover, the entry rate in the cluster (ER = 0.3118) is more than twice as high as that in non-clusters (ER = 0.1429).Furthermore, the exit rate in clusters (XR = 0.1646) is significantly higher than that in non-clusters (XR = 0.0879).These results suggest a higher competition level within industrial clusters since the firm turnover is significantly higher.Moreover, on average, the more significant turnover in clusters seems to be mainly driven by small firms, given that the mean values of ERS and XRS are smaller than 1 in clusters, and they are much lower than those in non-clusters.
Table 7 reports the regression results for the effects of clusters on the firm entry and exit of a county.As shown in columns (1) and ( 2), the existence of clusters is significantly and positively associated with the total number of new entries and exiting firms.Meanwhile, columns (3) and (4) show a similar pattern of cluster effects on firms' entry and exit rates.On average, a county with a cluster in a given industry may have 4.2 more new entries and 1.8 more exiting firms in the industry compared to a county without a cluster in this industry after controlling for factors such as the average age, size, ownership, and leverage of firms within the county-industry as well as the size, development level, marketization level and industry diversity of the counties.Similarly, on average, a county with a cluster in a given industry has a 0.029 (19.1 % of the mean) higher entry rate and 0.019 higher (20.6 % of the mean) exit rate of firms than a county without a cluster in the industry.Finally, Column (5) shows that the turnover rate of firms is significantly higher in counties with a cluster in a given industry than others without a cluster in the industry.On average, a county with a cluster in a given industry has a 0.027 (5.7 % of the mean) higher firm turnover rate than a county without a cluster.
Tables 8a and 8b show the relationships between the strength of the clusters and firm dynamics.Again, it is clear that the strength of the clusters, no matter measured by the total outputs or the number of firms in the industry, is significantly and positively correlated with the number of new entries and exiting firms, and the turnover of firms.Overall, the results of Tables 7, 8a and 8b confirm our conjecture that industrial clusters expose firms to greater competition and therefore facilitate the reshuffling of market shares from less to more productive firms.
Results shown in Tables 7 and 8 are based on the ASIFP data, which only contains SOEs or non-state firms with annual sales of 5 million RMB or above.So, the "entry" into the panel data may include cases where an existing non-state firm's sales grow to exceed 5 million RMB, and the "exit" may consist of cases where an existing non-state firm's sales decrease to below 5 million RMB.As a further robustness check, we utilize another firm-level data from the State Administration for Industry and Commerce, which contains information on the establishment date and deregistration date (if applicable) of all the registered firms in China during our sample period from 1998 to 2007.Using this dataset, we can calculate the number of newly established and de-registered firms in each county-industry.Meanwhile, the registration database provides information on the registration capital of the firm at the time of incorporation, allowing us to estimate the financial situation of the startup firms.
However, due to data limitations, we do not have information on the surviving firms during this period.Therefore, we cannot calculate the firm entry rate, exit rate, or relative size to incumbents.Instead, we calculate the growth of firm entry or exit from the current to the following year.Table 9 reports the OLS results for the effects of clusters on these measures of firm entry and exit patterns based on the firm (de) registration data.As shown in the table, the existence of clusters is significantly and positively associated with the number of new entries and existing firms in a county-industry between 1998 and 2007 and the growth rate of exiting firms.Moreover, the estimations on the registration capital show that the startup capital for firms in a county with a cluster in a given industry is lower than that for firms in a country without a cluster in this industry.Such results further confirm that clusters lower the entry barriers of firms and thereby intensify the competition in a locality.
Although we use registration data to test further the robustness of estimates for the impact of industrial clusters on resource reallocation, we note that there may be self-selection, i.e., some firms choose to deregister themselves rather than due to elimination by the market competition.In order to address such concerns, we conduct a series of robustness checks as follows.
Above all, a central question underlying our resource reallocation argument is whether less productive firms are more likely to lose market share or even drop out in clusters than in non-clusters.To directly address the concern of which firms lose market shares or are deregistered, we compare which type of active firms lose market shares in and outside clusters using our ASIFP data.We suggest that such an empirical strategy provides further support for our argument.At the same time, it reduces the concern of self-selected deregistration (i.e., some entrepreneurs may choose to close their businesses for other reasons rather than that firms are less productive).We will discuss these further estimates D. Guo et al.

below.
Firstly, as shown in Table A-5, a significant and positive correlation exists between firms' TFP and their market shares in clusters (* = p < 0.01), indicating that more productive firms gain more market share in clusters on average.However, in non-clusters, this relationship is not clear.In particular, when TFP is measured by the OP method, there seems to be a negative association between firm TFP and market share in non-clusters.Secondly, we focus on exiting firms' TFP in clusters and non-clusters.As we use the ASIFP data, exiting composes both "real exiting" and a decrease in sales to below 500 million RMB for non-state firms.The results are presented in Table A-6.It shows that no matter how TFP is measured, exiting firms in clusters always have higher TFP

Table 8a
Cluster strength (measured by output) and firm entry and exit patterns. (

Table 8b
Cluster strength (measured by establishment number) and firm entry and exit patterns. (

Table 9
Industrial clusters and firm entry and exit patterns using firm (de)registration data.
( than their counterparts in non-clusters.That is, firms with the same TFP are more likely to lose market share if they are in clusters.Thirdly, we compare the TFP difference between survivors and exiting firms in clusters and non-clusters, presented in Table A-7.It shows that within clusters, the gap in TFP between those surviving and those exiting the ASIFP is substantially smaller than the gap in non-clusters, which to a certain extent, indicates that the competition among firms with different TFP levels is fiercer in clusters than in non-clusters.The summary statistics discussed above provide further evidence that in counties with clusters, less productive firms lose more market shares than those in regions without clusters though they may not necessarily be deregistered.Such results support the regression analysis of the relationship between industrial clusters and resource reallocation among the firms of specific industries within a county.

Industrial clusters and firm markup dispersion within the local industry
Early inquiries into the relationship between resource misallocation and aggregate TFP tend to focus on the misallocation of production factors and generally assume that all firms have the same markup within industries (e.g., Hsieh and Klenow, 2009;Restuccia and Rogerson, 2008).In contrast, recent studies in international trade and industrial organizations have developed models with endogenous markup (De Loecker and Warzynski, 2012;Edmond et al., 2015Edmond et al., , 2018)).In their models, markup increases with firm size, generating the same misallocation captured by Restuccia and Rogerson (2008) and Hsieh and Klenow (2009).In this sub-section, we explore if China's industrial clusters mitigate markup dispersion, thus reducing resource misallocation.
The discussion of the efficiency costs of markups can be traced back to the study of Lerner (1934), which shows that in a world with markup dispersion, firms with higher markups employ resources at less-thanoptimal levels, while those with lower markups produce more than optimal, resulting in efficiency losses (Opp et al., 2014).Furthermore, some recent papers (Baqaee and Farhi, 2020;Edmond et al., 2018) show that for heterogeneous firms engaging in monopolistic competition,11 in equilibrium, more productive firms will be larger, choose to deal with less elastic demands and so charge higher markup than less productive firms.In a more competitive environment where resources or market shares can be reallocated freely from less productive to more productive firms, more productive (and higher markup) firms will produce more, reducing their markups.Similarly, lower productivity (and lower markup) firms will produce less, increasing their markups.Hence, if China's industrial clusters provide a more competitive environment, there should be reduced firm markup dispersion within clusters compared with non-clustered local industries.Furthermore, within clusters, the individual firm markup at the higher quantile should be decreased while that at the lower quantile should be increased. 12or this purpose, we first look at the relationship between firm size, productivity, and measured markup in our data.Then, following De Loecker and Warzynski (2012) and Lu and Yu (2015), we use firm sales to measure its size and calculate individual firm markup.Details of the calculation of markup are described in Appendix D. As shown in Table 10, consistent with Hsieh and Klenow (2009) and Edmond et al. (2015Edmond et al. ( , 2018)), the Chinese ASIFP data do feature strong positive correlations (p < 0.001) among the three variables: firm sales value is positively associated with firm productivity (measured by the three TFP indices), which is in turn positively associated with the markup it charges.
The following is the formal test on the effect of industrial clusters on firm markup distribution within county-industries using panel regressions.We use two measurements of firm markup dispersion to ensure the robustness of our results.The first one is the Theil index (Theil jkt = 1 n jkt ∑ n jkt i=1 y ijkt y jkt log y ijkt y jkt ), where y ijkt is the markup of firm i of industry j in county k at year t.y jkt is the average firm markup of industry j in county k at year t, and n jkt is the total number of firms in that countyindustry in year t.The second measure of markup dispersion is the relative mean deviation of each county-industry during our sample In addition to investigating the effect of industrial clusters on firm markup dispersion, we also look into markup responses at different quantiles along with the distribution.Specifically, we pin down the firm markup at the 10th, 25th, 50th, 75th, and 90th percentile of each county-industry every year and then estimate the effect of clusters on firm markup at different percentiles separately.The estimation model is the following: where Y jkt is Theil jkt , RMD jkt , as well as the firm markup at different percentiles.Cluster jkt is the dummy indicator of a cluster in industry j of county k in year t.Z jkt , θ jk , and θ t are the same control variables and fixed effects as defined in Section 4.
Table 11 reports the regression results on the effect of clusters on markup dispersion and markup distribution.Columns (1) and ( 2) demonstrate that firm markup dispersion is significantly lower in clusters than in non-clustered county-industries.Given the mean of Theil jkt and RMD jkt being 0.0065 and 0.047, industrial clusters alone can explain 15.38 % and 19.15 % of the decrease of Theil jkt and RMD jkt , respectively.The rest columns illustrate that the cluster effects on firms' markups vary with size.For smaller firms at the lower quintiles (Columns 3 & 4), the cluster effect is significant and positive, implying the enlargement of these firms' markups.However, for the larger firms at higher quintiles (Columns 6 & 7), the effect is the opposite, i.e., significant and negative, indicating the reduction of their markups.Whereas for middle-sized firms (Column 5), the cluster effect is insignificant.These findings prove that China's clusters offer a more competitive environment, which reduces the gap in markups between large and small firms and mitigates resource misallocation across firms within clusters.

Discussions and conclusion
Based on a systematic analysis of county-industry panel data, we find that China's industrial clusters have significantly improved the productivity of local industries.The increased productivity not only comes from higher average firm productivity but also comes from more efficient resource reallocation across firms within the clusters.Additionally, we find concrete mechanisms through which China's clusters mitigate the resource misallocation problem.For example, there is a higher level of firm turnover within clusters than outside clusters, and the startup capital for firms within a cluster is lower than that outside a cluster.Moreover, firms' markups within clusters have smaller dispersion than those outside clusters.These findings imply that industrial clusters in China provide a more competitive environment that contributes to more efficient resource reallocation within clusters.
This study fills three important knowledge gaps in agglomeration research by linking regional TFP and resource reallocation among firms within clusters.The three gaps refer to the disconnection between the agglomeration literature and the recent regional growth literature, the lack of understanding of the heterogeneity of agglomeration effects, and the under-investigation of Porter's competition arguments in regional growth settings.It sheds light on some challenging questions in agglomeration and development studies.Precisely, by estimating regional aggregate TFP and TFP changes driven by resource reallocation, we capture the spillover effects generated by agglomeration in the absence of structural changes in input factors, thus bringing the empirical estimates closer to the theoretical predictions of the agglomeration literature.Moreover, such results may shed some light on explaining the heterogeneous effects of agglomeration (e.g., Arzaghi and Henderson, 2008;Lee, 2018;Rosenthal and Strange, 2003;Speldekamp et al., 2020).In particular, it helps explain a challenging paradox found in the existing studies, namely, the seemingly contradicting observations that higher performance and lowered survival rates exist simultaneously in clusters (e.g., Folta et al., 2006;Myles Shaver and Flyer, 2000;Staber, 2001).Moreover, the findings on the lowered markup dispersions and the accelerated firm turnover within clusters suggest that both the entry and exit barriers are lowered within clusters in China.It therefore enriches evidence for Porter's (1990) competition arguments in two ways.First, it is among the few empirical studies examining agglomeration effects on regional growth by applying Porter's competition framework.Second, it provides a rigorous analysis of firm dynamics and markup dispersions among firms within a region simultaneously for the first time in the agglomeration literature.By adding this evidence, this study contributes to the existing theoretical debates on whether the benefits of competition may offset the costs of competition within clusters in certain contexts.
Finally, it complements existing discussions on economic growth by identifying agglomeration, a particular type of production organization, as a contributor to resource reallocation for the first time.It therefore adds some thoughts on why firms choose to co-locate in some cities with extremely high production costs and why the variations in economic development levels persist across regions (e.g., Banerjee and Duflo, 2005;Gancia and Zilibotti, 2009;Hsieh and Klenow, 2009;Restuccia and Rogerson, 2008).
The findings from this study have significant policy implications.Many governments worldwide try to promote local economic growth by providing incentives for firms to co-locate, believing that firms may benefit from the spillovers from the agglomeration.However, our discovery suggests that firms are not equally benefited from the agglomeration.Therefore, policymakers should consider the industry structure, general environments, and organizational structure of firms in a region when designing urban or agglomeration policies.Some limitations of this study are worthwhile to mention.Above all, the estimations are up to 2007 because of the constraints of data sources.Both the institutional environments within China and the global trade have experienced significant changes since then, so more updated data may help explore more insights regarding the institutional impacts on the effect of agglomeration on regional growth in the country.However,  it is important to note that some theoretical nature of the research questions focused on in this study and the corresponding empirical findings (i.e., the lowered entry and exit barrier and intensified competition within clusters may help improve resource reallocation and thereby improve regional productivity) are time-and contextindependent.Therefore, we suggest that such limitations should not significantly impact the contributions of this study.In addition, the primary sample of this study does not cover micro non-state-owned firms with sales below RMB5 million, which may have caused some potential biases.However, the robustness checks based on the 2004 ECD, which covers all business entities in China, provide evidence that the potential issue of the data does not change the findings of our estimations qualitatively.Finally, we focus only on manufacturing firms in this study because of the data constraints.If the effects of agglomeration on the service sector can be studied in future research, our understanding will be further improved.
Several challenging questions arising from our discoveries require further research.One of our major findings is that industrial clusters enhance regional productivity through improved resource reallocation in China.Our findings also suggest that the accelerated firm turnover and intensified competition within clusters serve as a potential channel for achieving such effects.Further exploration of alternative mechanisms and why particular mechanisms are more prevalent than others in certain regions requires much more research.At the same time, examinations on the reallocation of which specific resources are more likely to be affected by agglomeration deserve more scrutiny.Further studies on the characteristics of new entries to and exiting firms from clusters under different conditions may enrich our understanding of why some clusters last long while others decline with age.Furthermore, a systematic analysis of which aspects of technological progress or organizational learning are more likely to be affected by agglomeration and how such effects are related to the net effects of agglomeration deserves more investigation.Finally, our findings indicate that the collaboration between entrepreneurs and local governments to overcome institutional constraints is important for the development of clusters in China.Therefore, it is worth investigating whether it is also an important element in a market economy based on comparative analysis to further contribute to the literature on economic development and institutions.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
the different probabilities of exits for small and large low-productivity firms.

B.1. OLS method explanation
The OLS method is straightforward.In the OLS regression, the TFP, which denotes the effects on the total output that are not caused by the tangible inputs in the production and represents the technological dynamism, is estimated as the error term.
The equation below demonstrates the estimation of TFP through the OLS method.
As the residual of the OLS regression, lnA is the TFP we intend to measure.The firm-level TFP estimation considers the year and two-digit SIC code fixed effects.The robustness of the estimation results is verified by relaxing the year effects.
The disadvantage of the OLS method is that it only considers tangible inputs, such as labor and capital, and not unobservable shocks.This aspect results in a static model, in which all types of inputs are exogenous and have no correlation with the error term (i.e., TFP).The limitation of the OLS method is obvious, and the associated coefficients are biased.

B.2. OP method explanation
In the presence of selection bias and simultaneity, the OP estimation allows for the endogeneity of some input factors and unobserved productivity differences among firms.Moreover, such an estimation also considers the exit of firms from the market.Hence, the OP estimation has several advantages over the simple OLS method.
The Olley and Pakes' (1996) approach is characterized by the Bellman equation and assumes that the firm constantly maximizes the expected discounted value of future profits.Thus, stay-or-quit and investment decisions in each period are formulated.
For estimation purposes, this study uses the Cobb-Douglas production function.In particular, gross output and value-added production functions are adopted.The equation below denotes the production function in the OP method.
where Total Output it is deflated by the producer price index for manufactured products, L it is the labor input by firm i at time t (either the number of employees or the total payment of employees of a firm can be a proxy for this variable), K it is the capital input by firm i at time t and is deflated by the price index of investment in fixed assets, I it denotes the intermediate inputs by firm i at time t and is deflated by the producer price index for purchasing products, w it is the productivity shock known by a firm when it makes its liquidation decision and investment decision, and ε it is the true error term.
In this study, all variables in the equations are in their logarithm form, and the time trend and two-digit industry heteroskedasticity are controlled.

B.3. Data source
The firm-level data of this study, together with associated financial information, are derived from the Above-scale Industrial Firm Panel (ASIFP).ASIFP comprises virtually all the manufacturing firms in China, including all the state-owned enterprises and non-state firms with annual sales of at least 5 million RMB between 1996 and 2007.This database covers input information, such as labor, fixed asset, and intermediate inputs, as well as other firm-specific characteristics such as location, industry, and age.The dataset is an unbalanced panel data with gaps.
As a prerequisite to TFP calculation, real capital stock stimulates discussion and dispute.The lack of firm-level capital stock data causes difficulty in constructing a series of real capital stocks, which are comparable across time and firms.Following Brandt et al. (2012), the perpetual inventory method (PIM) is applied.Through the PIM method, the effective capital stock in production is measured as a weighted sum of previous fixed asset investments in constant price terms.

RCS
where RCS t is real capital stock in year t, d τ is the efficiency of fixed assets in the τth year, and I t− τ is the fixed asset investment flow τ years ago.
With the additional assumption of d τ declining in a geometric pattern, we write the PIM equation as follows: This study formulates fixed asset growth at the two-digit SIC code level as a recursive step back to when a firm was established.Applying the preceding PIM method, together with the series of investment deflators from China Urban Life and Price Yearbook (2009), this study constructs the series of real capital stocks.1978 is set as the starting point of the initial capital stock for the series calculation, and 9 % is applied as the fixed depreciation rate, to be specific.Finally, all the nominal values are deflated by price indices with the benchmark 100 set in 1996.
In the OP model, a firm's decision-making process, whether or not a firm opts to remain in the market, must be clarified.However, this information is not contained in the dataset used by this study.Accordingly, the panel data themselves are used to verify this exit variable.Using the unbalanced panel data with gaps ranging from 1996 to 2007, we define that a firm exits from the market when the observation record is not continuous.The dummy variable exit is equal to 1 if the firm exits from the market in the current period or 0 if otherwise.

Fig. A- 1 .
Fig. A-1.Distribution of the DBI clusters across Chinese counties.

Table 1b
Summary statistics of variables related to cluster existence and strength.Strength_V jkt _1 and Strength_V jkt _2 are always positive and significant.Moreover, the magnitudes of the coefficients for Strength_V jkt _2 are always larger than those of Strength_V jkt _1, and are also larger than those of Cluster jkt (see Table on local-industry productivity.As defined in Section 3, we construct two categorical variables to differentiate weak and strong clusters based on clusters' output value or establishment number.The first variable, Strength_V jkt , equals 0 for non-clustered county-industries.It equals 1 for clusters with a below-median contribution to national total industrial output compared with other clusters from the same industry and 2 for clusters with the median or above-median contribution to national total industrial production.As shown in Table4a, for aggregate productivity and average productivity, no matter how TFP is measured, the coefficients of

Table 2
Summary statistics of dependent and control variables by clusters and non-clusters.jkt_2 is about 0.04.Given that the mean value of reallocation TFP is about 0.235 during our sample period, the presence of strong clusters can explain 17 % of the increase in reallocation TFP in the county-industries of China.Table4breports similar results when cluster strength is measured by its contribution to the national total establishment number.The variable of interest, Strength_E jkt , equals 0 for non-clustered county-industries.It equals 1 for weak clusters and equals 2 for strong clusters.Similarly, for aggregate productivity and average productivity, no matter how TFP is measured, the coefficients of Strength_E jkt _1 and Strength_E jkt _2 are always positive and significant.The magnitudes of the coefficients of Strength_E jkt _2 are larger than those of Cluster jkt (see Table *** p < 0.01.D.Guo et al.Strength_V

Table 3
Industrial clusters and county-industry productivity.
To save space, we do not present all the control variables in the table; standard errors are clustered at the county-industry level.
To save space, we do not present all the control variables in the table; standard errors are clustered at the county-industry level.

Table 5
IV regressions for the effects of industrial clusters on county-industry productivity.
Note: To save space, we do not present all the control variables in the table; standard errors are clustered at the county-industry level.***p<0.01.**p< 0.05.*p< 0.1.D.Guo et al.

Table 6
Firm entry and exit patterns within clusters and non-clusters using the ASIFP data.
Note: Standard errors are clustered at the county-industry level.***p < 0.01.

Table 7
Industrial clusters and firm entry and exit patterns.
To save space, we do not present all the control variables in the table; standard errors are clustered at the county-industry level.

Table 10
Statistical correlation between firm size, productivity and markup.

Table 11
Industrial clusters and firm markup distribution within county-industries.
Olley and Pakes (1996)ation, we first transpose, then multiply both sides by M it Qit , and then multiply RHS by P it Pit .We get the following equation, and P it is the price of the final good.Pit λit is the markup defined as the ratio of price over marginal cost.And then we can get, the information on the expenditure of intermediate goods and total sales are available in the ASIFP dataset.However, the output elasticity of intermediate goods must be obtained from the estimated production function.We followOlley and Pakes (1996)to estimate the production function, and how our production function is estimated can be found in Appendix B for the estimation of TFP.Specifically, the output elasticity of intermediate goods is the first-order derivative of our production function w.r.t intermediate goods.

Table A -
2a Interaction effects of marketization index and industrial clusters.To save space, we do not present all the control variables in the table; Standard errors are clustered at the county-industry level.

Table A -
2bInteraction effects of marketization index (development of factor market) and industrial clusters.
Note: To save space, we do not present all the control variables in the table; Standard errors are clustered at the county-industry level.***p<0.01.D. Guo et al.** p < 0.05.

Table A -
2c Interaction effects of marketization index (protection of property rights) and industrial clusters.To save space, we do not present all the control variables in the table; standard errors are clustered at the county-industry level.Robustness check that controls the three largest industries in the county.To save space, we do not present all the control variables in the table; Standard errors are clustered at the county-industry level.

Table A -
4a Robustness check using the subsample of counties outside megacities.To save space, we do not present all the control variables in the table; Standard errors are clustered at the county-industry level.

Table A
To save space, we do not present all the control variables in the table; Standard errors are clustered at the county-industry level.The correlation between the TFP and market share of firms inside and outside clusters.The comparison of the TFP of firms exiting the ASIFP in clusters and non-clusters.

Table A -7
The average gap between the TFP of surviving and exiting firms in clusters and non-clusters.