Industry Location and Wages: The Role of Market Size and Accessibility in Trading Networks

We investigate the geographical distribution of economic activity and wages in a general equilibrium model with many asymmetric regions and costly trade. As shown by extensive simulations on random networks, local market size better explains a region's industry share, whereas accessibility better explains a region's wage. The correlation between equilibrium wages and industry shares is low, thus suggesting that the two variables operate largely independently. The model replicates well the spatial distribution of industry using Spanish data, yet overpredict changes in that distribution due to changes in 'generalized transport costs'. The latter had only small impacts on changes in the geographical distribution of economic activity in Spain from 1980 to 2007.


Introduction
Do market size and accessibility matter for industry location and wages? This question has attracted substantial attention in the literature ever since Krugman's (1980) and Helpman and Krugman's (1985) seminal contributions to new trade theory. The answer is 'yes', at least in simple models. It has indeed been shown that, in a world with increasing returns and costly trade, market size and 'accessibility' are locational advantages that significantly influence the geographical distribution of industry and regional factor prices. 1 Despite those fundamental insights of new trade theory and new economic geography, it is fair to say that the models in which the results have been derived rely on a number of highly restrictive assumptions. Those assumptions include, among others: (i) the existence of a costlessly tradable good; (ii) a single production factor; (iii) constant elasticity of substitution (ces) preferences; (iv) two industries only, with one producing a homogeneous good; and (v) two locations only. Though necessary to derive clear-cut results, those assumptions imply that little is know about the robustness of the results and on how they can guide empirical analysis.
Conscious of those limitations, and of the fact that a better understanding is required to push further empirical work on that topic, much subsequent work has started to relax some of those assumptions. First, Ottaviano and Thisse (2004), Picard and Zeng (2005), Zeng and Kikuchi (2009), Baldwin, Martin, Forslid, Ottaviano, and Robert-Nicoud (2003), Head, Mayer, and Ries (2002), and Yu (2005), among others, have shown that the basic insights of 'home market effects' (hme) generalize to other preference structures -such as quadratic-linear preferences -or other market structures -such as oligopolistic competition. Second, Davis (1998) and Picard and Zeng (2005) have shown that the effect of market size on industry location is strongly dampenend (or even disappears) when the homogeneous good is not costlessly tradable. Davis (1998), in particular, shows that when trading the homogeneous good is as costly as trading the differentiated good, market size has no longer any bearing on regional specialization. This is also one basic message of Hanson and Xiang (2004), who argue that -in the absence of a costlessly tradable good -not all increasing returns sectors can be disproportionately present in one region, i.e., display 'home market effects'. Zeng and Kikuchi (2009), and Takahashi, Takatsuka, and Zeng (2013) derive analytical results in the case without factor price equalization (fpe), but only with two regions. 2 Behrens, Lamorgese, Ottaviano, and 1 Tabuchi (2009) use a 'hybrid' approach, where trading the homogenenous good is costless, but where exogenous Ricardian differences in labor productivity in the homogeneous sector across countries create exogenous wage differences. Though conceptually simple and applicable to multiple countries, that approach does not allow for endogeneous wage changes in response to changes in economic fundamentals. Last, turning to multi-country extensions of those models, Behrens, Lamorgese, Tabuchi (2007, 2009) derive results for models with more than two countries. They show that the topology of the trading network matters for several of the results, and that the impact of market size on industry location arises only when differences in factor costs and in accessibility to markets are adequately controlled for. While being empirically very important, multi-location extensions of new trade models to arbitrary geographical structures have been very rare in the literature until now. 3 While all of the foregoing contributions shed some light on the role of market size and accessibility on industry location and wages, what is missing to date is more systematic evidence for what happens in more 'realistic settings' where several of the basic assumptions are relaxed simultaneously. To the best of our knowledge, there has been no systematic investigation when there are multiple locations, several industries, and costly trade for all goods. This paper addresses precisely these issues. As there is no hope to obtain clear-cut analytical results in the general case, we instead resort to systematic numerical simulations. More precisely, we simulate the equilibria of two different models using a large number of randomly generated networks with a large number of regions. We then run simple regressions to extract the essence of the 'comparative static' results that are out of reach of pencil-and-paper analysis. In a nutshell, our research strategy is to combine theory and numerical analysis to: (i) first prove some theorems in 'toy models'; (ii) then solve large-scale models by numerical analysis; (iii) then run a detailed statistical analysis of the numerical results, very much like engineers or physicists do; and (iv) finally confront the models with real data to use if for simulation purposes.
Our key findings can be summarized as follows. First, absolute local market size -as measured by population -and accessibility -as measured by centrality in the trading network -are crucial in explaining a region's wage. This result is due to the fact that absolute size and accessibility affect all industries in similar ways, i.e., constitute a region's absolute advantage. The effect is stronger and more systematic in models where all sectors are subject to transport costs and exhibit increasing returns to scale. Second the relative local market size of industries (as measured by their consumer expenditure shares) is crucial in explaining a region's industrial composition. This result is due to the fact that relative spending patterns do not affect all industries in similar ways, i.e., constitute a region's comparative advantage. In a nutshell -and in line with Ricardian trade theory -absolute advantage translates into wages, whereas comparative advantage maps into specialization patterns. Third, the correlation between equilibrium wages and equilibrium industry shares is rather low, thus suggesting that both variables operate largely independently.
We then apply the models to Spanish data. Using 'Generalized Transport Costs' between regions as a measure of trade frictions, we find that the models generally predict well the distribution of industries, yet predict less well wages. A formal test does not allow to reject the null hypothesis that the industry distribution predicted by the models is the same than that observed in the data. We then use the calibrated model for Spain to run two counterfactual exercises, the aim of which is to disentangle the impact of changes in accessibility and changes in market size on regional industry shares and wages. Holding population fixed at 1980 levels, we find that changes in transport costs between 1980 and 2007 do not explain much of the increase in regional inequalities observed in Spain during that period. The change in inequality is much better captured when we hold transport costs fixed at 1980 levels and consider changes in population shares between 1980 and 2007. Although the simulated models capture the qualitative trend towards more regional inequality in Spain, they also tend to significantly overpredict the increase in polarization observed between 1980 and 2007.
The remainder of the paper is organized as follows. Section 2 develops two different new trade 'toy models': one with a single differentiated industry and a homogeneous good industry; and one with two differentiated industries. In both models, trade is costly and factor prices are endogenous. In Section 3, we extent the models to a larger scale and discuss a set of numerical results obtained from simulating those two models for a large number of random networks. We then present, in Section 4, an application to the case of Spanish regional data, as well as results from two counterfactuals. Finally, Section 5 concludes. Technical details are relegated to an extensive set of appendices.

Models
We develop two models within which we analyze the geographical distribution of economic activity and wages. In both models, there are M ≥ 2 regions subscripted by i = 1, 2, . . . , M. Each region is endowed with L i immobile workers-consumers. The total population in the economy is fixed at L ≡ ∑ i L i . Labor is the only production factor, i.e., we abstract from comparative advantage across regions.

Model 1: One differentiated sector and one homogeneous sector
Our first model builds on Helpman and Krugman (1985) and its multi-location extensions by Behrens et al. ( , 2009). There is one increasing returns to scale (irs) sector with monopolistic competition that produces a continuum of varieties of a horizontally differentiated good; and one constant returns to scale (crs) sector with perfect competition that produces a homogeneous good. In the differentiated sector, the combination of irs, costless product differentiation, and the absence of scope economies yields a one-to-one equilibrium relationship between firms and varieties.

Preferences and demands
Preferences of a representative consumer in region j are given by: where H j stands for the consumption of the homogeneous good; where D j is an aggregate of the varieties of the differentiated good; and where 0 < µ < 1 is the income share spent on the differentiated good. We assume that D j is given by a ces subutility function where d ij (ω) is the individual consumption in region j of variety ω produced in region i; and where Ω i is the set of varieties produced in i. The parameter σ > 1 measures the elasticity of substitution between any two varieties. Let p H j denote the price of the homogeneous good in region j and p ij (ω) the price of variety ω produced in region i and consumed in region j. Let w j denote the wage in region j. Maximizing (1) subject to the budget constraint p H j H j + ∑ i Ω i p ij (ω)d ij (ω)dω = w j yields the following individual demands: where P j is the ces price index in region j, given by (3)

Differentiated good
We first explain the workings of the irs industry. Technology is assumed to be identical across firms and regions, therefore implying that firms differ only by the variety they produce and the region they are located in. Since varieties enter preferences in a symmetric way, we henceforth suppress the variety index ω to alleviate notation. Production of any variety involves a fixed labor requirement, F , and a constant marginal labor requirement, c. Denote by x ij the amount of a variety produced in i and shipped to j. The total labor requirement for producing output Trade in the differentiated good is costly. Following standard practice we assume that trade cost are of the iceberg form: τ ij ≥ 1 units must be dispatched from region i in order for one unit to arrive in region j. We further assume that trade costs are symmetric, i.e., τ ij = τ ji . 4 Using the demands (2), each firm in i maximizes its profit with respect to all its prices p ij , taking the price indices P j and the wages w j as given. Because of ces preferences, profit-maximizing prices have constant markups We denote by n i the endogenously determined mass of firms located in i, and by N ≡ ∑ i n i the total mass of firms in the economy. We also denote by λ i ≡ n i /N the share of firms in region i. Because of iceberg trade costs, a firm in region i has to produce x ij ≡ L j d ij τ ij units to satisfy aggregate demand in region j. Free entry and exit imply that profits are non-positive in equilibrium which, using (4) and the pricing rule (5), yields the standard condition Let φ ij ≡ τ 1−σ ij ∈ [0, 1] denote the 'freeness of trade' in the differentiated good between regions i and j. Inserting the demand (2) and the price index (3) into (6), multiplying both sides by p ij , and using the prices (5), we get the wage equations Dividing both sides by the total population, L, letting θ j ≡ L j /L, and choosing -without loss of generality -units for F such that F ≡ µL/σ, we can rewrite (7) as follows: where RMP i stands for the real market potential of region i (Head and Mayer, 2004). The number of workers employed in the differentiated industry of region i, when it has n i firms, is where we have made use of our normalization of F .

Homogeneous good
We next explain the workings of the perfectly competitive crs industry. We assume that technology is the same in all regions. Without loss of generality, we normalize the unit labor requirement to one. Perfect competition implies marginal cost pricing. Given L D i workers employed in the differentiated good industry, the number of workers employed in the homogeneous sector equals L H i ≡ L i − L D i . Inserting (9) into that expression, we can rewrite the number of workers in the homogeneous sector as follows: Note that (10) need not be positive, i.e., some regions may specialize in the production of the differentiated good only. We assume that trading the homogeneous good is costly. 5 Hence, factor price equalization (fpe) does not hold in general and the world mass of firms in the differentiated industry is no longer constant. 6 The price of the homogeneous good produced in i and delivered to j equals its marginal cost of production, the wage w i , times the trade cost τ H ij between regions i and j: where ξ > 0 is a parameter that captures the relative cost of trading the homogeneous good compared to the differentiated good. If ξ = 1, there are no cost differences. When ξ > 1, trading the homogeneous good is more costly than trading the differentiated good, and vice versa when ξ < 1. In what follows, we set ξ < 1 because in the opposite case there is no trade in the homogeneous good so that the only equilibrium is one where industry shares are proportional to the size of the local market (Davis, 1998). 7 Because good H is homogeneous and can be produced in, and imported from, any region, its price in region i must be the lowest one that can be secured from any source: where D sj stands for the ces consumption aggregate in sector s in region j; and 0 < µ sj < 1 is the region-specific income shares for sector s. With two sectors, µ sj is equal to µ j in sector 1 and to 1 − µ j in sector 2. Since expenditure shares are region specific, the relative consumption patterns differ across regions. Hence, market sizes differ due to spending patters on top of differences in regional population sizes. The aggregator for consumption of the differentiated good, D sj , is as follows: where d sij (ω) is the individual consumption in region j of sector-s variety ω produced in region i; and where Ω si is the set of sector-s varieties produced in i. For simplicity, we assume that the elasticity of substitution between any two varieties, σ, is the same in both sectors. 9 Let p sij (ω) denote the price of sector-s variety ω produced in i and consumed in j; and let w j denote the wage in region j. Maximizing (17) subject to the budget con- (ω)dω = w j yields the following individual demands: 8 Hanson and Xiang (2004) develop a model with a continuum of sectors, but their focus is on two regions only. In this section, we take a complementary approach: we focus on two sectors only, but consider a large number of regions to look at industry location and wages. 9 We could relax that assumption, but there is not much to be learned from that exercise. The same holds true for relaxing the assumption of identical technologies in the two sectors. Nevertheless, as explained in footnote 17 below, we have also studied the effects of alternative values of σ and expenditure patterns µ sj on industry shares, λ si , and wages, w i . 8 is the ces price index in sector s and region j.

Technology and trade
For simplicity, we assume that technology is the same in both sectors. As in Section 2.1, the total labor requirement for producing the output x si ≡ ∑ j x sij is given by l si = F + cx si . Trade in both differentiated goods is costly and trade cost are symmetric and of the iceberg form: τ sij = τ sji ≥ 1 units must be dispatched from region i in order for one unit of a sector-s variety to arrive in region j. Using (18), a sector-s firm in i maximizes profit with respect to all its prices p sij , taking the price indices P sj and the wages w j as given. As before, profit-maximizing prices have constant markups: We denote by n si the endogenously determined mass of sector-s firms located in i, and by N s ≡ ∑ i n si the total mass of sector-s firms in the economy. Last, λ si ≡ n si /N s denotes the share of sector-s firms in region i. A firm in region i and sector s has to produce x sij ≡ L j d sij τ sij units to satisfy aggregate demand in region j. Free entry and exit imply that profits are non-positive in equilibrium which, using the prices (20), yields again the standard free entry zero profit condition (6). Inserting the demands and the price index (18) into that expression, using the prices (20), and letting φ sij ≡ τ 1−σ sij ∈ [0, 1] denote the 'freeness of trade' in sector s, we get the wage equations: Dividing both sides by world population, L, letting θ j ≡ L j /L as before, and choosing without loss of generality units of F such that F = L/σ, we obtain the real market potential for sector-s firms in region i as follows:
To pin down the wages, we can impose either the labor market clearing conditions or the trade balance conditions. In what follows, we use the former as they are easier to handle given our choices of normalization. Labor market clearing in i requires that L i = n 1i (F + cx 1i ) + n 2i (F + cx 2i ) = L(n 1i + n 2i ), where we have used the normalization of F . Hence, Conditions (22) and (23) can be solved for the equilibrium wages and industry shares. The total masses of firms in the two sectors in the economy, N 1 = ∑ i n 1i and N 2 = ∑ i n 2i are not constant and vary with the spatial distribution of demand and with the structure of the trading network. Note, of course, that the total mass of firms in both sectors in the world economy is equal to one: ∑ i (n 1i + n 2i ) = ∑ i θ i = 1 from (23). To solve the model, we set w 1 ≡ 1 by choice of numeraire. Focusing on two regions with symmetric trade costs and free intra-regional trade (φ sii = 1 and φ sij = φ for all i = j), Behrens and Ottaviano (2011) have proven the following analytical results for two special cases: absolute advantage, i.e., when the spending patterns of the two regions are the same but when the regions differ by population size (µ 11 = µ 12 and µ 21 = µ 22 , but θ 1 > θ 2 ); and comparative advantage, i.e., when spending patterns are anti-symmetric but when the regions have the same population size (µ 11 = µ 22 and µ 21 = µ 12 , but θ 1 = θ 2 ). In those two polar cases, it can be shown that (see Behrens and Ottaviano, 2011, for the proofs): Proposition 1 (Pure 'Comparative Advantage') Assume that preferences are anti-symmetric across regions (µ 11 = µ 22 and µ 21 = µ 12 ), and that both regions are of the same size (θ 1 = θ 2 ). The equilibrium is such that The equilibrium relative wage satisfies w * 2 = 1.
In Proposition 1, each region is the larger market for one of the two goods. Hence, each region specializes in the production of the good for which it has a relatively larger local demand. In other words, relative differences in market sizes lead to different specialization patters but do not affect factor prices. In the case of Proposition 2, one region is the larger market for both goods. In that case, the wage in the larger region must be higher because it offers a locational advantage for both industries. Clearly, this is akin to absolute advantage in a Ricardian sense and it is, therefore, capitalized into factor prices.
Of course, the two cases in Propositions 1 and 2 are extreme ones, and intermediate cases where both absolute and comparative advantage play a role should be considered. Furthermore, it is of interest to relax the assumption of just two regions and of symmetric trade costs to investigate also the interactions with 'geography'. This is what we do using numerical simulations in the next section and Spanish data in Section 4.

Size and accessibility in random tree networks
It is virtually impossible to derive general analytical results in an arbitrary multi-region setting without fpe, because the equilibrium allocations of firms and wages are determined by a complex trade-off between a region's market size and its accessibility in the trading network. 10 To nevertheless gain insights into how size and accessibility -as well as the whole structure of the trading network -influence the equilibrium, we resort to systematic numerical simulation. To this end, we proceed as follows.
First, we generate a random tree network with a random number of nodes (see Appendix B for details). The nodes are the regions, and the links between nodes represent the connections for shipping goods. Networks are generated incrementally either by having equal attachment probabilities for new nodes, or by using the Barabási and Albert (1999; henceforth ba) preferential attachment algorithm that generates networks which exhibit a 'hub-and-spoke' structure. Second, we assign a random population share, θ i , to each node i of the network. 11 In the case with two differentiated industries, we also randomly assign a region-specific expenditure share for each industry. Third, we solve the two models for their equilibria. We repeat this three-step process for a large number of randomly generated networks and then relate selected characteristics of the equilibria thus obtained to underlying networks characteristics. Doing so will allow us to gain more systematic insights into how size and accessibility interact to determine the regional allocation of firms and wages, and how those allocations depend on the econonic model we have choosen.
We describe the numerical implementation in detail in Appendix C. In the following sec-tions, we explore the results obtained for the two models.

Model 1: One differentiated sector and one homogeneous sector
We first compute simple correlations between the equilibrium masses of firms in the different regions (n * i ), their population shares (θ i ), and their centrality (C i ). The latter is measured either by the closeness centrality (henceforth 'closeness', for short) or by the node's degree. Following standard practice in the network literature (see, e.g., Freeman, 1979), closeness is defined as where d ij denotes the length of the link -the distance -between nodes i and j. By definition, closeness varies between 0 and 1. 'Degree' is simply measured by the number of links of the node. Centrally located nodes have both a high value for closeness and for degree. This can be seen from the correlations in the top panel of Table 1. That panel shows that, as expected, size (θ i ) and accessibility (closeness i and degree i ) are positively linked to a region's equilibrium industry share (λ * i or, alternatively, n * i ) and to a region's wage (w * i ). The correlation is particularly strong for the degree measure of centrality. Observe also that size is more strongly linked to industry location, whereas accessibility is more strongly linked to wages. Put differently, size differences map into differences in industry structures, whereas accessibility differences translate into factor price difference. In general, however, the correlations with factor prices are weaker than the correlations with industry location.
. CV denotes the coefficient of variation, whereas Λ * , n * , w * , and θ denote the equilibrium vectors of industry shares, masses of firms, wages, and market size, respectively. The bottom panel of Table 1 displays the same correlations as in the top panel, but now between aggregate network statistics and the vectors of equilibrium outcomes. More precisely, it displays the correlations between the coefficients of variation (cv) of industry shares and wages (computed for each network at the node level), and the coefficients of variation of size and accessibility. As can be seen, dispersion in market sizes -as captured by a larger cvis positively associated with dispersion of industry shares and wages. The same holds true for dispersion in the accessibility measures, with again a much stronger effect of degree as compared to closeness. It is finally of interest to note that the correlations between the equilibrium industry shares, λ * i (or the equilibrium masses of firms, n * i ) and the equilibrium wagesthough positive -are fairly small (0.080 and 0.085, respectively). This result suggest that the two variables operate largely independently to determine the equilibrium.
To go beyond simple univariate correlations, we now run several ordinary least squares (ols) regressions to estimate the partial effect of increasing market size or centrality of nodes on the equilibrium shares of manufacturing activity and the equilibrium wages, controlling for accessibility and for size. In Model 1, there are two endogeneous variables that can be analyzed in the regressions: the equilibrium allocation of firms, λ * i , and the equilibrium wages, w * i . 12 Mirroring the two panels of Table 1, we start with an analysis at the level of the individual nodes, and turn then to an analysis at the level of the whole network.

Results for individual nodes
We regress the equilibrium shares of firms, λ * i , or the equilibrium wages, w * i , on measures of: (i) the node's centrality, as given by either closeness or degree; and (ii) the node's market size. 13 We perform a pooled analysis with both types of networks (based on preferential attachment, ba, or equal probabilities) -in which case we include a network dummy indicating the network type -and separate regressions for each type of network. Formally, we estimate for all the nodes of the networks we have generated. Table 2 summarizes our estimation results of (26) and (27). As can be seen from that table, both centrality and market size positively influence a node's equilibrium share of firms and its equilibrium wage. It is worth pointing out that the so-called 'Home Market Effect' (hme) -defined as a more than proportional increase in industry shares in response to an increase in local market size -always arises in both types of networks: (∂λ * i /∂θ i > 1). This effect seems to generally hold in models without fpe (see, e.g., Takahashi, Takatsuka, and Zeng, 2013, for a discussion of the two-region case). 12 Due to the high correlation between λ * i and n * i (see Table 1), there is no reason to look at the latter separately. 13 We do not include both measures of centrality simultaneously, because of their high correlation (see Table 1).
Note further that both measures of centrality -closeness and degree -have a statistically strongly significant impact on the equilibrium allocation of firms across regions. 14 We also ran the regressions by quintiles in terms of the degree or the closeness distributions of the nodes. In both cases, the estimated coefficients for θ i increase monotonically with the quintiles. Thus, there is some complementarity between market size and accessibility: more accessible regions benefit more strongly from an increase in market size than more peripheral regions. In other words, increasing the size of the market in peripheral regions is unlikely to have strong impacts on the equilibrium allocation of industry.
The results pertaining to wages in the bottom panel of Table 2 deserve some comments.
First, as can be seen, the two measures of centrality are always positively linked to a region's equilibrium wage. In other words, more centrally located regions or regions with better market access command higher wages, which is in line with predictions of new economic geography models and with empirical evidence (see, e.g., Mion, 2004, for Italy; and Hanson, 2005, for the us). The average equilibrium wage for 'peripheral' regions -in the first quintile of the degree distribution -is w Q1 = 0.9854; whereas that for more 'central' regions -in the fifth quintile of the degree distribution -is w Q5 = 1.0074. Consequently, peripheral regions tend to specialize in the homogeneous good, paying lower wages and exporting to more central locations characterized by a high degree. The latter enjoy lower transportation costs over the network, and hence specialize in the differentiated good paying higher wages. Second, observe that the correlation between θ i and w * i is quite low -though still positive. The intuition underlying this surprising result is as follows. Consider two regions i and j, where θ i > θ j . Assume that region i is not fully specialized in the production of the differentiated good, i.e., there is still some local production of the homogeneous good. If region i imports some of the homogeneous good from region j, by (11) the following condition must hold: Hence, the relative wage w i /w j in the two regions just depends on the relative trade costs τ ji /τ ii , but it is independent of market sizes θ i and θ j . In other words, it is just the structure of the trading network that matters, but not the distribution of market sizes. Of course, this result only holds true when a region is not fully specialized in the production of differentiated goods. Should no production of the homogeneous good take place in a region, its wage will increase with its market size -and so will the wages of the regions that export the homogeneous good to that region. We can easily confirm this conjecture by computing the correlation between θ i and w * i for the regions that do not produce any of the homogeneous good. In that case the correlation is about 0.4, instead of 0.09 when considering all regions. In other words, costly trade in the homogeneous good imposes strong conditions on wages, and those conditions partly destroy the positive link between market size and wages. Notes: Breakdown of individual nodes by specialization type. The sample is the same than that used for the regression analysis. θ i and w * i denote the average market size and the average equilibrium wages of the types of nodes.
Third, centrality -both in terms of closeness and in terms of degree -generally has a strong impact on industry location and, to a lesser extent, on wages as explained above. A larger local market is weakly associated with higher wages, except in hub-and-spoke type ba networks (see columns (iii) and (iv) of Table 2). This latter result is surprising and requires some further explanation. As can be seen from Table 3, the largest regions are not fully specialized in the production of either type of good: their large size prevents them from being fully specialized since they cannot source all the homogeneous good that they need. Consequently, these regions have lower wages than smaller regions specialized in the differentiated good. The reason is that, as stated before, their wage is linked to the wage of the regions that supply them with the homogeneous good since they are unspecialized. In ba type networks, the largest regions have relatively low wages compared to the equal random network case, as there are on average less links with other regions in those networks. This strong non-linear effect between equilibrium wages and market size drives the negative coefficients in the lower panel of Table 2 for ba networks. Note also that the constant term in the wage regressions is close to unity, which is the theoretical value of relative wages in the absence of any differences in size and accessibility.
Our findings suggest that any analysis focusing on two regions only or disregarding the spatial structure of the trading network is likely to miss an important part of the story. It also shows that more careful theoretical analysis of multi-region trading systems is necessary, though it is well known that such an analysis is difficult to carry out in the general case when factor prices are not equalized. 15

Results for the whole network
We next run regressions at the level of the network. The underlying idea is to link a measure of inequality in either the equilibrium allocation of industry or wages to measures of inequality in the distribution of market sizes and centrality in the network. We use as our inequality measure the cv of the different variables. As in the case of individual nodes, we first compute the correlations -this time across networks -for our measures of inequality. The results are reported in the bottom half of Table 4.
We then run ols regressions to estimate the effect of the dispersion in the population shares and in centrality on the inequality in the distribution of manufacturing shares and wages. Formally, we estimate: where the subscript l now denotes the network and not the individual nodes.   Notes: CV stands for 'coefficient of variation'. We set σ = 5, µ = 0.4, and ξ = 0.7. See Appendix D for a discussion of those choices of parameter values. Simple ols regressions. ba denotes networks generated using the Barabási and Albert (1999) algorithm. T -stats in parentheses. a , b , and c denote coefficients significant at 1%, 5%, and 10%, respectively.
As can be seen from Table 4, the dispersion in market sizes, θ i , has a significant impact on the dispersion in the equilibrium allocation of firms, whereas the geographical structure of the trading network seems to be of lesser importance. Inequality in market size is more important for explaining inequality in the allocation of firms than the network structure. Quite surprisingly, wage inequality is not strongly linked to either inequality in the distribution of market sizes or to inequality in accessibility in the trading network. Closeness has a positive impact on wage inequality, but only in networks that have a sufficiently strong topological structure (i.e., in ba-type networks). Observe that in totally random tree networks, neither dispersion in market sizes nor in accessibility correlate significantly with dispersion in wages. One might suspect that some non-linear relationship is at work, especially since many regions can become deindustrialized, i.e., have a zero industry share (see Table 3). When a large number of regions have zero industry shares, the cv may decrease since there is no more variation coming from the deindustrialized regions. We checked formally the impact of deindustrialized regions on equilibrium inequality. Controlling for the number of regions without industry (about 430 out of 2498, or 17.25%), we find that this variable is significant in all regressions, but that it does not change in any way the qualitative results. Thus, deindustrialized regions do not drive our key findings. They are also not driven by units of measurement issues, since we use the cv which is unit free. Last, observe that the model fit is generally much better for the dispersion of industry (top half of the table) than for the dispersion in wages (bottom half of the table). It seems thus much harder to link wage inequality to inequality in the model's fundamentals than spatial inequality in industry shares.

Model 2: Two differentiated sectors
We now look at the multi-region case with two differentiated ces industries. To the best of our knowledge, this has not been done until now. With two differentiated sectors, we have to examine the spatial distribution of firms in both sectors, λ * 1i ≡ n * 1i /(∑ j n * 1j ) and λ * 2i ≡ n * 2i /(∑ j n * 2j ), as well as the equilibrium wages w * i . Simple correlations among the equilibrium values are presented in Table 5.   The shares λ * si are computed as λ * si = n * si /(∑ j n * sj ), for s = 1, 2. CV denotes the coefficient of variation, whereas Λ * 1 , Λ * 2 , w * , and θ denote the equilibrium vectors of industry shares in sectors 1 and 2, masses of firms, wages, and market size, respectively.
As can be seen from the top panel of Table 5, size and accessibility are strongly positively linked to the equilibrium industry shares and to the equilibrium wages, respectively. Although market size still positively influences wages, there is almost no correlation between our measures of centrality and the shares of firms in the two industries. As can further be seen, there is regional specialization, as shown by the negative correlation between the equilibrium shares in both industries, as well as the positive correlation with the own expenditure share, and the negative correlation with the other industry's expenditure share. In words, this specialization is strongly driven by differences in local spending patterns, as can be seen from the last two lines of Table 5. Our finding thus extends the result on 'comparative advantage' from Proposition 1 to a multi-region setting. Note, finally, that market size has roughly the same positive impact on industry location in both industries conditional on expenditure shares. This is the manifestation of market size as 'absolute advantage', as subsumed by Proposition 2, which states that more centrally located regions should have, ceteris paribus, higher wages.
As for Model 1, we run the same regressions (26), (27), (28) and (29). The first two regressions are now run separately for the equilibrium shares of firms in each of the two sectors, λ * 1i and λ * 2i . In all regressions, we control for the region-specific share of expenditure on the two differentiated sectors, µ 1i and µ 2i . Table 6 shows that market size, θ i , and the expenditure share for the two differentiated sectors, µ 1i and µ 2i , are the key variables that explain the spatial distribution λ * 1i and λ * 2i of firms in the two sectors. The positive sign for market size is expected as labor market clearing (23) requires that the number of firms in the two sectors must sum to the population share. Once we control for local market size and the spending patterns, the centrality of a region is no longer associated with its industry share. The reason is that centrality affects both industries in the same way, which suggests that accessibility is akin to an absolute Ricardian advantage and should, therefore, be capitalized into factor prices. 16 This effect can precisely be seen from the bottom panel of Table 6. Clearly, both market size, θ i , and centrality are positively linked to wages, w * i . Regions with better access to markets and/or more trading links tend to have higher wages. Last, note that the expenditure shares µ si are nowhere near statistical significance in our wage regressions. In words, different expenditure shares affect industries differentially and, therefore, have no strong effect on regional wages. This is in line with our previous results on comparative and absolute advantage. 17 Notes: We set σ = 5. Simple ols regressions. ba denotes networks generated using the Barabási and Albert (1999) algorithm. T -stats in parentheses. a , b , and c denote coefficients significant at 1%, 5%, and 10%, respectively.

Results for individual nodes
To summarize, regional size and expenditure patterns determine the structure of regional specialization in the two industries in Model 2, whereas accessibility has a strong impact on wages. Observe that a home market effect -defined as a more than proportional increase in industry shares in response to an increase in local market size, i.e., ∂λ * i /∂θ i > 1 -generally does not arise, as shown in Table 6. The reason is that when all sectors are operating under increasing returns and face trade costs, not all of them can -by definition -exhibit home market effects (see Hanson and Xiang, 2004). In that case, an alternative definition of the hme, involving both the size θ i and the expenditure share µ si , would be required. To the best of our knowledge, such a definition has not been used to date in the literature.

Results for the whole network
As shown in Table 7, inequality in the distribution of market sizes and in the distribution of expenditure shares in the two differentiated sectors are the key variables that drive the inequality in the spatial distribution of firms and wages. Inequality in the network characteristics are only weakly associated with inequality in the equilibrium distributions of firms and wages. It is worth emphasizing that, as can be seen from columns (iii) and (iv) in the bottom panel of Table 7, more dispersion in the expenditure shares is negatively associated with wage inequality in the case of ba-type networks. This result is similar to the one linking the dispersion of population shares θ i to wage inequality in the case of ba networks in Model 1 (see the bottom panel of Table 4). In the case of equal random networks, there is no significant link between the dispersion in expenditure shares and wage dispersion.

Summary of results
A number of findings emerge from the foregoing analyses of the two models. Let us briefly summarize the key insights.
Starting from Model 1 with a single ces sector, we have firstly seen that accessibility has a strong impact on industry location and, to a lesser extent, on wages. This suggests that any analysis involving trade in homogeneous goods and focusing on two regions only -or disregarding the spatial structure of the trading network entirely -is likely to miss an important part of the story. Secondly, we have shown that the correlations between w * i and either λ * i or θ i are quite low, i.e., there is no strong correlation between either market size or the equilibrium industry shares and the equilibrium wages. As we have explained, this unexpected result is due to the fact that incomplete specialization in the production of the homogeneous good imposes strong restrictions on the relative wages of the trading regions, which break the link between market size and wages over the range of incomplete specialization. In that case, relative wages across regions depend on relative trade costs only but are independent of the regions' market sizes. Last, the home market effect generally holds even when trading the Notes: CV stands for 'coefficient of variation'. We set σ = 5. Simple ols regressions. ba denotes networks generated using the Barabási and Albert (1999) algorithm. Tstats in parentheses. a , b , and c denote coefficients significant at 1%, 5%, and 10%, respectively.
homogeneous good is costly, provided that it is less costly than trading the differentiated good. Turning next to Model 2 with two ces sectors, both absolute market size -as captured by θ i -and centrality -as measured by either closeness or the degree distribution -are capitalized into factor prices, thus showing that they constitute absolute advantage affecting all industries in the same way. Differences in spending patterns -as captured by the µ si -are however capitalized into industry structure, thus showing that they constitute comparative advantage affecting industries differently. Our findings, therefore, extend the theoretical results of Behrens and Ottaviano (2011), which have been derived with two regions only, to a multi-region setting.
Last, it is worth pointing out that the effects of accessibility and market size on wages are an order of magnitude larger in Model 2 than in Model 1 (compare Tables 6 and 2). As we have explained, the reason is that the equalization of prices in the traded homogeneous sector imposes strong restrictions on the determination of wages among trading partners when specialization is incomplete (a very frequent case). This in turn breaks the link between accessibility and market size in the wage determination. In a nutshell, market size and centrality matter all the more the more industries are subject to trade costs and increasing returns to scale.

Numerical application to Spanish regions
While the foregoing numerical simulations highlight regularities of our multi-region trade models without fpe, they provide no sense of how well those models perform when confronted with data. The aim of this section is hence to use calibrated versions of the models to check their fit with the data and to run a series of counterfactuals. To this end, we compute the equilibria of the two models using Spanish provincial data in two years: 1980 and 2007 (see Appendix D for a description of the data). This is an interesting period because it coincides with significant changes in demographic trends, and with important infrastructure improvements. Paluzie et al. (2007) discuss the migration trends in Spain from rural to urban areas that, starting in the sixties, still characterized demographic trends in the eighties. The fundamental tendency was the agglomeration of population in ever larger urban areas. Zofío et al. (2014) show that the decentralization of public administration -as Spain joined the European Communityaccompanied by substantial funding from the European Regional Development Plan (erdp) helped to finance remarkable improvements in the road network. Along with price changes in the transportation sector, mainly driven by salaries and fuel, generalized transport costs fell by about 15% over the period we consider. In a nutshell, our study period was one of important changes in the population distribution and in transport costs, both of which should have a strong influence on the spatial equilibrium structure of the economy.
Our aim in the remainder of this section is twofold. First, we compare the equilibrium distribution of economic activity predicted by our models with the data. Doing so will allow us to assess to what extent the models can 'replicate' the observed distributions. Second, we use the model to run some simple counterfactuals with respect to changes in demographic trends and transportation costs. We disentangle the role of market size from the role of transportation costs by shutting down one of the two channels when running our counterfactuals. More precisely, we first look at the equilibria of the models in the absence of any changes in the labor force between 1980 and 2007, i.e., when changes are 'solely' driven by changes in transportation costs. Second, we repeat the exercise by assuming that there are no changes in transportation costs between 1980 and 2007, so that changes are 'solely' driven by changes in the spatial distribution and in the size of the labor force.

Equilibrium distributions vs. observed distributions
The equilibrium distributions of firms in 1980 and 2007, as well as the equilibrium wages, are summarized in Table 8. Our results show large disparities in the distribution of firms across provinces, and those disparities increased between 1980 and 2007. In each year, the distribution of firms varies from almost 0% to about 16%-25%. Not surprisingly, the provinces of Madrid and Barcelona have the highest shares of firms in both models. These provinces are the largest -in terms of population shares -which, as reported in the previous sections, is the main determinant of firm shares (followed, to a lesser extent, by centrality that benefits Madrid as the geographical center of the Spanish infrastructure network). On the contrary, very small provinces situated in the Iberian Peninsula plateau (plain or meseta) are almost devoid of production (e.g., the provinces surrounding Madrid such as Toledo, Cuenca, Guadalajara, Segovia, or Ávila have a really negligible share of firms).
To tentatively gauge the predictive power of the models, we check the statistical significance of the differences between the observed distributions of production in 'differentiated products' and those associated with the equilibria of the models: λ * i (Model 1) and λ * 1i and λ * 2i (Model 2). Besides Pearson's r and Spearman's ρ coefficients of correlation, we also test the equality of distributions by way of a Kolmogorov-Smirnoff test. Table 9 reports large and significant correlations, both for linear (Pearson) and rank (Spearman) dependencies. The Pearson standard correlation ranges from 0.8638 in Model 1 for 2007 to a remarkable 0.9910 for Model 2 in the same year. The maximum values for the Spearman correlations correspond to the same models and year. Additionally, the hypothesis of equality of distributions cannot be generally rejected, except for the 2007 distribution in Model 1. Our results show that solving the models using real data yields model equilibrium distributions of economic activity that are in many cases statistically hard to distinguish from those observed in the real economy.
Turning to wages, we however do not find large correlations between those proxied by gdp per employee (our empirical counterpart for 'wages' at the aggregate level) and the solutions to the two models. Hence, while the models perform well in terms of their spatial predictions of economic activity, they perform much worse in terms of their predictions for prices. There are two possible reasons for this. First, gdp per capita -though widely used in the literature (see Head and Mayer, 2004) -is only a crude proxy for wages. Second, as shown in the previous section, the multi-region simulated models do not deliver clear results as to the roles of market size and centrality on wages. It is thus not surprising that their empirical fit to wage data in also fairly weak. Notes: 1,2 The null hypothesis is that both variables are independent; 3 The null hypothesis is that both variables come from the same continuous distribution.
p-values for all tests in parenthesis.

Population, transport costs, and trends in inequality
The equilibria computed in the foregoing section can be used to analyze to what extent the models capture the process of agglomeration that has taken place in Spain between 1980 and 2007, and which resulted in a more unequal distribution of manufacturing activity.   Figure 1 depicts the changes in the equilibrium manufacturing shares for Model 1. The equilibrium distributions in 1980 and 2007 are displayed as solid and as dashed lines, respectively. As can be seen from Figure 1, although the distributions are fairly similar (because there is a lot of inertia in spatial structures), the one in 2007 exhibits a higher density for low values of the manufacturing shares in the area identified by A (just below the mean value of 0.021, depicted by the vertical line). Another difference pointing towards an increase in inequality is the agglomeration of manufacturing activity that can be seen in 2007, with the equilibrium shares of Madrid and Barcelona driving this process. Indeed, both provinces increased their values from 0.167 to 0.196 and from 0.157 to 0.254, respectively. This evolution is visible from the dilation of the right tail of the distribution in the area identified by C. For values in the range between 0.05 and 0.15, both distributions display similar densities (area B) between the two years. 18 To provide a quantitative sense of the increase in inequality, we have computed the Gini indices G for the distribution of observed manufacturing shares and for the equilibria of the two models in both years. Both models capture the increase in inequality, even if both clearly overstate it. Observed inequality in the distribution of the manufacturing sector increases by 0.60% (from G 80 = 0.7700 to G 07 = 0.7816), while Model 1 yields an increase in inequality of 12.84% (from G 80 M 1 = 0.7805 to G 07 M 1 = 0.8808). Using the equilibria from Model 2, the observed increase in inequality is 2.12% for the manufacturing sector (from G 80 1 = 0.7609 to G 07 1 = 0.7770), and 2.41% (from G 80 2 = 0.7762 to G 07 2 = 0.7950) for the service sector, respectively. 19 The model again overpredicts these values at 11.69% (G 80 M 2,1 = 0.6841 to G 07 M 2,1 = 0.7641), and 11.90% (G 80 M 2,2 = 0.6683 to G 07 M 2,2 = 0.7479), respectively. We may thus conclude that while the model reasonably well predicts the spatial distribution of manufacturing in Spain for a given year, it overpredicts the impacts of changes in population or changes in transportation costs on that spatial distribution.

Counterfactuals
Keeping in mind the caveat from the previous section, we finally run two counterfactuals, the aim of which is to simulate the spatial equilibrium that would prevail if only population changes to its 2007 values, but not the transportation costs which are kept fixed at their 1980 values, and vice versa. Put differently, in the first counterfactual we fix the transport costs to their 1980 values and use observed population changes; whereas in the second counterfactual, we fix population to their 1980 values and use observed changes in transportation costs. In so doing, we can compare the 'pure' effect of population changes conditional on transport costs, and the 'pure' effect of changes in transport costs conditional on population. We compare the equilibria of the model in 1980 and 2007 to those derived in the counterfactuals to determine how each change contributes to the overall shift in the manufacturing shares. 20 The top panel of Figure 2 depicts the distributions of the counterfactual industry shares (in red) for the case where only population changes; whereas the bottom panel of the figure depicts the same change for the case where only transport costs change. Table 10 summarizes the detailed results, with the first superscript referring to the reference year for the population share, θ i , and the second one referring to the bilateral transportation costs, φ ij .
As can be seen from the top panel of Figure 2, the increase in the density of the left tail of the distribution of manufacturing shares is mainly driven by the change in the geographical distribution of the labor force rather than the reduction in transportation costs. As can be seen from the bottom panel of Figure 2, the effect of the latter is rather small, despite the fact that over the 1980-2007 period the fall in the average value of the Generalized Transport Costs was large. 21 As this change was similar across provinces, their relative position in the network remained basically unchanged. Observe that the fall in transport costs has slowed down the process of agglomeration, as can be seen from the bottom panel of Figure 2.
Generally, when comparing the counterfactual distributions with the observed ones in 1980 and 2007, Figure 2 reveals that changes in the spatial distribution of regional market shares have a stronger predictive power of changes in industrial specialization than changes in transportation costs. Our results thus suggest that the reallocation of the labor force was the main driver of agglomeration and larger inequalities as reflected by the change in the Gini indices. Note that these results are compatible with those obtained in the simulations presented in Section 3, and particularly with those in Table 2 for Model 1. In that case, the equilibrium shares λ * i depend mainly on the θ i rather than on network features as captured by transportation costs (i.e., closeness or degree). As the relative position of the provinces in the trading network did not change much between 1980 and 2007, this explains the stability in the distributions of the spatial equilibria when considering changes in this variable only.
To conclude on a policy note, observe that after three decades of significant investments in the road network, the distribution of industry shares had not changed much in Spain. Thus, these investments do not seem to have contributed much to territorial cohesion -though the main goal of infrastructure investment in the eyes of policy makers is often to 'reduce regional inequality'. In fact, the opposite occured: Madrid and Barcelona had larger shares of economic activity in 2007 than in 1980. These changes in industry shares were mostly driven by population reshuffling, and little by decreasing transportation costs. The financial efforts of transport improvements did apparently not translate into higher cohesion and lower inequality. 22 21 The fall in GT C ij amounted to 14.14%. This fall corresponds to an increase of 109.65% in the average φ ij , thus implying that the freeness of trade more than doubled. 22 One word of caution is in order. Our approach does not capture the fact that the population change between

Conclusions
We have investigated the geographical distribution of industries and wages in a asymmetric multi-region models without factor price equalization. Using systematic numerical simulations for two different trade models -one with a homogeneous and a differentiated sector, and another with two differentiated sectors -we have studied whether and how size and accessibility are linked to the equilibrium industry shares and to wages.
Our key findings can be summarized as follows. First, absolute local market size and accessibility are crucial in explaining a region's wage. This is due to the fact that absolute market size -as measured by the population size of a region -and accessibility -as measured by network centrality or the degree distribution of a region -affect all industries in similar ways, i.e., constitute a region's absolute advantage. This effect is stronger and more systematic in models where all sectors are subject to transport costs and exhibit increasing returns to scale.
Second the relative local market size of industries -as captured by their expenditure shares -is crucial in explaining a region's industrial composition. This is due to the fact that relative spending patterns do not affect all industries in the same way, i.e., constitute a regionspecific comparative advantage. In a nutshell -and very much in line with Ricardian trade theory -absolute advantage translates into higher wages, whereas comparative advantage maps into specialization patterns.
Third, the correlation between equilibrium wages and equilibrium industry shares is rather low in both models, thus suggesting that the two adjustment channels work largely independently. Empirical tests and formal definitions of the home market effect should take into account both dimensions -industry location and wages -in order to be relevant. To the best of our knowledge, tests looking simulataneously at industry location and factor prices have not yet been devised.
Finally, when applying the two models to Spanish data -using Generalized Transport Costs between regions as a measure of trade frictions -we find that the models generally predict well the distribution of industries, yet predict less well the spatial patterns in wages. The latter may be due to the fact that gdp per capita -though often used in the literature -is a rather crude proxy for wages. It may, however, also be linked to the fact that regional differences in accessibility are generally less pronounced than regional differences in population shares. Thus the second effect may dwarve the former in the applications.
Behrens acknowledges financial support from the crc Program of the Social Sciences and Humanities Research Council (sshrc) of Canada for the funding of the Canada Research Chair in Regional Impacts of Globalization. Barbero and Zofío acknowledge financial support from the Spanish Ministry of Science and Innovation (eco2010-21643 and eco2013-46980-p). Part of the paper was written while Barbero was visiting uqàm, the hospitality of which is gratefully acknowledged. The views expressed in this paper, as well as all remaining errors, are ours. (A-1) When (A-1) holds for all regions, and when trade in the homogeneous good is free, we have w i = 1 for all i = 1, 2, . . . , M. Observe that condition (A-1) is extremely restrictive. Consider, e.g., a world with 30 regions. If market sizes θ i were identical across regions, we must have µ < 1/30. This is already very restrictive. But in our case, since we randomly assign the shares θ i to regions, we may have very small shares in some cases. In those cases, the foregoing restriction can never be met for 'reasonable values' of µ.
Although condition (A-1) is technically speaking only a sufficient condition -i.e., we may still have fpe even when it is violated -it seems still very unlikely to be met in general. Another potential problem in the fpe version of the model is that it displays a much larger share of 'corner equilibria', i.e., equilibria in which some regions are deindustrialized and do not host any of the differentiated sector. We have simulated the model with fpe and find that the number of nodes with a zero industry share is 920 out of 2498, i.e. 36.82%. This is a large number, so that regression methods dealing with zeros may be required to analyze the general properties of these equilibria.
In a nutshell, the fpe model does not make much sense in a world with many regions, neither theoretically nor empirically, and it is difficult to implement consistently for reasonable values of µ. We thus disregard it in the remainder of this paper.

B. Generating random tree networks
We use two different algorithms for generating random tree networks. The first one is based on Barabási and Albert (1999). This algorithm starts with a network having M 0 linked nodes. Then, it adds new nodes one by one, up to M T nodes in total, where M T is the number of nodes of the network (i.e., the number of regions in the model). Each time a new node is added to the network at iteration t, it is connected to M t−1 pre-existing nodes. The probability of being linked to an existing node during iteration t depends on the degree of that node in the following way: p it = deg(i t−1 )/[∑ j deg(j t−1 )], where p it is the probability of being linked to node i at iteration t, and where deg(i t−1 ) is the degree of node i at iteration t − 1. The Barabási and Albert (1999) preferential attachment algorithm tends to create networks with some nodes that have a high degree, who are very well connected, and other nodes with a very low degree, who are badly connected. Put differently, the resulting network tends to have hub-and-spoke characteristics. By setting the initial number of nodes to M 0 = 2, and by setting the number of links for new nodes to m = 1, we ensure that the resulting network is a connected tree with In the second algorithm we use, new nodes are added to preexisting nodes with equal attachment probability, which means that the probability of being linked to node i at iteration t does not depend on the degree of node i. Formally, we have p it = 1/M t−1 , where M t−1 is the number of nodes in the network when adding the new node at iteration t.
Observe that the average degree of the tree network is equal to 2(M T − 1)/M T , independently of the algorithm used to generate it. The reason is that in an undirected graph, the degree sum formula is ∑ j deg(j) = 2 |E|, where |E| is the number of links in the network. Since in the generated tree networks there are M T − 1 links, the degree sum formula becomes 2(M T − 1). Then, the average degree of the network, defined as the degree sum over the number of nodes in the network, is equal to 2(M T − 1)/M T .
Observe further that the standard deviation of the degree of the nodes in the network will usually be higher in networks using the Barabási and Albert (1999) algorithm than in totally random tree networks. The reason is that this algorithm tends to generate a few nodes with a high degree, and a lot of nodes with a very low degree.
Last, when generating random links in the networks, we assume that the freeness of trade, φ ij , between adjacent nodes i and j is given by 1/5. Hence, the freeness of trade between two nodes i and k, linked by a path P = {i, j 1 , j 2 , . . . , j n−1 , k} of length n, is given by We use only shortest paths in the network, which are computed using the Floyd-Warshall algorithm. Because we work with trees, the shortest path is uniquely determined.

C. Details on the numerical implementation
We first use the algorithms described in Appendix B to generate random networks. In all cases, we compute the equilibria of the two models for the same set of networks. Hence, the results are directly comparable across models. For computational reasons, we generate random networks with between 20 and 30 nodes, the number of nodes being itself random (and drawn from a uniform distribution). Larger networks require too long to solve in the case with a homogeneous good. To solve the model, we transform the spatial equilibrium conditions (16) into complementary slackness conditions as follows: [RMP i (n) − 1] n i = 0, i = 1, 2, . . . , M, (C-1) where we make explicit the dependence of the real market potential on the whole distribution of firms n = (n 1 , n 2 , . . . , n M ).
Model 1: One differentiated sector and one homogeneous sector. We add as nonlinear inequality constraints the equilibrium conditions (13) in the homogeneous good market, the labor market clearing conditions (14), and the complementary slackness conditions (15) for exports of the homogeneous good: Furthermore, the following bounds for the variables are imposed: w i > 0 for all i and X ji ≥ 0 for all i and j. We also have the constraints that n i ≥ 0 for all i. Note that the presence of the min function, which is not differentiable, makes it more difficult to solve the problem. To overcome this problem, we replace all occurrences of the min function with a new variable, z i . To make sure that this new variable z i will be equal to the minimum, we substract it from the objective function (i.e., it works as a penalty). Thus, the solver will maximize it. We add the constraint that it should not exceed the delivered price of the homogeneous good: z i ≤ w j ξτ ji , ∀i, j. In doing so, we make sure that -in the final iterationz i is equal to the minimum delivered price of the good.
We transform (C-1) into an equivalent problem that consists in minimizing the sum of squared residuals subject to the set of equilibrium constraints. The numerical implementation

D. Data and calibration
We work with Spanish provincial data at the nuts-3 level, totaling 47 observations. 23 Table 11 provides details on the variables needed to solve the different models. For Model 1, these include the labor force shares (θ i ), the gross value added shares in the differentiated sector (the observed n i or λ i ), and the mean of the bilateral transportation costs (τ ij ). Population and industrial gross value added -our proxy for the differentiated production in the economyfor 1980 are obtained from the 'Spanish Domestic Income and its Distribution by Provinces' (fbbva) publication. The 2007 data come from the Spanish National Statistics Institute (Instituto Nacional de Estadística, ine). The fbbva data on private gross value added at the provincial level is disaggregated into Agriculture, Energy, Industry, Construction, and Services. Bilateral shipping costs are measured as the monetary value of the generalized transportation cost (GT C ij ) of delivering one ton of cargo between origin i and destination j. Zofío et al. (2014) describe the model assuming a cost minimizing behavior on the part of transportation firms, and determine the least cost optimal itineraries using geographical information systems that account for the actual road network in those years. In Table 11, we provide the mean value of all bilateral transportation costs for each province, i.e., GT C t ij = 1 47 ∑ 47 i=1 GT C t ij . Following the definition of the freeness of trade, φ ij , transport costs are computed as follows: (D-1) As for the structural parameters µ and σ of the model, few studies have attempted to test the main propositions of new trade theory and new economic geography using Spanish data. Pons et al. (2007) estimate a migration equation based on an neg model, and obtain a value for σ between 2.8 and 4.2, conditional on the values of the other parameters. Gómez-Antonio and Fingleton (2012) adopt a value of σ = 6.25 when analyzing the impact of the public capital stock on Spanish productivity. Their choice is justified on the grounds that it coincides with the key estimates in the literature (e.g., Table 5 in Head and Mayer, 2004). More recently, Broda and Weinstein (2006) estimate the elasticities of substitution for traded goods imports to the US using sitc rev2 for 1972-1988, and sitc rev3 for 1990-2001 at the 3-, 4-, and 5-digit levels, respectively. At the 3-digit level and across all goods, they find a mean elasticity of 6.8 from 1972-1988 and of 4.0 from 1990-2001, respectively. Looking only at differentiated goods -as defined using the Rauch (1999) classification -at the 4-digit level, they find a mean elasticity of 5.2 from 1972-1988 and of 4.7 from 1990-2001, respectively. Since the estimates obtained by these authors are probably the best currently available, and since they are roughly in line with the estimates obtained for Spain, we take the midpoint value of σ = 5 (as we also assumed in the numerical simulations performed in the previous sections). Turning to the expenditure share on the differentiated product, µ, we use the expenditure shares for manufacturing goods in total domestic demand coming from the household budget survey published by ine, which in 2007 was 41.92% (data for 1980 is unfortunately unavailable, but this share exhibits remarkable stability both in time and across developed countries, fluctuating around this value depending on the economic cycle). For simplicity, we round the value to µ = 0.4 (as we also assumed in the numerical simulations).
Since wages are endogenous, we require additional data to test whether the results of the calibrated model match the observed values. In particular, we need information on wages. The latter are obtained, as in many previous studies, by dividing aggregate gdp by the labor force (see the literature review in Head and Mayer, 2004). For Model 1, we associate the homogeneous sector with agriculture, while the differentiated sector corresponds to the manufacturing industry. As for the parameter ξ capturing the relative level of trade cost of the homogeneous good compared to the differentiated good, we adopt a value 0.7. Based on data from the 'Ongoing Survey on Freight Road Transportation', carried out by the Ministry of Transport (see Ministerio de Fomento, MFOM, 2007a), we can calculate a comparative range of relative freight costs in terms of tons-kilometer. 24 The difference in the cost of shipping homogeneous and differentiated products ranges from 0.7 to 1, with an average around 0.8. To keep consistency with the values adopted in the previous section, we take the lower bound for ξ .
Finally, besides regional labor shares and bilateral trade costs, we need to identify two differentiated sectors for Model 2. We associate the first differentiated sector with manufacturing plus energy, whereas services are associated with the second sector. We leave out agriculture -which is more homogeneous -and construction -which is essentially non-tradable -from the analysis. We determine expenditure shares to match the production side from the expenditure household survey, with the first share corresponding to manufacturing and utilities (processed food, clothing, water, electricity,. . . ) and the second one to services (health, communication, leisure, education, accomodation,. . . ). These shares are, unfortunately, only available at the nuts-2 regional level (States or Comunidades Autónomas), and they are an average of all nuts-3 provinces included in each region. As a result we apply the regional values to all provinces of a region. Although this reduces the regional variation, it is the only way we can use that required piece of information. Table 11 below summarizes the data that we use.