Mussolini meets Marshall in the city

ABSTRACT We estimate the link between population density and labour productivity at the city level. An exogenous change in the agglomeration of some cities that occurred in Italy during the fascist dictatorship allows us to instrument the agglomeration proxy. We find evidence of a causal impact of density on productivity for the subpopulation of cities affected by the instrument. The estimated elasticity in our preferred specification is 0.252.


INTRODUCTION
As originally stressed by Marshall (1890), economic activity tends to concentrate geographically due to so-called agglomeration economies.To measure these types of economies, many papers have looked at the effect of population density on productivity proxies (for a review of the literature, see Combes & Gobillon, 2015).While it is reasonable to argue that a change in agglomeration affects productivity, it is less reasonable to assume that such a change is exogenously determined.This is why the literature has relied on instruments for population density, including long-lagged values of the variable, as well as geographical or geological characteristics of the locations.Only a few papers have used historical events as a guide for instrumental variables (for a discussion, see Combes et al., 2011;and Proost & Thisse, 2019).Overall, there is still a lack of consensus about the magnitude of agglomeration effects around the world (Graham & Gibbons, 2019;Melo et al., 2009).
We add to the literature by proposing a new quasi-experiment based on historical facts.In particular, we estimate the causal effect of population density on labour productivity at the municipality level in Italy by exploiting an exogenous change in population density that occurred to some Italian cities during Benito Mussolini's fascist dictatorship.To the best of our knowledge, this is the first study of this type.

EMPIRICS
We use data collected at census dates in the period 1951-2001.Our dependent variable is labour productivity at municipality level.We compute it as the ratio between the value added in a municipality and its employment level, that is, as value added per worker.Value added at city level is the same as in Andini and Andini (2019, p. 624), and, to the best of our knowledge, it is the only existing measure of value added at city level in Italy.Our density variable is measured as city population over surface area.
During the 1920s, Mussolini reorganized the administrative boundaries of Italian municipalities by: (1) incorporating very small municipalities or fractions of them into already-existing bigger municipalities that can be called consolidating units; and (2) merging very small cities to constitute new-born municipalities.From 1921 to 1931, the number of Italian municipalities dropped significantly (Andini et al., 2017, p. 895).We exploit consolidations of type 1 as in Andini and Andini (2019).Those of type 2 have been exploited in a different exercise by Andini et al. (2017), who have also provided more details on the historical facts.
Our instrument takes the value of one for consolidating municipalities (zero otherwise).New-born cities are excluded to clearly identify the subpopulation of interest (already-existing bigger cities that experienced an exogenous change in their population density) and to estimate a local average treatment effect (Angrist & Pischke, 2009;Imbens & Angrist, 1994).In practice, we focus on municipalities with at least 5000 inhabitants in 1951 as consolidations of type 1 were only applied to bigger municipalities.
We estimate the following empirical model: where y i is the log of average labour productivity of municipality i over the period 1951-2001; d i is the log of average city population density over the same period, x i is a set of observable characteristics at municipality level, and 1 i is the error term of equation ( 1).The parameter of interest is b, namely the agglomeration effect.To obtain a consistent estimate of b by ordinary least squares (OLS), d i should be uncorrelated with 1 i , which is not the case.For instance, one could argue that more productive cites attract more people, thus becoming denser.To cope with this problem, d i is replaced by di = d + ûz i + ŵx i as obtained from equation (2).Note that z i is an instrumental variable (see further below), x i is the above-referred set of city observable characteristics, and v i is an error term.If z i is relevant and valid, then using di instead of d i in equation (1) yields b, that is, a consistent estimate of the agglomeration effect.In short, our main results are based on a two-stage least squares estimator (2SLS).
The information source that we use to construct the instrument is the Sistema Informativo Storico delle Amministrazioni Territoriali, which is provided by the Istituto Nazionale di Statistica (Istat).Istat and the Istituto Cattaneo are the sources for the remaining municipality-level data.Table 1 provides the summary statistics of our dataset.
An instrument must be both relevant and valid.Sometimes validity is also called exogeneity.This paper uses both words (validity and exogeneity) interchangeably.In our model, relevance means that the consolidation dummy must be correlated with the population density at the city level.Arguably, this correlation is mechanic because the consolidation that occurred to bigger cities in the 1920s caused a change in both their population level and surface area.Thus, the consolidation event that occurred in the 1920s changed their population density permanently, that is, from the day it happened to the present.As a consequence, from an empirical point of view, it is possible to test the existence of a correlation between the consolidation event of the 1920s and the population density in the 1951-2001 period (see further below).
As for validity, to understand the logic of our instrument, we may usefully compare it with the standard and well-known instrument exploiting a compulsory schooling reform.The latter attributes 1s to all individuals affected by a compulsory schooling reform, that is, those born after a specific year.Since the fact of being born in one year instead of another can be seen as uncorrelated with the unobserved heterogeneity of an individual, say his/her genetic ability, the exclusion restriction applies unless one can argue that being born in one year makes an individual genetically better or worse than an individual born in another year.The compulsory consolidation instrument that we use in this paper attributes 1s to all bigger cities in the 1920s affected by the fascist administrative reform, that is, those having at least one very small city nearby that had to be and was incorporated.Since the fact of having a very small city nearby or not can be seen as uncorrelated with the unobserved heterogeneity of a bigger city in the 1920s, the exclusion restriction applies unless one can argue that having a very small city nearby in the 1920s makes a bigger city 'genetically' better or worse than another bigger city in the 1920s not having a very small city nearby.Needless to say that any bigger city can have a number of smaller cities nearby.However, the fact of having a very small city nearby or not can be plausibly taken as good as randomly assigned.It is like taking 100 sunflowers from a field.All of them have regular size petals.Only a few of them have, in addition to regular size petals, one or two very small petals.These sunflowers with at least a very small petal are the 1s in our instrument, namely 14.5% of our units (Table 1).
In other words, for a bigger city in the 1920s, the fact of becoming a consolidating unit or not can be basically seen as the result of the intersection between the pure chance of having at least a very small city nearby at that time and a dictatorial obligation to incorporate it, neither of which can be linked to the 'preferences' (unobserved heterogeneity) of a bigger city in the 1920s.
Of course, even in the identification approach exploiting a compulsory schooling reform, instrument exogeneity is usually claimed in a model that controls for a set of observed covariates.This is aimed to sterilize the (non-random) part of the assignment that can be related to observables in the real (i.e., non-experimental) world.In this way, the observed characteristics of the above-referred 100 sunflowers are forced to be as similar as possible, except for the presence of at least a very small petal or not.In this paper, following the standard practice, we claim instrument exogeneity conditional on a set of observed municipality characteristics (see further below).
Table 2 provides first-stage results.All the first-stage regressions show a positive and significant link between population density and consolidation.Consolidating municipalities appear to have, on average, higher population density than other municipalities, conditional on their observed characteristics.In the simplest specification, the estimated coefficient is 0.527.In our preferred specification, the magnitude of the estimated coefficient is 0.462.Since the F-statistics for the weak-instrument test is well above the threshold value of 10 in all the specifications, our instrument can be seen as strongly relevant.
Table 3 provides reduced-form estimates.On average, the link between labour productivity and consolidation is positive and significant.A reduced-form coefficient provides the difference between the average productivity for the group of cities with the instrument taking the value of 1 and the average productivity for the group of cities with the instrument taking the value of 0, conditional on observables.As the instrument is a policy variable in our model, its coefficient represents the policy effect.In the simplest specification, the coefficient estimate is 0.212.In our preferred specification, the coefficient estimate is equal to 0.116.
Table 4 provides the main results of our study.On average, the causal link between labour productivity and population density is positive and significant.In the simplest specification, a 1% increase in average density in the 1951-2001 period is associated with a 0.402% increase in average labour productivity in the same period (note that both density and productivity are measured in logs).In our preferred specification, a 1% increase in density is associated with a 0.252% increase in productivity.
There are at least four specific points in our analysis that deserve further discussion.First, the only area covariate that we use is a dummy for Southern location.This is because, on the one hand, we want to take the so-called questione meridionale into account (the latter was first mentioned right after the unification of the country in 1861; Pescosolido, 2017); and, on the other, we find that neither region nor province dummies are jointly significant.The intuition for the latter result is that the assignment to the treatment in the 1920s was not driven by region or province characteristics.Regions did not exist as local governments in the 1920s and became operational in the 1970s.Provinces did exist in the 1920s.However, their limited administrative competencies were unclear in those years and were only clarified through a 1934 decree.In fact,   Corrado Andini and Monica Andini REGIONAL STUDIES, REGIONAL SCIENCE their existence and usefulness as local units of government have always been questioned (Fabrizi, 2008).And a few years ago their limited powers have been further weakened by the Italian parliament.With reference to the existing literature, our choice of using the dummy for Southern location as the only area covariate is also supported by the recent Sommervoll and Sommervoll (2019) claim that 'For spatial fixed effects, it is not a question of introducing many spatial dummies, but of finding a few good ones' (p.247).In practice, the main implication of our area-covariate choice is that the number of treated and control units that are used to estimate the elasticity of interest is larger in the macro area (e.g., South) than in the region area (e.g., region of Campania) or in the province area (e.g., province of Salerno), thus resulting in a more precise estimate.
Second, we do not report clustered standard errors because Abadie et al. (2023) have recently argued that 'when the number of clusters in the sample is a nonnegligible fraction of the number of clusters in the population, conventional cluster standard errors can be severely inflated' (pp.1-2 ).'If the sample represents a large fraction of the population and treatment effects are heterogeneous across units, robust standard errors are also conservative' (pp.31-2).In our case, the number of clusters (provinces or regions) coincides with the number of clusters in the population, our data represent the whole population of municipalities with at least 5000 inhabitants in 1951, and we allow for a distribution of causal effects across municipalities, of which a local average is estimated.
Third, our evidence is robust to a validation check inspired by Conley et al. (2012).If we relax the assumption of instrument exogeneity by allowing for the instrument to have a correlation with the residuals (and thus with the outcome) in a range between 0.000 (the exogeneity assumption, that is, all the effect of the instrument on the outcome materializes through density) and 0.116 (the reduced form effect in our preferred specification, that is, all the effect of the instrument on the outcome materializes directly), then the elasticity of interest is estimated in a range between −0.072 and 0.332 (union of 95% confidence intervals), which is reassuring about our finding that the true estimate is significantly larger than zero.
Fourth, our empirical model controls for a set of geographic/geological characteristics of municipalities (e.g., coastal location), that are strictly exogenous, and for one characteristic that can be questioned as such, namely a dummy indicating whether a city is a provincial capital or not.However, our results in Table 4 are robust to the exclusion/inclusion of this dummy.We keep it in our preferred specification to reduce the likelihood that the assignment to the treatment was partly driven by the special characteristics of these peculiar cities.

CONCLUSIONS
We estimate the agglomeration effect at city level in Italy by looking at the link between labour productivity and population density.The latter is instrumented by using an exogenous change in the population density of bigger cities that occurred during the fascist dictatorship.We find evidence of a positive and significant causal link between agglomeration and productivity.
To better position our study within the existing literature, we may usefully emphasize the following three points.
First, according to the meta-analyses of both Melo et al. (2009) and Graham and Gibbons (2019), the magnitude of the estimated agglomeration effect around the world spans from −0.800 to 0.658.Our estimated elasticity of 0.252 lies within this wide range.
Second, our exercise provides an estimate that is local in the sense of referring to the specific subpopulation of municipalities whose level of density was affected by a compulsory incorporation in the 1920s.Therefore, no number in the existing literature can be usefully compared with ours, unless it comes from an exercise that is exactly looking at the same subpopulation.We are not aware of any study of this type.This means that the estimates proposed in other studies are perfectly compatible with our estimate even in the cases where they are different from ours.The reason is simple: the estimates from other studies refer to different subpopulations of units (assuming that identification holds in all studies).The only estimates in this paper that can be actually compared with previous studies for Italy are those based on OLS, reported in Table A1 in Appendix A in the supplemental data online.The reason is that they are independent of identification.Their magnitude is in line with the OLS estimates proposed in a recent study by Buzzacchi et al. (2021), supporting the reliability of our method to construct the outcome variable, namely value added per worker.Relatedly, it is also worth stressing that the use of value added per worker as dependent variable in a regression model estimated with city-level data is an element of novelty of this paper with respect to earlier research.
Third, earlier point estimates of elasticities for Italy are generally lower than our main estimate, typically much lower.Besides identification, differences in terms of units of analysis, outcome variable, main explanatory variable, control set and time span are likely to explain different coefficient estimates across studies.In particular, regarding the units of analysis, we are not aware of any study that uses Italian data at city level.Among those cited in the most recent meta-analysis we are aware of (Graham & Gibbons, 2019), the only study that uses city-level data is an article on agglomeration economies in Japan in 1979 (Nakamura, 1985).As for the outcome variable, we are not aware of any study that uses a direct measure of labour productivity (value added per worker).Some have used total factor productivity (Buzzacchi et al., 2021;Cingano & Schivardi, 2004), some have looked at wages (Di Addario & Patacchini, 2008;Mion & Naticchioni, 2009).As for the main explanatory variable, we are aware of only one recent study that uses population density (Buzzacchi et al., 2021), though the latter is used for a robustness check of baseline results based on employment density.In some cases, the main explanatory variable is not the population density, but rather the population level (Di Addario & Patacchini, 2008) or the employment level (Cingano & Schivardi, 2004).As for the time span, the elasticity of interest is typically related to a 1% change in the agglomeration proxy over a shorter timeframe.For instance, in a recent work by Buzzacchi et al. (2021), the 1% change in the agglomeration proxy refers to one year, namely 2001, to the best of our understanding.In our work, the 1% change in the population density refers to a five-decade period, as explained in the previous section.Last but not least, in some cases (Accetturo et al., 2018;Di Giacinto et al., 2014), the elasticity of interest and the model setup are completely different from ours, which makes a comparative analysis unfeasible.

DISCLOSURE STATEMENT
No potential conflict of interest was reported by the authors.