The Economic Impact of Universities: Evidence from Across the Globe

We develop a new dataset using UNESCO source materials on the location of nearly 15,000 universities in about 1,500 regions across 78 countries, some dating back to the 11th Century. We estimate fixed effects models at the sub-national level between 1950 and 2010 and find that increases in the number of universities are positively associated with future growth of GDP per capita (and this relationship is robust to controlling for a host of observables, as well as unobserved regional trends). Our estimates imply that doubling the number of universities per capita is associated with 4% higher future GDP per capita. Furthermore, there appear to be positive spillover effects from universities to geographically close neighboring regions. We show that the relationship between growth and universities is not simply driven by the direct expenditures of the university, its staff and students. Part of the effect of universities on growth is mediated through an increased supply of human capital and greater innovation (although the magnitudes are not large). We find that within countries, higher historical university presence is associated with stronger pro-democratic attitudes.


I. INTRODUCTION
A striking feature of the last hundred years has been the enormous expansion in university education. In 1900, only about one in a hundred young people in the world were enrolled at universities, but over the course of the Twentieth Century this rose to one in five (Schofer and Meyer, 2005). The term "university" was coined by the University of Bologna, founded in 1088, the first of the medieval universities. These were communities with administrative autonomy, courses of study, publicly recognised degrees and research objectives and were distinct from the religion-based institutions that came before (De Ridder-Symoens, 1992). Since then, universities spread worldwide in broadly the same form, and it has been argued that they were an important force in the Commercial Revolution through the development of legal institutions (Cantoni and Yuchtman, 2014) and the industrial revolution through their role in the building of knowledge and its dissemination (Mokyr, 2002).
While there is an extensive literature on human capital and growth, there is relatively little research on the economic impact of universities themselves. In this paper, we develop a new dataset on the location of universities in 1,500 regions across 78 countries, and explore how university formation has influenced economic growth since 1950, when consistent sub-national economic data are first available. This period is of particular interest because in the years following World War II, university expansion accelerated in most countries; a trend partially driven by the view that higher education is essential for economic and social progress. This was in contrast to the pre-War fears of "over-education" that were prevalent in many countries, should enrolments much extend beyond the national elites (Schofer and Meyer, 2005;Goldin and Katz, 2008).
There are a number of channels through which universities may affect growth including (i) greater skill supply; (ii) more innovation; (iii) support for democratic values; and (iv) demand. 4 universities' role as human capital producers, should graduates enter the workforce and innovate.
A number of papers have found that universities increase local innovative capacity. 3 A drawback of this literature is that it looks at an indirect cause of growth using proxies for innovation such as patents rather than at output directly. Moreover, the work is also focused on single countries, hence limiting its generalizability.
A third way universities may matter is by fostering pro-growth institutions. Universities could promote strong institutions directly by providing a platform for democratic dialogue and sharing of ideas, through events, publications, or reports to policy makers. A more obvious channel would be that universities strengthen institutions via their role as human capital producers. The relationship between human capital, institutions and growth are much debated in the literature.
There is controversy over whether institutions matter at all for growth. 4 Some papers have argued that human capital is the basic source of growth, and the driver of democracy and improved institutions (e.g. Glaeser et al., 2004). But the relationship between education and democracy/institutions is contested by Acemoglu et al. (2005b) who show that the effects found 3 Jaffe (1989) uses US state level data to provide evidence of commercial spillovers from university research in patenting (in specific technical sectors) and R&D spending by firms, and Jaffe et al. (1993) provides more evidence for localization by comparing the distances between citations and cited patents. Belenzon and Schankerman (2013) look more closely at the distance between the location of university patents, and the firms that cite them in subsequent patents. They find that university knowledge spillovers are strongly localized. Hausman (2012) finds that universities stimulate nearby economic activity via the spread of innovation: long run employment and pay rises in sectors closely tied with a local university's innovative strength, and this impact increases in proximity to university. Toivanen and Väänänen (2014) consider how universities affect innovation via their role as human capital producers: they use distance to a technical university as an instrument in estimating the effect of engineering education on patents in Finland (which they find to be positive and significant). They perform a counterfactual calculation which suggests that establishing three new technical universities resulted in a 20 per cent increase in USPTO patents in Finland. They suggest that the effects that universities have on their local economies may grow over time as the composition of local industries adjusts. While much of this effect is likely to be due to innovation spillovers, it may also capture other types of agglomeration externalities. 5 in the cross section of countries are not robust to including country fixed effects and exploiting within-country variation.
Finally, universities may affect growth through a more mechanical "demand" channel.
Increased consumption from students and staff and the universities' purchase of local goods and services from the region could in principle have a material impact on GDP. This would occur when university costs are financed though national governments from tax revenues raised mainly outside the region where the university is located.
In this paper, we develop a new dataset using the World Higher Education Database (WHED) which is produced by the International Association of Universities in association with UNESCO. We map the location of over 15,000 universities into 1,500 regions across 78 countries to explore how changes in the number of universities within regions have affected subsequent growth.
We show that university growth has a strong association with later GDP per capita growth at the sub-national level. Even after including a host of controls (including country or region fixed effects to control for differential regional trends, and year dummies) we find that a doubling of universities in a region is associated with over 4% higher GDP per capita. We show that reverse causality does not appear to be driving this, nor do mechanical demand effects. We also find that universities in neighboring regions or other regions in a country also affect a region's growth, and there appears to be a spatial element to this, with larger effects for regions that are close together.
Finally, we show that the university effect works through increasing the supply of human capital and also through raising innovation, but both these channels are small in magnitude. In addition, we find that universities appear to affect views on democracy even when we control or human 6 capital, consistent with a story that they may have some role in shaping institutions over longer time horizons.
The strength of our paper is the comprehensiveness of the dataset in terms of the coverage of sub-national regions and time periods. We do not have plausible instruments for university location, but the correlations are highly suggestive as we can control for a wide source of regionspecific trends and observable confounders.
To date, few papers have explicitly considered the direct link between university presence and economic performance. Cantoni and Yuchtman (2014) provide evidence that medieval universities in 14 th century Germany played a causal role in the commercial revolution. In a contemporary setting, Aghion et al. (2009) consider the impact of research university activity on US states. Using political instruments, they find that exogenous increases in investments in four year college education affect growth and patenting. Kantor and Whalley (2014) estimate local agglomeration spillovers from US research university activity, using university endowment values and stock market shocks as an instrument for university spending. They find evidence for local spillover effects to firms, which is larger for research intensive universities or firms that are "technologically closer" to universities. 5 Feng and Valero (2016) use international data to show that firms that are closer to universities have higher Bloom et al (2016) management scores. This paper is organized as follows. Section II describes the data and some of its key features including interesting trends and correlations which give us a macro level understanding of the global rise in universities over time. Section III sets out our econometric strategy, and Section IV 7 our results. Section V explores the mechanisms through which universities appear to affect regional growth and finally, Section VI provides some concluding comments.

II. DATA
Our regression analysis is based upon information on universities in some 1,500 regions in 78 countries. This represents the set of regions for which our university data can be mapped to a regional time series of key economic variables obtained from Gennaioli et al. (2014), and covers over 90 per cent of global GDP. 6 We first describe the full World Higher Education Database (WHED) across all countries, with some key global trends and correlations. Then we focus on the 78 countries for which regional economic data are available, describing how we aggregate the WHED data into regions, and present some initial descriptive evidence.

II.1 World Higher Education Database
WHED is an online database published by the International Association of Universities in collaboration with UNESCO. 7 It contains information on higher education institutions that offer at least a three or four year professional diploma or a post-graduate degree. In 2010, there were 16,326 universities across 185 countries meeting this criterion. The database therefore excludes, for example, community colleges in the US and further education institutions in the UK and may be thought of as a sample of "higher quality" universities. The Data Appendix contains more discussion on countries and types of institution omitted. Key variables of interest include university location, founding date, subjects and qualifications offered and other institutional details such as how they are funded. 8 Our regional analysis is based on that sample of countries for which GDP and other data are available from 1955, which covers 78 countries, comprising 14,868 (or 91%) of the institutions from the full listing. Our baseline results simply use the year-specific count of universities by region as a measure of university presence, always controlling for region population. To calculate this, we first allocate each university to a region (for example, a US state), and then use the founding dates of universities in each region to determine the number of universities that were present at any particular date. 8 High rates of university exit would invalidate this type of approach, but we find that this does not appear to be an issue over the decades since the 1950s (see the Data Appendix for an extensive discussion). This is because there has been very little exit from the university sector over this period.
A disadvantage of the "university density" measure is that it does not correct for the size or quality of the university. Unfortunately, this type of data is not available on a consistent basis across all countries. So we present robustness results on sub-samples where we do have finer grained measure of university size and quality to make sure our baseline results are not misleading.

II.2 The Worldwide Diffusion of Universities
We begin by presenting some descriptive analysis of the university data at the macro level using the full university database. Figure 1 shows how the total number of universities has evolved over time; marking the years that the number doubled. The world's first university opened in 1088 (in Bologna) and growth took off in the 19 th Century, growing most rapidly in the post-World War II period (see Panel A). In Panel B we normalize the number of worldwide universities by the global population to show that university density also rose sharply in the 1800s. It continued to 9 rise in the 20 th Century, albeit at a slower rate and has accelerated again after the 1980s when emerging countries like Brazil and India saw rapid expansions in universities.
A number of additional descriptive charts are in the online Appendix. We find that in 2010, the distribution of universities across countries is skewed, with seven countries (US, Brazil, Philippines, Mexico, Japan, Russia and India, in descending order) accounting for over half of the universities in the world ( Figure A1). The US is the country with the largest share, accounting for 13% of the world's universities. We also examine the "extensive margin" -the cumulative number of countries that have any university over time ( Figure A2). We also examine the cross sectional correlations of universities with key economic variables at the country level ( Figure A3). Unsurprisingly, we find that higher university density is associated with higher GDP per capita levels. It is interesting that countries with more universities in 1960 generally had higher growth rates over 1960-2000. Furthermore, there are strong correlations between universities and average years of schooling, patent applications and 10 democracy. 9 These correlations provide a basis for us to explore further whether universities matter for GDP growth within countries, and to what extent any effect operates via human capital, innovation or institutions.

II.4 Regional Economic Data
We obtain regional economic data from Gennaioli et al. (2014) who collated key economic variables for growth regressions at the sub-national level. The outcome variable we focus on is regional GDP per capita. Since for many countries, regional GDP data and other variables such as population or years of education are not available annually we follow Barro (2012) and compute average annual growth rates in GDP per capita over five year periods. 10 We also gather patents data at the regional level as a measure of innovation. For the US only, we obtain USPTO data at the state level over the period 1965-1999 from the NBER (Hall et al., 2001). For 38 countries, we obtain region-level European Patent Office (EPO) patents from the OECD REGPAT database covering 1975 to 2005.
As an initial investigation, we examine the regional cross sectional correlations between universities and regional GDP per capita, based on the year 2000 -where data for 65 countries out of the full sample are available. Column (1) of Table 1 shows that there is a significant and positive correlation between GDP per capita and universities: controlling for population, a 1% increase in the number of universities is associated with 0.7% higher GDP per capita. Column (2) includes country fixed effects which reduces the university coefficient substantially from 0.680 to 0.214.
We include a host of further geographic controls in column (3) -whether the region contains a 11 capital city, latitude, inverse distance to ocean, malaria ecology and the log of cumulative oil and gas production. 11 This reduces the coefficient on universities still further to 0.160. In column (4) we add years of education. This reduces the coefficient on universities by around two-thirds. 12 In column (5) we restrict to the sample for which patents data are available, and add years of education in column (6). Again, this reduces the impact of universities, by about half. In column (7) we see that adding a measure of patent "stock" reduces our coefficient on universities by a further 28% to 0.0565, but it remains highly significant. This analysis suggests that universities may contribute to higher GDP per capita mainly as human capital producers, but also as producers of innovation.
The correlations in Table 1 are purely cross sectional and there could be a multitude of unobservables that can lead regions to have both higher GDP per capita and more universities. Our focus therefore turns to growth rates for the bulk of our analysis which allows us to sweep out the time invariant factors. Figure 2 shows that the raw correlations between growth rates of universities and GDP per capita that we saw at the country level are also present within countries. Panel A simply plots the average annual growth in regional GDP per capita on the average annual growth in universities for each region, over the time period for which data are available (which differs by region). Average GDP per capita growth rates are plotted within 20 evenly sized bins of university growth, and country fixed effects are absorbed so that variation is within region. Panel B plots GDP per capita growth rates on lagged university growth for the 8,128 region-years (on which we conduct the core of our analysis that will follow). In both graphs it is clear that there is a positive 11 Specifically, we take the natural log of 1+ this value, so that we retain zeroes in our sample. 12 The coefficient on years of education is highly significant and similar in magnitude to the cross section results in Gennaioli et al. (2013). In regressions of regional income per capita on years of education, controlling for geographic characteristics, Gennaioli et al. (2013) estimate a coefficient of 0.2763, see their Table IV column (2). relationship. In addition, these graphs show that there are observations in the top bin with very high university growth. We explore which region-years were driving this found that they are evenly spread across 60 countries and different years, so they do not appear to be data errors.
Dropping the observations in the highest growth bin actually strengthens the correlation in this simple scatter plot. We keep all the data in the main regressions, but show that the results are robust to dropping these observations or winsorizing lagged university growth. Table 2 has some descriptive statistics of our sample of 8,128 region-years. The average region has GDP per capita of just over $13,000, average growth of 2% per annum and nearly ten universities (this is quite skewed with a median of 2, so in our robustness tests, we show that our results are not sensitive to dropping region-years with no universities). 13 As we set out in the next section, our core regressions will control for population levels and growth, and a number of geographic characteristics -including an indicator for whether a region contains a country's capital (this is the case for 5 per cent of the observations). Measures of regional human capital (college share and years of education) are available for sub-samples of region-years. In those samples, the average region-year has a college share of seven per cent and average years of education of just over seven.

III. EMPIRICAL FRAMEWORK
The underlying model we are interested in is the long run relationship between universities and economic performance: (1) ( / ) , = 1 ( , −5 ) + , −5 ′ 2 + , 13 A related fact is that the median growth rate of the number of universities is zero (5,736 observations). We also checked that the results are not driven by regions that increased their number of universities from zero to one or more.

13
Where ( / ) , is the level of GDP per capita for region i, in country c, and year t and , −5 is the lagged number of universities in the region plus 1 (so that we include observations where there are no universities in our analysis). We lag this to allow for the effect that the impact of universities is unlikely to be immediate and since we estimate in 5 year differences, using the fifth lag is natural (we also show longer distributed lags). In addition, using the lag means that we eliminate the effects of a contemporaneous demand shock that raises GDP per capita and also results in the opening of new universities. We also control for a number of observables , that may be related to GDP per capita growth and also the growth in universities; in particular the population of the region, . , is the error term.
The cross sectional relationship is likely to be confounded by unobservable region-specific effects. To tackle this we estimate the model in long (five-year) differences to sweep out the fixed effects. Our main estimating model is therefore: (2) ∆ ( / ) , = 1 ∆ , −5 + , −5 ′ 2 + 3 ( / ) , −5 + 4 ∆ , −5 + + + , We control for the lagged level of GDP per capita in the region; ( / ) , −5 to allow for catch up (we expect 3 to be negative). In the controls we include the lagged level of population, country level GDP per capita and the region specific time invariant controls. We control for the lagged growth in population because an increase in universities may simply reflect a greater demand due to population growth. We do not initially include any other measure of human capital in these specifications, so that we can capture the total effect that universities have on growth.
However, we explore the effect of adding human capital when we try to pin down the mechanism through which universities impact on growth. Finally, we include country fixed effects ( ) to allow for country-specific time trends, time dummies ( ) and an error term , . Standard errors are clustered at the regional level. In robustness tests, we also estimate models where we include a full regional dummies and so allow for unobservable regional trends.
We also explore the extent to which GDP per capita growth in region i may be affected by growth of universities in other regions within the same country. We extend our estimating equation (2) to include the growth of universities in other regions, which may be the nearest region (j) or simply all other regions in the country (-i). Therefore, we include the growth in region i's own universities (∆ , −5 ) as well as a potential spillover effect from universities located in in neighboring regions (∆ , −5 ), see equation (3). The lagged population level and population growth in region j are in the controls, ′ , −5 . ( We allow for spatial variation by interacting university growth with the distance between region i and its nearest region, and control for this distance separately, see equation (4). The variable is the distance in kilometers between the centroids of regions i and j.
(4) ∆ � / , � = 1 ∆ , −5 + 2 ∆ , −5 + 3 * ∆ , −5 + 4 + ′ , −5 5 + ′ , −5 6 + + + , In the limit, if the nearest region were very close to region i, so that = 0, we would expect that the effect of university growth in region j to be close to the effect of university growth in region i, so 2 should be close to 1 . More generally, we would expect that 3 would be negative so that the effect of region j gets smaller as distance increases. Table 3 presents our basic regressions. Column (1) is a simple correlation between the lagged growth rate of universities an regional GDP per capita, with no other controls. The estimated coefficient is 0.0469 and highly significant. To control for the fact that populous regions are more likely to require more universities, we add the lagged level of the population in column

IV.1 Basic Results
(2) which lowers the university coefficient slightly. Adding country and year fixed effects has little effect, and actually raises the university coefficeint slightly. In column (4) we add lagged regional GDP per capita, growth in population, and the regional covariates (latitude, inverse distance to the coast, malaria ecology, and the natural log of oil and gas production ) and a dummy for regions that contain a capital city. In column (5) we control for lagged country-level GDP per capita which should capture time varying macro shocks, with little effect. Columns (6) and (7) replicate (4) and (5) but include regional fixed effects, a very demanding specification which allows for regional trends. These do not much affect the university coefficient and in fact it is higher at 0.0468 in the most general specification. Overall, these results suggest that on average, a doubling of universities in a region is associated with 4% to 5% higher GDP per person. 14 We include lagged regional GDP as this is standard in growth regressions to capture convergence, and because in this application, it is relevant to know that our university effect survives holding initial conditions of a region constant. There are of course issues of bias when controlling for a lagged dependent variable, particularly in fixed effects regressions with a short time dimension 15 , but this does not appear to be an issue here for our coefficient of interest. The 14 Our analysis is carried out on a sample that drops 54 observations from China pre 1970, before and during the Cultural Revolution, when universities were shut down. Our effects survive if these observations are included, with the coefficient on university growth becoming 0.0308, still significant at the 1% level. We drop them because of the unique nature of this historical episode and the fact that this small number of observations (less than 1% of the full sample) seem to have a large effect on the coefficient. 15 See Hurwicz (1950) and Nickell (1981), and discussion in the context of growth regressions in Barro (2012). university effect is not sensitive to its inclusion. 16 Nevertheless, we emphasise the OLS estimates from column (5) in this paper, but show that the fixed effects specification in column (7) is also robust to a number of checks in the next section for completeness.
The other variables in the regressions take the expected signs. The coefficient on the regional convergence term is nearly 2% in columns (4) and (5). 17 Country GDP per capita has a negative coefficient in these specifications. This becomes a positive relationship once regional fixed effects are included. Having a capital city in a region is associated with around one percentage point higher regional GDP per capita growth. The geographic controls generally have the expected signs (see Table A1 in the Appendix). We explore different distributed lag structures, but find that a single five year lag is a reasonable summary of the data (Table A2). 18 This is likely to be due to the fact that in longer time frames, there are more factors at play which are not captured in our estimation framework.

Specification and sample checks
In Table 4, we show that our regressions are robust to a series of checks. This is the case for both the OLS and fixed effects regressions from columns (5) and (7) to Table 3 which are replicated in row (1). In row (2) we conservatively cluster standard errors at the country level, to account for correlation between the errors of regions within the same country over time. While the 16 The coefficients on lagged university growth if we run specifications (5) and (7) excluding lagged regional GDP per capita are 0.042 in both cases, still significant at the one per cent level. 17 In the fixed effects specifications (7) and (8) this is larger, reflecting the downward (Hurwicz) bias in the coefficient of the lagged dependent variable which is particularly an issue in short panels (see Barro (2012) and Gennaioli et al. (2014)). 18 There is some evidence that contemporaneous and 10 year lagged university growth has a positive significant effect in more advanced economies, but these results are not systematic. Interestingly, the contemporaneous (unlagged) effect of university growth is zero or negative (though not particularly significant), suggesting that it takes some time for benefits to be felt, while presumably some costs are incurred at the regional level. standard errors rise a little, the association between lagged university growth and GDP per capita growth remains significant at the 1% level. 19 In row (3) we weight the regression by the region's population as a share of total country population, in case low density regions (which might be outliers) are affecting the results. Again, this weighting has little effect on the university coefficient.
In row (4) we include country-year dummies instead of lagged country-level GDP per capita, to control for time varying factors at the country level (including national income) that may affect both university growth and GDP per capita growth (for example a general increase in funding for higher education, or a change in national government). This does reduce the coefficient, but it is still highly significant. In row (5) we control for the current (as well as lagged) change in population to address the concern that the effect of the university is simply to pull in more people to the region, who spend or produce more and hence raise GDP per capita growth. Our university effect remains strong and therefore it does not appear to be driven by population growth. Row (6) uses growth in university density (universities per million people) instead of university count, with very similar results. We prefer to use the university count in our core analysis, with controls for population growth, as changes in university density can be driven either by the numerator (universities) or the denominator (population) and can be more difficult to interpret.
We then perform a few checks to see whether regions with no universities, or regions getting their first university are driving the results. Row (7) of Table 4 drops regions which never have a university in the sample period, and row (8) drops region-years with zero universities, and the coefficients remains unchanged. To make sure our results are not driven by extreme university growth observations we do two things. Row (9) drops region-year observations where a region opens its first university, and in row (10) we winsorize the top and bottom five per cent of university growth which both strengthen the results. 20 Row (11) uses similarly winsorized GDP per capita growth as the dependent variable, which dampens reduces our coefficient slightly but it still significant at the 1% level. In row (12) we show that the results are not sensitive to dropping observations where we have interpolated GDP per capita. To address measurement problems in terms of missing founding dates, in row (13) we include a dummy for regions where more than five percent of the universities have missing founding dates, reflecting the fact that university counts in those regions will be worse measured compared to elsewhere, this has little effect (and clearly is meaningless in the second column as the dummy is subsumed into regional fixed effects).
Finally, we explore whether the definition of university in WHED (i.e. only institutions that offer four year courses or postgraduate degrees) may be a problem, in the sense that there may be some countries that have a larger share of institutions outside this category which could be important for growth. For this purpose, we compare the most recent university numbers in our database to an external source, Webometrics. 21 Row (14) shows that our results are robust to dropping the 29 countries where there are more than double the number of institutions in Webometrics compared to the WHED listing.
We have shown that both specifications are similarly robust. Given potential issues with fixed effects estimation with a lagged dependent variable previously discussed, and in the interests 19 of parsimony, we focus on the OLS specifications (Table 3, column (5)) in the remainder of the paper. Some further robustness tests are reported in the online Appendix. To investigate the potential concern that our results are driven by expectations of growth in the region we explore "Granger Causality" tests. We use the growth in universities as the dependent variable and regress this on the lagged growth in regional GDP per capita, and the other controls (Table A3). We see that even as all controls are added, the lagged growth of regional GDP per capita has no relationship with current growth in unviersities and does not appear to "Granger cause" the opening up of univeristies. We also show that the university effect exists across continents (Table   A4). 22 Finally, we find that lagged university levels have positive significant effects on growth in "Barro-style" regressions (see Tables A5 and A6, and discussion in the online Appendix).

Heterogeneity between universities
A concern with our econometric strategy is that our use of university numbers is an imperfect measure of university presence. Universities are not homogeneous, but vary in size and quality. Clearly, both of these dimensions are likely to matter in terms of economic impact (although it is not obvious why this would necessarily generate any upwards bias in our estimates).
Using the crude count of universities may not be useful if the distribution of university size is uneven across regions. It is reasonable to expect that larger universities will have a larger economic impact; and thus size may be a better measure of university presence. For our university counts to be an adequate measure of university presence, we would want them to be positively the world, and tend to be available for recent years (for example the Shanghai Rankings have been compiled since 2003 and cover the world's top 500 universities).
Our data do contain some key attributes of universities which may be indicative of quality, specifically whether a university is a research institution which is more likely to have effects on innovation (we take whether or not a university is PhD granting as an indicator of research activity), whether it is public or private, whether it offers science, technology, engineering or mathematics (STEM) subjects, and whether it offers professional service related courses (business, economics, law, accounting, finance). We add these variables to the analysis by considering the effect of the growth in the share of each type of university over and above the growth in the number of all universities. 25 Table 5 examines whether these various proxies for university quality have differential effects on GDP growth. Here, we drop region-year observations with zero universities (where the share would be meaningless). Panel A shows the result for the full sample of countries. Each column includes one of these measures in turn. The effects are not significantly different from zero, suggesting that on the entire sample there seems to be a general university effect which does not vary much by type of university as defined here. We also perform this quality analysis on the more advanced economies of Western Europe and the US. In Panel B we can see that now increases in the share of PhD granting institutions are significant though the other measures are not, suggesting that the research channel may be more important in countries at the technology "frontier" (Aghion et al., 2005). Growth in the shares of public, STEM subject and professional subject universities all have no additional impact on growth, though this may be due to measurement issues. 26 Equivalent analysis of the sample of all other countries (Panel C) shows no effect for any of the quality measures.

Summary on robustness
We have shown that our results are robust to different specification and, to the extent that the data allow, consideration of the size and quality dimensions. However, this framework does not allow us to address potential endogeneity due to time-varying unobservables. It could be that our results are driven by factors that vary at the region -year level. For example, it could be that some regions have good local governments in certain time periods who implement a number of policies that are growth enhancing, one of which happens to be opening up universities. Although there is no direct way to address this without an external instrument, the fact that such policy decisions would take a while to feed through to the building of a university and the growth effects of universities are themselves felt only several years in the future, makes us doubt that such local political shocks are the reason for the relationships we observe in our data.

IV.3 Geographical Spillover effects of universities
If the effects we are finding are real we would expect to see that universities do not just affect the region in which they are located, but also neighboring regions. Therefore, in addition to including own region university growth we include university growth in other regions in the same country to see whether this affect GDP per capita in our home region. Table 6 presents the results of this spillover analysis. Column (1) replicates our baseline result. Column (2) includes lagged university growth in the nearest region. This shows that universities in the nearest region have a 23 positive but insignificant effect on home region growth. However, on closer inspection it appears that some "nearest regions" are actually very distant, and are dampening this result. In column (3) we drop observations where the nearest region is over 200km away (which constitute a fifth of all regions in the sample). In this column the nearest regions have an effect of nearly the same magnitude as the home region, though with larger standard errors. Therefore, using the full sample again in column (4), we control for the growth in universities in the nearest region interacted with the distance to that region (based on distance between centroids), and a linear term in distance. (3), the interaction is negative and significant and the linear "neighbouring universities" term is positive and significant. In column (5) we add the relevant controls for the neighbouring region -the lagged population and population growth (which should also control for a demand shock in the neighbouring region in the previous period). These have little effect on our coefficients or their significance. The magnitude of the effects is sensible: universities in a neighbour that has a distance of near zero has essentially the same effect on growth as a university in the region of interest (0.0430 vs. 0.0356).

Consistent with column
Finally, we look at the effects of university growth in all other regions (including nearest region) in the country on our home region. Column (6) adds the lagged growth in universities in all regions of a country, excluding the home region. Column (7) also adds the relevant controls (lagged population and population growth for the other regions). These effects are now larger than our main effect and again highly significant. 27 The implication is that a doubling of universities in the rest of the country (which in most cases will represent a greater absolute increase than a doubling of home region universities) will on average increase home region's GDP per capita by nearly 6 per cent. Overall, this analysis suggests that universities not only affect the region in which they are built, but also their neighbours and that there does appear to be a spatial dimension to this, in the sense that the closer regions are geographically, the stronger the effects. This suggests that from a country perspective, universities generate local and macro growth. Therefore, the full effect of opening universities on a country must be greater than what we find in our core regressions.

IV.4 Magnitudes
Using the coefficients in Table 6 we can estimate a country-wide effect of a university expansion on the typical region in our dataset. The average region has nearly 10 universities (see Table 2), and the average country has 20 regions (and therefore 200 universities). Doubling the universities in one region (10 to 20) implies a 4% uplift to its GDP per capita according to our main result. For each other region, this represents a 5.3% increase in universities in the rest of the country (a rise from 190 universities to 200), multiplied by 6% (the coefficient on other regions in column (7)), this implies an uplift to all other regions' GDP per capita of 0.3%. Assuming the regions in this hypothetical country are identical, the uplift to country-wide GDP per capita is simply the average of these effects: 0.5%. 28 While this seems like a significant amount of benefit, we also need to consider the costs of university expansion. 29 Given that the costs of building and maintaining universities will vary widely by country, we choose to focus on a particular institutional setting for this calculation. In 28 As a sense check for this result, we collapse our regional dataset to the country level and run macro regressions of GDP per capita growth on lagged university growth. The coefficient on university growth is 0.03 (but insignificant). According to these results, an 100% increase in universities at the country level would be associated with a 3% increase in GDP per capita growth. Therefore a 6% increase in universities would imply a 0.2% uplift -this is smaller than the 0.5% we calculate using the results from our better identified regional analysis, but of the same order of magnitude. 29 It is unlikely that these are controlled for in our regressions: a large portion of university financing tends to be at the national level, and costs are incurred on an ongoing basis (e.g. property rental or amortisation and staff salaries are incurred every year) and so would not be fully captured by the inclusion of lagged country GDP per capita as a covariate.
25 the UK in 2010, there were 171 universities across its 10 regions. As an experiment we add one university to each region, a total increase of 10 universities (6%) at the country level. Using similar steps as in our hypothetical country above (but taking into account the actual numbers of universities in each UK region in 2010), we calculate that the overall increase to GDP per capita While this calculation is highly simplified, it shows that there is a large margin between the potential benefits of university expansion implied by our regression results and likely costs.
We note that the costs of setting up universities, and methods of university finance vary by country so we cannot generalize this result to other countries, nor make statements about the optimal 30 For each of the ten regions in the UK in turn, we calculate the percentage increase implied by adding one university to that region's universities, and multiply this by the 4% (effect of a 100% increase). We then calculate the increase in the count of universities in all other regions (for that region), and raise their GDP per capita by that percentage increase multiplied by 6% (the effect of 100% growth of universities in "other regions" on home region GDP per capita from Table 6). We abstract from the 5 year lag in this calculation. We the add up the total GDP across regions, and divide by total population (assumed unchanged). 31 Series ABMI, Gross Domestic Product: chained volume measures: Seasonally adjusted £m, Base period 2012 32 Data on university finance, by institution, can be found at the UK Higher Education Statistical Authority (HESA) website (https://www.hesa.ac.uk/index.php?option=com_content&view=article&id=1900&Itemid=634). Total expenditure in the year 2009/10 was nearly £26 billion across 164 institutions listed in HESA, implying around £160 million per institution. University expenditure contains staff costs, other operating expenses, depreciation, interest and other finance costs. We checked if this figure has been relatively stable over time, finding that by 2013-14, average expenditure was £180 million. At this higher amount, the implied costs of our expansion rise to 0.11% of GDP. Note that the number of institutions present in 2010 was 173. The majority of institutions in WHED (156) correspond to those listed in HESA, but there are a small number of discrepancies due to differences in the classifications of some institutes or colleges between the two listings. This does not matter for our purposes, as are simply using the HESA data to calculate the average expenditure of a typical university. 26 number of universities in particular regions. Similar calculations for other countries could be made by delving into particular institutional settings.

V. Mechanisms
Having established a robust association of universities on GDP per capita we now turn to trying to understand the mechanisms through which universities may affect growth.

V.1 Human Capital
We add measures of growth in human capital to our regressions to see how this influences the university coefficient. In Table 7 we consider the relationship between universities and college share. Column (1) replicates the core result, and column (2) the same specfication on the reduced sample where college share is non-missing, for which the university effect is a bit stronger at 0.07.
Column (3) adds the lagged growth in college share which in itself is highly significant, and reduces the university effect from 0.0710 to 0.0672. Column (4) uses contemporaneous growth in college share and column (5) adds in the lagged college share. In column (6) we include also the level with both lags, with little change in the university coefficient. In column (7) we look at the raw correlation between contemporaneous growth and the lagged growth in universities (with only country fixed effects as controls), and find it to be relatively small but highly significant. Adding all the other controls dampens this relationship further and this small effect of university growth on college share is what explains the fact that adding in growth in human capital causes only a small reduction in the coefficient on universities. This analysis suggests that a 1% rise in the number of universities gives rise to around a 0.4 percentage point rise in the college share. 33 33 Table A7 in the Appendix uses another measure of human capital: years of education, which is available for a larger sample of countries and years. The qualitative results are similar 27 Appendix A2.2 gives some simple simulations showing that the magnitude of effect of universities on human capital magnitude is unsurprising giving the variation we are using in the data.

V.2 Innovation
The best available measure of innovation output available consistently at the regional level over time is patents, though unfortunately patents with locational information are not available for our entire sample of countries and years. We consider the effects of adding growth in cumulative patent "stocks" 34 to our regressions, first for the US (where we use US PTO registered patents over the period , and then internationally (we use patents filed at the European Patent Office which are available for over 38 of our countries between 1975 and 2005). Table 8, Panel A shows the results for the US. Column (1) runs the core regression for the US only, and over the time period that we have patents data. 35 Column (2) then includes the change in patents stock, which reduces the coefficient on university growth from 0.113 to 0.109 (a reduction of four per cent). Patents themselves have a positive association with GDP per capita growth, but this is not significant. Column (3) considers the raw correlation between lagged university growth and current patent stock growth (including only year dummies), and shows it is positive and significant at the one per cent level. Column (4) then adds the standard controls, and this correlation becomes substantially smaller and insignificant. This analysis suggests that at least for the US, innovation is part of the story of why universities have an economic impact, though not the entire story. This may be because the effect of newer universities on patents takes a while to accumulate.
On a wider sample of countries, we also consider EPO patents (Panel B). Column (1) restricts the regression to the sample for which we have the EPO patents data giving us a university effect of 0.0220, signficant at the 10 per cent level. Adding in patents in column (2) reduces the effect of universities by around 13 per cent, and we see that growth in patents has a strong association with regional GDP growth in this international sample. A doubling of the patent stock is associated with 5% higher per capita GDP. Columns (4) and (5) shows the raw correlation between lagged university growth and patents growth, and shows that it is postitive, but insignificant.

V.3 Institutions and democracy
The use of country fixed effects throughout our analysis should rule out the possibilty that the effects of universities simply reflect different (time invariant) institutions, since these things tend to differ mainly at the country level. We also show that the results survive the inclusion of country-year fixed effects in the robustness, this would capture country specific changes in institutions or changes in government. To the extent that time invariant institutions vary within countries, say at the US state level, our regional fixed effects analysis should address this.
However, institutions do vary over time, and it is possible that universities contribute to this -albeit over longer time horizons than those analysed in our core regressions. We saw in Figure 5 that there is a positive significant correlation between country level democratic institutions (as proxied by Polity scores) and universities. This correlation also exists when we consider the 1960-2000 change in universities and polity scores (see the online Appendix for more discussion). A time series of data on regional institutions to fit into our growth framework is not available, but we do explore the relationships between perceptions of democracy obtained from the World Values Survey and lagged university presence in the cross section.

29
Our chosen survey measure is a categorical variable which gives the approval of a democratic system for governing one's own country, as this is more widely available across survey waves compared with other questions on democracy. 36 We note however, that the experience in one's own country (for example, if corruption prevents democracy operating effectively) may affect this judgement. Therefore, in the robustness we test whether results hold for a another more general survey question 37 (available for fewer survey waves). World Values Survey data begins in the 1980s and we pool data into a cross section due to insufficient observations in some region-year cells to generate reliable variation over time. Table 9 shows the results of these regressions. 38 We start with a simple correlation between our measure of university density lagged by 15 years from the survey year, controlling for country and year fixed effects and clustering standard errors at the region level (column (1)). 39 This shows a highly signficant association between university presence in a region and approval of a democratic system. The relationship is robust to including a host of individual demographic characteristics such as gender, age, martial status, children, employment and relative incomes (column (2)). In column (3) we see that the result is also robust to including controls for the individual's own education: a dummy for whether or not they hold a university degree, and a 30 dummy indicating whether or not they are a student. The result that one's own education is positively related to approval of democracy is consistent with Chong and Gradstein (2009) who use years of schooling. But the result that universities matter over and above their effect on an individual's education suggests that they may be a mechanism whereby democratic ideals spill over from those who have direct contact with universities, or that there is some form of direct diffusion from universities into their surrounding region. Further supporting this, we find that the result survives dropping students and graduates from the regression entirely (see Table A9 more robustness tests and further discussion in the online Appendix). Column (4) adds our standard geographic controls and shows that these do not affect our result. This analysis shows that there is robust relationship between other lag lengths of university presence in a region and approval of a democratic system, and that this operates over and above the human capital effect. While it is not possible to account for any potential impact of this type of mechanism on growth in our current framework, this analysis suggests that institutions could be part of the story, albeit on a longer term basis.

V.4 Demand effects
Could our results simply be driven by a mechanical impact of universities on regional GDP? Students and staff in a university consume more goods and services. Including changes in population in our regressions (lagged, and contemporaneous in the robustness tests) should have largely controlled for the possibility that universities simply contribute to growth through a mechanical demand channel associated with people coming into the region and consuming more.
Moreover, showing that our university coefficient remains robust to including changes in current and lagged human capital (see Table 7) should also address the concern that the effects are simply driven by higher earners entering the region (changes in college share will reflect inward migration of graduates, in addition to the impact of universities churning out graduates).
To the extent that university finance comes from inside the same region, there should be no mechanical demand effect as this should already be netted off. For example, in the US, states have historically provided more assistance to tertiary institutions and students: 65 per cent more on average over the period 1987 to 2012, though now the share is more equal. 40 But if university finance comes from outside the region (e.g. from the Federal government), this could also result in higher GDP per capita as the university purchases goods and services within the region (including paying salaries to staff and support services).
We think it unlikely that the regressions are merely capturing this type of effect. The initial shock to region GDP associated with the new university is likely to occur in the year it is founded (when transfers begin, and include capital and set up costs), and the level effect should be captured by lagged regional GDP which we control for in the regressions. Ongoing transfers may rise incrementally over the years as the university increases its size and scope, but we might expect the largest effect on growth would be in the initial years rather than in the subsequent five year period.
Notwithstanding this argument, we carry out a simple calculation to show that even under very generous assumptions, direct effects are unlikely to explain a large portion of our results. We use the hypothetical experiment of a new university of 8,500 students and 850 staff opening in the average region of our dataset (see Appendix A2.2). We estimate the effects of the transfer into the region assuming that all the costs of our new university are met from sources outside the region, and that these are spent within the region. We assume that the average cost per student is $10,000, and therefore the cost for a university of 8,500 students is $85 million. With a university of constant size, building up year-group enrolments over four years, there would be no effect in the following five year period. If we assume total enrolments grow by 5%, we can explain around 15% of the regression coefficient on universities.

V.5 Summary on mechanisms
In summary, it appears some of the effects of university growth on GDP growth work via human capital and innovation channels, though the effects of these are small in magnitude. In addition, universities may affect views on democracy but this appears to be on a longer term basis.
We have shown convincingly that the university effect is not merely driven by mechanical demand effects.

VI. CONCLUSIONS
This paper presents a new dataset on universities in over 1,500 regions in 78 countries since 1950. We have found robust evidence that increases in university presence are positively associated with faster subsequent economic growth. Doubling the number of universities is associated with over 4% higher GDP per capita in a region. This is even after controlling for regional fixed effects, regional trends and a host of other confounding influences. The benefit of universities is not confined to the region where they are built but "spills over" to neighboring regions, having the strongest effects on those that are geographically closest. Using these results, we estimate that the economic benefits of university expansion are likely to exceed their costs.
Our estimates use within country time series variation and imply smaller effects of universities on GDP than would be suggested from cross sectional relationships. But we believe 33 our effects are likely to be lower bounds as the long-run effect of universities through building the stock of human and intellectual capital may be hard to fully tease out using the panel data available to us. Nevertheless, the evidence seems compelling here that there is some effect of universities on growth.
Understanding the mechanisms through which the university effects works is an important area to investigate further. We find a role for innovation and human capital supply (but not demand or transfers into a region), although these are small in magnitude. This might be due to statistical issues, but better data on the flow of business-university linkages, movements of personnel and other collaborations would help in unravelling the underlying mechanisms. In addition, focusing on the relationships between universities and local economic performance in individual countries where better causal designs and richer university data is available would be a valuable extension (e.g. Jäger, 2013).
We provide suggestive evidence that universities play a role in promoting democracy, and that this operates over and above their effect as human capital producers. Understanding the extent to which this may account for part of the growth effect is another area for future research.   (1) shows the relationship between universities and GDP per capita, controlling for population. Column (2) includes country dummies. Column (3) includes regional controls (a dummy indicating whether the region contains a capital city, together with latitude, inverse distance to ocean, malaria ecology, log(oil and gas production) 1950-2010, these are not reported here). Column (4) includes years of education. Column (5) is identical to column (4) but restricts the sample to the regions for which OECD REGPAT patents are available. Column (6) includes years of education, and column (7) includes the natural log of the regional patent "stock".   (1) is a simple correlation between regional GDP per capita growth and the lagged growth in university numbers. Column (2) controls for the lagged log of population. Column (3) includes country and year dummies. Column (4) controls for lagged regional GDP per capita, the lagged growth in population, the lagged log population density level, and lagged growth in average years of education, a dummy for whether the region contains a capital city, together with latitude, inverse distance to ocean, malaria ecology, log(oil and gas production) 1950-2010 (not reported here). Column (5) adds lagged country GDP per capita. Column (6) includes regional fixed effects, and the time varying controls of column (4). Column (7) adds lagged country GDP per capita. Standard errors are clustered at the regional level. Levels of GDP per capita and population are in natural logs.   Table 3), but drops regions with zero universities. Columns (2) to (5) add in the lagged growth of the shares of universities of different types as labelled.  (1) replicates our core regression (column (5) from Table 3). Column (2) adds in the lagged growth in universities in the nearest region. Column (3) replicates column (2) but conditions the sample to regions whose nearest region is less than 200km away. Column (4) returns to the full sample, but adds an interaction term of universities with distance to that region (in km), and distance to that region as a separate variable. Column (5) adds controls from the nearby region: namely the lagged population and population growth (not reported here). There were a small number of observations where the population in the nearest region was missing, relating to early years in the sample period. In this case, population was extrapolated back in time, using a log-linear trend, and a dummy variable included to indicate this. Column (6) includes the lagged growth in universities in all other regions of the country, and column (7) also adds the relevant controls from all other regions in the country: namely the lagged population and population growth (again with a dummy to indicate where the population in the rest of the country has been calculated with missing values for any regions that year). Notes: Growth in college share is simply the percentage point difference: (college share (t) -college share (t-5))/5. Column (1) replicates Column (5) from Table 3. Column (2) restricts to the sample for which the change in college share is available. Column (3) drops the lagged growth in college share. Column (4) adds the contemporaneous change in college share. Column (5) includes both lagged and contemporaneous changes. Column (6) further adds the lagged level of college share (unlogged). Column (7) regresses the change in college share on the lagged growth in universities, with country dummies, but no other controls. Column (8) adds all the other controls.  . Panel B is a larger sample: the countries for which regionalized EPO patents are available in OECD REGPAT (1975REGPAT ( -2005. Column (1) replicates our core regression (column (5) from Table 3), but restricts to the relevant sample for patents data. Column (2) adds in the contemporaneous growth in cumulative patent "stock" to the regression. Column (3) regresses the growth in patent stock on the growth in universities as a raw correlation, with no other controls. Column (4) then adds the standard time varying controls (reported) and geographic controls (not reported).   WHED contains information on higher education institutions that offer at least a post-graduate degree or a four year professional diploma. It therefore excludes, for example, further education institutions in the UK or community colleges in the US and may be thought of as a sample of "higher quality" universities.
We compare the country totals in WHED as at 2010 to data from "Webometrics" (http://www.webometrics.info/en/node/54), a source where higher education institutions (including ones that would not qualify for inclusion in WHED) are ranked by their "web presence". This source puts the total number of universities worldwide at 23,887 in 2015 (part of this difference will be due to growth over the 2010-2015 period). In the results section, we discuss a robustness check where we drop countries from our regressions with a very large divergence between the two sources.

A1.2 VALIDATING OUR APPROACH
Our approach for calculating university presence by region uses the founding dates of universities to determine the number of universities that were present at any particular date. We consider that a "university" is founded on this initial founding date, even if it was a smaller higher education institute or college at that date. This is often the case, but our approach is reasonable since only the better quality institutions are likely to subsequently become universities. Furthermore, there are many cases where a number of universities or higher education institutes were merged together into what is today recorded as one university in WHED. Our approach avoids the apparent reduction that would occur in such cases if we were merely counted the number of institutions present at any given date.
One key concern with this strategy is that it would not be suitable in a world where university exits are commonplace. Say a number of universities were present in the past and closed down before WHED 2010. A region could have actually seen a fall in universities, but our method would not capture this since it includes only surviving universities. Anecdotally we know that the period since the 1960s has been one of university growth across the globe, but we investigate this issue further in order to gain more comfort on the validity of our approach. We do this by obtaining historical records of the universities and higher education institutions present in the 1960s, and assess whether significant numbers of these are missing from WHED 2010.
The appropriate sources are the predecessors to WHED: "The International Handbook of Universities" (1959, published by the IAU annually); "American Universities and Colleges" (1960, published by The American Council on Education and "The Yearbook of Universities of the Commonwealth" (1959), published by the Association of Universities of the British Commonwealth). As the name suggests, the "American Universities and Colleges" yearbook contains fully fledged universities, but also smaller colleges (including religious institutions), many of which would not be included in WHED today. The international handbook lists universities and other institutions not considered of "full university rank" separately. We include all of these institutions because the distinction is not consistent between countries -for example in France, these latter institutions contain all the grandes écoles which are considered to be of very high quality but are outside the framework of the French university system; and in China only one institution is listed as a full university while other institutions include a number of institutions with the name "university". The Commonwealth yearbook contains only fully fledged universities.
The main exercise we carry out is to name match between 1960 yearbooks and WHED 2010. There are 2,694 institutions listed across 110 countries in the three yearbooks in 1960, this compares with 5,372 institutions (in 132 countries) which according to WHED 2010 were founded pre 1960 -this is higher because WHED counts universities from the date they are founded, even if they are not founded as a fully-fledged university (as discussed above). The country level correlation of the number of universities present in 1960 in the two sources is 0.95. The matching process involves a number of iterations: exact matching, "fuzzy" matching, and manual matching. The process is complex because name changes and mergers are commonplace, therefore internet searches on Wikipedia or university websites were necessary. Where an institution was found to have been merged into a university that is present in WHED 2010 we considered it a match. The results of this process are summarized in Table A10. We find that university closure is extremely rare, and we only find evidence of this in the US, where 33 small (mostly religious) colleges are present in the 1960 yearbook and were found to have closed down, mainly due to bankruptcy. 155 institutions worldwide were found to still be in existence but not be listed in WHED. This was usually because they do not meet the WHED listing criteria (a university that offers at least a four year degree or postgraduate courses). Indeed, of the 155 institutions in this category, 115 were not considered fully fledged universities in 1960, and 33 of the remaining 40 were US colleges (mostly religious).
Based on these facts, we believe that it is reasonable to use the WHED founding dates as an (albeit imperfect) basis for a time series of university presence by region.

A1.3 DESCRIBING UNIVERSITY GROWTH IN SELECTED COUNTRIES
This section gives a historical overview of the diffusion of universities from the 1880s in four advanced economies: France, Germany, the UK and US, and two emerging economies: India and China. We compare the timing of historical university expansions to growth and industrialisation (see Figure A5 for a measure of industrialisation over time in the UK, US, France and Germany sourced from Bairoch (1982)). This analysis provides a visual "sensecheck" for the thesis developed by Mokyr (2002) that the building and dissemination of knowledge played a major role in the Industrial Revolution.
In the United Kingdom, universities have been established in waves: the "Ancient universities" starting with Oxford in 1100s were the first seven universities which were founded before 1800. Then a number of universities were chartered in the 19th Century, followed by the "Red Brick" universities before World War I. A large expansion occurred after World War II, following the influential Robbins Report into Higher Education (1963). Former polytechnics were converted to universities in 1992, but in our data these higher education institutions are counted from when they opened in their original form. These waves can be seen in the university density line as shown in Figure A6, Panel A, which also plots national GDP per capita data (from Maddison), suggesting that the first expansions coincided with industrialisation in the 1800s ( Figure A5 shows that industrialisation picked up from the 1830s in the UK). The raw university count trend is shown in Panel B.
In the US, the first university was Harvard, founded in 1636. By the American Revolution there were nine colleges modelled on the Oxford and Cambridge in England. However these were very small, exclusive and focused on religion and liberal arts. At that time, there were no law or medical schools, so one had to study these subjects in London. It was Thomas Jefferson who had a vision for state education, separate from religion, but this only took hold after the Civil War with the land grant colleges. This sharp rise in university density can be seen in Figure  A7. Industrialization in the US began to pick up in the 1860s (see Figure A5). University density reached much higher levels than in Britain: at 13 universities per million people in 1900 versus just over 2 in the UK. The difference is that in the US, density came down again as population growth outpaced the opening of new universities which continued to grow as shown in Panel B; though the downward trend did slow during the post war period (we can see the slight kick in university numbers from the 1950s in Panel B). However, the fall in university density must be considered in the context that over the same period, university size has also been increasing in the US and (this can be seen in Figure A8 and in our analysis in Section IV on enrolments). Furthermore, there has been a sharp rise in "Community Colleges" in the US, which provide college access qualifications, and are not counted in our dataset.
In France, Figure A9 shows that university density really started picking up in the 1800s with the opening of the "Grande Écoles" which were established to support industry, commerce and science and technology in the late 19 th Century. Indeed industrialization in France was more gradual, and started picking up in the late 1880s, early 1900s. The next dramatic increase in universities numbers and density occurred in the 1960s during de Gaulle's reforms of the French economy.
Cantoni and Yuchtman (2014) discuss the opening of the first universities in Germany following the Papal schism in the late 14 th Century. However, during the 1800s, Figure A10 shows that university density actually fell, as population growth outpaced the gradual increases in university numbers which can be seen in Panel B. Historically, Germany had a low share of college graduates as higher shares of the population were educated via the apprenticeship system. A deliberate push to expand university education began in the 1960s, with new public universities founded across the country (Jäger, 2013). This was motivated by economic reasons; in particular the need to compete in technology and science against the backdrop of the Cold War; but also social reasons, namely the notion that education is a civil right to be extended beyond the elites, and is crucial for democracy.
China and India saw much later expansions as shown in Figure A11 and Figure A12. China started opening up to Western advances in science in the 1800s, and followed Soviet influence in the 1950s with centrally planned education. We can see a sharp rise in university density from the 1900s to 1960. The spike in the 1960s is due to the Cultural Revolution, when higher education institutions were shut down for 6 years, and all research terminated. When the universities were reopened, they taught in line with Maoist thought. It was from the 1980s that institutions began to gain more autonomy and when China began its rapid growth trajectory, though so far growth in universities has not outpaced population growth. In India, expansion occurred after independence in 1947. During the colonial era, the upper classes would be sent to England for education. The British Raj oversaw the opening of universities and colleges from the late 1800s, but university density only started rising more rapidly after 1947 and recently has picked up pace again. We note that the in both countries, there are around 0.4 universities per million people, which is still a lot lower than in the UK or US.
Finally, we note that in general, expansions in university numbers have been accompanied by increases in university size. As we saw in Figure A8 (using UNESCO data that are only available from 1970), university students normalized by population have been growing overall in the US and the UK since the 1970s (with a dip in the late 1990s in the US) and more recently in China and India.

A2.1 REGRESSIONS OF GDP OER CAPITA GROWTH ON LAGGED LEVELS OF UNIVERSITY PRESENCE
In Table A5 we replicate as closely as possible 41 the results in Gennaioli et al. (2014) but add in the lagged university level into our analysis. Column (1) follows their Table 5, column (8), omitting years of education, and column (2) includes years of education. The coefficients are very similar: the convergence term is between 1.4% and 1.8%, and the coefficient on years of education is nearly identical at around 0.004. Adding in universities and population does not affect the other coefficients much. Column (3) suggests that doubling the level of universities leads to a 0.24% rise in the GDP per capita growth rate. Universities have a positive and significant effect over and above years of education. As we would expect if some of the effect of universities is via their production of human capital, the effect of universities is higher when years of education are omitted (column (3)). Table A6 presents a similar analysis, but in long difference format; so for each region there is one observation with the average annual growth rate over the 50 years to 2010, 40 years and 30 years respectively; regressed on starting period universities and other controls. Overall, this shows that even in this simplified specification on the reduced sample where the relevant data are available for the time periods, there is a positive significant relationship between initial period univerisities and subsequent growth once country fixed effects are included; and the magnitude is comparable with the conventional Barro-style results.

A2.2 SIMULATION OF THE EFFECTS OF A NEW UNIVERSITY ON THE AVERAGE REGION'S HUMAN CAPITAL AND GDP
To assess the plausibility of the magnitudes identified in the main text we consider some quantitative calcuations of university expansion.
To look at a representative case we take the average region in the dataset as summarised in Table 1: a population of just under 3 million, GDP per capita of $13,056 (and hence GDP of $39 billion), a college share of 7%, average years of education of 7.37, and just under 10 universities.
We assume that a new university with a capacity of 8,500 students is opened in the region. We believe that a university of 8,500 students a generous size for a new university, based on to average enrolments in our sample countries over the years where country level enrolments data are available. 42 The annual intake of students is 2,125, so the university is at capacity in four years. We assume it takes four years to graduate with a bachelors degree and a staff student ratio of 10, 43 so that there are 850 staff present at the university from the outset. We assume 41 Our sample is larger because for the purposes of our analysis we interpolate GDP per capita, and not just years of education and population as in their paper. 42 We obtained total tertiary education enrolments from UNESCO which is available since the 1970s, and divided by the number of universities in our data, to get the average number of universities by country in each year where the data are available. The average over the period is just under 8,500. Obviously, this will represent existing as well as new universities. Moreover, this is likely to be an overstatement since, as we previously discussed, not all tertiary institutions are included in WHED. The average growth rate in students per university implied by this country level data over the period is 2.5% per annum. 43 This is a generous assumption. In the UK, for example, staff-student ratios range between 9 and 25 (see http://www.thecompleteuniversityguide.co.uk/league-tables/rankings?o=Student-Staff%20Ratio) that students enter the region to begin studying, and stay in the region post graduation, adding to its human capital stock. We keep the university size constant for the five year period following its opening. We assume that all staff enter the region in the first year of the university and remain there. We assume that the typical graduate has 18 years of education.
This experiment involves adding one university to an existing stock of 10 universities, which represents a 10% increase over that period, or an average of 2% per year. To compare to our regression results, which represent the impacts of a 1% cent increase in universities, we need to double the regression coefficients. Our core coefficient on universities in column (5) of Table  3 is 0.0445. This implies that a 1% increase in the number of universities is associated with a 0.045% increase in GDP per capita in the subsequent 5 years. Therefore the implied increase in GDP per capita following a 2% change would be 0.089%.
The impact of a 1% increase in universities in the previous period on college share from Table  7, column (8) is an increase in college share of 0.0037, which represents 0.37 percentage points since college share is measured as a fraction. Therefore we double this to 0.007 to compare with the experiment. Similarly, impact on years of education is a 0.02% increase, so we double this to 0.04%.
Using this simple example we generate impacts on college share and years of education growth in the next five year period and compare these to the predictions from our regressions.
Our calculation involves a churning out of 2,125 new graduates per year and this results in an average annual rise in college share of 0.0006 (or 0.060 percentage points). This is actually smaller than the 0.007 implied from our regressions, and could be due to more inward migration of skilled people following the opening of universities, which we do not capture controlling only for population changes. On the other hand the implied average annual rise in years of education is 0.09% which is more similar to the 0.04% implied by the regressions (which, as we noted are based on a different sample from the college share regressions).
While there are differences here, our simulation shows that the effects on human capital even with generous assumptions about the size of a new university, will be relatively small. This is in line with what we find in the regression analysis.

Demand effects of universities
Using this same example of the representative region, we can simulate the demand effects of university expansion. If the university is funded from outside the area then GDP may increase mechanically as demand from a university (e.g. rent, supplies, building and maintence) and its staff and suggests boost the local economy.
We assume that the cost per student in our new university is $10,000 per year, which is likely to be an overestimate of the average university in our sample. 44 For a university of 8,500 students this implies total costs of $85 million. Since this represents annual costs, we assume that the transfer continues in each subsequent year. Therefore the uplift to GDP will be felt only in the initial years. Assuming that total enrolments stay fixed at 8,500 over the five years following university entry (which is the key period we use for our regressions), there would be no uplift to GDP per capita in that period. Alternatively, if we assume enrolments are growing by 5% per year 45 this would only account for around 15% of the coefficient on universities implied by our baseline specification (0.089). Figure A13 shows that there is a positive and signifcant correlation between the change in university density and change in polity scores over 1960-2000. Table A8 reports a number of robustness tests around the regressions of approval of democracy on lagged university presence reported in Table 9. Column (1) repeats Table 9 column (4). Column (2) shows that this effect appears to be driven by OECD countries, as an interaction term between an OECD dummy and the lagged university presence is positive and significant. Column (3) shows that the main result is much smaller in magnitude and insignificant for 5 year lagged university presence, and actually negative for a 30 year lag. We note however, that the results are robust across lags on the OECD subsample (available on request). Column (5) shows that our main result can be closely replicated using a different survey measure for approval of democracy, "democracy is best" which asks respondents whether they agree with the statement that democracy is better than any other form of government. Column (6) does not include country fixed effects. This shows that the positive relationship we find between universities and approval of democracy is valid within countries. Across countries, factors not controlled for in these regressions (such as levels of corruption) appear to influence the result. We investigated which countries appear to be driving this negative relationship and found, for example, that the Philippines (a country with high levels of corruption) has high university density but low approval of democracy. Column (7) clusters at the country level and significance holds. Column (8) weights by population, to account for the fact that some regions with low population may have less representative responses. Column (9) drops students and graduates and the main result gets stronger. Finally, column (10) shows that the results are robust to estimation using an ordered-probit model.   (5) from Table 3. The subsequent columns add contemporaneous and further lagged growth in universities, and corresponding population growth. The level of population at the furthest lag is also controlled for. Notes: All columns include country and year fixed effects, and standard errors clustered by region. Column (1) is simple correlation between regional growth in universities and the lagged growth in regional GDP per capita. Columns (2) to (5) include the variables shown. In addition, column (6) includes geographic controls which are not reported: latitude, inverse distance to ocean, malaria ecology, ln(oil and gas production) 1950-2010. Levels of GDP per capita and population are in natural logs.   (3) replicates column (1), but adds the five year lagged level of universities in a region, and lagged population. Column (4) then adds years of education to the specification in column (3). Standard errors are clustered at the regional level. Levels of GDP per capita, population and population density are in natural logs. Years of schooling are not logged.  (1) is a simple correlation of the average annual growth in regional GDP per capita over 1960-2000 on the natural log of 1+ the number of universities in 1960. Column (2) adds country fixed effects, the 1960 natural log of the level of regional GDP per capita, the 1960 natural log of the level of population, the 1960-2000 change in population and country fixed effects. Columns (3) and (4)   Notes: Growth in years of education is the log difference. Column (1) replicates column (5) from Table 3. Column (2) restricts to the sample for which the change in years of education is available. Column (3) drops the lagged growth in years of education. Column (4) adds the contemporaneous change in years of education. Column (5) includes both lagged and contemporaneous changes. Column (6) further adds the lagged level of years of education (unlogged). Column (7) regresses the change in years of education on the lagged growth in universities, with country dummies, but no other controls. Column (8) adds all the other controls.  Notes: Column (1) replicates column (4) from Table 9. Column (2) includes an OECD dummy (not reported) and interaction between this and lagged university density. Column (3) is identical to column (1), but uses the five year lagged university density. Column (4) uses the thirty year lagged university density. Column (5) has a different dependent variable: the view that democracy is "best". Column (6) omits country and year dummies. Column (7) clusters standard errors at the country level. Column (8) uses weighted OLS, weighting each region by its population as a share of the country's total population. Column (9) drops graduates and students from the sample. Column (10) is estimated using an Ordered Probit model.  Figure