New Road Infrastructure: The Effects on Firms

This paper estimates the impact of new road infrastructure on employment and productivity using plant level longitudinal data for Britain. Exposure to transport improvements is measured through changes in accessibility, which is calculated at a detailed geographical scale from changes in minimum journey times along the road network. These changes are induced by the construction of new road link schemes. We deal with the potential endogeneity of scheme location by identifying the effects of changes in accessibility from variation across wards close to the scheme. We find substantial positive effects on employment and numbers of plants for small-scale geographical areas (electoral wards). In contrast, for firms already in the area we find negative effects on employment coupled with increases in output per worker and wages. A plausible interpretation is that new transport infrastructure attracts transport intensive firms to an area, but with some cost to employment in existing businesses.

Road networks dominate transport infrastructure in most countries. In the UK in 2008, 91% of passenger transport and around 67% of goods transport was by road. For transport within the European Union, in 2009, the corresponding figures were 92% and 47% and for the US in 2007, 88% and 40-48%. 2 Clearly road transport delivers economic benefits, and transport improvements are frequently proposed as a strategy for economic growth, integration and local economic development (e.g. European Commission, 2006;World Bank, 2008). Transport improvements decrease transportation costs, improve access to markets and labour, which may foster economic integration, stimulate competition, generate agglomeration economies and a number of other 'wider' economic benefits.
A number of recent studies have looked at the impact of transport infrastructure networks on the spatial economy, 3 usually in an economic-historical context (Donaldson and Hornbeck, 2015;Duranton and Turner, 2012;Michaels, 2008;Baum-Snow, 2007) or developing country context (Ghani et al., 2015;Faber, 2014;Baum-Snow et al., 2012). But for economies with well-developed transport networks, little is known about the effects at the micro level that result from additions to the network. Direct evidence of the causal effects of such improvements using ex-post evaluation of road network improvements is rare. Holl, 2012 provides a recent example, for a relatively small sample of manufacturing firms in Spain.
This paper provides such evidence by investigating the causal impact of recent road improvements on employment and productivity related firm outcomes, using administrative data on all -2 -businesses in Great Britain from 1997-2008. We measure exposure to road improvements using changes in a continuous index of accessibility, calculated at a small geographical scale -electoral ward (there are 10,300 wards in Britain with an average land are of 24km 2 and population of 6000).
These accessibility changes are based on optimal travel times, calculated from analysis of potential routes along the major road network. For a given location, this index measures the accessibility of potential destinations along the major road network. To construct this index, we use a bespoke dataset of road construction schemes carried out in Great Britain between 1998 and2007, combined with road network data.
The principal challenge to estimation of the causal impact of road network changes on firm outcomes is that roads may be purposefully built to meet demand in places where productivity is growing, or to try to stimulate growth in places where productivity is falling. To address this problem of endogenous scheme location, we exploit the geographical detail in our data and identify the effects using only over-time variation in accessibility for wards that are very close to new schemes (within 10-30 km). This variation is incidental to the policy aims of the transport schemes, which are additions to the major road system aimed at improving network performance rather than local economic development. Our study is unique in using this variation in accessibility changes close to specific road transport schemes to identify the effects of transport improvements. Restricting our attention to areas close to schemes results in little loss of generality relative to comparison across the whole country, because most of the variation in accessibility generated by relatively small scale road transport improvements occurs in areas close to new schemes. Our aim then is to infer the more general effects of changes in accessibility from these transport-induced changes occurring at a small geographical level.
-3 -Our continuous, network-based accessibility index is a crucial component in this research design, because it varies in complex ways that do not depend solely on a firm's distance to the transport improvement scheme. Instead, it varies across space, and changes over time, in ways that depend on a firm's position in relation to both the new and existing parts of the transport network, and whether new road links affect the optimal travel paths between the firm and destinations on the network. For example, two firms A and B sited at equal distance to a new road scheme will experience very different accessibility changes, if the new road becomes a link in many of the leastcost network paths from A to other destinations, whereas the new road is rarely if ever a link in the least-cost network path from B to other destinations. Therefore changes in the accessibility index are plausibly unrelated to location-specific characteristics that jointly influence new road placement and firm productivity and employment. To support this argument, we present tests that show there is no systematic relationship between accessibility changes and pre-trends in area characteristics. Other studies have used changes in similar, network based accessibility indices (an earlier version of our work Gibbons et al, 2010, Donaldson andHornbeck, 2015;Rothenberg, 2013;Holl, 2012) although, to our knowledge, no paper exploits localised variation in accessibility across areas that are close to road projects, 4 nor worked with such spatially refined data.
Our key finding is that road improvements increase ward-level employment and the number of businesses. A 1% increase in accessibility leads to a 0.3-0.4% increase in plants and employment.
However, we find small negative employment effects at plant level, implying that the local employment changes come about through firm entry and exit. Conversely, we find evidence of positive impacts on labour productivity (specifically on gross output and wage bill per worker). 4 Indeed, Donaldson and Hornbeck (2013) control for local changes in accessibility.
-4 -The rest of the paper is structured as follows. Section 2 reviews the related theoretical and empirical literature. Section 3 presents the empirical methodology and explains the construction of the accessibility, productivity and employment measures. Section 4 describes the data used, Section 5 discusses the empirical results and Section 6 concludes.

Theoretical background and existing evidence
Theoretically, reduced transport costs and improved connectivity offer various direct benefits to firms. 5 Changes to logistics, business travel and internal organisation may improve productivity. If transportation services are a factor of production, reductions in transport costs will affect input choices. Input mix may also change if relative prices of other inputs are affected by falling transport costs. For example, wages could rise if productivity effects are capitalised into wages, or could fall if they are set along the supply curve as a function of commuting costs (Gibbons and Machin, 2006). Land prices and commercial rents could also change in response to changed location-specific benefits. There may be additional scale effects if cost reductions feed through into lower output prices and higher demand (for example by increasing market area, as suggested by Lahr et al., 2005). These effects combine to determine changes in employment and observed labour productivity.
In addition to these direct effects the literature considers a number of 'wider economic benefits' that involve total factor productivity effects arising from agglomeration economies (Graham, 2007). These agglomeration externalities have origins in sharing of resources, matching of workers to firms, and learning by information exchange (Duranton and Puga, 2004). Although usually associated with spatial concentration (e.g. cities or industrial clusters), these effects can just as well 5 Our theoretical discussion draws mainly on Gibbons and Overman, 2009 who provide further analysis.
-5 -be attributed to lower travel times between locations (sometimes referred to as 'effective density').
Agglomeration benefits are traditionally assumed to act like a production function shifter increasing the amount produced with given inputs (Gibbons and Overman, 2009).
Transport improvements can also influence the spatial distribution of firms through market access, selection and sorting effects (Baldwin and Okubo, 2006). Better transport may encourage start-ups and survival by lowering costs, increasing returns to scale or agglomeration economies.
Conversely, improvements can force the exit of low-productivity firms previously protected from competition (Melitz, 2003).
Due to the multiplicity of potential effects, theory provides little definitive guidance on whether to measure the effects on firms through prices, output, or inputs, or what response to expect on any of these dimensions. The theoretical predictions on the net effects of transportation improvements on area level outcomes are similarly varied. Traditional ex-ante appraisal of improvements have set many of these issues aside, by assuming a world of perfect competition in which all the economic benefits of transport improvements are captured by travel time savings and induced demand (Small, 2007). However, more recent studies (Eddington, 2006;Gibbons and Machin, 2006;Venables 2007;Gibbons and Overman, 2009) argue that this may not be a complete picture. In short, given the unclear theoretical predictions, the size and direction of the effects of transport policy on economic outcomes is mainly an empirical question. 6 A number of studies have tried to estimate the effects of transport improvements on the economy, but relatively few have looked at the impacts on firms at a micro or spatially disaggregated scale. 6 A number of papers adopt a more structural approach to restrict the possible outcomes. See, for example, Rothenberg (2013), Donaldson and Hornbeck (2015) and Donaldson (2014). In contrast to these papers we use a reduced form approach paying particular attention to the issues of identification. Turner and Redding (2014) discuss both approaches in their recent survey.
-6 -Most of the empirical evidence considers the macro or regional level (for a review see Straub, 2011). These studies generally estimate aggregated production functions where infrastructure expenditure or roads are treated as a factor of production (García -Mila et al., 1996). Results, for a variety of outcomes are mixed. 7 Unfortunately, this literature struggles to address endogeneity concerns -that is, the fundamental problem that transport policy improvements are not randomly allocated, but are spatially targeted to meet specific economic and travel-related demands.
Several recent papers have tackled this problem of endogenous transport improvements using various identification strategies. These include: a) using historical transport plans as instruments, under the assumption that the original plans are unrelated to current economic conditions; b) using physical geography as an instrument, under the assumption that physical geography is unrelated to current economic conditions; or c) assuming that some places are incidental beneficiaries of transport links, e.g. those located between big cities. Some papers use combinations of these ideas.
A number of papers in the US have used instruments derived from historical transport plans (or older routes) to look at various outcomes such as: urban growth (Duranton and Turner, 2012), road traffic (Duranton and Turner, 2011), trade patterns (Duranton et al., 2014), sub-urbanisation (García-Lopez et al, 2015Baum-Snow, 2007), commuting patterns (Baum-Snow, 2010), demand for skills (Michaels, 2008). These papers usually capture the effect of transport using connectivity to the network or some measure of the spatial density of the network. Baum-Snow et al (2012) use a similar idea for China. A number of studies use the second strategy we outline above. For example, Faber (2014) uses optimal transport routes derived from physical geography, while -7 - Banerjee et al. (2012), Michaels (2008, Jedwab and Moradi (2015) use straight line paths between cities as instruments. Finally, several papers use the third strategy and claim that treatment of locations between cities or other network nodes is incidental to the aims of the transport policy, and therefore exogenous 8 (e.g. Chandra and Thompson, 2000;Holl, 2004a;Melo et al., 2010;Ahlfeldt and Feddersen, 2010;Ghani et al., 2015).
Studies of firm-related outcomes are relatively rare. For India, Ghani et al. (2015) study a major national highway improvement programme in India and find that districts within 10km of the non-nodal sections of highways saw increases in entry rates and productivity in the manufacturing sector, compared with districts further away. Their identifying assumption is that the location of highway links between cities is exogenous, and they test this assumption by comparing with planned highways that were not constructed. Holl (2012) links a small panel of firms to road network-based market potential indices for 573 municipalities in Spain, and finds negative impacts of market potential on value-added in firm fixed effects specifications. She recovers positive effects by applying System GMM with instruments based on lags of the control variables, historical instruments and geology. A concern with this approach is that the instruments are arguably drivers of the long run historical evolution of the municipalities, so are likely to affect firms through non-transport related channels. Other papers have looked at firm relocation and entry (Coughlin and Segev, 2000;Holl, 2004a and2004c;Rothenberg, 2013) or birth (Holl, 2004b;Melo et al., 2010). Hornbeck (2015)and Li(2013) use the construction of the Chinese highway system to suggest that improvements reduce inventories.
-8 -The identification strategy in our paper differs substantively from these previous approaches. We avoid instrumental variables approaches based on historical networks or plans (or any other historical variables) because: a) they would not be very relevant for the relatively small additions to the road network that we study; and b) we prefer not to rely on the assumption that historical transport-related variables influence current economic performance only through the current road network. Instead, we address the concern that transport schemes are endogenously targeted by estimating from treated places only; we do not use non-targeted places as a comparison group.
Estimation is based on variation in the intensity of treatment within targeted locations, where treatment intensity is the magnitude of the change in accessibility induced by the new road scheme. Our paper is, to our knowledge, the first to exploit localised changes in road network accessibility in this way to identify the causal effects of transport improvements on productionrelated variables, using firm level micro data. The research design is discussed in detail in the next section.

Empirical methods
We measure the intensity of a firm's exposure to road improvements using an index of changes in 'accessibility', derived from changes in minimum travel times along the road network to potential destinations. These changes occur when new road links are constructed or existing links improved.
We carry out our analysis at firm level and for small spatial units -electoral wards. The ward-level analysis allows us to capture effects working both through changes within firms, and through entry, exit and relocation of firms. The firm-level analysis captures within-firm changes only. Both approaches use the same general estimation strategy, applied to a panel of units (wards or firms) observed for up to 11 years, during the period 1998-2008. Data sources are described in Section 4.

General empirical set up
Our aim is to estimate the expected change in employment or productivity-related outcomes caused by a change in minimum travel times along the road network, induced by infrastructure improvements. We start from the basic regression equation: Here yjt is the outcome variable for unit j in year t, is a measure of travel time-based accessibility along the road network from origin unit j at time t. Unobservable factors include unitspecific, time-invariant components ( ) , year-specific, unit-invariant components ( ) and yearby-unit varying ( ) components.
The accessibility index at j is a proximity-weighted sum of activity at k, where the proximity of k to j is a decreasing function of minimum journey times along the network, ( The weights 0 depend on the level of economic activity at destinations k in some base period. For the main part of our analysis the destination weights are 1997 employment, which precedes the first period in the estimation sample and the function ( ) is defined by a simple inverse distance decay. Minimum journey times along the major road network are imputed from GIS network analysis. In existing literature, this index has been variously called an index of accessibility (e.g. El-Geneidy and Levinson 2006, Vickerman et al 1999, population or market potential (e.g. Harris 1954), effective density (Graham 2007), or market access (e.g. Donaldson and 9 Travel times are a proxy for travel costs. A generalised measure of transport costs would require additional information on other characteristics of infrastructure (e.g. reliability), vehicle and energy use, as well as labour, insurance, tax and other charges (such as tolls). However, as demonstrated by Combes and Lafourcade (2005), using detailed French data, most of the spatial variation in transport costs is driven by time savings through infrastructure improvements.
-10 -Hornbeck 2015). We use the generic word 'accessibility' throughout since we make no judgement on whether the effects work through access to markets or access to something else. We show results using alternative weights (e.g. employment, population etc.) and distance decay relationships, but the different indices are extremely highly correlated. Therefore it does not make sense in our context to place a theoretical interpretation on the accessibility index based on the choice of destination weights or functional form.
The parameter of interest is in (1), interpreted as the causal effect of accessibility on economic outcomes. OLS estimates of are very likely to be biased, because accessibility is non-random across space and time, and so is correlated with unobserved . OLS regression compares units with high accessibility and units with low accessibility that are non-comparable on many unobserved dimensions. Part of this correlation occurs through fixed-over-time components of and . In particular: a) faster transport connections may have been built to link more productive places; b) dense places may be more productive, and origins and destinations j and k are by definition closer together and network travel distances shorter in denser places, implying greater accessibility; c) the weights 0 , if based on measures of economic activity will be endogenous if the outcome in j and in connected destinations k, are affected simultaneously by unobserved common productivity advantages.
A standard first step to eliminating the endogeneity induced by these fixed-over-time components of is to control for unit j fixed effects in (1), using standard within-groups regression (along with dummies to estimate general time effects ̃)

̃=̃+̃+̃
(3) Here the notation indicates ̃= − ̅̅̅̅̅ and estimation of comes from the within-unit changes in . 10 Given the structure of in Equation (2) and the way we construct (as described below in Section 4.3), these changes occur only through changes in minimum travel times between j and destinations k along the road network, caused by new or improved road infrastructure (where these changes are weighted by destination employment in the base period).
Now the concern is that changes in infrastructure incorporated in ̃ are correlated with changes in the time varying unobservables for unit j (̃) if, for example, new road infrastructure is targeted at places that are experiencing better or worse than average productivity trends. To deal with this problem, we exploit the fact that in our data changes to minimum travel times are the result of a number of discrete road transport improvement schemes put in place in Britain over our study period (31 schemes over the period [1998][1999][2000][2001][2002][2003][2004][2005][2006][2007][2008]. It is primarily the location of these schemes within Britain that is potentially endogenous, due to policy targeting, not the accessibility changes occurring for units that happen to be close to these schemes. We can therefore control for endogenous scheme placement by controlling for geographical fixed and time varying effects related to scheme location. We do this in two ways. Firstly we restrict our sample to units j within a given distance buffer b of the nearest transport scheme (20 km in our main results). In this case, identification comes from comparison of units experiencing larger accessibility changes with units experiencing smaller accessibility changes, amongst the sub-sample of units that are all in close proximity to the road schemes that open over our study period. Secondly, we control for differential trends for each scheme within this sub-sample, by interacting nearest-scheme dummies with linear time trends.
-12 -Our identifying assumption is therefore that the variation in transport-induced, within-unit changes in accessibility is as good as random, when we compare units within a given radius of a particular road transport scheme. There are good arguments to support this assumption. The road schemes in our analysis are generally bypasses and motorway extensions that were intended to improve traffic flows between origins and destinations that are remote to the sites of the schemes (Department of Transport, 1997; Department of Transport, 2009). The variation in the changes in accessibility close to a scheme, while large relative to the changes elsewhere, are therefore an incidental by product of the scheme rather than its intended outcome. Related arguments have been made in other papers, e.g. Michaels (2008) argues that counties in intermediate rural locations between cities may be incidental beneficiaries of highways built between them. However in our case we are not using this argument to claim that the scheme location is exogenous, but that the incidental treatment means that the variation in accessibility changes amongst units in close proximity to each scheme location is exogenous. Our study is unique in using this variation in accessibility changes within distance buffers of specific road transport schemes to identify the effects of transport improvements. 11 There are remaining concerns if the schemes are sited in such a way that their precise position and routing, and the accessibility changes they introduce, are correlated with very localised differential productivity trends. For instance, the route for a bypass may be chosen on the basis of low land -13 -prices, which in turn could indicate low potential productivity growth. Or, production close to schemes may be temporarily or permanently disrupted by construction works. We take a number of steps to mitigate these problems. Firstly we drop units which are crossed by, or very close to, improvements (within 1 km of any scheme). This also gets rid of rather mechanical sources of impact, such as service stations and fast food outlets built to serve road traffic. 12 Secondly we augment our regressions with further controls for differential trends over time and space within each nearest-scheme group. In the results, we show specifications with linear time trends interacted with: the straight line distance to nearest scheme; a dummy indicating observations in the periods after opening of the nearest scheme; and the level of accessibility at the beginning of our study period in 1997. We also experimented with trends interacted with salient electoral ward characteristics taken from 2001 Census data (unemployment rate, average age of population, proportion of population aged 16-74 with higher education and proportion of population living on social housing) and discuss results for these in the text. Finally, there are also often long delays (10 years or more) between commissioning and opening of schemes, which further weakens any link between local productivity trends and the decisions over exactly where to site these projects.
We estimate (1), (3) and their extensions that include scheme fixed effects and other control variables, using ward and plant level data. There are some specific points to consider when estimating the fixed effects regression Equation (3) for plant level data. 13 Firstly note that the plant identifiers are location specific (changing if a plant moves to a different location). Thus, in the within-plant analysis, changes in accessibility are not caused by relocation of plant j, but only by -14 -changes in the transport network for a fixed plant location. Therefore estimation requires plants to appear in the data both before and after the opening of the transport schemes that are used as the source of identifying variation. This means these plant level regressions do not capture changes in employment or productivity associated with the opening of new plants. In addition there are potential sample selection issues, if these firms that stay in response to transport improvements differ on unobserved dimensions from those that relocate (with a similar comment applying to the frequency with which firms appear in the same location multiple times in the data). These caveats aside, estimation of from within plant changes give the micro-level impacts of improvements on firms, which are one component of the area level effects, as well as interesting in their own right.
Note, that the estimation strategy at both ward and plant level ignores whether or not specific firms or their employees and customers in fact use the network improvements. The effects are thus analogous to 'intention to treat' estimates in the programme evaluation literature, and are the expected changes for firms or areas exposed to the 'treatment' (change in road transport accessibility).

Justification for using accessibility to measure exposure to transport improvements
The accessibility index in Equation (2)  the amount of public expenditure in an area on road infrastructure (Fernald, 1999). In our context, the major road network is already very developed and dense (49,816 km long in 1998, in a land -15 -area of about 230,000 km 2 ) and does not expand much during our study period (increasing by 0.87% to 50,250 km by the end of 2008). We are also using small geographical areas and plant level data. This means that measures based only on whether a unit is crossed by a road, or changes in the number of road kilometres, or other simple indicators are unlikely to exhibit much variation (or are meaningless, in the case of plant level data). Proximity to roads is one viable alternative indicator. However, when new road scheme location is endogenous, distance to a new road is a poor indicator of network exposure, because it is infeasible to separate the influence of new transport infrastructure, from the influence of the place in which the new transport infrastructure is located. In fact, as discussed in Section 3.1, we view distance to the road schemes as a potentially important control variable, not an index of treatment.
Using an accessibility index (Equation (2)) has the key advantage that it varies continuously over space in ways that are partly unrelated to distance to improvements. 14 This helps identify the effects of transport improvements separately from the specific advantages or disadvantages of sites chosen for improvements. It also means we can potentially observe the degree of treatment for all firms, irrespective of whether they are close to the site of the road improvement (though clearly firms closest to the improvements are more likely to use these new links, and hence tend to be the most exposed). Other studies have employed similar indicators for this purpose, e.g. Graham (2007) uses cross sectional variation (but not changes) in accessibility. Holl (2012) and Donaldson and Hornbeck (2015) like us, use changes in accessibility (or 'market potential'/'market 14 Donaldson and Hornbeck (2015) also claim that a continuous market access index of this type avoids the problems of spillovers in the effect of treatment that are inherent in designs with discrete, neighbouring, treatment and control areas.
With an accessibility/market access index, all areas are treated to a greater or lesser degree. This is arguable, since spillovers between neighbouring areas with big accessibility gains to neighbouring areas with smaller accessibility gains will also lead to biased estimates of the accessibility impact.
-16 -access') as a treatment variable, but do not exploit the variation in accessibility within localised areas for identification, which is our key contribution.

Geographical units and area controls
Our analysis is based on plant level micro data. We have detailed information on the location of plants (postcodes, equivalent to around 17 houses or a medium sized plant) and can link this data geographically at various levels using the Office for National Statistics (ONS) National Statistics Postcode Directory. For most of our analysis we work with aggregates for approximately 10,300 electoral wards. Wards are defined to have roughly the same number of electorate and are geographically small in dense areas. We use wards as defined in 1998.
To construct ward level control variables we use the GB Census 2001 to calculate the share of population aged 15-64 with higher education, mean age of population, share of population living on social housing and the rate of unemployment. We also calculate straight line distances from each ward to the nearest scheme (undertaken at any point during our study period) using GIS and the dataset of transport schemes described in 4.3.

Firm data
Data for the analysis of employment and plant counts (number of establishments) at ward level, and for analysis of plant level employment, is from the Office for National Statistics (ONS) Business Structure Database (BSD) 15 accessed through the UK's Secure Data Service. We use data from 1997 to 2008 to construct both dependent variables and the accessibility index. The BSD -17 -contains a yearly updated register of the universe of businesses in the UK covering about 98% of business activity (by turnover). For consistency with our productivity data -described below -we do not use data for years past 2008.
The smallest unit of observation is the establishment or plant ('local unit', LU), but there is also information on the firm to which the plant belongs ('reporting unit', RU). The dataset provides detailed information on location (postcode), sector of production (up to 5 digit SIC) and employment in plants. We can calculate employment and number of establishments at any geographical level aggregating up from postcodes.
For the productivity regressions, we use the ONS Annual Respondents Database (ARD). 16 The ARD holds responses to the Annual Business Inquiry (ABI) completed by a stratified random sample of units, extracted from the BSD (see Criscuolo et al, 2003

The road schemes and road network data
Information on completed road schemes for the British major roads network comes from information provided by the Department for Transport (DfT) and other sources including The Highways Agency, the Motorway Archive, Transport Scotland, Wikipedia and other web based sources. We consider improvements carried out on trunk roads, principal roads (class A) and motorways. These roads represent only 13% of total road network length, but correspond to 65% of driven kilometres (Transport Statistics Great Britain, 2010). We focus on major roads for two reasons. The first one is data availability: detailed data on road projects is only available for major schemes. The second reason is that these are the schemes we expect to have a substantial effect on travel times between wards.
-19 -  Figure A1 in the Appendix shows the major road network in 2008, while Figure 1 shows the location of the schemes we consider. Projects are scattered all over Britain.
This information on new road links is combined with a snapshot, GIS road network for 2008 provided by the DfT. This DfT road network is generalised and covers only major roads.
Combining the road scheme and network data sources allow us to reconstruct the major road network, complete with location, length and travel speeds of the road links, for each year between 1998 and 2008. We start with the 2008 network and locate all the road links belonging to each of the 31 schemes described above and listed in Table A1. By working backwards in time and deleting -20 -the new links opened in every year, we reconstruct the network as it was at the beginning of each year back to 1998.

Origin-destination travel times and accessibility index construction
The essential first ingredient in our accessibility index = ∑ ( ) 0 ≠ from Equation (2) is the bilateral, unit-to-unit road travel times ( year are due to the construction of new road links. -21 -We calculate timejkt in (2) using minimum journey times routing along the transport network in each year. Transport improvements change the structure of the network and this changes journey times between some origin units j and destinations k. This in turn changes the accessibility index.
There are three potential channels for these changes in journey times. Firstly, a transport improvement that involves a journey time reduction on a road link p-q will have a first order effect on the time between j and k if the quickest route between j and k passes along the link p-q both pre and post-improvement. Secondly, the quickest route between j and k may not use link p-q preimprovement, but switches to use the link p-q post-improvement because of the reduction in journey time. Lastly, second order effects arise when the quickest route between j and k does not use link p-q pre-or post-improvement, but other traffic switches to use link p-q, which reduces congestion on the quickest route between j and k. In our empirical work we exploit only the first two of these channels. We ignore second order effects because our network data does not allow us to observe travel time changes induced by changes in congestion resulting from improvements.
With the time-varying ward-to-ward O-D minimum journey time matrix in hand, we need to specify the proximity function a(timejkt). This is a decreasing function of the minimum ward-toward travel time and, in line with common practice, we use a simple inverse-time weighting scheme for our main specifications (a(timejkt) = timejkt -1 ). We show alternatives in robustness checks.
Finally, when constructing the accessibility index for our main regressions, we use workplacebased employment in destination wards (from the BSD) as weights ( 0 ), measured in 1997 before the start of our estimation sample. We show results too for alternative residential population weights, and constant weights. Note, when we aggregate the components up to form the accessibility index, we exclude location j from its own accessibility index, to mitigate potential endogeneity problems.
-22 - However, accessibility changes are bigger for wards closer to new road schemes, because it is in these wards that the new road links make the most difference to the minimum journey times to all potential destinations. It is the variation within these narrower distance bands from which we identify the effects in our regression analysis. Within 10 km of a scheme the mean change is 1.2% and the 90 th percentile is 3.2%. Within 20 km, which we use in our base specification, mean accessibility change is 1.2% and 90th percentile is 2.0%. Within 30 km these figures are 0.95% and 1.7%, respectively.

Descriptive statistics
As discussed in Section 1, alternative destination weights such as employment and population and different distance decay functions can be used in constructing the index. It is tempting to try to use these different indices to measure access to specific economic inputs or markets. However, the similarity in the spatial distribution of population, employment and other economic variables means that, in practice, indices computed using different weights are very highly correlated.
Appendix Table A2, shows the summary statistics for the 1998-2008 changes in various alternative indices, and their correlation between the 1998-2008 changes in our preferred index using 1997 destination ward employment weights (for the 1-20km band). The means and standard deviations for indices with different destination are nearly identical, and differences due to alternative weighting schemes arise simply through a matter of scale (higher weights imply aggregation of -23 -employment over a shorter range). Evidently, all the correlations are above 0.9, and most are above 0.98 and it will be infeasible to identify their separate contributions in a regression analysis. Figure 1 and Figure 2 illustrate the spatial relationship between road schemes and the accessibility increases they cause. 17 The left panel of Figure 1 shows new roads and major improvements and the right panel shows accessibility improvements. Figure 2 zooms in on wards in the Manchester-Leeds area in order to illustrate the identification strategy. New links are indicated by bold lines.
Clearly the effect of improvements on accessibility varies considerably across wards in the vicinity of the same improvement. As detailed in Section 3.1, our identifying assumption is that the variation in accessibility changes across wards within narrow distance buffers around these links are incidental to the policy aims of the road scheme, and can be treated as exogenous (especially conditional on additional scheme-specific fixed effects, and other local time trends). Our preferred distance buffer in the regression analysis is 20 km which, as can be seen from the map scale is a small distance compared to typical road links in our data.
One concern from Figure 1 and Figure 2 might be that changes in accessibility induced by transport improvements are correlated with pre-treatment trends in area characteristics. Table 2 provides a direct test of this, showing results from the regression of 1981-1991 changes in area characteristics on the 1998-2008 log accessibility changes, conditional on our baseline controls: 1998 accessibility, distance to scheme and scheme specific dummies. We do not have information on the characteristics of firms prior to our sample period, hence rely on residential population demographic characteristics from the 1981 and 1991 population Census. These regressions provide 'placebo' tests, in the sense that we do not expect to see impacts from transport improvements on -24 -changes prior to the improvements occurring. Indeed, looking over the 24 coefficients in this table, none of them is significant at conventional levels, suggesting that accessibility and pre-transportimprovement trends in local characteristics are uncorrelated. Overall these findings suggesting that, within our preferred 1-20km distance band, changes in accessibility are uncorrelated with pre-treatment trends.
For reference, Appendix Table A3 provides further descriptive statistics for the employment and numbers of plants and total employment in the wards in our estimation samples.

Ward-level employment and plant count regressions
Results, presented in Table 3 given that employment will be subject to greater survey measurement error. In column 2 we add nearest-scheme specific trends to the ward fixed effects regression (i.e. it includes interactions between nearest-scheme dummies and a linear time trend). Column 3 introduces a time trend interacted with distance to the scheme, with a dummy for years after scheme opening, and with accessibility in 1997, all to allow for differences in trends across space close to the schemes, and general post-operation changes. Introducing these additional control variables improves the statistical significance (all at the 5% level or better) and shifts the point estimates around slightly, but not by much relative to standard errors. The effects on employment are generally slightly larger than the effects on number of plants, although again the differences are not large relative to standard errors. Results do not change substantively, for specifications (not reported) where we included an interacted time trend with a set of census variables for each ward, to allow for time patterns related to the underlying demographics.
Expanding the sample cut-off distance around the schemes in column 4 leaves the results largely unchanged. Reducing the distance to within 10 km of schemes in column 5 leads to smaller coefficients, and for employment, results are insignificant (although again less than one standard error below the largest point estimates in previous columns). The weaker results in the 10km band might seem surprising given that the variation in accessibility is biggest closes to the schemes (Table 1). However, close to schemes, a greater proportion of this variation is due to measurement error in the travel time calculations, since we do not have detail on minor roads journeys from wards to the new major roads (which is an additional reason for excluding wards within 1 km, see Section 3.1). Sample sizes are also smaller.
The headline story from these results is that accessibility changes induced by road improvements drive up local employment and the number of local plants, with an elasticity of around 0.3-0.4.
-26 -These estimates appear quite large, but remember that the changes in accessibility are small (see Table 1). On average, within 20 km the mean change in accessibility was only 0.83%, implying an increase in employment of 0.25% and an increase in plants of 0.33%.
The panels in Table 4 show results by broad industrial sector for specification 3 in Table 3. 18 The results suggest that most of the action on employment and plants (in terms of the size of the effects) comes from producer services, land transport and 'other' sectors (a residual category that includes the primary and public sectors). The number of plants in the manufacturing sector also responds strongly, although this does not show up in the employment figures (presumably implying that new plants in the manufacturing sector are small). In additional results not reported, we find, as before, that expanding to a 30 km distance band leaves results unchanged, while reducing the area to within 10 km leads to smaller, less precise estimates. Both the land transport and producer services effects are consistent with a story in which road improvements lower transport costs for intermediates and business travel and stimulate employment in the logistics sector. Additional results which break down the transport sector effects (not reported) confirm that the strongest positive effects come for employment and plants within the land transport sector, and specifically within the road freight and cargo handling sectors. -27 -counts of post office residential delivery addresses (taken from the ONS National Statistics Postcode Directory) in column 1; ward population weights from the 1991 GB Census in column 2; ward plant counts from the BSD in column 3; ward level (residence-based) employment from the 1991 census in column 4; and no weights in column 5, so the accessibility index is simply the sum of the inverse minimum travel time to all potential destinations (within 75 mins). It is evident from the similarity in all these results that changing the definition of accessibility is immaterial, which is not surprising given the high correlations shown in Appendix Table A2. We have also estimated specifications using destination ward BSD employment in the current year as weights, instead of 1997 employment weights. When instrumenting this accessibility index with the version based on 1997 employment weights, results are essentially unchanged from the reduced form versions reported so far.

Robustness checks: alternative accessibility indices, distance bands, spatial autocorrelation
Columns 6-8 change the proximity function a(timejkt) in Equation (2)  However, if we standardise the effects (multiply by the standard deviation of the accessibility variables reported in Appendix Table A2) we find a more stable pattern of effect sizes. For example, for the accessibility index using inverse time weighting (with a parameter of -1) the standardised effect on employment for a one standard deviation increase in accessibility is 0.365 x 1.97 = 0.72% (from Table 1 and column 3, Table 3). For the exponential time decay function in column 8 of Table 5, the standardised effect size is 2.13 x 0.46 = 0.98%.
-28 -Other robustness checks included estimating the regressions using samples within distance bands e.g. 11-20, 21-30km, so we are comparing firms that are at similar distances to the road schemes.
We find strong effects within these bands, indicating that our results are not driven simply by firm relocation to sites close to schemes. We also ran regressions with differential time trends according to initial level of employment or number of plants (by interacting initial levels of the dependent variable with time trends), to allow for potential mean reversion, but find the results substantively unchanged.
In sum, there is no evidence that the results are substantively sensitive to changes in the definition of the accessibility index, or to other changes in specification.
One concern when using data on closely spaced wards and firms is that the unobservables in our regression models may be spatially autocorrelated, leading to biased standard errors and incorrect inference. Given we include ward fixed effects, scheme specific trends, and distance to scheme trends as control variables, this problem is probably not as important as it might at first seem. We need only be concerned about spatial autocorrelation in the deviations around these fixed effects and trends, not the simple cross sectional patterns. Nevertheless, direct tests of the residuals from our regressions using Moran's I statistics take tiny values (less than 0.01) showing no evidence of strong spatial autocorrelation in the residuals. In addition, re-running our main regressions with standard errors clustered at a larger geographical level, the Census district, makes very little difference.

Plant-level employment regressions
The main results in the preceding tables suggest that increased accessibility leads to increased ward employment and number of plants, at least for some sectors. These findings could be driven by existing firms increasing employment, or by new firms entering. The plant count results show -29 -that firm entry appears to contribute to employment changes, but we can explore the issue further by looking at within-plant changes. To do this, we estimate the effect of log accessibility on log plant employment using plant level data from BSD for 1998-2008. Table 6 presents the key results using a similar structure to Table 3. Additional control variables in these plant level regressions are industry-year dummies (using the 6 broad sectors used for the sector-specific results above). As before, standard errors and F-statistics are clustered at ward (i.e. treatment) level.
The baseline plant fixed effects estimate in column 1 indicates plant size reductions in response to changes in road transport accessibility, although the coefficient is not statistically significant.
Adding in further controls for local time trends in columns 2 and 3, and changing the sample to plants within a narrower area (10km distance) or wider area (30km distance) increases the coefficient slightly (in absolute terms) and improves the statistical significance, but the general picture remains unchanged. We also introduced interactions of linear time trends with GB Census characteristics as described at the end of Section 3.1, but this made little difference. A 1% increase in accessibility is associated with a 0.06-0.09% reduction in employees. Sector specific results (not reported) do not offer any strong insights, although interestingly there are (insignificant) positive employment effects for firms in the transport sector.
The evidence overall suggests, surprisingly, that incumbent plants exposed to accessibility changes as a result of transport improvements are, on average, reducing employment. Read in conjunction with Table 3, the clear implication is that transport improvements boost local employment through a strong net gain in the number of plants and associated employment, while existing plants cut back (marginally) on employment. These employment cuts within incumbent plants could be due to increases in the price of labour relative to other inputs, causing substitution from labour to those -30 -inputs, or reductions in scale due to local factor price increases induced by demand from new plants entering the locality. The productivity results in section 5.5 shed more light on this question.

Productivity and other production related outcomes
Although we find a negative response for existing firms on the employment margin, these firms may experience productivity gains if lower transport costs allow reorganisation resulting in increased output per worker. We explore this directly by using various output and input-related balance sheet variables in the ARD data as dependent variable. Specifications are similar to those in Table 6 supplemented to include a dummy and a trend specific to single plant firms or 'singletons' (available for the ARD, but not BSD). The regressions are weighted using ward-by- year specific employment weights derived from the BSD, in order to make the ARD sample more representative of the spatial employment structure in the BSD population. The key results for all sectors pooled together are in Table 7. We restrict attention to the 1-20 km distance band.
The headline story from the coefficients in the top panel is that there is very little impact on plant total output, total labour costs or the wage bill. There is an increase in inputs of goods, services, materials and road transport services, which in turn is associated with a reduction in value-added (which is gross output minus purchases of materials, goods and services). However, none of the coefficients is precisely estimated and we cannot rule out zero impact on any of these outputs and inputs. Once we switch to per-worker values in the second panel, as indicators of labour productivity, we find some stronger positive effects. Total labour costs and wage bill per worker (i.e. mean wages) increase, output per worker increases, as does the amount of non-labour inputs used per worker. The increases in the economic variables measured in per-worker terms are clearly due to reductions in employment at plant level, and the corresponding employment effects estimated from the ARD data are shown in the third panel of Table 7. The signs are consistent with -31 -the reductions in employment shown in the BSD data in Table 6, although the point estimates are larger in absolute magnitude. The overall picture in Table 7 is one in which output remains constant, but worker productivity is increased, with corresponding increases in wages and substitution from labour to non-labour inputs. Note however, that we can detect no labour productivity increases measured in terms of value-added per worker, because increases in output per worker are accompanied by increases in goods, services and materials purchases. Again, it is important to remember that the average mean accessibility change within 20 km is 0.83%, so these coefficients imply induced output per worker and wage effects of around 0.25% as a result of these schemes. Sector specific results (not reported) suggest that the strongest effects are in the manufacturing and consumer services sectors, which were some of the least responsive sectors in terms of aggregate employment in Table 4, although the picture is generally quite mixed and the effects imprecisely measured. The effects are also concentrated in larger firms with more than 10 employees.
Additional analysis for these productivity-related outcomes aggregated to ward level in the first panel of Table 8, suggests that output and input costs effects are also large at the ward level, consistent with the increases in the number of plants documented in Table 3. However, these effects are often too imprecisely measured to be informative, with the exception of expenditures on input goods, services, material and transport services which are highly responsive. A 1% increase in ward accessibility is associated with a 2-2.75% increase in total expenditure on non-labour inputs and transport services. There are also increases in the per-worker mean outputs and input costs, which are consistent with the output per-worker increases in incumbent firms (Table 7).
Taken together with the sectoral results for employment, a picture emerges in which transport improvements induce entry of firms in most sectors apart from consumer services and -32 -construction/energy. At the same time, employment reductions occur in existing firms in all sectors apart from transport and manufacturing. Existing consumer services and manufacturing firms experience the most significant output per-worker and wage increases.

Conclusions
This paper estimates the impacts of recent improvements to the road network in Britain on a range of firm productivity-related and employment-related outcomes, using micro data at a very detailed geographic scale. Our results contribute to the evidence on the effects of transport on area and firm level economic outcomes, and provide unique evidence on the effects of relatively incremental changes in the network that are relevant to policy in developed economies.
We measure road transport access with a continuous index of accessibility based on minimum journey times, imputed from GIS network analysis. Our data-intensive research design uses policy evaluation methods applied to rich panel data. 'Treatment' as a result of road improvements is captured by changes in this accessibility index over time, in response to 31 new road link schemes over the 1998-2008 period. We identify the causal effects of changes in accessibility, from variation in this treatment amongst firms close to new road schemes, which mitigates biases arising from the location of schemes being potentially endogenous due to policy targeting. Places closest to the schemes are also those that experience the biggest changes in accessibility, because it is routes from these places that are most likely to make use of the new road links. Focussing on places close to schemes therefore makes the best use of the variation in accessibility generated by road network changes.
From our ward-level regressions we find strong evidence that road infrastructure improvement schemes increase the number of firms and employment in places that gain through better access to and along the road network. A 1% improvement in accessibility leads to about a 0.3-0.4% increase -33 -in the number of businesses and employment. The estimates range between zero and 1% according to sector and specification. Evidence from our plant level estimates suggests that, at the same time, incumbent firms shed workers, so employment gains must come about through firm entry. We detect output per worker and wage increases for incumbent plants: these plants maintain output whilst cutting workers, as they substitute to goods and services inputs. One theoretical story that is consistent with our findings (there may be others) is that accessibility improvements attract firms that benefit the most from transport accessibility, bidding up local wages relative to other input prices and transport costs. In response, incumbent firms (those that do not exit the area) substitute in-house labour with purchases of goods and services inputs. The sectoral picture is less clear, but reveals aggregate employment effects dominating in the producer services, transport and administrative sectors. Incumbent firms in most sectors cut employment, and the consumer services and manufacturing services experience the strongest labour productivity increases.
Our evidence does not shed light on whether these effects arise because new roads improve access to output markets, intermediate inputs or workers, or just reduce travel times in general, and we treat our accessibility index simply as an indicator of policy treatment. This accessibility index is identical in structure to those used previously in transport project appraisal and in the trade and spatial economic literature, where claims are sometimes made about the ability to distinguish market access effects from other changes. However, we show that accessibility indices constructed to measure access to destination employment, residential population, or simply the number of destinations are all highly correlated and yield similar results.
In common with all empirical work that estimates causal effects from statistical comparisons across time and place, it is impossible to know for sure whether these employment increases are additional to the economy as a whole. Our design ensures that the effects we observe are not -34 -estimated from simple displacement to areas near new road schemes from areas elsewhere in Britain, because we only estimate from variation within areas close to new road schemes.
However, it is fundamentally impossible to rule out more subtle effects where firms relocate precisely in response to the accessibility changes within the vicinity of the road schemes. If we were to assume that the local employment gains are additional, they appear substantial when roughly We geo-locate all the road links belonging to each of the 31 schemes listed in Table A1   local authority dummies. The regression predicts speeds from the FORGE reasonably well (Rsquared = 0.76). We then use the regression results and link characteristics in the 2008 network to 19 The National Transport Model provides "a means of comparing the national consequences of alternative national transport policies or widely-applied local transport policies, against a range of background scenarios which take into account the major factors affecting future patterns of travel". It is used to produce forecasts on traffic flows in order to design transport policies. The Road Capacity and Costs Model is one of the three sub-models included in the NTM and it corresponds to the highway supply module. The Road Capacity and Costs Model (FORGE) is used to show the impact of road schemes and other road-based policies. As explained in the DfT documentation: "The inputs to the Road Capacity and Costs Model are car traffic growth (based on growth in car driver trips) and growth in vehicle-miles from other vehicle types. This traffic growth is applied to a database of base year traffic levels to give future "demand" traffic flows.
These are compared to the capacity on each link, and resulting traffic speeds are calculated from speed/flow relationships (which links traffic volumes, road capacity and speed) for each of 19 time periods through a typical week". One of the outputs of FORGE is therefore vehicle speeds by road type, and this is what we use in the calculation of travel times between wards.
-43 -predict travel times for links opened after 2003 for which no speed data is available. For some of the links, the prediction exceeded travel time implied by the speed limit. We replaced predicted speed with the speed limit for these links. It should be noted that the network is highly generalised. Journeys via the minor road network are not modelled nor are forbidden turns and one way systems. All link intersections are treated as junctions. Moreover, journey times for the links may be imprecise. Changes in accessibility must therefore be regarded as approximate. This measurement error means our estimates of the effect of accessibility could be attenuated.
To partially address concerns about measurement error in the accessibility index, we cross checked a sample of times and accessibility measures against estimates derived from Google maps, using the STATA 'traveltime' module (Ozimek and Miles 2011). The cross sectional correlations in the journey times are high (in the order of 0.6-0.8), and the correlations in the accessibility indices (using address counts rather than employment) are even higher (0.8-0.95. However, the correlation for travel times is weaker for shorter journeys, presumably because shorter trips that do not use our generalised network are poorly approximated by our O-D calculation. For this reason, and because locations immediately proximate to new schemes may be adversely affected by the scheme (e.g. loss of premises, and environmental impacts), we drop wards and plants within 1 km of the road schemes in our analysis. As discussed above, this also helps to further mitigate concerns about the targeting of specific wards as a result of endogenous routing of schemes.

Appendix B: Back of the envelope representation of potential GDP gains
The average effect of all the major new road schemes in Britain between 1998 and 2007 was to raise mean accessibility at ward level by 0.34% (Table 1). This implies a 0.013% increase in total employment from a year's investment in major road transport network improvements (using the elasticity of 0.37 from    Ward level regressions of 1981-1991 changes in Census residential population shares with characteristics described in column headings on 1998-2008 change in log accessibility. Sample restricted to wards within 1-20km of a road scheme. Table reports regression coefficients and robust standard errors (clustered at Ward level). *, **, *** indicate significant at the 10%, 5% and 1% levels, respectively. All regressions include scheme dummies, log accessibility 1998, distance to scheme. Obs. 3469.
-49 -  Table reports coefficients from ward-level regression of log employment or log plant counts on accessibility. Each coefficient is from a separate regression. Standard errors in brackets (clustered at the ward level). *, **, *** indicate significant at the 10%, 5% and 1% levels, respectively. 'Scheme trends' are closest-scheme dummy variables interacted with a linear time trend. 'Controls' are a linear trend interacted with: distance to closest scheme, a dummy for years in which the scheme is open and the initial level of (log) accessibility in 1997.
-50 -    Table reports coefficients from plant-level regression of log plant employment on accessibility. Each column is from a separate regression. Standard errors in brackets (clustered at the ward level). *, **, *** indicate significant at the 10%, 5% and 1% levels, respectively. 'Scheme trends' are closest-scheme dummy variables interacted with a linear time trend. 'Controls' are a linear trend interacted with: distance to closest scheme, a dummy for years in which the scheme is open and the initial level of (log) accessibility in 1997.
-53 -  -58 - Notes: Own calculation using BSD and optimal travel times calculated as described in the text.