Estimating road transport costs between and within European Union regions

.


Introduction
Transport costs are crucial elements of spatial analyses.They directly affect trade flows, which also serve as the main transmission channel for spillover effects between regions.The assumptions on transport costs therefore directly affect the results of any model analysis.Unfortunately, good transport cost estimates at the regional level for the European Union (EU) are not readily available.
In this paper we address this issue by estimating a unique and comprehensive dataset of road freight transport costs by a representative 40t heavy duty vehicle (HDV), for the EU regions at the NUTS 2 (Nomenclature of Territorial Units for Statistics) level.We focus on transport costs by road, first, because this transport mode represents 76.4% of total freight transport in the EU (Eurostat, 2016); and, second, because its dominance in hinterland areas in comparison to other transport modes (Fr� emont and Franc, 2010).The resulting dataset is available for download from the online appendix on the Transport Policy website of this article.
Following the existing theoretical (Hanssen et al., 2012) and empirical (Combes and Lafourcade, 2005;Zofío et al., 2014;Ford et al., 2015;Laurino et al., 2019) literature on the estimation of generalized transport costs (GTC), we estimate transport costs as the average cost of road freight transport between pairs of centroids within the regions. 1  These centroids are taken from a 1kmx1km population grid, which allows us to sample hundreds of centroids for each European region based on the spatial population distribution.Thanks to considering a large number of centroids in each region, 1) we account for the spatial distribution of economic activity within each region, and 2) we can calculate precise transport costs within and between every region.2Specifically, we calculate a composite cost over each road segment which allows us to calculate the optimal route between two centroids.This optimal route is defined as the minimum cost entailed by a representative 40t Heavy Duty Vehicle.To the best of our knowledge, we are the first to estimate GTCs using a consistent methodology in multiple countries at the scale of subnational regions of the EU.
Using a geographical information system (PostGIS), an open source database for digitalized road networks, OpenStreetMap (OSM), and a number of additional datasets, we build a database with more than 4 millions road-segments (arcs) containing highways, primary and secondary roads (including bridges and tunnels), and ferries in Europe, with a total length of over 1.500.000km.We also obtain from OSM additional information on the characteristics of the roads such as the presence of traffic lights and roundabouts, the curvature, and the surface material.We then associate these arcs with a series of attributes related to the costs of the transport activity.Among these costs, we consider those related to the distance and the time dimensions of any single route.More concretely, for the distance-related costs, we combine the length of the arc with information on fuel prices and fuel consumption, tolls, taxes, and maintenance costs.For the time-related costs we focus on the travel time over the arc (influenced by the maximum speed, the length, and road characteristics), the salaries in the transport sector, and European transport regulations on resting times.Additionally, actual geography is controlled for by the use of the European Digital Elevation Model, modifying the fuel consumption, the speed, and the travel times according to the gradients of each road-segment.After building the road-network, we calculate the minimum-cost route among the set of all possible itineraries between samples of centroids using the Dijkstra (1959) algorithm.The averages of the costs associated with these optimal routes over all centroid-combinations within a region-pair are reported in a baseline origin-destination cost matrix expressed in euros.
This methodology based on large random samples taken from satellite imagery improves on the basic distance or transport cost measures, which are typically used in regional models such as spatial CGE models (Br€ ocker et al., 2010;Lecca et al., 2018); applied economic geography models (Fingleton, 2007); regional trade models (Barbero and Rodriguez-Crespo, 2018;Alam� a-Sabater et al., 2015;Díaz-Lanchas et al., 2019;Wessel, 2019); migration models (Sardadvar and Rocha-Akis, 2016); and a vast number of other contributions.Our approach is closely related to Antweiler (2007) and Hinz (2017), who use satellite imagery to calculate distances but rather focus on the country level, where we believe that the additional precision brought by the use of satellite imagery has relatively less benefits compared to analysis on the regional level.Moreover, these authors do not consider transport cost.
The complete matrix with transport costs is provided for download in the supplemental Online Appendix, including its major components and other measures such as the travel distance and time over road as well as the average geodesic distance between the sampled points.A consistent and widely used set of distance measures had already been developed by Mayer and Zignago (2011), who also provide both harmonic and arithmetic averages of city-based distances between countries.Our dataset differs in that it focusses on the regional level, uses more detailed underlying data on the distribution of population which is independent from definitions on city boundaries, and additionally provides road freight transport costs, travel time and travel distance by road.This dataset and the methodology proposed can be used by regional modellers that need an accurate measure of transport costs between EU regions.
The transport cost matrices can be incorporated in spatial economic models.To this aim, we transform the transport costs into the restrictive "iceberg" form where transport costs are expressed as an ad-valorem tariff.We provide estimates of these iceberg transport costs by resorting to a novel database on interregional trade flows for the EU regions in 2013 (Thissen et al., 2019).This new iceberg-type transport cost matrix allows us to appropriately implement and include transport cost shocks and road-transport infrastructure investments into new economic geography (NEG) and spatial computable general equilibrium (CGE) model for the whole Europe or any of its countries, such as the works by Fingleton (2007), Br€ ocker et al. (2010), Barbero et al. (2018), or Lecca et al. (2018). 3ur results clearly point to core-periphery structures among the EU regions.That is, the centrality of the regions within the road network is the main driver of the distance-related costs, being smaller for geographically central regions, whereas the salaries in the transport sector directly affect the time-related costs of the GTC, being lower in regions with low wages in the transport sector and vice versa.
Given the detailed cost components within the GTC, we can assess the effect of changes in its attributes.As a result, we obtain a new counterfactual transport-cost matrix that can be used to evaluate transport policies.We perform a series of policy experiments by modifying the attributes of the GTC as showcases of our methodology.
We create a transport policy tool to assess the impact of roadtransport infrastructure investment in a region, where the investment is assumed to be used for upgrading existing roads to highways.The roads to be upgraded are selected according to where the direct economic benefits in terms of saved expenses on transport would be the largest relative to the amount of resources that are invested, and taking into consideration the cost of building highways in each country.Among the road attributes, we modify those related to the maximum speed, and the ones related to penalties for curvature, slope, traffic lights and surface.After selecting and modifying all the upgraded roads, we recalculate the set of cheapest routes between regions to get counterfactual transport cost matrices which can be compared to the baseline case.We take the European Cohesion Policy program 2014-2020 as a case study for our transport policy tool and show that Eastern European countries are clearly experiencing the biggest reductions in transport cost, although there exist some positive spillover effects on central EU regions.
Our method for evaluating transport investment differs from the existing literature by being very general.We do not consider actually known planned road investments as in for example Br€ ocker et al. (2010) and Ib� añez and Rotoli (2017), but rather estimate the potential benefits from improving any road segment in every region, and decide on which roads to upgrade for a given investment depending on a cost-benefit analysis.This allows considering infrastructure investments in any region, and of any size, at the price of making likely mistakes as to which roads would be upgraded.We believe this approach is quite novel and useful for evaluating large scale investments such as in the context of the cohesion policy in the EU.
The remainder of the paper is structured as follows.Section 2 presents the theory and methodology for the GTC and the iceberg-type transport costs.Section 3 describes the data and the calculation process.Section 4 illustrates and discusses the results by way of descriptive analysis, and considers the policy applications.Finally, section 5 concludes.

Generalized transport costs
Several attempts have been made to estimate transport costs going beyond the traditional physical distance and travel time proxies.Teixeira (2006) computes a transport costs matrix using a digital road network that allows him to calculate the lowest cost (the fastest and shortest) itineraries between Portuguese districts to assess the dispersion/agglomeration of industries as a result of changing transport costs.Martínez-Zarzoso and Nowak-Lehmann (2007) analyse the determinants of maritime transport and road transport costs resorting to alternative factors affecting them such as unit values, services structures and services qualities, but also transport conditions.They apply their analysis to the Spanish exports to Poland and Turkey to study the impact of transport costs in trade flows.Jacobs-Crisioni et al. (2016) calculate a set of travel-time accessibility measures to model population changes at a very fine spatial level due to varying transport costs in the cases of Poland, Germany, Austria and Czech Republic.Even at the European level, a set of transport models aims to assess the impact in transport costs and the accessibility gains due to changes in the transport infrastructure network, covering different transport modes and passenger databases (Rotoli et al., 2014), or specific study cases such as the Trans-European Transport Network (Ib� añez and Rotoli, 2017).
Recent studies estimated economic transports costs depending on distance and time according to the so-called GTC concept proposed by Nichols (1975).For example, Combes and Lafourcade (2005) accurately estimate transport costs for the French employment areas over the period 1978-1998. Hanssen et al. (2012) ) consider intermodal transport solutions when estimating the GTC in transporting fresh fish between Norway and Continental Europe.Zofío et al. (2014) resort to index numbers to disentangle the effect of economic and infrastructure determinants on the reduction of generalized transport costs for the case of Spain in the years 1980-2007.Focused on urban areas, Ford et al. (2015) apply the GTC analysis to the transport network of London to assess different infrastructure scenarios.Finally, Laurino et al. (2019) estimate the GTC for different transport modes in the case of the Italian regions to measure the accessibility of the most remote ones.
We contribute to this literature by building a database with estimates of the GTC between all the possible pairs of 268 EU regions.In comparison with previous work, 1) we base our analysis on trips between a very large number of centroids in each NUTS-2 region, which allows calculating not only between-but also within-region transport costs while taking into account the often very unequal spatial distribution of the population within regions; 2) we make use of the digitalised network from the open source database OSM which contains an up-to-date network for roads and ferries reflecting the actual state of the European roads; and 3) we greatly simplify the analysis and the computational counterparts by focussing on a single mode of transport.
We start from Zofío et al. (2014) and estimate the bilateral GTC between any two pair of locations ij within the EU.The GTC ij is defined as the cheapest itinerary I ij in the set I ij of all possible trips between two locations.An itinerary is divided into segments of roads a (arcs), which possess several characteristics affecting the cost of traversing them.We associate all costs required to traverse the arcs by considering their length in km (d a ) and the required travel time in minutes (t a ).Thus, we define the GTC as: where DistC ij stands for distance related costs and TimeC ij stands for time-related costs.The former is defined as follows: where e d a (in EUR per km) entails fuel costs (fuel a ), which is computed as the fuel price (in EUR per litre) multiplied by the fuel consumption in litres per kilometre of the representative truck.The fuel cost per km over an arc differs across EU countries because of differences in fuel prices.The fuel consumption will be affected by road properties such as the slope; toll costs (toll a ) are also country-specific because of differences in nation-wide tolling (which either operates through vignettes, or a country-wide electronic toll per km) or also per road-segment (for countries that have tolling on a limited set of road segments).Costs related to maintenance and tires represent a relatively small share of the total transport costs.Zofío et al. (2014) find that cost shares of tires and maintenance costs are 4.92% and 4.24% of the total, respectively, whereas fuel consumption costs accounts for 29.04% of the total transport cost.We consider that tire and maintenance costs for all trips are related to fuel costs in the same proportion.Therefore, we assume that for every euro spent on fuel during a trip, tireCS ¼ 4.92/29.04¼ 0.17 additional euros are spent on tires and maintCS ¼ 4.24/29.04¼ 0.146 euros are spent on maintenance costs.
Time-related costs are defined as follows: where the main component is the labour cost of the driver (t a lab ij ).The hourly wage cost lab ij from Eurostat is multiplied by the time (in hours) it takes to cross the arc.Notice that the wage cost changes depending on the origin and destination.We take the average of the origin and destination hourly wage in transport.The remaining costs related to amortization and financing costs (amortFinCS) of the vehicle, insurance (insCS) and indirect costs (indCS) are assumed to be proportional to labour costs, with the relative cost shares again matching those in Zofío  et al. (2014). 4 Taxes (Tax i ) are added to the distance and time costs components to compute the generalized transport costs.To account for them we take a series of assumptions such as: 1) taxes are given and affect all the roads departing from any origin in a single country so they are not taken into account when computing the optimal route between a pair of cities; 2) the same holds for the cost of vignettes, that is, they represent a fixed cost between any pair of origin and destination.We calculate it as the sum of the cost of a yearly vignette, divided by an estimate of the number of trips that can be made within one year, adding up this cost for all vignette-countries that a pre-calculated optimal route takes the truck through.
We aim to include a comprehensive set of costs, at a detailed level.Still, some costs are outside of the scope of our study.We do not consider, for example, some costs that are quite closely related to the trip, such as congestion or costs related traffic accidents.We ignore cost of logistics related to trade, such as warehousing, and also broader societal costs such as pollution or climate change.The dataset accompanying this paper contains several basic variables which can be useful in accommodation together with allowances.We assume that accommodation costs are relatively small for the case of internal Spanish transport costs considered by Zofío et al. and ignore them.We then compare costs shares of capital expenditures (amortization and financing), insurance costs and indirect costs relative to the sum of wages and allowances, so amortFinCS ¼ 13.16/32.96¼ 0.4, insCS ¼ 5.24/32.96¼ 0.16, and indCS ¼ 8.31/32.96¼ 0.25.

D. Persyn et al.
calculating alternative GTC's.One example may be to consider a different vehicle type with an alternative fuel consumption based on the distance driven which is reported in the dataset.Another example could be to consider additional costs such as pollution, based on the reported fuel consumption.

From inter-centroid GTC's to inter and intra region GTC's
The GTC as described in section 2.1 is calculated at the level of pairs of centroids, which makes operability and calculations harder for economic models operating at higher level of aggregations (regional, national).To overcome this problem, in a first step, we define and calculate the GTC between two regions o and d as the arithmetic average of the GTC between the m centroids belonging to region o indexed by x ¼ 1;…; m, and the n centroids belonging to the region d indexed by y ¼ 1;…;n.The inter-regional GTC then equals: This simple arithmetic average will give an average GTC that is representative for a random draw of a pair of centroids drawn from the population distribution.However, as emphasised by Head and Mayer (2010), given that trade is more likely to occur between centroids that are at shorter distances, the average GTC between two regions that is relevant when modelling international trade rather is the harmonic average GTC h od , which gives more weight to centroids at shorter distances and take the form: Notice that there is no need to weight by population when calculating the averages, since the random sampling of centroids already assures that there will be more centroids sampled in dense areas.We report harmonic averages alongside the arithmetic averages in the datasets accompanying this paper.

The iceberg transport cost matrix
The GTC as calculated above is easy to interpret and it is standard in the transport literature.Many economic models, however, consider a specific transformation of transport costs which is known as the "iceberg" representation.The name stems from the fact that it represents the transport costs as a "wasteful ad valorem tax".Transport costs are assumed to be proportional to the value of the good, and the receipts of the tax are lost for society.This would be equivalent to assuming some extra proportion of the good "melts" during transport. 5eal transport costs obviously are not proportional to the value of the good being transported, but will rather depend on the weight, volume, or special measures such as cooling which must be taken during transport.However, assuming transport costs to be proportional to the value has clear advantages for the algebra involved in typical economic models. 6Given that many economic models use this representation, we therefore also report the iceberg transport cost equivalent of the transport cost for every pair of regions as follows: Where F ij is the flow of goods between region i and j; GTC ij is the average GTC between both regions; and L is the EU-wide average loading of trucks. 7The numerator expresses the total transport cost of the observed trade flow between both regions, multiplying the trade flow (manufacturing and agricultural goods) in tonnes by the number of trucks required to ship one ton and by the cost of the trip for one truck.
Expressing the total transport cost relative to the value of the trade flow gives the trade costs expressed in ad valorem terms (Hummels, 1999), which is the form required in many spatial economic models.

Open street map
The road network over which transport costs are calculated is a subset of the publicly available OSM data.We extract over 4.000.000road segments of motorways, trunk roads, primary and secondary roads, and ferry lines from the original data.The total length of the network adds up to over 1.569.000km.An image of this network is given in Figure A1 in online Appendix A.1.The covered area includes the EU countries under consideration, with the addition of some selected areas through which an optimal route may lead such as Norway, Switzerland, and Western Turkey.We add three "virtual" ferry routes to the network, connecting the islands of Madeira and the Azores Islands to Lisbon, and one over the English Channel to mimic the Channel Tunnel.All ferry lines have an assumed speed of 35 km/h, a waiting time of 1 h, and average fuel price for the EU at different distance-thresholds set to reflect ticket prices as reported in Martino and Brambilla (2016), as explained below.
The size of the road network and the large number of routes to be calculated requires a suitable and scalable method.We use the freeware osm2po tool8 to convert the OSM data into a PostgreSQL database residing on a dedicated server with 40 cores and 240 GB of RAM.This database was accessed using the software R to start 40 parallel queries, each of them calculating a many-to-many routing problem corresponding to an adequately sized portion of the origin-destination matrix of centroids.The optimal routes themselves are calculated using the Dijkstra algorithm from the pgRouting project.

Centroids
The centroids which are used to calculate driving time and transport costs originate from a population density grid at a one square kilometre resolution, which was obtained from the European Environmental Agency.Every square kilometre from the original raster is populated with a randomly placed centroid for every 100 individuals estimated to inhabit that area.
The full set of centroids generated from the population grid represents the location of population quite precisely, even in excess of what is needed for our analysis.We therefore do not consider all these centroids in the calculations of the GTC's, but rather take random samples, with larger samples for larger regions, and region-pairs at shorter distances.Table 1 shows the sample sizes of centroids for each distance-threshold.
For intraregional distances, we include a large number of centroids because the NUTS 2 regions are quite unequal in terms of area, but also with respect to the internal distribution of economic activity within them, for example in coastal regions.Since most trade typically occurs within the region and over relatively short distances between regions, using many centroids for short distances will improve the representativeness of the calculated trade costs especially for those pairs of origin and destinations where most trade is happening.
Fig. 1 below shows a random sample of 160 centroids (green dots) which have been selected for the estimation of the transport costs of the Spanish region of Andalusia to some neighbouring regions, superimposed on a night-time satellite image from NASA's Earth Observatory.The sample of dots appears to be sufficiently large to capture the geographical dispersion of economic activity and population revealed by the satellite image. 9 It is also worth noticing that the population grid may underestimate the spatial concentration of the origin and destination of freight flows in the case they are dominated by a limited number of transport hubs, industrial areas and seaports.This effect may potentially arises from the implicit assumption in our sampling method by which freight flows are widely spread in space.

Data sources for the GTC
The GTC is composed of distance and time costs as per equation (1).
To calculate each component, we assume all trips are made by a representative EURO VI truck (HDV) consuming 34.5 L/100 km (Dünnebeil et al., 2015), before adjustment for slope.

Distance-related costs
The fuel cost depends on fuel prices per country 10 and the fuel consumption.The fuel consumption is assumed to change with the slope of the road, which we derive from the European digital elevation model for Europe developed by the European Environment Agency. 11An increase in the slope of a road segment in absolute value of 1% increases fuel consumption by 5.5%, which corresponds to the value of 11% found by Chen et al. (2017), adjusted for the fact that a positive slope will be present only for either the trip or the return trip.This implies a fuel consumption penalty of over 10% for about 15% of the roads with slopes in excess of 2%, and a penalty of over 44% for about 1% of the roads which have a slope higher than 8%.
Tolls are proportional to the distance travelled on toll roads.We performed extensive research on the tolls per km in all EU member states (and Switzerland and Norway).Table A1 in the Online Appendix A.1 provides a summary.
For the tire (tire) and maintenance costs (mant) per km, we assume these costs to be constant between all arcs in all countries.Specifically, they are set as to correspond to a joint cost share of 9%, as found for Spain by Zofío et al. (2014). 12

Time-related costs
The base travel time over an arc is calculated using the length of the arc and the maximum allowed speed over the arc.This maximum speed is the value which is provided for the segment in the OSM database.In case no value is provided in OSM, we take the legal maximum speed for HDV in each country and road type according to DG MOVE. 13Nevertheless, further assumptions were taken to better reflect the real-world properties of the roads.These are the following: The maximum speed on all primary roads was limited to the value from OSM or 70 km/h, whichever was smaller.Likewise, the maximum speed on secondary roads was set to 60 km/h.
The presence of a traffic light adds 2 min to the travel time to cross the arc.
� Curvature: the tortuosity of an arc is calculated as the ratio of the great circle distance between source and endpoint and the effective length of the road segment.We reduce the maximum speed by 1/2 for cases where the tortuosity of the road exceeds 1.5, resulting in a speed of about 35 km/h on those segments on a primary road.� We divide the speed by 2 on surfaces like sand, cobblestone, etc. to give a typical speed of 10 km/h for a primary road with this surface type.� Roundabouts: we divide the maximum speed by 5 on roundabouts and highway ramps, to give a typical speed of 14 km/h on a on a roundabout on primary road.� Slope: We apply a 10% speed penalty for slopes exceeding 8% and 20% for slopes exceeding 12%.
All these changes affect the travel times which, jointly with salaries, 9 A bootstrapping analysis using 10 samples with an average of 60 centroids (around 127.000.000routes per sample) revealed the sampling error for calculated transport costs between regions for a sample to be over 5% for less than 0.12% of the region-pairs (82 out of 71825 routes).The largest standard error of 7.5% was found for the internal transport costs for the Danish capital region Hovedstaden (DK01).Averaging the estimated transport costs over these 10 runs, however, allows reducing the standard error of the mean estimate according to σ ij = ffi ffi ffi ffi ffi ffi 10 p where σ ij is the original sampling error.Our reported results average over 10 simulations, reducing the standard error for 99% of the estimates below 1% (and a much smaller standard error for the majority of them).All estimated individual sampling standard errors are below 2.5%.
are the main determinants of the time costs.In our dataset, salaries include the definition of wages and direct remuneration for the transport sector according to the European Labour Cost Survey (2012) from Eurostat.Available information is at the NUTS 1 level for some countries, while for most countries there are no regional data and the national level information is used.We ignore the impact of the number of lanes or width of the road on travel speed since this data is not well reported in OSM.
The hourly wage cost is calculated starting from the annual wage cost in the transport sector (including employer social contributions, benefits, allowances etc.), taken from Eurostat, and assuming that 90 h can be driven in 2 weeks of time, in line with regulation regarding resting times ((EC) no.561/2006).We assume two weeks of rest per year in addition to these compulsory resting times, for a total of 2250 h driven per year.By dividing the annual wage cost by this estimate of hours driven per year, we get an estimate of the wage cost per hour driven, including all resting times.

Taxes and ferries costs of the GTC
Ownership taxes of HDV are paid yearly independently on the trade route the truck is performing.However, it is reasonable to assume that the truck owner will transfer the tax incidence to the client of the transport services.Ownership taxes come from Van Essen et al. (2012) and reported in Table A.2 in the Online Appendix A.1.Given the time needed to go from each origin to each destination, we compute the number of trips that a truck can perform between each two regions within a year by dividing the working hours in a year, including resting times, over the time needed to travel the transport route via the cheapest route.Then, the ownership tax added to the GTC of each trade link is calculated as the yearly ownership tax divided by the number of trips that the truck can perform in a year.
Ferries are considered equivalent to a regular road route, with no gradients and penalties except for a waiting time of 1 h.We follow Martino and Brambilla (2016) for the average price (cost) of a ferry-ticket charged to passengers in the EU. 14 This price varies according to three distance thresholds: short (less than 100 km), medium (100-300 km) and large (more than 300 km).We add this cost of the ferry ticket per km as the fuel cost for ferries, depending on the length (in km) of the ferry line.Table A.3 in the Online Appendix A.1 presents the different tickets (in €/km) imputed for each ferry-arc.

Data source for trade flows
The iceberg trade cost matrix relies on trade flows among the EU regions.We resort to Thissen et al. (2019) to get trade flows in monetary units and quantities (tonnes).These authors estimate the inter-regional trade flows at NUTS 2 level for the EU, expressed in free on board (FOB) terms.They provide data for different sectors entailing goods and services in 2013.Given our GTC approach based on transport costs by trucks, we take flow data in euros and tons for manufacturing goods, agricultural and forestry products, and raw materials and energy sources such as mining, quarrying, electricity and gas.15

Descriptive analysis of the GTC
The full table with the estimated GTC between all pairs of 268 regions is available from the authors on request.This section 4.1 gives a descriptive analysis of these estimates.
As described in section 2.1, the GTC is composed of many cost elements.These elements are shown in Table 2 where fuel and wage costs are the most predominant components, both representing around 63% of the total transport cost.Ownership taxes represent a negligible share of costs, whereas vignettes and tolls are approximately 6% of total costs.The rest of the other costs, for distance (maintenance and accommodation) and for time (insurances, financing and amortization costs, plus indirect costs), stand for the remaining 30% of the costs.
As can be observed in Table 3, the region-level GTC calculated as the arithmetic average of the centroid-level GTC's has on average a value of 2.039€.The internal GTC presents values of 2.42€ (for the city-region of Melilla measuring just 12 km 2 ) up to 759€ (for the Greek Southern Aegean region where the population is spread over 50 islands).The external GTC possesses more variability and highest average values.
Fig. 2 plots the simple average GTC across all destinies for each region i.As can be seen, geographically central regions have the lowest transport costs due to their location within the road network, whereas remote regions suffer from higher transport costs.
Distance and time costs are not only the most important components of the GTC, they also show remarkable heterogeneity among regions.To reveal these regional differences, Fig. 3 separates out the arithmetic average distance (a) and time (b) related costs of each region.Distance costs are mainly driven by fuel consumption and fuel prices, so regions that require shorter distances when travelling to other regions reduce significantly their fuel consumption.
Salaries in the transport sector are the leading determinant of the time-related costs in Fig. 3(b).Regions with lower salaries present lower GTC, while the opposite holds for regions with high salaries.Peripheral regions require more time and resting times to perform shipments by truck, leading to higher GTCs even though salaries are typically lower in these regions.This seems to be the cases of Greek, Portuguese, Spanish, and southern Italian regions.Moreover, central regions in the road network also benefit from low GTCs even in the eventual case of having high salaries in the transport costs due to the lower required travel times.Western and central regions in Germany are examples of this.
In summary, the GTC patterns reflect sources of comparative advantages across regions caused either by their geographical location.A core-periphery structure within the EU market emerges as a result of regional differences in travel time and geographical distances.
Alternatively, we consider the harmonic weighted average GTC i of each region i.Here, we weigh by the relative regional GDP.Compared to the arithmetic mean, the harmonic mean gives more weight to smaller GTC's, and results in an average that is more reflective of how much aggregate interaction can be expected between a region an it's neighbours, if the bilateral interactions are governed by a standard gravity equation.This harmonic mean resembles the inverse of a standard accessibility measure, but is quite different.Because the sum of the weights over the different destinations equals one for every origin, the measure focusses more on the bilateral GTC's, and uses weights only to appropriately give more weight to those GTC's which are (potentially) more important trading partners, considering each origin separately.Fig. 4 plots this indicator for every region.Truly integrated and central regions, as those in the core of Europe (regions in Belgium, the Netherlands, or even the UK) are characterised by low transport costs.The higher average GTC's for peripheral (especially eastern and southern regions) reflects the fact that these regions are on average more remote, but also farther from the economic core of the EU.The harmonic mean gives more weight to the nearby regions, which in the periphery tend to have small economic mass.It is noticeable that compared to the arithmetic mean, the harmonic mean shows an increase in the   connectivity of capital regions (such as Madrid, Lisbon, Bucharest, Prague, Bratislava, Vienna, Berlin, and Ile-de-France) compared to their surroundings.This is due to the higher emphasis on centroid pairs at short distances in the harmonic mean, and the fact that there are relatively more such pairs in these dense areas.
When graphically illustrating the effect of policies on transport costs in the next section 4.2, we will consider percentage changes in this weighted version of each region's harmonic GTC as the most appropriate summary statistic of how the policy is affecting them.The effect of transport policies we show therefore illustrate the effect of policies on an index of transport cost, weighting by the share of the trading partner (region) in a regions total trade.As in a Laspeyres index, we ignore possible changes in the weights.

Policy simulation: carbon tax
Up to now we focused on the spatial structure of the baseline GTC estimates.However, an interesting application lies in performing policy experiments.First, modifying the components of the GTC, we obtain an alternative transport cost matrix for the EU regions.We compare this counterfactual matrix over the baseline matrix to obtain (bilateral) changes in transport costs.As an example, in Fig. 5 we plot the percentage change in the weighted GTC (following equation ( 7)) after a 20% increase in fuel prices for all the regions as an approximation of a carbon tax.The darker regions are more affected by the increase in fuel prices as fuel represents a relatively large share of their costs.These are regions where wages (and therefore the share of labour costs) are lower, as well as their endowment of transport infrastructure (for instance in Eastern European regions), leading to higher fuel consumption.Also important is the higher fuel consumption on primary and secondary roads (compared to highways).In regions with low population density and few highways (such as Scotland and Ireland) fuel consumption is relatively high and the increase in fuel prices has a larger effect on the overall costs.
The GTC estimates in section 4.1 show an uneven spatial pattern.Fuel consumption, fuel prices and, to a lesser extent, vignettes/tolls costs represent a higher share in the average transport costs for remote regions.Policy changes reducing these components of the GTC will benefit these regions proportionally more than more central regions.In the reverse case of a carbon fuel tax, more central high-wage regions are less affected.
Time costs are affected by travel time, resting time and speed regulations.Regulations modifying resting-time should consider the effects on changes in travel time and GTCs, and the asymmetry of the effect between regions.

Policy simulation: Transport infrastructure investment
In order to estimate the effect of transport infrastructure investment on transport costs, we identify those roads that would create the largest net economic benefit when being upgraded to a highway.For the cost of upgrading, we take the estimated cost of building a km of highways in the EU from the European Court of Auditors (2013).We adjust the cost per country by applying the Eurostat price level index for civil engineering construction projects. 16 We identify roads for upgrading by comparing the economic cost of upgrading to the economic benefit.To calculate the benefit, we look for network segments which are the most important bottlenecks for traffic. 17We exclude roads that are less than 7.5 km from an existing highway from upgrading to avoid improvements of roads running parallel to existing (toll) highways.The freight flow between two centroids i and j belonging to regions o and d, is assumed to follow the path with the minimum GTC.The number of vehicles depend on the flow F of goods in tonnes between the regions o and d; the estimated number of trucks required to perform the transport of 1 tonne of goods, L; the great-circle distance dist between the centroids i and j; and finally, the population share of both the centroid of origin and destination in the respective regional populations.We assume that the number of trucks travelling between centroids i and j is By using geographic distance, we consider potential traffic rather than existing traffic.
The effect of transport infrastructure investment will depend on the number of centroids, that is, when using more centroids, traffic will be spread over more roads, and improvements in a road segment will affect a proportionally smaller share of transport.Given that centroids may be sampled at arbitrarily close distances, these would be assigned overly large shares of the trade flows.We therefore impose a 15 km minimum distance between centroids in equation ( 7).
Roads are ranked by the difference between the aggregate GTC, calculated by multiplying the GTC of the arc by the number of trucks travelling over them, compared to the aggregate cost assuming the arc is upgraded.We then progress by changing the characteristics of the primary and secondary roads earmarked for upgrading to match those of a Fig. 5. Change in the weighted harmonic average GTC due to a 20% increase in fuel prices. 16In online Appendix A.1, Tables A.4 and A.5 respectively show the baseline cost for building highways and the adjustments per country.This cost is further adjusted depending on several road properties: we increase the cost by 10% for every 1% increase in the slope of the road; increased by 100% for every 1000 inhabitants in a 1 km radius around the road and lowered by up to 30% in area's with a population density below 200 inhabitants per km; increased by 70% if the road is a bridge or tunnel. 17This does not mean to suggest that the improved roads are always a choice policy makes will or should make, and oftentimes it is not realistic to assume that a specific road segment would be improved.Nevertheless, we believe the approach can provide an estimate of how the transport costs would change in case of real world infrastructure projects.motorway: removing any speed or distance penalty due to the presence of traffic lights, roundabouts, surface material, curvature or slope, and increasing the effective speed to 90 km/h.The road segments are upgraded starting from the arc with the largest estimated net economic gain, up to the point where a segment improvement can no longer be financed for more than 50% (this ensures that on average all the investment money is spent, although individual regions will have some small amount of over-or underspending).We exclude toll costs from being affected by policies.
We now turn to perform policy simulations based on infrastructure investments using the EU's 2014-2020 Cohesion Policy programme for transport infrastructure investment.Fig. 6(a) shows the amounts invested for all types of transport infrastructure and over the entire programming period.The programme clearly targets regions in Poland, Romania, Slovakia, and the Baltic countries, with smaller investments in Southern Italy and Spain.
Fig. 6(b) shows the simulated change in transport costs due to the transport infrastructure investment.Given that the EU Cohesion Policy is a targeted policy for less endowed regions, Eastern European regions are, overall, the ones with the highest reduction in transport costs.However, there are significant spillovers to non-targeted regions.Some examples of this are the eastern regions of Austria, which are surrounded by regions receiving significant funding in neighbouring countries.The east of Germany benefits from its proximity to Polish regions targeted by the policy.Also Finland benefits indirectly, but significantly, from the road infrastructure investment in the Baltic countries through which the majority of trade to the EU passes.There is much less impact on some targeted regions in the south of Spain.Countries such as Belgium, the UK and Sweden benefit relatively little.
We then separately consider changes in transport costs within the region (Panel c) and between regions (Panel d).As for the former, the regions with a less developed road network like in Eastern Europe show the largest reduction in internal transport costs.Complementary to this, transport costs between regions (Panel d) follow the same pattern as the overall transport costs.This panel shows some clear evidence of spillover effects in Eastern Germany and Northern Italy, which benefit from a reduction in their external transport cost even without receiving many funds directly.Also some Finish regions benefit significantly from the large investments in road infrastructure in the Baltic countries and Central and Eastern Europe.

Conclusion
Transport costs are not usually well captured in spatial analyses and economic models.In an attempt to overcome this limitation, we create a unique dataset of interregional transport costs for the EU regions (NUTS 2) by taking use of the open digitalised road network OSM.Combining this database with other information allows us to calculate the optimal route of transport by truck and calculate the associated average transport cost between regions.A first contribution of our paper is to provide a comprehensive set of transport cost estimates, as well as their corresponding iceberg-type costs, between EU regions at the NUTS2 level, along with a set of underlying variable such as driving times and distances.
The results indicate that transport costs follow a core-periphery structure within the EU, where geographically central regions benefit from shorter trips and reduced fuel consumption, and more peripheral regions tend to benefit from lower salaries within the transport sector.
Moreover, the method allows performing transport policy analysis.We provide an example studying the effects of a generalized increase in fuel prices.We describe a tool for policy analysis on transport infrastructure investment.We apply it to estimate the impact of the EU's 2014-2020 Cohesion Policy programme, showing that Eastern European regions are the regions benefitting the most in transport costs reductions from road infrastructure investments, with positive spillover effects across the whole EU, but especially to neighbouring regions.
The definition of transport costs in this paper focusses solely on the cost related to driving the truck.In future research, we aim to expand the definition of transport costs considered to include also costs related to loading/unloading or warehousing, and to consider other modes such as rail or inland waterways and logistic platform choice.

Fig. 1 .
Fig. 1.A sample of 160 centroids obtained for the Spanish region of Andalusia Source: Own elaboration based on NASA Earth Observatory data and the EEA population grid.

Fig. 6 .
Fig. 6.Change in the weighted harmonic average GTC due to Cohesion Policy.

Table 1
Centroid sample size used to calculate distances between regions.

Table 2
EU average GTC cost components.

Table 3
EU region-level arithmetic GTC, descriptive statistics.