Dataset for the life cycle assessment of the high speed rail network in Spain

A life cycle assessment (LCA) of the Spanish high speed rail (HSR) network in service in 2016 (2583 km) was conducted. Life cycle inventory (LCI) data related to the construction and maintenance phases of the infrastructure was collected using Google Earth tool, and complemented with data obtained from the LCA carried out by Tuchschmid et al. [1]. LCI data associated with the operation phase of the infrastructure was built on available fragmentary data on passenger movements for the year 2016 [2–4], processed with a python algorithm to estimate the transport service provided by the infrastructure. Environmental impacts for transport modes were obtained from Ecoinvent v3.7 database [5,6] and processed with openLCA software [7]. Life cycle impact assessment (LCIA) results gathered in the dataset include Global Warming (GWP100a), Cumulative Energy Demand and total emissions for PM10, SO2, NOX and NMVOC. This dataset presents a detailed description of the Spanish HSR network, including the length of each item (bridges, tunnels, earthworks, railway tracks), and a robust estimation of passenger transport over the infrastructure for year 2016. The LCI data presented in this paper support the original research done on whether the construction of Spanish HSR network infrastructure is justified in terms of reducing environmental impacts and energy consumption [8], and may be used as a baseline for future studies on transport economics.

transport over the infrastructure for year 2016. The LCI data presented in this paper support the original research done on whether the construction of Spanish HSR network infrastructure is justified in terms of reducing environmental impacts and energy consumption [8] , and may be used as a baseline for future studies on transport economics.  Table   Subject Transportation and environment Specific subject area Life Cycle Assessment and rail transport Type of data Text, tables, figures, Excel spreadsheet How data were acquired Raw data was acquired from primary and secondary data sources, including: (i) Google Earth tool, providing visual information related to the elements in the HSR network; (ii) public reports on passenger movements on the network in 2016; (iii) LCA studies on other HSR infrastructures; and (iv) Ecoinvent v3.7 commercial database. Data format Raw and analysed data Parameters for data collection The network considered in the LCI was the Spanish HSR in operation in 2016. Primary data for the calculation of passenger transport were passenger arrivals and departures for every HSR station in 2016, and passenger movements for some specific connections in 2016. High-speed services considered were AVE, ALVIA, AVANT and AV-City. Alternative scenarios were analysed considering different combinations of occupancy rates and propulsion technologies for private cars, electricity mixes and future HSR transport demand.

Description of data collection
The description of the infrastructure elements of the HSR network was performed using Google Earth and its aerial perspective. Complementary data for the LCI of the construction and maintenance phases was obtained from Tuchschmid et al. [1] . Passenger transport in the network was calculated for year 2016 with a python algorithm applied to primary data of passenger arrivals and departures for every HSR station in 2016, provided by El País [3] and El Independiente [2] journals, and to passenger movements in some specific connections, provided by Fundación de los Ferrocarriles Españoles [9] . Environmental impacts and material flows for modes of transport were collated from the Ecoinvent v3.7 database [5 , 6] processed with openLCA v1.10 [7] . LCIA results were obtained for Global Warming (GWP100a) and Cumulative Energy Demand, along with total emissions for PM10, SO 2 , NO X and NMVOC. Data source location Layout of the four high-speed network corridors in Spain and all the network stations with high-speed services.
Percentages of shifted transport from other modes from Betancor and Llobet [4] . LCI data for modes of transport from the Ecoinvent v3.7 database [5 , 6] . Data accessibility Data are available within the article and detailed supplementary information is also provided through a repository.

Value of the Data
• This dataset gathers a precise geographical description of the Spanish HSR network infrastructure, including the measurement of each item, which can be crucial to different studies of the Spanish HSR. • The dataset provides necessary detailed life cycle inventories to perform the life cycle assessment of the Spanish HSR network but can be used as a baseline for researchers to perform a LCA of other HSR networks. • This dataset may be used in future academics studies on transport economics, e.g. to calculate environmental footprints or economic profiles, and for any stakeholder interested in the environmental sustainability of HSR lines. • This dataset provides with passenger transport estimates over Spanish corridors in 2016 measured in passenger-kilometre.

Data Description
This dataset provides all primary and secondary data used to build the LCA of the entire Spanish HSR network in operation in 2016. The LCI data presented support the original research done on whether the construction of that infrastructure is justified in terms of reducing environmental impacts and energy consumption, under different scenarios [8] . Table 1 provides background data related to the reference scenario and the alternative scenarios. This background data includes not only the characterisation of the electricity mix that powers the high-speed rail infrastructure, but also the transport characteristics of the modes from which the infrastructure shifts passengers, in a situation where the HSR infrastructure is not in operation.
Data is structured in three parts: the first part (1.1) describes the LCI of the infrastructure's construction and maintenance phase; the second part (1.2) gathers the LCI of the operation phase and the third one (1.3) presents the LCIA data.

LCI on the construction and maintenance phase
The Spanish HSR network consists of four main corridors that connect different regions with the state capital, Madrid: Madrid-Catalonia, Madrid-León (Northern corridor), Madrid-Levante and Madrid-Andalusia. In total, 2583 km were in service in the year 2016.
A detailed description of each of the four corridors of the HSR network, including the length of each item (bridges, tunnels, earthworks, railway tracks, passageways, forks, stations), is gathered in the "Layout" tab of each of the four corridors' attached Excel files ("XX CORRIDOR Table 1 Background data related to the reference scenario (Baseline) and the alternative scenarios (S1-5).  Ecoinvent 3.7.xlsx", where XX is ANDALUSIA , CATALONIA , LEVANTE or NORTHERN) included in SM01 (SM means Supplementary Material).
The LCI for the construction and maintenance phase of each of the HSR corridors is gathered in SM02. Tables 1 -8 in the "SM02.docx" file gather the LCI of bridges ( Table 1 ), earthworks ( Table 2 ), tunnels ( Table 3 ), sleepers and ballasts ( Table 4 ), rails ( Table 5 ), mast, catenary and overhead wiring ( Table 6 ), signals and communication systems ( Table 7 ) and construction and maintenance of railway buildings ( Table 8 ) in the four corridors of the Spanish HSR. Table 2 summarises the LCI for the whole network. LCI data provided by Tuchschmid et al. [1] in relation to Cumulative Energy Demand, CO 2 , PM10, SO 2 , NO X and NMVOC emissions flows related to the construction and maintenance of items is also shown, in terms per km and year (or per unit and year, where applicable).

LCI of the operation phase
The LCI of the operation phase requires an accurate estimation of passenger transport over the network. This, in turn, requires knowledge of passenger movements between stations. Since the LCA is orientated towards the evaluation of the implementation of the HSR infrastructure, it is also necessary to compute the passenger transport displaced from other modes to the HSR network, in comparison with an alternative scenario in which the infrastructure is not operative.
In the SM03 excel file ("SM03.xlsx"), the "HSR station Traffic 2016 tab presents the passenger arrives and departs in each station of the Spanish HSR in 2016. This data was provided by Galán et al. [3] and by García [2] . The "Specific connections" tab presents data of passenger movements in some specific connections provided by Fundación de los Ferrocarriles Españoles [9] . The "Shifted transport" tab from the same excel file presents the percentages for shifted transport from other transport modes to the HSR, for each of the corridors, obtained from Betancor and Llobet [4] .
The SM04 file ("SM04.docx") presents four tables, one for each corridor, with the matrix of distances between each pair of stations ( Table 1 for the Andalusia corridor, Table 2 for the Northern corridor, Table 3 for the Catalonia corridor and Table 4 for the Levante corridor). Each table also provides the number of annual passenger arrivals and departures in each HSR station.
The SM06 file ("SM06.docx") provides the estimation for the passenger transport in the HSR that is coherent with the data gathered in SM03 and SM04. In particular, this file provides a table for each corridor with a matrix of average passengers up and down in each HSR station ( Table 1 for the Nothern corridor, Table 2 for the Andalusia corridor, Table 3 for the Levante  corridor and Table 4 for the Catalonia corridor). Statistical results for the passenger transport estimation in each corridor is also graphically provided in Figs. 1-4 of the same file station ( Fig. 1 for the Nothern corridor, Fig. 2 for the Andalusia corridor, Fig. 3 for the Levante corridor and Fig. 4 for the Catalonia corridor).
Based on data from SM03, SM04, SM06 and SM07, Table 3 presents the LCI of the operation phase in each corridor of the Spanish HSR network, in the Baseline Scenario. The LCI for the operation phase in the alternative scenarios (S1-5) is gathered in the SM08 file ("SM08.docx"). Table 1 from this document gathers the LCI for the operation phase in the alternative Scenario 1, Table 2 for alternative Scenario 2, Table 3 for alternative Scenario 3, Table 4 for alternative Scenario 4 and Table 5 for alternative Scenario 5.

LCIA and total emissions data
LCIA results gathered in the dataset include Global Warming (GWP100a) and Cumulative Energy Demand. Total emissions to air of the elementary flows PM10, SO 2 , NO X and NMVOC have      also been considered in the analysis. Table 4 provides information on the LCIA methods considered and the elementary flows for which final net balances have been analysed. Tables 5 -8 present data related to the processing of LCIA results and total emissions of selected elementary fluxes for the four corridors of the Spanish HSR, into three separate blocks. The first block presents the environmental impacts linked to the construction and maintenance  phase, the operation phase and to the transport shifted from other modes for the Baseline Scenario, and the net environmental impacts for all the scenarios explored. These results are provided in a yearly basis, considered that the lifetime of the infrastructure is 60 years. The second block presents the quantity of annual passengers (in millions) needed in each corridor, in each scenario, to reach a null net environmental impact. For scenario S3, the value shown in the table is the percentage of induced transport that provides null environmental impact (in S3, transport demand exactly doubles that estimated for 2016). The third block presents the years of HSR operation needed to compensate the environmental impacts associated to the construction and maintenance of the infrastructure. Complete and detailed results are available in the excel files included in SM01. Table 9 presents GHG equivalent emissions linked to infrastructure construction and maintenance for each corridor in a yearly basis, and for each passenger-kilometre (pkm) travelled in each corridor, in the Baseline Scenario (assuming that transport demand in 2016 is maintained throughout the lifetime of the infrastructure).

Experimental Design, Materials and Methods
The purpose of the work that gathered this dataset was to carry out a comparative LCA of the Spanish HSR network, in order to verify whether its construction is justified in terms of reducing environmental impacts and energy consumption. The goal of the LCA was set to obtain and present the inventory data linked to the life-cycle of the infrastructure, also considering the environmental contribution of shifted transport from other modes to the HSR. The scope of the study was the Spanish HSR transport system constructed and in operation in year 2016. The functional unit was the passenger transport service provided by the entire HSR network in one year of operation, which includes the AVE, ALVIA, AVANT and AV-City high-speed services in Spain. Primary data presented in the previous section was processed in a modelling that is described in the remainder of this section.
The net environmental impact (NetEI) linked to the functional unit, in comparison with a transport system in which the HSR is not operative, is calculated following Eq. (1) : where EI HSR Const ruct ion & Maintenance denotes the construction and maintenance impacts of the infrastructure, considering that they are evenly distributed along its lifetime (60 years) [15] ; EI HSR Operation denotes the impacts related to the annual operation of the HSR infrastructure; and EI i i → HSR denotes the impacts linked to the annual operation of alternative transport modes -referenced with the i subscript, i.e. aeroplane, conventional train, coach and private car-, from which transport is shifted to the HSR [8] .
Detailed information was needed on every kilometre of the Spanish HSR constructed and in operation in year 2016, for the compilation of the inventory of the construction and maintenance phase. Faced with the impossibility of obtaining the construction plans of every line in the network, the data was obtained using the Google Earth tool. A complete aerial review of each of the lines that make up the network was carried out, and all the measurements of the different items were recorded. The environmental impacts associated with the construction and maintenance of the HSR network were calculated applying to each item of the infrastructure the corresponding impact coefficient following Tuchschmid et al. [1] . The calculation for each corridor is gathered in the "CONSTRUCTION IMPACTS MODELLING" tab of the corresponding Excel file included in SM01.
The traffic data gathered for year 2016 (available in the attached files SM03 and SM04) was processed with an algorithm programmed in python in order to obtain an estimation of the annual density of transport in the HSR, measured in terms of annual passengers over the complete infrastructure (total transport over the corridor divided by the length of the corridor). The algorithm and its results are gathered in SM05. The python script generates four "XX.csv" files (where XX is the name of the corridor), one for each corridor. Each .csv file has to be converted into .xlsx format, and the data inserted into the "XX_corridor_2016.xlsx" file, at cell "D6". The .xlsx file calculates minimum, maximum, mean and standard deviation values of possible passenger movements between each pair of stations. The .xlsx file also calculates a histogram for the density of transport. The parameters for the calculation of transport in each corridor are the boundary conditions provided by Galán et al. [3] and García [2] for the number of passengers entering or leaving each station (in thousands), and the data provided by Fundación de los Ferrocarriles Españoles [9] for the number of passengers in some specific connections (also in thousands). For each HSR corridor with n stations, characterised by its distance matrix ('distances', an n × n matrix with the distances between any pair of stations) and the annual travellers arriving or departing from each station (A, B, C, D…), the algorithm allows the random generation of an n × n matrix ('R') with the movements between any pair of stations, complying with the boundary conditions. For each calculation, a recursive algorithm is executed until total error (e) falls below a maximum value (2% for the Northern corridor, 4.1% for the Levante corridor, 1% for the Andalusia corridor and 5% for the Catalonia corridor). Multiple executions (10 5 ) of the algorithm provides probabilities for passenger transport on the corridor under consideration to be statistically treated. Results are gathered in the SM0 6 file ("SM0 6.docx").
The application of Eq. (1) to the primary and secondary data reported in Section 1 for each of the corridors is carried out, for each of the scenarios considered, in the spreadsheet tab (labelled with the name of the scenario: "BASELINE SCENARIO", "S1", "S2", "S3", "S4", "S5"; the description of the scenarios is gathered in Table 1 ) from each of the Excel files included in SM01.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.