Geospatial dataset for analyzing socio-economic regional divergence of European regions

This data article presents macroeconomic data that can be used for comparative territorial studies. The data cover a sample of 413 regions (national administrative-territorial units corresponding to second level of a common classification of territorial units for statistics of the European Commission – NUTS 2 level region of the European Union, and comparable administrative-territorial units outside the EU) of 48 European countries, including Cyprus, Turkey, the European part of Russia, and two partially recognized states – the Republic of Kosovo and the Pridnestrovian Moldavian Republic. The statistical database covers a five-year period of 2010–2014. This dataset is created to enhance our understanding of the contemporary coastalization dynamics in Europe. Despite the fact that coastal regions of European countries exhibit an extensive level of development and remain attractive to human settlement, industry localization, and investment flows their contribution to the socio-economic development of Europe is unclear. The reported data cover a series of macroeconomic data on key indicators traditionally used in comparative analysis of regional development: average annual population, gross regional product (GRP) in purchasing power parity (PPP), labor productivity, population density and GRP (PPP) values per sq.km. Accounting for differences in geoeconomic position of the European regions enables to distinguish four subtypes of regions with a particular emphasis on the coastal area: coastal border, coastal other, coastal hinterland, and inland other. An additional focus is made on differentiating the performance indicators of regions depending on their border geo-economic position: border regions with a state border over land, lake or river surface, and midland regions – other non-border regions. This data is to be used as a comparative benchmark for the coastal border subgroup of regions against the totality of border and midland regions.

border over land, lake or river surface, and midland regionsother non-border regions. This data is to be used as a comparative benchmark for the coastal border subgroup of regions against the totality of border and midland regions. &

Value of the data
Studies on coastalization generally confirm the asymmetrical development of territories with gravitation towards marine and ocean coasts [1][2][3][4][5][6][7]. To some extend these results are predetermined by the research scope featuring islands and marine-focused economies, or a limited classification of territoriescoastal and non-coastal, etc. The data presented responds to such research limitations by providing macroeconomic data across Europe for a broad classification of regions. Dataset enables to differentiate development patterns of coastal and adjacent regions (coastal hinterland), border regions, coastal borderland, and midland (inland) territories (regional typology is acquired from [8]).
The dataset covers the entire territory of Europe, including countries that are not part of the European Union (EU). Difficulties in collecting and harmonizing the data of national statistical offices limit most available research to the EU or national level. The data provided enables to conduct comparative studies on regional socio-economic development across Europe, including the European part of Russia. Of particular value would be research on regional divergence at macro-regional level (e.g. Baltic region, Mediterranean region, Baltic-Black Sea region, etc.).
This dataset may have important policy implications. The identifiable socio-economic development trajectories of regions over the five-year period my reveal distinct patterns in the development of regions of different types (e.g. the interrelation between coastal regions and the adjacent territories of coastal hinterland). Correlations may be found between the certain policy instruments implemented and the change in macroeconomic indicators. The data may be useful in developing socio-economic typologies of regions and assessing the differences in the territorial development of individual European countries.

Data
The data cover a sample of 413 regions (the level of NUTS 2 and comparable administrativeterritorial units) of 48 European countries. Dataset spans over the period 2010-2014. The data is grouped according to the types of regions allocated on the basis of their geo-economic position ( Table 1). The determining factor for assigning a region to a particular subgroup within the first group is its location relative to marine and ocean coasts (coastal geo-economic position): 1.coastal; 1.1 coastal border; 1.2coastal other; 2.inland; 2.1coastal hinterland; 2.2inland other. Second group focuses on the borderland geo-economic position featuring two subgroups of regions: 1.
border; 2.midland. The two groups are designed to be used complementary. Second group is designed to obtain a comparative benchmark for the coastal border subgroup of regions against the totality of border and midland regions.
The choice of average annual population and GRP (PPP) data as the basic indicators for analysis is due, firstly, to their universality (they are taken into account in the statistical bases of all European countries or may be alternatively obtained from calculating the available statistical data); secondly, they reflect the level of regional socio-economic development, and in relative terms (per sq. km or per person) demonstrate the concentration of resources in a certain region, which is necessary to assess coastalization or regional divergence in general.
Figs. 1 and 2 serve as clear representation of data on average annual population and GRP (PPP) in all regions of Europethe average value for period 2010-2014. Raw data for a series of maps are available in Excel spreadsheets with a separate table for each map.
The Supplementary data provide the developed typology of European regions with an indication of the nomenclature of each region and country, as well as information on the size of territory (Supplement 1a-d and 2a,b). Individual Excel tables present a series of aggregated macroeconomic data on the types of European regions studied.

Experimental design, materials and methods
The data cover a sample of 413 regions of 48 European countries, including Cyprus, Turkey, the European part of Russia, and two partially recognized statesthe Republic of Kosovo and the Pridnestrovian Moldavian Republic. Individual account for Kosovo and Pridnestrovie is required to obtain accurate statistical data (not estimates) and to consider their region types as de facto independent territorial socio-economic systems. The statistical data for the European territory of the Russian Federation includes regions of the Northwestern, Central, Volga, Southern, North Caucasus, and from 2014 the Crimean Federal Districts. Regions are defined as national administrative-territorial units corresponding to second level of a common classification of territorial units for statistics of the European Commission (2015) -NUTS 2 level region of the European Union, and comparable administrative-territorial units outside the EU. For 17 European states this classification corresponds to the total territory of the country (Andorra, Cyprus, Estonia, Iceland, Kosovo, Latvia, Liechtenstein, Lithuania, Luxembourg, Macedonia, Malta, Moldavia, Monaco, Montenegro, Pridnestrovie, San Marino, and Vatican). The NUTS 2 level of assessment represents holistic socio-economic systems, featuring a high degree of economic and institutional self-sufficiency. The classification corresponds to official statistics available for the overwhelming majority of countries under study, which provides the comparability of results and a possibility of building a complete series of statistical data by regions.

Variable definition Data frequency Source of data
Basic indicators of regional socio-economic development level The statistical database covers a five-year period of 2010-2014. The macroeconomic data is collected from several reliable sources, such as the Eurostat, national statistical offices, the World Bank, the IMF. When creating the database the comparability of the indicators' measurement units was ensures (i.e. GRP (PPP) of all countries is quoted in Euro). For some regions, individual indicator values were either unavailable or inaccessible, so data extrapolation and interpolation techniques were applied to build complete data series. This is done in the following ways: if the value of the indicator for the region was known only for one year from the period under consideration, its value was taken as a constant and extrapolated for the entire period of 2010-2014; if the value of the indicator for the following year was omitted, it was replaced by data for the previous year; if the value of the indicator for the first analyzed year was not available, then it was replaced by data for the following year. Table 2 provides the variable definitions for the macroeconomic series.
Aggregation of the initial data was performed in the context of the proposed classification of regions to obtain the values of average annual population and GRP (PPP) indicators by the types and subtypes of regions. At the next stage, the data was converted into calculated indicators, growth rates, coefficients. The ranking of regions was carried out.
Firstly, there are basic indicators of regional socio-economic development level and regional economic efficiency. These indicators are used to assess manifestation of coastalization in Europe, as well as for segregation of European regions featuring different geo-economic position in terms of socio-economic development.
Secondly, there are indicators of regional socio-economic development dynamics. They are used to assess regional development vector, namely, the growth or decline of average annual population, GRP, labor productivity.
Thirdly, there are indicators of the typological distribution of regions. They are used to assess distribution patterns of population and GRP (PPP) between the identified types of regions: Group I regions: 1. Fourthly, there are indicators of resources concentration in the regions. They are used to assess the urban agglomeration effects and efficiency of using the territory of different types of regions.