A spatial database of CO2 emissions, urban form fragmentation and city-scale effect related impact factors for the low carbon urban system in Jinjiang city, China

This paper presented the spatial database collected in 2013 for mitigating the urban carbon emissions of Jinjiang city, China. The database included the high-resolution CO2 emissions gridded maps, urban form fragmentation evaluation maps, and city-scale effect related impact factors distribution maps at 30 m and 500 m. We collected the multi-sources data including statistical, vector, and raster data from open-access websites and local governments. We used a general hybrid approach based on global downscaled and bottom-up elements to produce the CO2 emissions gridded maps. The urban fragmentation was measured by the landscape fragmentation metrics under the feature scale and the accurate identification of the urban functional districts. The percentage of the urban area and the points of interest (POI) density representing the city-scale effect related impact factors were calculated in each grid by the land use and POI data. Our database could be used for the validation of urban CO2 emissions estimation at the city scale. The landscape metrics and city-scale effect related impact factors maps can also be used for evaluating the socio-economic status in order to solve the other urban spatial planning problems.


a b s t r a c t
This paper presented the spatial database collected in 2013 for mitigating the urban carbon emissions of Jinjiang city, China. The database included the high-resolution CO 2 emissions gridded maps, urban form fragmentation evaluation maps, and city-scale effect related impact factors distribution maps at 30 m and 500 m. We collected the multi-sources data including statistical, vector, and raster data from open-access websites and local governments. We used a general hybrid approach based on global downscaled and bottom-up elements to produce the CO 2 emissions gridded maps. The urban fragmentation was measured by the landscape fragmentation metrics under the feature scale and the accurate identification of the urban functional districts. The percentage of the urban area and the points of interest (POI) density representing the city-scale effect related impact factors were calculated in each grid by the land use and POI data. Our database could be used for the validation of urban CO 2 emissions estimation at the city scale. The landscape metrics and city-scale effect related impact factors maps can also be used for evaluating the socio-economic status in order to solve the other urban spatial planning problems.
© 2020 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons. org/licenses/by/4.0/).

Data
A spatial database of low carbon urban system represented the spatial distribution maps of CO 2 emissions, urban form metrics (urban landscape fragmentation), proportion of urban area (PUA) and Specifications Table   Subject Environmental Sciences Specific subject area Landscape ecology, Climate change, Carbon emissions Type of data Text and Geo-tiff raster How data were acquired The raw data used to produce the CO 2 emissions gridded maps were from the statistical yearbooks (energy consumption by sectors), public remote sensing products (NPP-VIIRS, DEM) and the Jinjiang municipal government department (population, land use and road network). Besides, the other raw data used to identify the urban landscape functional districts were obtained from the Jinjiang municipal government department (city master plan) and the Baidu map.com (points of the interest data). The analyzed data were produced by authors using data fusion and calculation based on GIS  points of interest density (POID) at two resolutions (30 m (R 30m ) and 500 m (R 500m )) in 2013 Jinjiang city, China. The data were produced from ArcGIS 10.2, Apack 2.23, Fragstats 4.2 and R 3.5.3. In order to produce the spatial database, 10 types of raw data (Table 1) were preprocessed into the uniformed geographical coordinate system and some of them which were used to calculate the urban form metrics were uniformed into 16 urban functional subtypes by the specific standards (Tables 2 and 3). The high-resolution CO 2 emissions gridded maps contained the emissions from the residential, industrial and transport sectors at two resolutions (Fig. 1). The landscape mixing degree of urban functional districts were classified through the functional district types at two resolutions (Fig. 2). The fragmentation levels at two resolutions were identified by the landscape metrics (Fig. 5) which were calculated at the feature scales (Figs. 3 and 4, Table 4). The spatial distribution of PUA and POID in the spatial database represented the city-scale effect impact factors (Fig. 6). The description of all the data could be seen in the Datadescription.txt.

Raw data collection and preprocess
Our methodology developed the spatial database of CO 2 emissions, urban form fragmentation, and city-scale related impact factors (PUA and POID) at two resolutions (R 30m and R 500m ). All the raw data were collected to produce the gridded maps of the spatial database. The raw data included the vector master planning spatial data for Jinjiang 2010e2030, the vector land use data for Jinjiang 2013, point of interest (POI) data, road network data, the population per town in 2013, the GDP, energy consumption and related socio-economic factors in 2013, digital elevation model (DEM) at 30 m, and nightlight imagery at 500 m in 2013. The detailed information of raw data sources is shown in Table 1.
We must preprocess the spatial raw data from different sources into the specific and standard input data before developing the spatial database. For example, the master planning spatial data of Jinjiang 2010e2030, land use map of Jinjiang, and Baidu POI data have 49 functional subtypes, 31 land use subtypes, and 10 POI subtypes respectively. We aggregated the master planning spatial data and land use map into 16 urban functional districts according to the "Current land use classification standard" (GB/T 21010-2017). Detailed aggregated information could be seen in Table 2. We reclassified the Baidu POI data into the 16 specific urban functional districts by the name of POI data. Detailed reclassification information could see Table 3. Then, we used simple correction method to adjust the nightlight imagery of NPP-VIIRS, which unified the negative value into 0 and resampled it to the 500-m resolution to correct the data. Then, the Kriging spatial interpolation method was used to downscale the 500 m nightlight imagery to 30 m [1]. Besides, 30 m DEM was resampled to 500 m. So far, all the raster data mentioned in Table 1 had the images at two resolutions. Finally, all the vector and raster were unified into the WGS Albers 1984 projection coordinate system.

High-resolution CO 2 gridded maps
Our study follows the 2006 IPCC guidelines for National Greenhouse Gas Inventories [2]. We produced gridded maps of CO 2 emissions with sizes of 30 x 30 m and 500 x 500 mdresolutions which are typical in urban studiesdbased on multi-source geospatial CO 2 emissions data.
In this study, gridded maps of CO 2 emissions were constructed using a general hybrid approach based on global downscaled and bottom-up elements (e.g., industrial area). The total CO 2 emissions in each grid were calculated as follows: where Grid i is the total CO 2 emissions (unit: t) at the i th grid (i ¼ 1, 2, 3…, n), C l (units: t) is the total amount of CO 2 emissions from different emission sources, AL l (units: t) is the total energy consumption from different emission sources, and EF l (units: t/t CO 2 ) is the emission factor for different emission sources based on the IPCC method [2] at the i th grid (l ¼ p, I, T, which represent residential, industrial, and transport emissions, respectively), and Weight i,l is the weight of the specific emission type on-grid i. In fact, Weight i,l is the mathematical form of spatial proxies.  As formula (1) showed, we calculated the total CO 2 emissions based on energy consumption values within the urban geographic boundary of Jinjiang City in 2013 firstly. The total CO 2 emissions could be divided into three sectors: residential, industrial, and transport emissions.
The equation for residential emissions was as follows: where C 1 was the electricity CO 2 emission per capita; E 1 was the household electricity consumption; EF 1i was the carbon emission factor of the power grid, which was 0.8095 tCO 2 $Mwh À1 (Fujian Province belongs to the East China regional power grid); P was the population; C 2 was the gas CO 2 emissions per capita; F 2 , F 3 and F 4 were the quantities of household liquefied gas, gas and natural gas consumption; NVI 1 (50.179 MJ$kg À1 ), NVI 2 (38.931 MJ$m À3 ) and NVI 3 (38.7 MJ$kg À1 ) were the heating values of liquefied gas, gas and natural gas, respectively; EF 2 (0.06307 kgCO 2 $MJ À1 ), EF 3 (0.0561 kgCO 2 $MJ À1 ) and EF 4 (0.0444 kgCO 2 $MJ À1 ) were the carbon emission factors of liquefied gas, gas and natural gas, respectively; M 1 (0.45 kg$m À3 ) and M 2 (0.717 kg$m À3 ) were the density of gas and natural gas; C p was the mean emissions per capita of CO 2 . The equation for industrial emission was as follows: where C I was the industrial emission; K I was the energy consumption of industrial enterprises above a designated size (t standard coal per million yuan); I growth was the increment of the industrial enterprises (million yuan); k 1 was the standard coal CO 2 emission factor in the city, which was 2.773 t CO 2 per t standard coal. The standard coal was calculated from the energy consumed during production process that generated greenhouse gases, (e.g. cement and lime production). Besides, energy consumption per industrial increased value was equal to the industrial energy consumption divided by the increased value of industrial GDP, details can be seen at http://www.stats.gov.cn/tjsj/tjgb/qttjgb/ qgqttjgb/201007/t20100715_30644.html. The equation for transportemissions was as follows: where C T was the transport CO 2 emissions; Q 1 , Q 2 and Q 3 were the numbers of buses, taxis and private cars, respectively; L 1 , L 2 , and L 3 were the annual total mileages of buses, taxis and private cars, respectively; l 1 (32 L$km À1 ), l 2 (10 L$km À1 ) and l 3 (10 L$km À1 ) were the oil consumption factors per hundred kilometers, respectively; T 1 and T 2 were passenger and freight turnover, respectively; K T1 (11.6 kg standard coal per thousand person kilometers) and K T2 (1.9 kg standard coal per hundred tons kilometers) were the units of energy consumption of the passenger transport and freight; k 1 (2.773 t CO 2 per t standard coal) was the CO 2 emission coefficients of standard coal and k 2 (2.314 kg$L À1 ) was the gasoline CO 2 emission coefficient. The mileage of buses (70 080 km$year À1 ), taxis (12 000 km$year À1 ) and private cars (20 000 km$year-1) was calculated in terms of references [3e6].
To produce gridded maps of CO 2 emissions, we constructed spatial proxies that reflected the distribution of CO 2 emissions and allocated the total emissions of certain regions to each grid according to different weights. The spatial proxies were the high-resolution gridded population map, industrial land map, maps of nighttime light intensity, and the areas of various types of the road which were generated from multi-source data including digital elevation models, nighttime light imagery from the NPP-VIIRS, land use map, POIs and road network map. In order to produce the maps at two spatial resolutions, the spatial proxies were both generated using geospatial data at two resolutions. For instance, the high accuracy population map at 500 m relied on the digital number (DN) value of nighttime light imagery from NPP-VIIRS. However, more proxy variables should be added for the OLS regression models at 30 m such as the elevation and the area of different land-use subtypes. The product of the nighttime light intensity and the industrial land map were used to generate the industrial proxies at two resolutions. Road areas were calculated based on the national road construction standard which states that different classes of road have different road widths at two resolutions [7]. Road areas representing the transport emissions were calculated by multiplying road widths and road lengths in each grid cell using the ArcGIS 10.2 software. Finally, we generate the CO 2 emission gridded maps by overlaying these proxies at two resolutions ( Fig. 1).
High-resolution CO 2 emissions gridded maps were obtained by combining global downscaled and bottom-up approaches with spatial analysis models; however, some uncertainties of input variables existed. The uncertainties of input variables were propogated and mapped when obtaining the highresolution CO 2 emission map. The uncertainties should be explained and analyzed [8].
The first step was to analyze the input variables, the output variables and the whole distribution model. The input variables included total CO 2 estimate and spatial proxies, as shown in formula (1). As to the CO 2 estimate (also known as "magnitude" uncertainty), the activity level and emission factor had

Table 4
Fitting and inflection point of log-log curve of size-Lacunarity.

Landscape type
Fitting curve R 2 Inflection point Functional district patches (30 m) y ¼ À 0:422x þ 0:094x 2 À 0:010x 3 þ 1:406 0.999 3.080 Functional district patches (500 m) y ¼ À 0:143x þ 0:061x 2 À 0:049x 3 þ 0:008x 4 þ 0:684 0.999 0.693 great uncertainties; however, we calculated the number of different emissions using the IPCC guidance, which had high confidence. Besides the magnitude uncertainty, the spatial weight distribution created and propogated the uncertainty. The uncertainty of Weight p was the error in the generation of the population map. Uncertainty of Weight I depended on identifying the accuracy of industry location and the "blooming" effect of night light images. The area of the road network couldn't represent the transport emission intensity [9]. Based on the research of Wang et al. [10], greater accuracy could be obtained by comparing the total estimated emissions with the total number of all grids on the map. The maps showed that the absolute errors of total emissions at R 30m and R 500m were 886.70 t and 0.28 t, respectively. The relative errors of total emissions at R 30m and R 500m were 0.005% and less than 0.001%, respectively, especially in residential emissions. RMSEs at R 30m and R 500m were 185.73 t and 4508.19 t, respectively.

Identification of urban functional districts
With reference to the research of Chi and Long [11], we generated urban functional districts using the land use map, master planning data and POI data that generated the grid frequency density (FD) and grid category ratio (CR). FD and CR are defined as follows: where, F i represented the FD of the corresponding classification and C i represented the ratio of the FD of the corresponding classification to the FD of all classifications in the grid; i represented the functional category of POI; n i was the number of POI in the ith category in the grid; N i represented the sum of POI in the ith category. If C i exceeded 50%, we defined the grid as the single functional district patch, with the category being the same as the corresponding classification. If C i was less than 50%, we defined the grid as the mixing functional district patch.
Since POI was point data, there were some data deficiencies in areas where human activity was rare. Moreover, there were only 10 categories of urban functional areas in the POI data, in which six functional area categories (Water, agriculture and forestry land, other non-construction land, village construction land, special land, mining land) were missing compared to the master plan map data. We spatially superimposed two kinds of data and found that the missing POI data area was labeled as "indefinite" and we replaced it using the master plan map [12]. The indefinite patch was corrected using the land use/cover data. The rules for correcting an indefinite classification of urban functional district patch were as follows: To identify the parcels whose classification was inconsistent among the POI, land use map and master planning data. No correction was needed for parcels whose classification was all consistent, or at least classification of land use/cover and POI should be the same. The inconsistent parcels were defined as uncertain parcels and were then corrected. The uncertain parcels would be corrected by the land use/cover map. Due to the rare POI deficiency, the single functional district patches would be validated with the land use/cover map as well.
To confirm that urban functional districts were accurately defined, 100 verification points were randomly generated. The Google Earth image in 2013 was selected for visual identification. 84 verification points were consistent. Only 16 points were inconsistent. We revised the inconsistent urban functional district of these corresponding parcels into true urban functional subtypes. The urban mixed and single functional district/zone was defined according to the division of the mixture degree of each grid pixel.

Landscape mixed index of urban functional districts
The equation used to calculate the urban functional landscape mixed index was as follows: Landscapemix i ¼ À P K k¼1 p k;l ln p k;l lnðK; lÞ (9) where K was the number of urban functional districts types in the lth pixel and p k,l represented the area percentage of the k-type urban functional districts in the lth pixel. The higher the urban functional districts mixed index was, the higher the fragmentation degree was. The 0 mixed indexes represented the single urban functional district, the area with other values was the mixed urban functional district. The urban functional distrcts landscape mixed indexes at two resolutions were showed in Fig. 2.

Lacunarity index and feature scale determination
We used the Lacunarity index to quantify landscape heterogeneity and select feature scales [13]. The feature scales were the sizes of the moving windows used to calculate the landscape metrics. The landscape metrics refer to a simple quantitative indicator that enriches information on landscape pattern. The metrics reflect the landscape structural composition and provide information regarding some aspects of spatial configuration. Different moving window sizes were set for R 30m and R 500m . As to R 30m , the moving window size ranged from 30 mÂ30 me9000 mÂ 9000 m. The moving window of R 500m was from 500-mÂ500-m to 10000 m Â10000 m. The Lacunarity index was calculated based on the following equation: where LðrÞ was the Lacunarity index; S(r) was the mean mixing degree contained in each sample with the moving window; S s 2 (r) was the variance and r was the length of the moving window. The inflection point of log-log curve of the size-Lacunarity index was the feature scale, i.e., the maximum value of the differential of the fitted hyperbolic function. The logarithm of the Lacunarity index decreased as the logarithm of the size increased (Fig. 3). The logarithmic curve dropped greatly in the middle and right sections, indicating that the hierarchical structure of the whole urban mixed landscape and the fractal dimension distribution were in the middle and left sections. Polynomials were used to fit the porosity exponent double logarithmic curve at R 30m and R 500m . The fitted hyperbolic curve showed that R 2 reached 0.999, all the fitted equations were showed in Table 4, of which the fitting effect was excellent. The feature scales of functional area fragmentation analysis were 652.698 m (R 30m ) and 1000 m (R 500m ) (Fig. 4).

Landscape metrics
To characterize the fragmentation degree of urban form, Fragstats 4.2 software was used to calculate the number of patches (NP), patch density (PD), division (DIVISION) and effective mesh size (MESH) metrics. Fragstats 4.2 was a spatial analysis program for categorical map and could calculate all the different landscape indexes at different levels. (https://www.umass.edu/landeco/research/fragstats/ downloads/fragstats_downloads.html).
Formulas of the above mentioned indicators were as follows: MESH ¼ P n j¼1 a ij 2 A 1 10000 (14) Where N i was the number of patches in the ith landscape type (e.g. agriculture); M was the total number of landscape types; A was the total area of the landscape (m 2 ); n represented the number of single patches (mixing degree was 0) in the landscape; a ij was the area of patch ij. High NP, PD and DIVISION values represented high fragmentation, whereas a high MESH value represented low fragmentation. We divided Jinjiang city into low-, mid-and high-fragmentation level areas according to classification maps obtained by the K-Means method (Fig. 5). K-Means method which is a quick clustering method has a low algorithm complexity and a high efficiency in handling the big data [14]. The range of different landscape metric values were showed in Table 5.
2.4. City-scale effect related impact factors of the CO 2 mitigation: PUA and POID The UFD was the relatively detailed function classification of built-up areas. Non-built-up areas can be treated as a single function classification. The UFD fragmentation of non-built-up areas was not significant enough. In contrast, the phenomenon of urban functional landscape fragmentation in urban built-up areas was more significant. In quantifying the influence of UFD fragmentation on urban CO 2 emissions, we used PUA and POID to control the effects of the urban population, economic scale and aggregation on CO 2 emissions, which are the three most recognized major impact factors of urban CO 2 emissions.
PUA was expressed as the proportion of the urban area to the whole unit per grid. Detailed information on the metrics of urban sprawl was provided by Ref. [15]. POID was expressed as the number of POI to the whole units per grid. PUA was classified into three categories (high, low and middle) using K-Means. POID was classified into three categories (POID¼0, POID¼1 and POID >1).