Mapping and evaluating global urban entities (2000–2020): A novel perspective to delineate urban entities based on consistent nighttime light data

ABSTRACT The differences in the definition of urban areas lead to our contrasting or inconsistent understanding of global urban development and their corresponding socioeconomic and environmental impacts. The existing urban areas were widely identified by the boundaries of built-environment or social-connections, rather than urban entities that are essentially the spatial extents of human activity agglomerations. Thus, this study has attempted to map and evaluate global urban entities (2000–2020) from a perspective of an updated urban concept of urban entities based on the consistent remotely sensed nighttime light data. First, a K-means algorithm was developed to cluster urban and non-urban pixels automatically in consideration of global region division. Then, a post-processing was conducted to enhance the temporal and logical consistency of urban entities during the study period. Rationality assessment indicates that urban entities derived from remotely sensed nighttime light data more effectively reflect the spatial agglomeration extents of human activities than those of physical urban areas. Global urban entities increased from 157,733 km2 in 2000 to 470,632 km2 in 2020 accompanied by a differentiated urban expansion at global, continental, and national levels. Our study provides long-time series and fine-resolution datasets (500 m) and new research avenues for spatiotemporal analysis of global urban entity expansion with the improvement of the understanding of urbanization and the emergence of effective urban mapping theories and approaches.


Introduction
A potential premise for understanding and predicting urban growth paths and solving future sustainable development problems is to identify urban areas on the basis of specific urban characteristics.Areas with physical surfaces occupied by roads, buildings, and other infrastructure are often regarded as urban areas (Zhou et al. 2014;Zhang et al. 2020).Although the definition characterizes urban physical characteristics from the perspective of landscape components or structures (Gong et al. 2020), it cannot effectively evaluate inherent human activity aggregation characteristics (Ellison, Glaeser, and Kerr 2010).In addition, invisible human perception and socioeconomic urban functions should be interpreted as different expressions of urban structures (Keuschnigg 2019), but these cannot be accurately implemented to specific spatial regions at long time series and large scale.From an economic perspective, urban areas are essentially aggregated entities of population (or human activity) for promoting socially, economically, and environmentally sustainable development.The socioeconomic reason for the urban existence is mainly attributed to the benefits brought by agglomeration effects (e.g. economy scale, positive externality, and labor force accumulation).
An economics-based definition of urban areas would facilitate practical socioeconomic application analysis.For example, it is more appropriate to analyze the vacancy rate of urban housing from the perspective of urban human activities rather than the physical surface of a city (Chen et al. 2015).Thus, urban areas should be defined as urban entities with both socioeconomic and natural characteristics.Generally, urban entities ought to have three characteristics as follows.(1) Urban entities should be the spatial expressions of highly concentrated human social and economic activities.(2) The spatial form of urban entities should be characterized by concentration and continuity.(3) Urban entities have the same physical properties from a large-scale perspective.
Traditionally, population density or size are usually selected as the main criterion for defining urban entities by governments (Habitat 2006).For instance, taking a certain population density as the threshold, the LandScan data and WorldPop datasets were usually employed to define urbanized entities (Henderson, Nigmatulina, and Kriticos 2019).Although "population urban entities" meet the identification of urban "population" in economics, the differences in population size or density with various national conditions worldwide make them impossible to use for forming unified urban entities across diverse countries (Cohen 2006).Satellite remotely sensed data provide great potential in urban entity mapping on a large scale.Currently, global urban entity mappings are usually derived from Moderate Resolution Imaging Spectrometer (MODIS) data (Schneider, Friedl, and Potere 2010), Landsat observations (Kuang et al. 2021), and Sentinel data (Taubenböck et al. 2012), which are mostly designed on revealing the temporal and spatial distributions of artificial impervious extents or urban built-up areas.However, existing urban theories indicate that urban development is a complex mosaic or process, and urban areas should not be determined solely by the visual physical structure (Grove, Cadenasso, and Pickett 2015).
Satellite-based nighttime light (NTL) observations can effectively monitor human-caused light sources from cities, towns, and other places, as well as traffic flow, with persistent human-caused lights, which present a great potential to measure the spatiotemporal changes of socioeconomic activities and urbanization process (Noam et al. 2020;Small, Pozzi, and Elvidge 2005).Unlike traditional remotely sensed data focused on reflecting the morphological and texture information of landscapes, NTL data can provide coupling information of lights and spatiotemporal locations (Zhao et al. 2019).Specifically, the Defense Meteorological Satellite Program's Operational Linescan System (DMSP-OLS) data and the Suomi National Polar Partnership-Visible Infrared Imaging Radiometer Suite instrument onboard (SNPP-VIIRS) data, as the two most popular NTL data, are widely employed in the extraction of urban entities (Zheng, Weng, and Wang 2021), evaluate urbanization (Ma et al. 2014), estimate socioeconomic parameters (e.g.population, gross domestic product (GDP), carbon emissions, and building stock) (Shi et al. 2014(Shi et al. , 2015(Shi et al. , 2021(Shi et al. , 2019)), and analyzed other events related to human activities (Shi et al. 2020(Shi et al. , 2021)).Theoretically, the use of NTL data to delineate and map urban entities has two advantages: (1) They can effectively identify urban entities from a human activity agglomeration perspective (Henderson, Nigmatulina, and Kriticos 2019).( 2) They can expediently map urban entity expansion at a nearly global coverage and long temporal frequencies within their minimal spatial resolution (Elvidge et al. 2009).Many studies have attempted to extract urban entities by using NTL data from different angles, but the following issues still need further indepth analysis.First, most studies rarely discussed what are the contents they are mapping or reflecting (Hu et al. 2020).Some studies still regarded the extracted urban entities as urban built-up areas or impervious surfaces (Chen et al. 2019).Although a few studies realized the necessity of defining urban entities and recognized NTLs to reflect human activity attributes, most of them assumed that urban areas are equivalent to urban physical extents (Zhao et al. 2019;Zhou et al. 2014).On the basis of these assumptions, impervious surfaces or urban built-up areas were often employed as validated references (Hu et al. 2020).As mentioned above, urban entities are complex mosaics that mainly reflect the spatial agglomerations of population or human activities (United Nations, Department of Economic, and Population Division Social Affairs 2018; Hu et al. 2020).Therefore, the credibility and universality of NTL data-derived urban entities, which reflect the spatial agglomeration extents of human activities, should be further explored and verified.Second, although DMSP-OLS (1992-2013) and SNPP-VIIRS (2012-present) have valuable time series, longer urban expansion on a global scale are still lacking due to NTL data source inconsistencies.In particular, global urban mappings after the 21st century are lacking as the world has experienced a violent process of urbanization.Currently, urbanization monitoring of NTL is limited to medium-and short-term studies, either based on DMSP-OLS data (1992-2013) or SNPP-VIIRS data (2012-present).Only a few studies used consistent NTL data to analyze global or regional urban expansion (Zhao et al. 2020(Zhao et al. , 2022;;He et al. 2019).With the advent of a new generation of longtime series, consistent NTL data (hereinafter referred to as SNPP-VIIRS-like), global urban expansion aimed at identifying urban entities have a unique advantage in revealing the process of global urbanization.
In consideration of the limitations of current urban mappings in NTL data, improvements of urban entity mapping within consistent NTL data demand updates with the renewal of the understanding of urbanization and the emergence of effective urban mapping theories and methods.Thus, this study will map and evaluate global urban entities during the period 2000-2020 from a perspective of an updated urban concept of urban entities based on the consistent "SNPP-VIIRS-like" data generated previously.Rather than adopting widely used threshold-based methods, we proposed a K-means algorithm and a postprocessing, which can directly identify urban entities on a pixel scale without a need for additional data in accordance with the light intensity characteristics of SNPP-VIIRS-like data.The value is to update understandings of urban entities derived from NTL data, and an equally important goal is to eliminate the need for auxiliary datasets to map urban expansion flexibly in different regions and time periods to support urban sustainable development efficiently.
The population data of LandScan were employed to assess the rationality of urban entities derived from NTL data.The data were obtained from Oak Ridge National Laboratory.LandScan data were developed from an ambient population measure with a 1 km spatial resolution by using nocturnal and diurnal population estimates from multiple data sources, such as census and remotely sensed data.The use of LandScan data allows not only to assess the ambient population with potential day and night populations but also to identify the potential distribution of population in different built covers from high-resolution satellite data effectively (Rose and Bright 2014).LandScan data have been widely employed to define and validate urban areas in different regions or countries (Henderson, Nigmatulina, and Kriticos 2019).
Two types of physical urban areas, namely, artificial impervious extents (hereinafter referred to as MODIS) collected from the data product of MODIS Land Cover Type Version 6 and urban built-up areas (hereinafter referred to as HE) proposed by He et al. (2019), were used to evaluate spatial extents in comparison with urban entities derived from SNPP-VIIRS-like data.As one of the most widely used and analyzed urban products, the MODIS data (500 m) were developed from supervised classifications with a post-processing that incorporated ancillary information and prior knowledge (Sulla-Menashe and Friedl 2018).The HE data (1 km) were produced through a fully convolutional network by using multisource remotely sensed data with average values of 90.9% and 0.47% for OA and kappa, respectively (He et al. 2019), which fairly effectively and accurately represent actual urban builtup extents globally with long time series.
Other ancillary data, such as Terra land surface temperature (LST, 1 km), Terra vegetation index (VI, 500 m), Terra land water (250 m), and Landsat 8 OLI (30 m), were obtained from the Google Earth Engine platform.The 2010 developed land (30 m) reflecting urban areas was acquired from GlobeLand30.Road networks were extracted from OpenStreetMap.Statistical data, including urban population and GDP, were obtained from the United Nations and World Bank, respectively.Administrative boundaries were collected from Natural Earth.

Methods
Due to heterogeneous light brightness in different urban development types, existing threshold methods cannot effectively extract optimal urban boundaries.In addition, the traditional mutation detection method not only has certain requirements for urban forms but also extracts urban boundaries manually and subjectively (Yang et al. 2017).As a classical unsupervised clustering algorithm, the K-means algorithm provides a fast and efficient way for image classifications without requirements for urban forms on a pixel scale (Delmelle 2015).Compared with other methods, the K-means algorithm has less uncertainty in extraction results and effectively solves the non-objectivity problem in previous studies (Feng, Peng, and Wu 2020).Thus, the K-means algorithm was used to map global urban entities  on the basis of the time series of the SNPP-VIIRS-like data.First, the K-means algorithm was developed to identify urban and non-urban pixels automatically with an appropriate zoning.Then, a post-processing was conducted to improve the temporal and logical consistency of urban entities during the study period.The details will be stated in the following sections.

Urban entity extraction based on the K-means algorithm
From urban centers to rural hinterlands, the difference in NTL intensity presents a significant trend, although the intensity in the urban -rural transition zone shows a gradual weakening trend.Thus, urban entities can be effectively extracted by distinguishing the difference between urban and rural NTL intensities (Xie, Weng, and Fu 2019;Zhou et al. 2018;Noam et al. 2020).We developed a K-means algorithm to identify urban entities automatically by distinguishing the difference in NTL intensity.This algorithm divides the dataset with n objects into k clusters in accordance with distance, which was mainly regarded as the similarity measurement and evaluation standard.Although comprehensively comparing the attributes of n objects, the output meets the highest similarity of objects in the same clusters.Given its simple clustering process and structure in spherically distributed datasets, the K-means algorithm has been widely used in many fields of Earth science (Delmelle 2015;Feng, Peng, and Wu 2020).
In this study, the NTL images could be segmented into a series of pixel matrices, and the feature types represented by each pixel in an image were obtained through cluster analysis.When evident differences in the observed values of various ground types in a pixel matrix exist, the clustering method can reasonably detect diverse objects.The unsupervised classification algorithm can directly identify feature types on a pixel scale without special requirements for urban forms.Hence, the K-means algorithm was used to segment the SNPP-VIIRS-like data into two categories (k = 2): urban areas and non-urban areas.After the K-means clustering iteration, the difference between urban and non-urban areas was judged on the basis of gray value.
However, extracting urban entities via the K-means algorithm may be limited by "distance."When global NTL data are employed as the input dataset, the light radiation value captured by satellites may be affected by the degree of socieconomic development in different regions, resulting in an unstable clustering effect without the optimal value.To avoid the "distance" effect, we should divide the global NTL data into several subregions for extracting urban entities carefully and accurately.We tried to divide the world into several regions in accordance with different degrees and chose a certain division as the optimal standard.Taking 1°, 5°, 10°, and 20° as examples, we selected some cities in China for experiments, judged urban entity changes after clustering in each block, and evaluated the effect of image zoning on data clustering accuracy.As shown in Figure .S1, we found that when the zoning standards are 5°-10°, total urban pixel changes present the most stable trend.This means that bright attributes-based clustering at these scales does provide stable results.To maximize convenience, we divided the world into several regions in accordance with longitude and latitude 10° and input the partitioned NTL dataset into the classifier.
The specific process was as follows.First, we randomly selected two pixels with different attribute values as the centers of the initial categories.Second, these sample pixels were assigned to the most appropriate clusters by judging the similarity between the attribute value of each pixel x t (t = 1, . .., n) and the centers of the two categories.Third, in the j-th iteration, we successively calculated Euclidean distances (ED) for all points in the dataset x t to μ j ð Þ i centers.In accordance with the minimum ED, x t was included in the cluster of μ j ð Þ i with the smallest distance.The average distance of each newly obtained cluster was recalculated, and the cluster was regarded as the subsequent cluster center.Then, the process was repeated until the cluster center was no longer updating; that is, the criterion function E is convergent. (1) where E is the minimum square error of cluster C={C l , C 2 , C 3 ,. . .C k } for sample x clustering from the SNPP-VIIRS-like data.The smaller the E value is, the higher the similarity of the samples in the cluster is.μ jþ1 i is the center of cluster C i when iterating j + 1 times.
In this study, the sum of error square criterion function was used in the clustering criterion function.
where P represents all pixels in cluster C i , and M i is the arithmetic mean of all pixels in cluster C i .J C is a mapping between data objects and cluster centers.
Considering that J C can reflect the error of clustering results, we need to find the clustering results that can minimize J C as much as possible.

Post-processing
To maintain the logical consistency of spatial and temporal changes in urban entities, we postprocessed the initial annual urban entities automatically extracted using the K-means algorithm from long-time series SNPP-VIIRS-like data.The posttreatment scheme consisted of three main steps.(1) An iterative temporal filtered procedure was performed for the initial binary urban entities.It follows logically that the urban entities that emerged in the previous year will basically be retained in the following year (He et al. 2014).The temporal inconsistency of annual initial urban entities was corrected through a temporal moving window (Li et al. 2019).By changing labels with low temporal consistency probability, isolated pixels in urban entities of contiguous years could be effectively reduced.
(2) Although misclassification noises could be corrected with temporal filtering, the resulting sequence may be illogical, including alternating urban and non-urban segments.
A logical reasoning improvement was employed to check for irreversible conversion between non-urban and urban areas (Li, Gong, and Liang 2015), because in theory, human activity agglomeration extent cannot disappear suddenly unless irreversible events (Henderson, Nigmatulina, and Kriticos 2019;He et al. 2014).
(3) We deleted independent objects less than 2 km 2 , which are considered extremely small to function for urban entities (Hu et al. 2020;Henderson, Nigmatulina, and Kriticos 2019).This particular size can be a local refinement process that may be improved in other areas with the addition of spatiotemporal big data (Hu et al. 2020).Through this postprocessing, we obtained more reliable global urban dynamics from 2000 to 2020 with less spatial and temporal inconsistency.

Rationality assessment of urban entity extracted from SNPP-VIIRS-like data
Urban areas varied greatly in previous studies due to different definitions or objectives (Zhou et al. 2018;Hu et al. 2020).Considering that urban entities extracted from SNPP-VIIRS-like data mainly reflect the spatial agglomeration extents of human activities (hereinafter referred to as urban entities), visual and quantitative evaluations were employed to validate urban entities as follows.First, urban entities were visually compared with LandScan population products and road networks.
The purpose is to emphasize that urban entities are closely related to human activities with both socioeconomic and natural characteristics.The approach also fills in the gap of not being able to directly extract urban entities based on population or other social perception data.Second, comparisons were conducted between urban entities and physical urban areas (e.g.artificial impervious extents or urban built-up areas) to identify their similarities and differences.The purpose is to enable readers to deeply understand how different urban definitions depict global urbanization.Third, considering that urbanization is closely related to socioeconomic development, comparisons were performed between urban entities and socioeconomic parameters (e.g.urban population and GDP).It is intended to highlight the human activity aggregation characteristics that are reflected in urban entities.These different comparisons would guarantee the urban entity reliability.

Comparisons with the LandScan population product and road networks
Before comparisons, we classified the population density within the LandScan population product into four categories: 1-500, 500-1000, 1000-1500, and >1500 person/ km 2 .The reason was that 500, 1000, and 1500 person/ km 2 are often referred to as the optimal thresholds when using population density to define urban entities within different levels of urban development (Henderson, Nigmatulina, and Kriticos 2019).This classification avoided the confusion of urban area identification due to the lack of unified population density threshold globally.That is, extents with a population density greater than 500 person/km 2 can even be considered urban areas in some sparsely populated regions or countries (Henderson, Nigmatulina, and Kriticos 2019).As shown in Figure 1, urban entities can spatially capture the changes in a population density of at least more than 500 person/km 2 .Specifically, through overlaying the extracted urban areas on the LandScan population product, we found that urban entities can basically reveal the spatial distribution of a population density of at least more than 500 person/km 2 within different types of cities (e.g.New York, São Paulo, London, and Shenzhen) in distinct continents.These cities are selected because they are evenly distributed on each continent and are among the largest in these regions.
Owing to the selection of developed metropolises, the population density in urban areas was basically greater than 1500 person/km 2 , while the population density was greater than 500 person/km 2 only in some urban fringe areas.This finding confirmed that urban entities effectively reflect the spatial agglomeration extents of human activities, which is considerably in line with the definition of urban population density.However, we also found that some sporadic areas with a population density of more than 500 person/km 2 or even 1500 person/km 2 around urban agglomeration patches were not defined as urban entities (Figures.1( accordance with the threshold of 1500 person/km 2 and the total population threshold of 50,000 people (Taubenböck et al. 2012).Moreover, some areas with a population density greater than 1500 person/km 2 were identified as urban entities, but they were not human construction land from the Landsat images (Figures.1(e)-(l)).This was due to the fact that urban entities are defined from the perspective of human activity agglomerations.Even if there are not many human-occupied buildings in a certain region, it can be identified as urban entities as long as it meets certain population density and size.
Urban commuting zones, which reflect the human activity scope and employment extent, are also widely defined as urban entities from the perspective of population agglomeration in economics.Thus, road networks were selected as comparison references because roads are transmission mediums of human commuting activities.For visual comparisons, urban entities were overlaid on road networks as an example of different cities in China (Figure 2).We found that urban entities highly overlapped with high-density road networks.The denser the roads were, the brighter the lights were (Figures.2(a)-(c)).In addition, different levels of roads could indirectly represent various urban attributes.For example, motorway and primary road were regarded as "the main artery" connecting different cities and regions; living street and other roads were "the capillary," which were the main mediums of human activities within urban areas.We found that the boundaries between urban entities and living street and other roads had a good spatial consistency within different cities (Figures.2(d)-(f)).The reason was that the NPP-VIIRSlike data can indirectly reflect human commuting zones, which is considerably in line with the perspective of human activity "flow," because the data can monitor the lights of cars on the roads concentrated in urban commuter areas (Henderson, Nigmatulina, and Kriticos 2019).These results further proved that the NPP-VIIRSlike data can effectively identify entities, which are highly in line with urban definitions from the spatial agglomeration extent of population.

Comparisons with global urban products
Given that artificial impervious extents or urban built-up areas derived from satellite remotely sensed data are widely used as urban extent references, two global urban products, MODIS and He, were selected for assessment and intercomparison with urban entities.As shown in Figure 3, the total areas of urban entities were basically consistent with the total areas of HE and MODIS worldwide, with R 2 values no less than 0.73 and relatively low root-mean-square errors no more than 4389 at the national level.However, we found that most national points were located at the bottom right of the 1:1 line, indicating that the total areas of urban entities were generally lower than those of HE and MODIS globally.
To verify this analysis further, we recalculated the total areas of urban entities, HE, and MODIS at the global and continental levels (Figure 4).The total areas of MODIS were the highest, followed by those of HE and urban entities.The area differences also maintained a relatively stable trend within different levels and regions.This inconsistency can be explained as follows.First, artificial impervious extents (or urban built-up areas) with large size and dim lights (e.g.water, grass, and forest) within large patches were usually excluded in urban entities.Second, artificial impervious areas (or urban built-up areas) with dim lights around urban fringes (e.g.towns or large villages) were possibly excluded in urban entities.Third, some scattered artificial impervious extents (or built-up areas) around urban fringes with brightness lights were also excluded in urban entities.Essentially, urban entities was identified from the perspective of the  spatial agglomeration extent of human activities, while artificial impervious extents (or urban built-up areas) were delimited from the perspective of urban landscape or physical structure.For example, some water conservancy facilities or motorway intersections around urban fringes could be identified as artificial impervious extents or urban built-up areas, but they would not be recognized as a part of urban entities due to the low human activity agglomerations.The cognition is basically consistent with the study of Zhao et al. (2020), which indicated that NTL-derived urban entities correspond to 20%-45% range of artificial impervious extents.Through overlaying urban entities, HE, and MODIS on the LandScan population product, we found that urban entities showed relative agreement with the LandScan population product with a population density of at least more than 500person/km 2 in the experimental objects of Buenos Aires, Gauteng, and Silesia (Figure 5).However, there were still spatial differences.For HE, some low-density and even non-populated extents within or around patches were identified as urban areas (Figures.5(a), (d), (g)).For MODIS, some highpopulation-density areas were not only omitted within patches (Figure 5(b)), but also many lowpopulation-density areas were defined as urban areas or patches (Figures.5(e), (h)).By contrast, urban entities in this study identified urban entities with a high population density of more than 500person/km 2 .Spatial comparisons proved that physical urban areas were greater than urban entities, which was attributed to the differences in the attributes of urban objects they recognized.The results are consistent with Figures 3-4.
Following the approaches suggested by Sutton, Cova, and Elvidge (2006)  continents, and reference urban entities were extracted from the LandScan population product with a referenced population density threshold (≥1500 person/km 2 ) (Harrison 2008;Henderson, Nigmatulina, and Kriticos 2019), because many authorities (e.g.Union Commission and the Organization for Economic Co-operation and Development) usually define urban entities as a group of continuous grid squares, in which the population density is greater than or equal to 1500 person/km 2 in some developed cities.Comparisons displayed that the spatial consistency of urban entities were higher than that of HE and MODIS (Table S1).Specifically, the average OA and kappa of urban entities were 74.29%-99.34%and 0.22-0.75,respectively.The average OA and kappa of HE were 76.05%-97.88% and 0.22-0.59,respectively.The average OA and kappa of MODIS were 78.69%-97.85% and 0.21-0.65,respectively.Except for London, the average OA and kappa of urban entities were higher than those of HE and MODIS.Thus, compared with the urban references of the LandScan population product, urban entities can be adopted for delineating urban extents when considering the spatial agglomeration extents of socioeconomic attributes.

Comparisons with socioeconomic statistics
To prove that urban entities are partly consistent with urban definitions from the perspective of socioeconomic agglomerations, we further evaluated the relationships between urban entities and statistical data (e.g.urban population and GDP) provided by the United Nations at the national level during the period of 2000-2015.As shown in Figures 6 and S2-S4, although all regressions passed the significance test (>0.01), the correlation coefficient (R 2 ) values of urban entities with statistical data were the highest in the regression results.Specifically, the R 2 values of total areas of urban entities (TA) and urban population density were 0.56-0.71(Figures 6(a of TH growth-GDP growth (0.38-0.52) and TM growth-GDP growth (0.18-0.45) (Figures . S4(e)-(l)).The results presented that urban entities could better reflect socioeconomic development than He and MODIS.

Evaluations of global urban entity expansion
As shown in Figure 7, we found that the world presented a rapid and differentiated urban entity expansion since 2000.Globally, urban entities increased from 157,733 km 2 in 2000 to 470,632 km 2 in 2020.Urban distributions showed notable differences with various longitudinal and latitudinal zones.From the longitudinal perspective, regions 2, 3, and 4 presented peaks with large-scale urban entity distributions in North America, Western Europe, and East Asia (Figure 7(F)).From the latitudinal perspective, region 1 showed a peak with large-scale urban entity distributions in the mid-latitude of the Northern Hemisphere (Figure 7(g)).These peaks also showed the temporal differences of urban entity expansion.For example, compared with regions 2 and 3, region 4 had experienced a larger urban entity expansion in the later period, which corresponded to the results shown in Figures.7(b)-(e).This phenomenon may be attributed to the unprecedented urbanization process in East Asia, especially in China.
Global urban entity expansion also showed significant intercontinental differences.As depicted in Figure 8, Asia, North America, and Europe experienced faster urban entity expansion than other continents over the past two decades.Specifically, urban entities in Asia increased from 42,198 km 2 in 2000 to 181,990 km 2 in 2020.In North America, urban entities increased from 70,105 km 2 in 2000 to 138,669 km 2 in 2020.In Europe, urban entities increased from 23,766 km 2 in 2000 to 68,561 km 2 in 2020.In addition, urban entities increased by 25,678, 30049, and 4022 km 2 in Africa, South America, and Oceania, respectively.Compared with urban entities, per capita urban entities presented a completely different dynamic pattern (Figure 8(b)).Among the continents, Europe experienced the largest per capita urban entities expansion, followed by North America, South America, Asia, Africa, and Oceania.
Moreover, urban entity expansion varied among different countries over the past decades.From the spatial pattern perspective, we found that urban entities were mainly distributed in the United States, China, Brazil, India, and Russia, all of which have a vast territory and a large population (Figure 9(a)).Meanwhile, developed countries (e.g. the United States, Italy, Canada, and the United Kingdom) were the top countries with the highest per capita urban entities in 2020 (Figure 9(b)).From the temporal perspective, a notable example was found in China, where urban entities increased from 9438 km 2 in 2000 to 78,545 km 2 in 2020 (Figure 9(c)).Developing countries, such as India, Indonesia, Iran, Turkey, Nigeria, and Mexico, also experienced rapid urban expansion.Furthermore, we found that developed countries, like Italy, the United States, Russia, the United Kingdom, and France, showed high per capita urban entities with a stable growth rate (Figure 9(d)).On the contrary, developing countries, such as Nigeria, Bangladesh, the Philippines, India, Pakistan, Indonesia, and China, presented relatively low per capita urban entities, despite with a relatively rapid growth rate (Figures.9(c)-(d)).Hence, although these developed countries maintained a relatively stable urban expansion, no corresponding urban population growth occurred, which likely led to some urban issues, such as urban sprawl (Gounaridis, Newell, and Goodspeed 2020).By contrast, the rapid urban entity expansion in such developing countries was often accompanied by large-scale population migration from rural areas to urban areas without corresponding rapid growth in per capita urban area, but this would produce a series of urban issues, such as traffic congestion and environmental pollution.

Advantages of SNPP-VIIRS-like data in urban entity mapping
Most people's direct mental images for distinguishing urban and non-urban areas are the differences among high buildings, busy streets, farmland, and flowing rivers.This inherited impression may lead researchers and policymakers to adopt visual physical differences as standards to differentiate urban and non-urban areas (Hu et al. 2020).However, urban is a complex system, which is composed of many types of spatial mosaics (Brenner and Schmid 2014).The reality is that no clear urban boundary exists.When physical structures were used to identify urban boundaries, we would not be able to perceive the agglomeration and positive external effects of urban socioeconomic development to some extent.
In fact, NTL and urban physical surfaces tell us different stories.To understand these differences, NTL, developed land, and Landsat images were compared to look at their gray surface discrepancies on Earth with the examples of New York, London, the Pearl River Delta, and São Paulo (Figure 10).Concrete structures and impermeable surfaces showed the same color (e.g.red or purple) in urban and nonrural areas from developed land and Landsat images (Figures.10(a)-(h)), but differences in NTL images were present (Figures.10(i)-(l)).That is, even the same type of land cover surface presented evidently different NTL intensities.These differences were attributed to our diverse ways of viewing urban areas.
Many studies have recognized that other human perceptions or functions can also be considered to understand urban entities (Chao et al. 2021), in addition to the visual physical structures of human-occupied buildings and roads (Ma et al. 2015;Zhou et al. 2018).Physical structure data and socially sensed data (e.g.social media data or location-tagged data) should be integrated to develop a more effective approach to identifying urban entities.Nevertheless, the data availability limits the identification of urban entities from the perspectives of large scale and long time series (Shi et al. 2020).Coincidentally, NTL cannot only identify the spatial scope of human activities (Shi et al. 2014;Xie, Weng, and Fu 2019) but can also reflect the intensity differences of socioeconomic development (Shi et al. 2014;Chen and Nordhaus 2011), which directly reveal the characteristics of urban sizes, shapes, and functions.However, we should recognize that the traditional DMSP-OLS data cannot effectively and truly show spatial human activities due to their spillover effect.Some studies have attempted to extract urban entities based on the DMSP-OLS data and other auxiliary data (Chen et al. 2019;Ma et al. 2014), but their purposes are to identify urban built-up areas without considering socioeconomic information.In addition, although a few studies have recognized that NTL reflect human activities (Zhao et al. 2019;Zhou et al. 2018), the results of surface coverage are still used as evaluation criteria.The main difference between our study and the previous studies via NTL data is the perspective of urban area definition.In previous studies for urban mapping, regardless of what auxiliary data they used, NTL data were actually converted into other types of data with reference to physical urban areas (e.g.artificial impervious extents or urban built-up areas) (He et al. 2019;Xu et al. 2021).Urban entities extracted using NTL data were directly regarded as a kind of physical surface.Nonetheless, the advantage of SNPP-VIIRS-like data is that they can directly reflect the real distribution of human-caused lights for effectively identifying the extents of human activities with elimination of the spillover effect.The verification results further revealed that urban entities are not only more in line with the spatial extent of population agglomeration but also more effectively reflects socioeconomic development.Urban entities are helpful to analyze many issues in urban sustainable development, such as urban sprawl, urban shrinking, and urban living environment suitability.

Performance of the K-means algorithm in extracting urban entities
Many classification methods have recently been employed to facilitate the extraction of urban entities by using NTL data.As the most popular method, the threshold-based method was widely adopted to extract urban entities on different scales (Zhou et al. 2014;Liu et al. 2016).However, due to the influence of socioeconomic and natural factors, optimal thresholds present different trends within various regions (Chen et al. 2021).Nevertheless, the abrupt change point of urban boundary is often identified by vision, which inevitably needs manual adjustment (Feng, Peng, and Wu 2020).In this study, the K-means algorithm did not preset thresholds or parameters, and the clustering process was based on the datasets themselves automatically.Comparisons between urban entities extracted using the reference-based threshold method and our urban entities are shown in Figure 11(a).Although the results of the two datasets exhibited a large area of overlap, which preliminarily confirmed that both approaches have certain reliability, we found that the threshold method omitted many high-population-density areas within large patches (Figures. 11(a), (f)) with an example of Pearl River Delta, China.Mutation detection has also become a common way of urban entity identification without any additional data.
The K-means algorithm with a clustering process we conducted in this study provides a new way to identify urban entities of human activity agglomeration.Many studies have proven that urban area change can be detected through the dimensional characteristics of multiple data sources, such as VI and LST.Thus, we also attempted to map urban entities by integrating NTL, VI, and LST with the K-means algorithm, but we found a significant misidentification by integrating multiple data sources.These identified areas were significantly larger than urban entities and included many low-density or even non-populated areas (Figures.11(b)-(f)).The reason may be that these factors led to the increase in uncertainty in the clustering process.In our study, the advantage of the K-means algorithm was the extraction of urban entities rapidly and effectively by directly using the characteristics of light intensity without any additional data.Given that the basic definitions of urban entity detection and grouping algorithm are concise and explicit, this method can effectively identify and draw urban areas with different definitions.On the basis of the criterion of considering pixel characteristics, the method can be used for comparative analysis in any spatial range or long time series.

Application: Characteristic analyses of brightness threshold in urban entities
Our results have proven that the K-means algorithm can effectively delineate global urban entity expansion from 2000 to 2020 with different levels of urban development (Figure 7).Then, we also found that the brightness thresholds of urban entities varied over years globally, which implied that using traditional threshold methods for extracting urban entities is unrealistic.As shown in Figure 12, the global brightness thresholds of urban entities presented significant fluctuations with a peak range between 10 and 40 nWcm −2 sr −1 .The brightness threshold ranges of urban entities also showed different patterns in various continents.Specifically, the peak ranges of brightness thresholds in North America (approximately 20-35 nWcm −2 sr −1 ) and South America (approximately 20-40 nWcm −2 sr −1 ) were higher than those in Asia, Europe, and Oceania.Owing to the low level of socioeconomic development, no significant peak ranges were observed in Africa.Ten typical countries were considered as examples, and the brightness threshold ranges were further compared and identified between developing and developed countries (Figures.12(b)-(c)).Developing countries, such as Argentina and Brazil, showed relatively high peak ranges, followed by China, India, and Thailand.The reason may be that the urban population in Argentina and Brazil was relatively concentrated in some mega cities, whereas the population in Thailand, India, and China was relatively distributed in multiple cities.For developed countries, a remarkable phenomenon was that the peak ranges of brightness thresholds in the United States were distributed in 20-38 nWcm −2 sr −1 , which may be attributed to the development of continuous metropolitan areas with high population density.The peak ranges of brightness thresholds in France and South Korea were approximately 20-30 nWcm −2 sr −1 , followed by those in the United Kingdom and Japan.
The different brightness thresholds of urban entities can also indicate that various functions or As illustrated in Figure 13, we can speculate that stronger interconnections existed within New York than those within the Pearl River Delta.

Conclusions
This study has proposed a K-means algorithm and a post-processing to extract urban entities and evaluate global urban entity expansion (2000-2020) by using SNPP-VIIRS-like data.The advantage would meet the needs of the latest urban mapping and differently updated urban theoretical understanding.
The concepts and methods could effectively and efficiently outline the determined urban entities worldwide in accordance with light intensity without any additional data.On this basis, we first developed the K-means algorithm to identify urban and non-urban pixels automatically in consideration of global region division.Then, we performed a post-processing to improve the temporal and logical consistency of urban entities during the study period.The accuracy assessment showed that urban entities spatially agreed well with the urban entities derived from the LandScan population product and road networks.This finding confirmed that urban entities effectively reflect the spatial agglomeration extents of human activities, which are considerably in line with the definition of urban population.We also performed comparisons between urban entities and traditional physical urban areas to identify their similarities and differences.The comparisons confirmed that although urban entities are generally consistent with traditional physical urban areas, it is more credible for delineating urban entities when considering the spatial agglomeration extents of human activities.
Correlation comparisons further presented that urban entities could better reflect socioeconomic development than traditional physical urban areas.
The evaluation results show that, in the past 20 years, global urban entities increased from 157,733 km 2 in 2000 to 470, 632 km 2 in 2020.Asia, North America, and Europe experienced faster urban expansion than other continents.Urban entities were mainly distributed in the United States, China, Brazil, India, and Russia, all of which have a vast territory and a large population.A notable example was found in China, where urban entities increased from 9438 km 2 in 2000 to 78,545 km 2 in 2020.
Given that urban entities are complex spatial mosaics, the urban boundary should be drawn in accordance with the specific characteristics of research problems or management applications.The brightness threshold results indicated that the global brightness thresholds of urban entities presented a significant fluctuation trend with a peak range between 10 and 40 nWcm −2 sr −1 .The brightness threshold analysis not only implied that using traditional threshold methods for extracting urban areas is unrealistic but also effectively represented various structures or functions existing in urban interiors and fringes.Our urban mapping lays the basis for human activity reflected from NTL, which is highly in line with the spatial extent of population agglomeration.
Our study provides a more fine-resolution dataset (500 m) for spatiotemporal analysis of urban entity expansion than those of traditional urban areas extracted from DMSP-OLS data on various scales due to the spatial and temporal consistency (Zhou et al. 2018).However, there are also some limitations to our study that need to be addressed in the future.For example, the definition of urban entity is still contested and needs clarification and evaluation.Further explanation is needed of how NTL data can be used to extract physical and socioeconomic mechanisms behind urban entities.We should also consider the effects of satellite transit time, the observation geometry impact, and characteristic production lights, such as the pitaya planting in Vietnam, on the extraction of urban entities (Li et al. 2019;Dobler et al. 2015), although our results have eliminated these effects as much as possible.

Figure 1 .
Figure 1.Comparisons of urban entities with the LandScan population product and Landsat images in 2015.

Figure 2 .
Figure 2. Comparisons of urban entities with road networks in different cities of China in 2015.

Figure 3 .
Figure 3. Regression comparisons of total areas from urban entities with those from HE and MODIS at the national level from 2000 to 2020.Note: HE represents urban built-up areas proposed by He et al. (2019); MODIS represents artificial impervious extents collected from the data product of MODIS Land Cover Type Version 6. Below are all the same abbreviations.

Figure 4 .
Figure 4. Comparisons of total areas from urban entities with those from MODIS and HE at the global and continental levels.

Figure 5 .
Figure 5. Spatial comparisons of urban entities, MODIS, and HE with the LandScan population product in 2015.

Figure 6 .
Figure 6.Correlations of total areas of urban entities, MODIS, and HE with urban population density at the national level.

Figure 7 .
Figure 7. Global urban entity expansion from 2000 to 2020.Note: (b)-(e) urban entity expansion patterns of sample cities located in different continents; (g)-(f) urban entity expansion by longitude and latitude.

Figure 8 .
Figure 8. Urban entity expansion at the continental level.Note: (a) urban entities; (b) per capita urban entities.

Figure 10 .
Figure 10.Spatial visual differences of NTL, developed land, and Landsat images in 2010.

Figure 11 .
Figure 11.Comparisons among urban entities identified using different approaches.Note: Urban entities identified by (a) the reference-based threshold method, (b) integrating NTL and VI, (c) integrating NTL and LST, and (d) integrating NTL, VI, and LST; (e) and (f) Landsat and LandScan images, respectively.Note: NTL, VI, and LST represent nighttime light data, terra vegetation index, and terra land surface temperature, respectively.
He et al. (2019)itiescan be downloaded https:// dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10. 7910/DVN/79CRQJ.The newly generated SNPP-VIIRS-like data are openly available in Harvard Dataverse at https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/ YGIVCD#.Population was extracted from the LandScan datasets.Artificial impervious extents collected from the data product of MODIS Land Cover Type Version 6 and urban built-up areas proposed byHe et al. (2019).Terra land surface temperature, Terra vegetation index, Terra land water, and Landsat 8 OLI, were obtained from the Google Earth Engine platform.The 2010 developed land was acquired from GlobeLand30.Road networks were extracted from OpenStreetMap.Statistical data, including urban population and GDP, were obtained from the United Nations and World Bank, respectively.Administrative boundaries were collected from Natural Earth.