Exploring Urban Population Forecasting and Spatial Distribution Modeling with Artificial Intelligence Technology

: The high precision population forecasting and spatial distribution modeling are very important for the theory and application of population sociology, city planning and Geo-Informatics. However, the two problems need to be solved for providing the high precision population information. One is how to improve the population forecasting precision of small area (e.g., street scale); another is how to improve the spatial resolution of urban population distribution model. To solve the two problems, some new methods are proposed in this contribution. (1) To improve the precision of small area population forecasting, a new method is developed based on the fade factor and the slide window. (2) To improve the spatial resolution of urban population distribution model, a new method is proposed based on the land classification, public facility information and the artificial intelligence technology. For validation of the proposed methods, the real population data of 15 streets in Xicheng district, Beijing, China from 2010 to 2016, the remote sensing images and the public facility data are collected and used. A number of experiments are performed. The results show that the spatial resolution of proposed model reaches 30m*30m and the forecasting precision is better than 5% using the proposed method to forecast the population of 15 streets in Xicheng district in the next four years.


Introduction
Urban population forecasting and spatial distribution can provide important information to local governments, businesses and academics for various purposes. The inaccurate urban population information will lead to the failure of city planning, economic investment and public resource allocation. In contrast, the high precision population information can improve the urban sustainable development and the utilization efficiency of public resources. Therefore, many scholars have investigated different methods to urban population forecasting and spatial distribution [Clark (1951); Wu and Murray (2005); Wilson (2015); Zou, Zhang and Wang (2018)].
In general, there are two kinds of population forecasting models. One is the demographic model which is known as the "golden models", such as double-region model, multiregion model, queue group element model, Hamilton-Perry model [Isserman (1993); Smith and Tayman (2003); Renski and Strate (2013)]. The demographic model can obtain the high precise results of the large area population forecasting (such as a country, a province, a state), where Mean Absolute Percentage Error (MAPE) will be less than 6% [Wilson (2016)]. However, it is not suitable for the small area population forecasting since the small area is lack of the necessary population statistical information, such as birth rate, death rate, migration rate, etc. Another is the pure mathematic model, such as linear model, exponential model, mixed model, gray model, autoregressive model [Armstrong (2001) ;Baker, Ruan, Alcantara et al. (2008); Deng (2010)]. These models are often used to forecast population of small areas, such as a district, a block, a street [Chi and Voss (2011)]. However, the population forecasting precisions of these pure mathematic models are poor, where MAPE is about 10% [Zou, Zhang and Wang (2018)]. Tab. 1 shows the merits and demerits of demographic model and pure mathematic model. From Tab. 1, it is known that neither the demographic model nor the pure mathematic model can provide the spatial distribution information of urban population. However, it is very significant for government, business and individual to make a practical policy, planning and investment that the high precision spatial distribution information of urban population. Therefore, it is attracting more and more research interests of the urban population spatial distribution modeling [Vidyattama and Tanton (2010)]. Currently, there are three kinds of urban population spatial models: a) population density model [Clark (1951);Tanner (1961); Smeed (1961); Anderson (1985)]; b) spatial interpolation model [Tober (1979); Lam (1983)]; c) geographical factor model [Harvey (2002); Tian, Chen, Yue et al. (2004); Xu, Mei and Han (1994); Zhuo, Chen, Shi et al. (2005)]. Tab. 2 summarizes the characters and applicability of these urban population spatial distribution models. From Tab. 2, the population density model is easy to use, but its spatial resolution is low. The spatial interpolation model can reach a high spatial resolution if the mesh is sufficiently dense that the numerical approximation is an accurate one, but the additional computational burden may not be tolerable. The geographical factor can improve the data processing efficiency, but it is very difficult to accuracy establish the function relation between geographical factor and population density. Therefore, it is necessary to develop a high spatial resolution and easy-to-use urban population distribution model. In this study, the methods of high precision small area population forecasting and high spatial resolution urban population distribution modeling are investigated. To improve the precision of small area population forecasting, a new method is developed based on the fade factor and the slide window; and to improve the spatial resolution of urban population distribution model, a new method is proposed based on the land classification, city public facility information and the artificial intelligence technology. For validation of the proposed methods, the real population data of 15 streets in Xicheng district, Beijing, China from 2010 to 2016, the remote sensing images and the public facility data are collected and used. The results show that the spatial resolution of proposed model reaches 30 m*30 m and the forecasting precision of each street population is better than 5%. In the following, Section 2 introduces the study area, data and methods in this study; Section 3 presents the experimental results and analysis; Section 4 summarizes the main points and contributions of this paper.

Study area
The study area is the Xicheng district, Beijing city, China. Beijing is the capital of the People's Republic of China. There are 16 districts in Beijing city and the Xicheng district is the center of Beijing, where the state council of China and the other important administrative organizations of China are located in Xicheng district. Therefore, the population density of Xicheng district is the largest in the 16 districts of Beijing, which reaches 28,793 people per km 2 in 2016, and the registered population is counted and the unregistered population is not included. Actually, the number of unregistered population is very large in Xicheng district. Therefore, the real population density of Xicheng district is larger than the above value. It is noted that the administrative area of Xicheng district was adjusted in 2010, where the Xuanwu district was merged into the Xicheng district. Therefore, the study area of this paper means the merged Xicheng district. The Fig. 1 shows the population spatial distribution of 16 districts of Beijing city and 15 streets of Xicheng district in 2016. It is known that the population distribution of Beijing like a group of concentric rings from the Fig. 1(a), where the population densities of central areas (I, II) are the largest, and that of outer suburbs (XII, XIII, XIV, XV, XVI) are the smallest. This kind of population spatial distribution was described by Clark, see Clark [Clark (1951)]. However, it is only suitable for modeling the population distribution of large area (e.g., Beijing city). The population spatial distribution of small area (e.g., Xicheng district) is different in the Fig. 1(b), which is effected by various kinds of factors, such as land type, public facility, house price, etc. Therefore, the high spatial resolution model should be developed to describe the real distribution of urban population.

Data
In this study, there are three kinds of data are collected and used: a) the population data of 15 streets in Xicheng district, Beijing from 2010 to 2016; b) the remote sensing images of Xicheng district, Beijing from Landsat in 2016; c) the spatial distribution of public facilities of Xicheng district, Beijing from 2010 to 2016. The special information of three kinds of data is listed in the Tab. 3.  [Davis (1995)] and sharing model of population growth variable [Wilson (2015)], of which the first three models are pure mathematical model, and the latter three models are the forecasting models with total population constraint information. And a new method based on the fade fact and slide window is adopted to improve the precisions of these models for small-area population forecasting [Zou, Zhang and Wang (2018)]. The concrete formula is as follows: ① Linear model (LIN): ( 1) Where, Pi(t), Pi(t+1) and ri are the population and the average annual growth rate of population of the i th street in the t th and (t+1) th year respectively. ② Improved exponential model (MEX): When ri≥0, Ki is five times the population of the i th street in the t th year; when ri<0, Ki is 1/5 of the population of the i th street in the t th year. ③ Grey model (GM) Where a and b are the calculation coefficients for the model whose formula is as follows: Where, Si is the average scale of the population of the i th street accounting for the total population of the whole district in the past t years. ⑤ Constant model of the population growth rate difference (CGD): Where, rT(t, t+1) is the annual growth rate of the population of the whole district in the (t+1) th year, and grdi is the average of the difference between the population growth rate of the i th street and that of the whole district. ⑥ Sharing model of population growth variable (VSG): Where, ( 1) i Pt+ is the population of the i th street in the (t+1) th year predicted using linear model and exponential mixed model; PFi(t+1) and NFi(t+1) are the adjustment coefficient for the increased population of the i th street in the (t+1) th year when the increased population of the whole district in the (t+1) th year is positive and negative respectively. The formula is as follows: Where, m is the number of street in the district. ⑦ The method based on the fade factor and slide window: To weaken the influence of historical information and strengthen the role of new information, a new method based on the fade factor and slide window technology is proposed [Zou, Zhang and Wang (2018)]. The specific implementation steps of the method are as follows. The calculation formula of LIN and MEX is consistent with (1) and (3), but the fading factor and sliding time window are introduced while calculating the average annual population growth rate. So (2) is adapted as follows: Where, w is the number of times of window movement; f(j) is the fading factor; α is the weight coefficient. (16) and (17) make use of the parameter w to keep the dynamic update of ri. Due to the introduction of f(j) and α, the weight of the historical data is adjusted constantly, which will further improve the timeliness of the parameter ri. The calculation formula of GM is basically the same as (4)-(9), but Pi (1) (t) is constantly updated using moving window technology, and then matrix B and matrix L are updated. (7) is substituted into weight matrix W, and then the fading factor f(j) are introduced, the specific formula of which is as follows: (2 ) 0 0 0 0 Where, the calculation formula of f(j) is the same as (17) in which t is the dimension of matrix W plus 1. After the introduction of the fading factor and sliding time window, GM becomes a weighted progressive model of equal dimension essentially. CSP and CGD are calculated in the same way as (10)-(13), and the fading factor and sliding time window are also introduced to them. So (11) and (13) are adapted as follows: Where, the calculation formula of f(j) is the same as (17), and the calculation formula of VSG is the same as (14) and (15). However, (16) and (17) are used in the calculation of the average annual population growth rate. Thus, the method of small-area population forecasting based on the fading factor and sliding time window is actually to add new predicted value via moving window method, keep updating parameters of the model, and meanwhile weight the modeling data using the fading factor. This method can not only improve the timeliness of model parameters, but also increase the flexibility of the prediction model, thus better adapting to the rapid and dynamic change characteristics of unstable time series data.

Population distribution based on land-use type
The total number of population in each street can be obtained by the above forecasting method. However, the population of each street is not distributed evenly on the whole street. For example, it is impossible for people to live on a traffic/green/water land. The people just live on the construction land [Tayman (1996); Ji, Wang, Zhuang et al. (2014)]. Therefore, the land use type of each street should be accurately obtained. To solve this problem, the 30 m*30 m remote sensing image of Xicheng district from Landsat in 2016 is used and ENVI software is used for image data preprocessing and land-use classification. The special method of remote sensing image data processing is described as follows. Firstly, the remote sensing image data from Landsat in 2016 is preprocessing, including radiative correction, atmospheric correction, geometric correction, contrast stretching, etc. Secondly, the remote sensing image is clipped according to the administrative boundaries of Xicheng district. Thirdly, the land of Xicheng district is classified into construction land, green land, water land, traffic land by the supervised classification method. Fourthly, the image of construction land is visual interpreted furtherly for ensuring the precision of construction land classification. Lastly, the construction land is vectorized for the subsequent spatial analysis.

Population distribution based on public facility
The population of each street is distributed on the construction land based on the result of land-use classification. However, the population on each piece of construction land is not completely equal. The population density of construction land which has a good public facility condition is larger than that of construction land which has a poor public facility condition. Therefore, the spatial locations of public facilities (subway station, school, hospital) in Xicheng district are obtained by the digital map from the OpenStreetMap. Then these public facilities are placed on the remote sensing image. It is noted that the coordinate systems of the remote sensing image and the digital map should be kept consistent. Furthermore, it becomes a key problem how to simulate the spatial distribution of population based on the spatial distribution of public facility. To solve this problem, a new method of population spatial distribution modeling is proposed based on the cellular automata (CA) and multi-agent system (MAS Where Vi(k) means the kth character value of i th CA. And k is type of character value (1=traffic, 2=school, 3=hospital). The dij(k) is the Euclidean distance between the i th CA and j th public facility and the j th public facility is the nearest one to the i th CA in m facilities of k th type of public facility. n notes the number of CA and m is the number of the k th type of public facility. Thirdly, the integrated score of each CA is computed by the Eq. (24).
Where Ti is the integrated score of the i th CA; P(k) is the power of the k th type of public facility. And the P(k) can be obtained by the adjustment method based on at least three year of historical population data in each street. Fourthly, the population of each street is divided equally to each CA. Then one agent represents one person. And average score per agent (ASPA) of each CA is calculated by the Eq. (25).
Where Pi is the number of population of the i th CA. If the average score per agent is high, it means that the public facility is rich and the number of population is small on this CA. In contrast, if the average score per agent is low, it notes that the public facility is poor and the number of population is large on this CA. Fifthly, the agents live on the low ASPA of CA move to the high ASPA of CA. Then the ASPA of each CA is calculated again and it will be stopped until the differences between the new ASPA and old ASPA of all CA are less than one threshold. It means that the balance between the public resource and the number of population has been realized on all CA. The Fig. 2 is the flow chart of population spatial distribution modeling based on the CA and MAS technology.

Population forecasting experiment
To validate the proposed method in this study, the data of introduced in the Section 2.2 and the methods of introduced in the Section 2.3 are used. In the population forecasting experiment, two experiment schemes are designed and performed. In scheme 1, the population data of 15 streets in Xicheng district from 2010 to 2012 and LIN, MEX, GM, CSP, CGD, VSG models are used to forecast the population of each street in the next 4 years. In scheme 2, the basic data and models are the same as those of scheme 1, but the fading factor and sliding time window are introduced in the forecasting models. In our experiment, the weight coefficient α of the fading factor is set as 0.5 and the length of sliding time window is set to be 3 years (basic data length). The Where, AVEa and AVEb represent the average forecasting models of the a th and the b th experimental scheme.   Fig. 3, it can be known that (a) the forecasting precisions of the latter three population forecasting models (CSP, CGD and VSG) with total population constraint information are higher than those of the first three pure mathematical models (LIN, MEX and GM), among which VSG has the highest forecasting precision (6.32%); (b) the forecasting precision of all the six models increase significantly after using the fading factor and sliding time window technology. Compared with the scheme 1, the forecasting precision of scheme 2 is improved by 46.88%. Among these models, the forecasting precision of the optimal model VSG reaches 3.51%.

Population spatial distribution experiment based on land-use type
To improve the spatial resolution of urban population distribution modeling, the land-use type of Xicheng district is classified by ENVI software since the people just live on the construction land. The results of 2016 population forecasting from the VSG of scheme 2 are taken as the basic data. The Fig. 4(a) shows the results of land classification of Xicheng district and the Fig. 4(b) demonstrates the results of the population spatial distribution based on the land-use type. From the Fig. 4(a), it is can be known that the area of construction land is the largest because Xicheng district is in the center of Beijing city. The area of other unclassified land is the smallest, which means there is little undeveloped land in Xicheng district. The Fig.  4(b) provides a higher spatial resolution of Xicheng district population distribution than that provides by the Fig. 1(b). And it can be found that the population density of Fig. 4(b) is larger than that of Fig. 1(b), because the population is not allocated on all types of land but on the construction land. The population density of Dashanlan Street (K) is the largest in the Fig. 1(b). However, the population density of Yuetan Street (E) becomes the largest in the Fig. 4(b). The reason is that the area of un-construction land of Yuetan Street is larger than that of Dashanlan Street (see the Fig. 5). Therefore, it is proved that the land-use classification is very important to model the population spatial distribution accurately.

Population spatial distribution experiment based on public facility
Although the spatial resolution of population distribution is improved by the land classification, the spatial distribution of urban population is severely affected by the public facilities distribution [Voss (2006)]. Therefore, a new method is developed to simulate the effect of public facility on the spatial distribution of urban population, which is described in Section 2.3.3. The Fig. 6(a) shows the spatial distribution of three kinds of public facilities (subway station, school and hospital) of Xicheng district in 2016. To simplify the data processing, only the subway station, the key schools and hospitals are considered. The Fig. 6(b) is the population spatial distribution of Xicheng district based on the above public facilities, using the CA and MAS technologies. From the Fig. 6, three conclusions can be drawn: (1) the spatial resolution of population distribution can be improved significantly if the effect of public facility is considered. In the same construction land, the population is not distributed evenly, but is strongly affected by the spatial distribution of public facilities; (2) the population aggregation of the Baizhifang Street (M) is very obvious, although its population is not too large. However, it leads to the highly concentrated population because the rare public facilities; (3) the subway station shows the strongest attraction for the population in the three kinds of public facilities. It notes the traffic condition is a very important influence factor for resident decision of where they live. Therefore, it indicates that the government can guide urban population realize the even distribution by the reasonable planning and construction of city public facilities.

Conclusions
In this study, two key problems of urban population forecasting and modeling are investigated. One is that the population forecasting of small area (street scale) and another is that high spatial resolution modeling of urban population spatial distribution.
To improve the precision of small area population forecasting, a method is proposed based on the fade factor and the slide window. To improve the resolution of population spatial distribution model, a method is developed based on the artificial intelligence technology. For validation of the proposed methods, the population data, the remote sensing images and public facility distribution data of Xicheng district, Beijing, China are used and a number of experiments are performed. Some conclusions are listed as follows. Compared with the tradition six models (LIN, MEX, GM, CSP, CGD, and VSG), the average forecasting precision can be improved by 46.88% using the proposed method to forecast the population of 15 streets of Xicheng district in the next four years. The VSG model is the best and its forecasting precision (MAPE) reaches 3.51%. The spatial resolution of population can be improved significantly using the information of land classification and public facility distribution. And the subway station has the more effect on the urban resident spatial distribution than the hospital and the school. However, more influence factors of urban population spatial distribution should be investigated and the longer time series of population data and public facility distribution data should be used to determine the power (P(k)) of each type of public facility. In addition, the population data of special resident area should be collected for validating the precision of proposed population spatial distribution model in the future study.