An attractiveness-based model for human mobility in all spatial ranges

In the past decade, various aspects of human mobility, from individual to population levels in both spatial and time scales, have been studied. However, existing human mobility models still fail to describe activities in all spatial ranges, while in particular, local activities like self-loop fluxes are generally ignored. Moreover, the regional attractiveness, a basic but useful concept in mobility model design, remains difficult to be quantified due to its intricacies. To resolve these two fundamental issues, we introduce a trip competition mechanism to empower a mobility model to estimate population fluxes in all spatial ranges. The model includes attraction scores of regions in concern, obtained via optimizing the working population distributions. Its capability of predicting a variety of mobility patterns is verified by empirical data from three different countries, and its accuracy outperforms those of existing models. The quantified attractiveness is also found to be highly correlated with common socioeconomic indicators, and is able to act as a distinct metric to characterize a region.


Introduction
The insights into the motivation of human movement and the availability of massive demographic data have facilitated the evolution of human mobility modeling. Based on existing models, the statistical human mobility patterns ranged from individual level [1,2] to population level, in diverse spatial [2][3][4] and time scales [5][6][7], have gradually been unraveled. Furthermore, the characteristics of different classes of individuals, for example, the returners and the explorers [8], have also been studied. These findings not only deepen our understanding in human behaviours but also provide carriers to other momentous issues, such as epidemic spreading [9][10][11], traffic engineering [12,13], urban planning [14], just to name a few.
Human mobility is an aggregated outcome of trip selections in a population. As reflected in people's intuition and assumed in most mobility models, distance is one of the critical factors in this decision process. In the seminal gravity model [15], population flux is assumed to be inversely proportional to a function of distance. In opportunities-based models [16][17][18][19], the radius of the circular competition region is specified by the distance between the origin and the destination, while a distance ranking approach is adopted in the rank-based model [20].
Due to their formulation, the above-mentioned models focus on coming-in/going-out activities of an area, while local activities, i.e. self-loop fluxes within an area, are excluded. However, self-loop fluxes actually account for a large proportion of the total population fluxes, and in many scenarios, these local activities should be considered. A typical example is that, for workers' population, those who commute in/out for work, and those live and work in the same area are of equal importance. Indeed, in most census data, the percentages of populations who work in their resident regions are huge.
Another key factor in trip selections is the attractiveness of the destination. The simplest and widely adopted approach for representing attractiveness of an area is based on its population size. This method has been adopted Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. in many mobility models, including the gravity model [15,21], the intervening-opportunities (IO) models [16,17], the radiation model [18], the population-weighted opportunities model [19], etc. In addition to population size, some other factors have been suggested. For example, in random utility models [22,23], a function combining traveling time, traveling distance, and job opportunities of a destination is assigned as the total attractiveness. The models proposed in [2] and [3] assume that the attractiveness is proportional to both population size and the memory effect [24,25], while the number of nearby settlements is taken into account in a rank-based model [20]. Recently, machine learning has also been applied to explore the influence of various urban indicators to commuting flows, such as distance, GDP and unemployment rate [26].
Unfortunately, due to the diversity of motivations and complex relation in between, a clear definition of attractiveness is still in vain. Therefore, we consider that using a single factor, such as the population size, or an arithmetic combination of several factors to represent such an abstract attribute could be inadequate. In contrast, we here take attractiveness as a collective score, according to personal preference and other factors, reflected by the outcome of trip selections. It should be emphasized that an inappropriate approximation of attractiveness would affect the validation of human mobility laws, resulting in a large error in mobility pattern prediction.
In this paper, a new mobility model, called attraction model, is proposed. Comparing with existing models, our attraction model can predict the flux between any pair of subareas, achieving an excellent agreement with the census data. Moreover, the obtained attraction score well characterizes a subarea and also highly correlates with common socioeconomic metrics, providing a comprehensive indicator to show how competitive an area is. Last but not least, our model is superior in data efficiency. As compared to other parameter-based models, such as [2,15,22,23], which usually require a large amount of data to tune their parameters, our attraction model only requires population distribution of the subareas, which is quantitatively same as popular parameterless models, such as the radiation model. Also, thanks to the population optimization process, our model is able to provide a better estimation in all scenarios.

Attraction model of human mobility
Similar to other studies in human mobility, we focus on commuting that governs most of the regular mobility patterns, for which the occurrence time, the quantity of fluxes, the origins and destinations are relatively fixed.
A targeted area is divided into N subareas according to its natural partitioning (for example, a state in the United States is divided into counties). Every subarea i has three attributes: • resident working population pop i r : it refers to the group of persons who live in subarea i and have their jobs in any subarea; • working population pop i w : it refers to the group of persons who work in subarea i but can live in any subarea; • attraction score α i : it reflects the attractiveness of subarea i.
Denoting the population flux from subarea i to j as f i j , , the following assumptions are given: The basic idea of the proposed attraction model is depicted in figure 1. The selection process of a trip is conducted by a competition mechanism, evaluated by a benefit score. Generally, human subjective scores can be regarded as weighted averages of many independent small effects, and thus, according to the Central Limit Theorem, normal distribution is frequently adopted. Therefore, the benefit score is assumed to follow a normal distribution ( ) m s  , with mean μ and variance σ 2 , so as to capsulate the diversity and variations in the decisionmaking process of individuals. It is remaked that, similar ways of modeling have also been used in other human perception related studies, such as quantification of human response [27], emotion modeling [28], neural coding [29], and so on. Also, some further study about using normal distribution is presented in section 4 of the supplementary information.
Concretely, let the random variable S i,j be the benefit score of working at subarea j evaluated by the resident working population living in subarea i, one has where α i is the attraction score of subarea i, d i,j represents the distance between i and j, and β is a distance penalty constant. Here, μ i,j is assumed to be linearly proportional to the distance d i,j and this assumption will be further discussed in section 3.1.
The trip competition mechanism is operated as follows. For each individual m who lives in subarea i, the benefit score s i j m , of working at subarea j is a random extraction from  i j , . He/she would then choose to work at the subarea with the highest benefit score. Therefore, the transition probability of commuting from subarea i to j is: where (·) P is the probability density function. In terms of weak law of large number, the expectation of the flux f i,j can be estimated by:

Parameter identification via population optimization
Most of the parameter-based mobility models estimate their parameters by fitting the model with empirical commuting fluxes data. This method can be regarded as a flux optimization method because its goal is to minimize the errors between the fluxes obtained from the models and the empirical data. On the one hand, a high precision may be achieved via the optimization. On the other hand, it takes much time and efforts to collect such a large amount of data (for N subareas, the number of data is N(N−1)), and it is difficult to ensure their accuracy. Also, it is challenging to tackle with the high-dimensional optimization.
To overcome these drawbacks, we introduce another method that only requires pop i r and pop i w of each subarea (for N subareas, the number of data is 2N). We refer to it as the population optimization method. By matching the predicted and actual population distributions (or census data in practical applications), attraction score of every subarea and a weighted distance term, i.e. α i and β, can be obtained.
We consider the population estimation error, which can be defined as are the attraction scores and β is the distance penalty factor as defined in equation (1). Figure 1. Description of the attraction model. The attraction model is based on a trip competition mechanism. A trip from subarea i to j is evaluated by a benefit score S i,j which is a random extraction from a normal distribution  . The mean of  is determined by the destination's attraction score α j and the distance in between d i,j . When making a decision, an individual evaluates all the possible trips including the self-loop trip and picks the subarea k with By minimizing J pop , a and β are given as The minimization can be easily accomplished by designing a simple greedy algorithm, while the details of the algorithm can be found in supplementary information, algorithm S1.
Similarly, the flux estimation errors can be defined as: However, it should be remarked that the flux information is not required in our model, and J flux is solely for the sake of comparison and discussion.

Overview of the dataset
In this study, the commuting data of three countries, including the United States (US), Canada, and the United Kingdom (UK), are adopted. The commuting data of the United States is the county-to-county working flow from the US Census (2009-2013) [30]. This dataset contains all the states except for the State of Hawaii and Alaska. Each state is defined as a closed area in our study, while the counties of a state are the subareas. The Canada data [31] is from Statistics Canada (2016). It includes all the Provinces and each of them is defined as a targeted area, while the census subdivision regions in the province are regarded as subareas. Data of the Greater London area is from the UK Census (2011) [32,33]. We regard the whole Greater London as an area and extract each borough as a subarea.
For the distance between two subareas, we adopted a practical way and used the 'traveling time by car' as calculated by Google Map.

Model validation
As discussed in section 1, distance is a critical factor in the decision of trip selection. Therefore, a linear relation between the mean of benefit scores μ i,j and the distance d i,j is assumed in (1). This relationship is further investigated by obtaining μ i,j directly from the data (i.e. without using equation of μ i,j in (1)) via minimizing flux estimation error (6) and d i,j from Google Map as described in section 2.3. Firstly, the Pearson correlation coefficients (ρ(μ, d)) for each subarea j are calculated. Then, the coefficients of determination (R 2 ) and mean square errors (MSE) are measured by applying linear regression to μ i,j and d i,j for each j. Figure 2 shows the boxplot of the results based on the data of New Jersey (with 21 counties). The large absolute value of ρ(μ, d) (close to 1), large R 2 and small MSE indicate a high linear correlation between μ and d. As a result, the use of linearity assumption in (1)  and β (see equation (1)) are found by minimizing (5). After obtaining a and β, the predicted population fluxes E( f i,j ) can be easily derived by equation (3). The obtained results are compared with the census data, as shown in figure 3 and referred as Case 1.
To further evaluate its effectiveness, the attraction model is compared with the gravity model [15] and the radiation model [18]. Since the compared models fail to estimate self-loop fluxes E( f i,i ), for fair comparison, all models are targeted to obtain the out-going fluxes f i,j with ¹ i j. The results are given in figure 3, referred as Case 2.
For Case 2, the establishment of attraction model is similar to that for Case 1, except that self-loop fluxes f i,i are excluded from pop i r and pop i w (denoted as  Table 1 compares the commuting/population data  required by each model, while table 2 and table 3 list the overall population estimation error J pop and flux estimation error J flux of each model, respectively. As shown by Case2 in figure 3, there are significant deviations in the predicted fluxes from the gravity model, especially for both high and low fluxes. The radiation model generally achieves better prediction than the gravity model. Except for the Greater London, the radiation model performs well with some slight underestimation of high fluxes. The predicted median values align with the census data, but most of the mean values are overestimated. The deficiency of the radiation model for studying the Greater London is similar to the phenomenon reported in [19], which is due to the small scale prediction.  In contrast, the attraction model achieves more accurate prediction (also see tables 2 and 3). It not only obtains the predicted working populations with negligible deviations from the empirical data, but also achieves a small flux estimation error. As observed, both the predicted mean and median fluxes tightly align with the census data. This thus confirms the effectiveness of the optimization mechanism and also supports the validity of the attraction model. Moreover, as shown in Case 1, it is able to achieve high precision when estimating both the out-going and self-loop fluxes, even though only working population distributions pop i r and pop i w are available. To further compare the prediction accuracy of different models, the Pearson correlation coefficients ρ flux between the predicted and census commuting fluxes for different models are listed in table 4 while the coefficients of determination R 2 flux based on linear regression are tabulated in table 5. The larger coefficients in using attraction model for both Case 1 and Case 2 clearly confirm its superiority.
Another important metric for mobility is the distance probability p dist (d) which stands for the probability of a trip between two subareas with distance d. Figure 4 shows these probabilities of each model, while figure 5 focuses on cases with d=0, i.e. the self-loop fluxes, as a supplementary. Except for some slight underestimation of long-distance fluxes, the p dist (d) of the attraction model matches the census data very well, demonstrating its high robustness on diverse spatial scales. On the contrary, both the gravity model and radiation model have difficulty in predicting short and long-distance trips. In particular for the Greater London area, both compared models fail to follow the trend of the census data, resulting in large prediction errors.
The performance of a population-level model also depends on the discretization resolution, which specifies the size of subareas. Usually, when a population-level model is concerned, the resolution is low and thus the size

2N
Note. N is the number of subareas. Also, distance information d i,j are required for all three models.   Table 4. Comparison of Pearson correlation coefficient (ρ flux ) between the census commuting fluxes and those predicted by different models.    population's collective behavior to individuals' when resolution increases. (Note: the size of subareas becomes smaller.) Moreover, the more subareas there are, the harder the optimization is. For the attraction model, due to the decrease in the proportion of self-loop fluxes (as subarea is smaller), the advantages of the attraction model would become less significant. However, as observed by the results in table S2, even though the prediction accuracy of the attraction model reduces, it is still much better than the other existing models for both Case 1 and 2. More detailed analysis can be referred to section 5 of the supplementary information.

Implications of attraction score
The attraction score α i is a collective estimation of the attractiveness of a subarea. Take the State of New York as an example, figure 6 depicts the choropleth maps of attraction scores and some socioeconomic metrics. To be precise, we focus our discussion on three regions as numbered in the figure.
Region 1 (figure 6(a)) includes the New York City and Long Island. New York City is the most populous city in the US with its global impact in various areas. Long Island owns the largest industrial park in the east coast which provides more than 71 000 job positions. Moreover, it plays a prominent role in scientific research and engineering. As expected, counties in Region 1 gain high attraction scores while the Manhattan is particularly high as it serves as the economic and administrative center of New York City.
Region 2 is mainly covered by the Mohawk Valley which is a suburban and rural area. It is low in per capita GDP, in median household income, and in population size. Not to our surprise, these subareas have poor attraction scores.
Region 3 is the Western New York. The loss of traditional jobs in manufacturing has led to a decline in economic. Thus, even though it is still a populous area, it suffers from a drop of population. The median household income is relatively low around the year 2010. According to these observations, low attraction scores might be predicted. However, it is the opposite as observed in figure 6(a). The fact is that, its role has been transferred from traditional manufacturing to light manufacturing, high technology, and services. Together with the help of the government's policies (e.g. the casino business in the late 1990s and the investment of a billion dollars under 'Buffalo Billion' project, etc), the economy in this region has been improving significantly since early 2010s and there are many job opportunities. Therefore, the counties in this region gain high attraction scores during the year 2009 and 2013. (Note: data used for the attraction score quantification is from this 5 year survey.) Based on the very similar patterns of the four choropleth maps in figure 6, a good correlation between the attraction scores and socioeconomic metrics is expected. To quantify the observation, their Pearson correlations are given in figure 7. The population size of an area does highly correlate with its attraction score but not high enough to be regarded as a linear correlation. In other words, a mobility model may not be precise enough if we simply take the population size to represent the attractiveness. The per capita gross domestic product (GDP) also contributes a lot to the attraction score. It even has a higher correlation than that of the population size and similar observation has also been reported in [26]. On the other hand, the median household income shows a relatively low correlation with the attraction score.

Discussion and conclusion
Attractiveness of destinations and distance are two key elements in the formulation of human mobility models. Nowadays, distance information can be easily (and accurately) obtained, e.g. by the use of electronic maps, but it is still hard to quantify attractiveness, in spite of many attempts. In addition, existing mobility models did not address issues of self-loop flux prediction, which is of great importance. From both theoretical and practical points of view, a model reflecting the true physical mechanism behind human mobility should predict fluxes in any spatial range without exceptions.
The above two challenges are resolved by two key mechanisms in the attraction model, namely the trip competition mechanism and the population-based attractiveness quantification method.
In the trip competition mechanism, all possible trips within the studied area would participate in the competition. The competitiveness of a trip is a normal distribution, with its mean determined by a linear combination of the attractiveness of the destination and the distance of the trip. As a result, self-loop flux can be compared in a fair manner. The parameters of the attraction model, i.e. the attraction scores and the weighting factors of distance, are obtained through optimization, minimizing the estimation error of the working populations. Therefore, the inputs of the attraction model are only the distance and the working population distributions of the subareas, in contrast to other parameter-based models, for which costly empirical commuting data is required. Our model has been verified with census data from three different countries through a comparison with the classical gravity model and the radiation model.
Validation of the attraction scores is twofold. Firstly, the commuting fluxes are predicted based on our model after optimization. As shown in figure 3, the predicted fluxes well match with the empirical fluxes. Secondly, the relation between the attraction score and common socioeconomic metrics has been studied. The obtained attraction scores are found to be positively correlated with several socioeconomic measurements including population, per capita GDP and the median household income [34]. These metrics are chosen as they intuitively reflect how attractive an area is, while the results conform with our intuition. Furthermore, the correlation with the population size reflects the limitation of using population size as an approximation of attractiveness in other models, which is also confirmed by the fair pattern estimation results of these models given in section 3.
Similar to other specific attractiveness such as the tourism attractiveness [35][36][37][38] and the investment attractiveness [39,40], the usage of our attraction score is not limited to mobility prediction. It also serves as a monitoring indicator which properly reflects the healthiness of a regional labor market. As demonstrated in the case of Western New York, it is able to reveal some potential crisis or opportunities of an area, which would be useful for decision making and socioeconomic analyses. For individuals and enterprises, the quantified attraction score suggests a simple and unified way to assess an area. As reported in [41], creative workers would rather be attracted by the overall attractiveness of a city rather than work-related motivations. For governments, the correlations between attraction score and different socioeconomic metrics help to identify areas for improvements, and are also useful for verifying the effectiveness of the governments' policies.

Data accessibility
Data and code used in this paper can be downloaded from https://hippie-caltsby.github.io/attractiondownload.html.