Escaping from pollution: the effect of air quality on inter-city population mobility in China

China faces severe air pollution issues due to the rapid growth of the economy, causing concerns for human physical and mental health as well as behavioral changes. Such adverse impacts can be mediated by individual avoidance behaviors such as traveling from polluted cities to cleaner ones. This study utilizes smartphone-based location data and instrumental variable regression to try and find out how air quality affects population mobility. Our results confirm that air quality does affect the population outflows of cities. An increase of 100 points in the air quality index will cause a 49.60% increase in population outflow, and a rise of 1 μg m−3 in PM2.5 may cause a 0.47% rise in population outflow. Air pollution incidents can drive people to leave their cities 3 days or a week later by railway or road. The effect is heterogeneous among workdays, weekends and holidays. Our results imply that air quality management can be critical for urban tourism and environmental competitiveness.


Introduction
Air pollution in China has resulted in severe health and economic losses. In 2015, particulate matter became the fifth largest contributor to disabilityadjusted life-years in China (Forouzanfar et al 2016), and the observed air pollution contributes to nearly 1.6 million (∼17%) deaths each year in China (Rohde and Muller 2015). If there is no new technology to control air pollutants, China may experience a 2.00% loss of GDP and health expenditure of USD 25.2 billion from PM 2.5 pollution by 2030 (Xie et al 2016). Although the hospitalization cost is a large part of the economic loss from air pollution, the decline in productivity caused by air pollution decreases manufacturing output (Fu et al 2017) and the GDP per capita (Hao et al 2018). Air pollution also has an influence on people's emotions and mental health. People tend to express less happiness on social media (Zheng et al 2019) and their dining-out satisfaction tends to decrease (Zheng et al 2016) when air quality is poor.
While national and local governments are making great efforts to mitigate pollution emissions, individuals have been taking avoidance behaviors to lower the adverse health impacts of air pollution. Compared with the pollution abatement policies that can sometimes take a long time to improve air quality, avoidance behavior has advantages of low cost and effectiveness in relieving the adverse health impacts. Studies have found that air pollution can significantly shorten the time people spend outdoors (Bresnahan et al 1997, Laumbach et al 2015. Cycling behavior Original content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. reduces by 14%-35% in air-polluted weather (Saberian et al 2017), and people tend to shift their travel time during the day because of health concerns (Welch et al 2005). With air quality alerts, attendance at places of entertainment, such as zoos or observatories, decreases significantly (Zivin and Neidell 2009), and diningout frequency also decreases to avoid exposure (Zheng et al 2016). Air pollution also affects the choice of destination when conducting such activities. The mechanism is mainly attributed to health concerns, i.e. people are apt to following expert advice to stay indoors when the air is polluted (Laumbach et al 2015). As a result, air quality has become a factor influencing travel behavior, and studies have found that the attraction of traveling to places is reduced when the air is more polluted (Mihalič 2000), as social and recreational purposes tend to be major reasons for traveling (Zavattero et al 1998).
In this sense, air quality can affect population migration. Current research focuses on the long-term effect. Studies have found that clean air has an attraction for city residents and visitors, while cities with poor air quality are less competitive and stimulate people's intention to emigrate (Qin and Zhu 2018). Bayer et al (2009) used a hedonic method to estimate the effect of clean air as an amenity, and found that air quality could influence long-term migration. At the sub-national level, degradation of air quality from 1996 to 2010 has significantly decreased migration inflows in Chinese counties (Chen et al 2017). However, few studies have discussed the impact of air quality on short-term population mobility, especially in developing countries such as China where air pollution tends to be more severe.
Knowledge on this topic would benefit the environmental and transportation management of cities, specifically by providing an accurate estimation of health loss attenuated by the avoidance behavior as well as the economic loss to travel and tourism due to air pollution. It would also have implications for policy making in attracting talent and immigration to desirable environmental amenities. This paper looks at the impact of air quality on short-term mobility by innovatively utilizing a nationwide smartphone-based city-level outflow mobility network and the daily air quality data in cities. Both datasets are daily and cityspecific, covering a period of over 3 months and over 300 cities in China. We use instrumental variable (IV) regressions to analyze the impact of air quality on population mobility with a discussion of heterogeneous effects. We find that people do escape from cities with high air pollution and the effects of air quality on mobility differ over time.

Methodology
We first adopt a fixed-effects regression model to investigate the effect of air quality on population mobility across cities. Using the generalized least squares (GLS) strategy, the model is specified as where Mob it is the indicator of population mobility of city i on day t. Due to data availability, our data only cover the major outflows. Therefore, in this study we only focus on outflows, i.e. the population that flows out of a city. AQ it denotes the air quality variables (introduced with details in section 3). X it is a vector of control variables that consist of climate factors including temperature, precipitation, wind speed and relative humidity. The square of temperature is also controlled by referring to studies on the effects of temperature on human activities (Zheng et al 2019). a i and l t are city and date fixed effects, respectively. ε it is the error term. We are interested in b . 1 Since the air quality index (AQI) describes how bad the air is, this coefficient is expected to be positively significant for population outflow, showing that residents tend to avoid the more polluted areas.
This naïve GLS is likely to suffer from endogeneity issues due to missing variables. Air quality and the population flows out of cities are influenced simultaneously by the macro as well as the localized socioeconomic status. For instance, a booming economy may motivate the production of polluting industries and lead to severe pollution, while attracting population flows to specific cities for business activities. As it is difficult to capture the effect of these unobservables with available data, the independent variables in equation (1) are likely to be correlated with e , it which causes attenuation bias that can underestimate the effect of air pollution. To deal with this issue, we introduced thermal inversion for an instrumental estimation. Normally, air temperature falls with height in the troposphere. A thermal inversion is a deviation from the normal change of air temperature with altitude, i.e. warmer air is held above cooler air, which traps air pollution close to the ground (Katsoulis 1988). While correlated with air pollution, a thermal inversion is unlikely to affect socio-economic activities and thus the error term in equation (2); thus it can be a rational instrumental variable. Its validity has been verified in studies estimating the effects of air pollution on human health (Arceo et al 2016) and company productivity (Fu et al 2017). We thus conduct the first stage of the IV estimation using Here TI it is an indicator of thermal inversion in city i on day t. m it is the error term for the first stage, and other variables are the same as in equation (1).
To figure out whether the air quality on one day has an impact on mobility in future days, we test the relation between the outflow and the AQI of previous days: AQ i t n is the air quality of city i on day n before the mobility calculation. Other variables are the same as in equation (1). By testing the coefficients b b +  , n 1 1 we see whether the air quality of former days influences mobility later.
The equations above test how the air quality in absolute form may affect city mobility outflows. Nevertheless, its relative difference between two cities may also play a role. Residents can be attracted to places with better air quality compared with their residence even if the air of their destination is still polluted. To test whether this mechanism plays a role, we test how the population flow between each pair of origins and destinations is affected by the air quality of both places by a gravity model, as follows: where F ijt is the population flow of the city pair (city i to city j ) on day t. AQ it denotes the air quality of the source city i of the city pair i, j on day t, and AQ jt denotes the air quality of the target city j of the city pair i, j on day t. Similar to equation (1), X it and X jt are vectors of control variables of city i and city j, respectively. Other variables are the same as in equation (1). Here we are interested in b 1 and b , 2 and they are expected to be positive and negative, respectively, since the relatively better air quality of the target city ( j) tends to attract more population flow.

Population mobility
The population mobility data come from the Tencent Location Big Data Platform (https://heat.qq.com/), which covers the daily migration flows across cities sourced from the location service of mobile applications from Tencent Enterprise (the most popular application, WeChat, has over 1 billion users) on individual smartphones. For each city, the short-term mobility, i.e. the top 10 outbound population flows (measured by a standardized indicator based on mobility times, transportation and distances; provided directly by the platform) into other cities by car, plane and train are recorded instantly (since cross-city commuting is rare in China, commuters are barely covered). In this way, there are some cities with small inflows not included in any city's top 10 target cities. In other words, based on the current data, the inflows are not objectively presented. Therefore, we sum up all the population flows out of each city as measurements of the major outflows and do not discuss the impact of air quality on the inflows. The outflows may ignore the flows in less popular directions, and the age composition of the data is not perfect due to its smartphone-based source (because of lower usage by children and older residents). For example, for the most popular application, WeChat, people older than 55 years accounted for 5.82% of users in 2017 (https:// support.weixin.qq.com/cgi-bin/mmsupport-bin/ getopendays), while those over 65 accounted for 11.7% (National Bureau of Statistics of China 2018). Nevertheless, considering the lower mobility of the young and the old, this study can still provide some conservative quantifications about how inter-city mobility flows are affected. Our sample contains records of 318 cities on 102 days (from 1 March 2018 to 10 June 2018).

Air quality
The AQI is the most commonly accessible index of air quality for city residents in China, as well as the referring factor for their decisions. Previous research shows that the forecast AQI is of high accuracy (Song et al 2019), and we regard the actual AQI as the one that people receive in forecast alerts. The air quality data we used were retrieved from the website of the Ministry of Ecology and Environment of the People's Republic of China (http://mee.gov.cn/). The data include hourly concentrations of six major air pollutants, SO 2 , NO 2 , CO, O 3 , PM 10 and PM 2.5 , which are monitored by local stations. Based on these concentrations, the AQI is calculated as a comprehensive indicator denoting the level of air pollution and consequential health risks. The daily AQI data for each city are calculated as where IAQI P is the individual air quality index (IAQI) of pollutant P which is calculated by an interpolationlike process based on the concentration of P (C P ), the upper and lower bounds of the pollutant P (BP Hi and BP , Lo respectively) and the lower and upper limits of IAQI (IAQI Lo and IAQI , Hi respectively) corresponding to the range BP Lo to BP Hi (China Ministry of Environment Protection 2016). The C P values include daily average concentrations of SO 2 , NO 2 , CO, PM 10 and PM 2.5 , the maximum hourly average concentration of O 3 and the maximum 8-h moving average concentration of O 3 . Then, the AQI equals the maximum of the IAQIs of n (here, n=7) types of pollutant: The AQI score ranges from 0 to 500, with a higher value indicating a more serious pollution level. The AQI is divided into six classes to indicate the significance of health effects according to a series of cutoff points (China Ministry of Environment Protection 2016).

Thermal inversion
We used air temperature data from the National Centers for Environmental Prediction (NCEP) Final (FNL) Operational Model Global Tropospheric Analyses (2000) to identify the thermal inversions. This dataset records the temperature at different heights in the atmosphere with specific pressure levels. The data are recorded every 6 h specified for 1°by 1°grid. We identified the thermal inversion by differencing the temperature at 1000 hPa and 975 hPa (i.e. at altitudes of 110 m and 230 m; Fu et al 2017), the layers in the troposphere that are closest to the ground and most influential on the dispersal of air pollution. The grid data were then matched with city locations. We count the number of times that the records show thermal inversions in the four records of a city each day and used this as our instrumental variable.

Weather conditions
Temperature, precipitation, wind speed and relative humidity data are obtained from the China Meteorological Station Data Sharing Service System (http:// cdc.cma.gov.cn/home.do). This dataset provides daily meteorological records from ground climate stations covering the country. Every city has at least one station within its geographical area, and the average station data were used in this work.
The panel data used in regression models have passed the stationary test for a robust analysis. Descriptive statistics of the data used are shown in table 1.

Results and discussion
4.1. The effect of AQI on the outflows City residents in China tend to avoid being exposed to pollution by leaving cities with low air quality. The AQI shows a positive impact on outflow in the naïve GLS regressions (table 2, model (1)). The endogenous issue between air pollution and population mobility is due to two effects. On the one hand, people crowding into a city may cause pollution because of the increasing activity such as transportation, and people leaving a city may reduce pollution. On the other hand, highly polluted air may lead people to flee the city. Therefore, population mobility and air pollution have reciprocal causation. Thus, we use the count of thermal inversions as the instrumental variable to solve this problem. As shown in model (3) in table 2, the count of thermal inversions is a strong IV for addressing the endogenous problems between air pollution and population outflow, and IV estimation solves the problems that the GLS model fails to do. The IV estimations (with the count of thermal inversions as an effective IV), however, show higher AQI which means poorer air quality, instigating more residents to leave (table 2, model (4)). The results show that an increase of AQI by 100 points causes a 49.60% rise in the outflow. Beyond the air quality, temperature also matters in determining traveling behavior, and presents an inverse-U curve relation with an optimum temperature for population outflow (around 26°C). Days that are either very hot or cold affect residents' behaviors. Other weather conditions show an insignificant impact on residents' outflow behavior, indicating that air pollution is even more powerful in driving people out than bad weather.

The effect of PM 2.5
As the air pollutant that attracts most attention, PM 2.5 and its effect have been widely studied (Xie et al 2016, Hao et al 2018, Maji et al 2018, Zheng et al 2019. Here, we introduce thermal inversion as a good instrument (table 2, model (5)) for working out the impact of PM 2.5 , since the GLS model for PM 2.5 shows a negative effect on outflows (table 2, model (2)). In the IV models, a rise of PM 2.5 of 1 μg m −3 may cause a 0.47% increase in the population outflow (table 2, model (6)). PM 2.5 shows a similar impact on mobility as AQI, because in many cities in China PM 2.5 is the dominant pollutant, and PM 2.5 is highly correlated with AQI (correlation coefficient 0.5901 with p<0.001). The temperature and wind speed also have an impact on population mobility, and gentle breeze seems to keep residents where they are (table 2, model (6)).

The effect of air quality on previous days
Since decisions about traveling are made some time before the trip, air quality on previous days may influence people's traveling decisions. However, the influence may vary with the distance and the means of transportation people choose. Air pollution may drive people away from their cities in a short-term period, and the flexibility of road and railway travel in China allows people to choose, change or cancel their journey, while air trips are less flexible and may be not as sensitive to air pollution. We tested the impact of the AQI 0-7 days before the mobility event on outflow by air, railway and road (table 3), and found that the AQI of previous days has no significant impact on air trips (model (1)) but certainly influences the mobility by railway (model (2)) and road (model (3)). The AQI 3 and 7 days previously positively affects outflow by railway and road, i.e. when the air quality is poor, people tend to choose to leave the city in 3 days or in a week by railway or by road. While flight tickets are usually ordered more than a week in advance, the short-term reactions to air pollution are not as significant. Long-term mechanics of the impact of air quality on mobility by air relate to anthropogenic factors, and are out of the range of this work.

Working days, weekends and holidays
People are more likely to travel for recreational purposes on weekends and holidays than on working days. As recreational trips usually contain more outdoor activities, the population flows on holidays may be more affected by air quality. To test such a heterogeneous effect, we added two interactive terms to AQI and dummies indicating the weekends and holidays (including Tomb Sweeping Day and May Day holidays), respectively, into the regression. If the weekend and holiday effect is not considered, the effect of AQI on all dates is as shown in section 4.1, i.e. an increase in the AQI causes a larger outflow. However, when dividing the time range, the weekend/holiday effects counteract the general effect, which means on holidays, the equivalent rise in the AQI may lead to fewer residents leaving the city. Generally, a 100 point increase in the AQI leads to a 49.60% increase in outflow (as shown before in table 2), but on weekends and holidays that increase in AQI causes an uncertain impact on population outflow. Theoretically, on weekends and holidays, city residents have more time to travel freely. In this regard, their avoidance of air pollution should be more frequent. But 2-day weekends are only enough for a short-distance round trip, possibly to nearby cities. Legal holidays are no shorter than 3 days. Although compulsory trips still happen on weekends and holidays, spontaneous mobility counts for the largest part. Most people would choose close travel destinations due to budget, convenience and time. The mean distance of working-day mobility is larger than that of weekend mobility, which is in turn larger than that of the holidays. (The mean mobility distances on working days, weekends and holidays that are covered in our data are 619.4 km, 599.9 km and 470.3 km, respectively.) These results support the idea that city residents are inclined to choose somewhere close for leisure. Models (2) and (3) in table 4 show that the air quality at the weekends and holidays does not have a significant impact on mobility. Similarly, weather conditions (all as control variables) other than temperature show an insignificant effect, indicating their small impact on residents' travel decisions. A possible reason is that city residents may refuse to travel when air quality is poor. Since the destination is near, the air quality there is similar to their place of residence.
Based on intuition and the air quality alerts of their home cities, people may expect that the air quality of the destination will not improve significantly and decide to stay at home on weekends and holidays (Zivin and Neidell 2009)-they follow experts' suggestions (Laumbach et al 2015) to decline all kinds of outdoor activities including traveling to a close city and going out in their own city. On working days, however, the mobility of city residents is mainly for commercial purposes, meetings or other unavoidable reasons. Thus, the travel plan is unlikely to be canceled even if the target place is heavily polluted regardless of the distance (as shown in a generally negative impact in table 4, models (2) and (3)).

Gravity model of pairwise population flows
The gravity models of pairwise population flows among cities provide evidence for the relation between air quality and population mobility from a relative viewpoint. The population flows between city pairs of the source (noted as S) and the target (noted as T) cities present an impact of AQI that is largely consistent with the results above: with the source cities' AQI rising, more residents leave for other cities. Model (3) in table 5 shows a normal tendency that a higher AQI of the source city and a relatively lower AQI of the target city are conducive to more population flow, which means cleaner air quality attracts people to escape from cities with dirtier air.
To investigate the weekend and holiday effect, we added the interaction of weekend, holiday and AQI. Models (4) and (5) in table 5 show the inhibitory effect of local air quality on outflows, as discussed above. GLS, naïve generalized least square models; AQI t-n, AQI n days before the mobility. Robust standard errors clustered by city in parentheses. * p<0.1; ** p<0.05; *** p<0.01. However, there are rather controversial effects on weekends and holidays. Model (5) in table 5 shows that the AQI of the destination on holidays motivated people to travel, and the attractiveness of target cities' clean air is effective and becomes even larger than on 'non-holiday' days (table 5, model (3)) because people consider the environment more since it is a precious holiday trip. However, from the view of source cities, similar to discussions in section 4.4, counteractive effects between leaving and staying indoors exist and the total avoidance of exposure to pollutants prevails. The source cities with poor air quality would tend to lose their residents. However, the impact of air quality on the flow between two cities varied among different time periods due to residents' different responses to air pollution. Shorter mobility distance on holidays tends to be accompanied by a greater tendency to stay indoors.

Conclusion
China faces severe air pollution issues following the rapid growth of the economy, and recent studies have noted that the health, productivity and activity of Chinese urban residents can be influenced by air quality. People tend to make avoidance behaviors when the air quality is bad, including reducing outdoor activities, traveling or migrating. Revealing the relation between air quality and population mobility is helpful for a more accurate estimation of the health risks of exposure to pollution, a better calculation of economic or productivity gain and loss, and more comprehensive guidance for city environmental management. However, few studies have focussed on this question. This study intends to solve the question of how air quality affects population mobility by using IV regressions with daily city data (including smartphone-based data). Results show that air quality does affect the population outflows of cities. When the AQI of a city rises by 100 points, the population outflow increases by 49.60%. Taking PM 2.5 as the air quality variable, we obtain similar results to the AQI, and a rise of 1 μg m −3 in PM 2.5 may cause a 0.47% increase in the population outflow. Air pollution on one day may drive people to leave the city 3 days or a week later, going to other cities by railway or by road. There are heterogeneous effects among time ranges and cities. On weekends and holidays, city residents tend to take avoidance actions with air pollution, but it seems that the effect of staying indoors counteracts the leaving effect. Based on the results above, this study can provide clues for making more accurate estimations of health risks considering the population mobility issue. The Table 5. Impact of AQI on source-target mobility between city pairs.

IV
(1) ( outflow of a city should be considered in the exposure estimation. Also, the evaluation of the economic loss due to air pollution, for example loss of tourism and productivity, can be improved by taking the pollutioncaused population outflow into account. However, the most valuable aspect of the results is that in addition to reasons of health and civic responsibility, cities now have another key reason to protect the atmospheric environment. As more cities are becoming developed cities, residents are more able to selecting a pleasant city in which to live. A city with poor air quality not only loses visitors but talented people also leave to find another workplace. The time range matters in residents' responses to air quality-weekends and holidays give residents the choice of staying indoors to avoid exposure to air pollution. If more residents choose to stay indoors in polluted cities on weekends and holidays, not only tourism but other local economic activities would decline due to fewer residents going outdoors. The outflows due to air pollution may also influence the city transportation system, which could either be under pressure or insufficiently utilized. Furthermore, judging by our short-term results, commercial mobility may also be influenced by air quality, which is important for non-tourist cities because they need to attract talent or investment. Our findings are consistent with the long-term migration response to air quality (Chen et al 2017), which indicates a trend of people leaving due to bad air quality. Civic governments need to rethink the importance of air quality, and more effective environmental management policies should be implemented for better retention of residents and visitors. This study has limitations. The impact of seasonality of air quality and thermal inversions has been checked, and would not significantly influence the effectiveness of the IV methods (for details, see the supplementary materials available online at stacks.iop. org/ERL/14/124025/mmedia). However, there remains a seasonality issue between population mobility and thermal inversions due to the short range of mobility data. So we used monthly passenger capacity data to find that in the period of March to June mobility showed no consistent seasonality with thermal inversions. This result cannot eliminate the seasonality issue between mobility and thermal inversions but suggests a weak influence that may not affect the robustness of the results. In addition, the relatively short time range may not be sufficient to reveal the complete impact of air pollution on holidays, but our interpretation of the holiday effect can still be considered. Because the data only covered outflows, we did not discuss the sensitivity of inflows to air quality changes. Future work could use data that cover a longer period and more cities to confirm the casualty and explore further issues with multiple models.