To measure the effects of air pollution on human activities, this study applies statistical/econometric modeling to hourly data of 9 million mobile phone users from six cities in China’s Zhejiang Province from December 18 to 21, 2013. Under a change in air quality from “Good” (Air Quality Index, or AQI, between 51 and 100) to “Heavily Polluted” (AQI between 201 to 300), the following effects are demonstrated. (i) Consistent with the literature, for every one million people, 1, 482 fewer individuals are observed at parks, 95% confidence interval or CI (−2, 229, −735), which represents a 15% decrease. (ii) The number of individuals at shopping malls has no statistically significant change. (iii) Home is the most important location under worsening air quality, and for every one million people, 63, 088 more individuals are observed at home, 95% CI (47, 815, 78, 361), which represents a 19% increase. (iv) Individuals are on average 633 meters closer to their home, 95% CI (529, 737); as a benchmark, the median distance from home ranges from 300 to 1900 meters across the cities in our sample. These effects are not due to weather or government regulations. We also provided provisional evidence that individuals engage in inter-temporal activity substitutions within a day, which leads to mitigated (but not nullified) effects of air pollution on daily activities.
Copyright: © 2021 Chen et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: Data cannot be shared publicly because of privacy concerns. Data are available from Zhejiang University (contact via the State Key Lab of CAD&CG at email@example.com) for researchers who meet the criteria for access to confidential data.
Funding: W.C. acknowledges financial support from the National Natural Science Foundation of China (61772456). The website of the National Natural Science Foundation of China is: http://www.nsfc.gov.cn/english/site_1/index.html The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Air pollution in China during wintertime consistently draws media attention  because of its well-documented adverse health consequences [2–4] and negative impacts on cognitive performance . Those findings are corroborated by evidence from elsewhere. To name a few, [6–9] document a negative effect of air pollution on health/mortality in the U.S.;  finds a robust negative relationship between air quality and infant mortality in Africa;  shows the impact of exposure to air pollution on student test scores; and  demonstrates that outdoor air pollution reduces the productivity of indoor workers.
Therefore, countermeasures by individuals may be naturally implemented against air pollution, such as particulate-filtering facemasks  or air purifiers for home use . Our study focuses on how individuals change their activities, including outdoor activities.
To test how outdoor activities change in response to air pollution, the common approach in the literature is to study the attendance at major outdoor facilities, for example, the Griffith Park Observatory and the Los Angeles Zoo & Botanical Gardens in Los Angeles [15, 16], the Bristol Zoo Gardens in Bristol, U.K. , and national parks in the U.S. . While these studies convincingly document that individuals avoid these outdoor facilities when air pollution is elevated, they do not inform us what individuals choose to do as a substitute.
Substitutions between different activities can be crucial to measuring the costs of air pollution and thus to developing appropriate policies. For instance, when air pollution is severe, if more individuals choose to go to indoor facilities, such as shopping malls, the negative effects of pollution can be mitigated provided that those facilities supply purified air; however, if individuals choose to be at home, only those who have a well-functioning air purifier can successfully avoid pollution. Different substitution patterns lead to different policy implications.
Our study aims to advance this literature by measuring the substitution patterns in a dataset of nine million mobile phone users in six Chinese cities. In particular, by taking advantage of detailed information on each user’s hourly location, we overcame the main challenge in such a study: observing high-frequency behaviors of a large number of individuals.
We make two contributions to the literature. First and foremost, in terms of the substitution pattern, we demonstrate that individuals, in general, choose to be at home when air pollution is elevated, which has potentially important policy implications. For example, the official recommendations (as detailed later in Table 1) state that when the air quality is “Heavily Polluted,” children, senior citizens, and individuals with respiratory or cardiac issues should stay indoors. However, such a recommendation may not be sufficient because indoor air quality can also be poor. Not every home is well insulated against outdoor pollutants , and indoor air quality can be made worse by cigarette smoking and cooking at home [20–22] or chemicals from furniture and decorations . Several studies, as reviewed by , show that the inflow of outdoor pollutants, combined with internal sources (e.g. indoor combustion, particle resuspension) can make air quality lower than outdoors. Moreover, high-income households are more likely to purchase air purifiers for their homes ; thus, air pollution may further exacerbate inequalities in health status. In contrast, shopping malls, museums, schools, and other public indoor facilities in China have started to take measures to clean their indoor air [25, 26]. Helping individuals monitor indoor air quality, especially at home, and encouraging individuals to stay at an indoor facility with clean air may more effectively reduce the adverse effects of air pollution on health.
Second, our modeling of human behavior in a big dataset might be useful to others conducting related research. Our massive dataset of mobile phone logs contains rich information on phone users’ locations; however, it is also contaminated with significant noise. Modeling individual location choice and performing appropriate aggregation enable the flexible control of noise and measurement errors while reducing the computational burden. As the availability of big datasets increases rapidly, our strategy provides some insights when a tractable way of dealing with such datasets is required.
More specifically, our data cover six cities in Zhejiang Province of China, namely, Huzhou, Jiaxing, Ningbo, Taizhou, Wenzhou, and Zhoushan, from December 18 to 21, 2013. To investigate the effects of air pollution, we focused on each phone user’s location in each hour and his/her distance from home. We geolocated each mobile phone in a given hour by using the locations of the mobile phone towers, or Base Transceiver Stations, to which it connected, which is similar to . However, this method differs from that in studies that rely on mobile-phone locating-request data from certain mobile apps, such as those from Tencent , Baidu , and Facebook , or location-based service data (e.g., geo-tagged messages) from social media, such as Twitter  or Weibo . During the COVID-19 pandemic, mobility data from commercial vendors are used to investigate the impacts of the pandemic, e.g., ; such data are also different from ours because a vendor may use a mobile phone’s GPS, the name of its connected WiFi network, and even the MAC address of its connected router to geolocate a phone. More details about one of the vendors, Unacast, are available at https://www.unacast.com/privacy. An advantage of our dataset is that it covers a less selected sample of individuals in the six cities. In our sample period, the mobile phone service penetration rates in these cities range from 125% to 212% , and our dataset contains all the users in these cities from the biggest service provider, which had a national market share of 62.42% in 2013, calculated based on the total number of subscribers announced in the provider’s 2013 annual financial report and the national total number of subscribers in . Our dataset is far from being a representative sample of all adults in those cities because these phone users may be different from non-users and users with other providers. We discuss the potential issues that this non-representativeness can bring in the sensitivity analyses. However, our dataset does not suffer from certain common selection issues. For example, a mobile app generates a location data point only if a phone user has installed the app, allows it to use location services, and launches the app; therefore, only a self-selected subsample of mobile phone users can be observed in datasets based on app location data. A disadvantage of our dataset is the potential errors in geolocating a phone. A location is often serviced by multiple towers, and a phone at that location can connect to any of them. In the geolocating procedure, we assume that within an hour, a phone connects to a tower for a duration proportional to the distance between them (relative to the phone’s distances to other towers). There are many reasons that this assumption can be violated. For example, a tower may reach its capacity and refuse to allow more connections. Our modeling approach guarantees that our estimation is consistent as long as the measurement errors are not correlated with air pollution. The sensitivity analyses discuss possible biases when the errors and air pollution are correlated.
We aim to measure how individuals voluntarily respond to air pollution rather than factors like weather and government regulations. As clarified in the “Materials and methods” section, during our sample period, major weather events were not observed and the six cities did not have regulations on human mobility related to air pollution. An additional important factor is individuals’ knowledge about real-time air pollution levels. In regular weather reports on TV and radio, the official air quality level for the city and occasionally an exact air quality reading for the past hour was reported. More detailed information was available online. Because the city-level air quality measure was arguably the most salient to individuals, we used it as the main explanatory variable. Similar to any study that examines the effects of area-based attributes on individual behaviors, our study faces the uncertain geographic context problem (UGCoP) , which helps identify two sources of contextual uncertainty in our setting: uncertainty in the spatial configuration of the appropriate units for assessing the effects of air pollution and uncertainty about the timing and duration of exposure to the unit’s air pollution. The spatial unit in our study is a city in the administrative sense. As city-level air pollution measured by the air quality index is the most commonly publicized, individuals are more likely to respond to this information. Further, our study assumes that individuals react to air pollution on an hourly basis. These assumptions rule out the possibility that individuals only care about air pollution in their neighborhood or adjust their reactions to air pollution more/less frequently than hourly. In summary, we conducted a range of sensitivity analyses and provided some caveats in interpreting our results; in particular, we addressed the following concerns: (i) a phone’s location and distance from home are measured with error; (ii) a single mobile phone user may hold multiple phones; (iii) our data may not be representative; and (iv) including different explanatory variables, e.g., alternative measures of air quality, may affect our results. Our results are shown to be reasonably robust.
We categorized the locations into four mutually exclusive groups, namely, home, park, shopping mall, and others, and modeled individual location choice as a logit discrete choice problem. The model was applied to our data, and we found that increased air pollution is associated with more individuals being at home. Under a scenario in which air quality worsens from “Good” (the second-best level according to the official standard) to “Heavily Polluted” (the second-worst level), we demonstrated the following effects of air pollution: (i) For every one million people, 1, 482 fewer of them are observed at parks, 95% CI (−2, 229, −735), which represents a 15% decrease; (ii) the number of individuals at shopping malls does not show a significant response to such a change in air pollution; (iii) home is the most important location when there is such a worsening in air pollution, and for every one million people, 63, 088 more individuals are observed at home, 95% CI (47, 815, 78, 361), which amounts to a 19% increase; and (iv) in the time window from 7:00 to 22:00, the above effects are less pronounced from 13:00 to 17:00. We also estimated the effects of air pollution on individuals’ distance from home with a Tobit model . The results demonstrate that when the air quality worsens from “Good” to “Heavily Polluted,” individuals are 633 meters closer to their home on average, 95% CI (529, 737), which represents a sizable reduction because the median distance from home ranges from 400 to 1900 meters across different cities from 7:00 to 22:00 (or 300 to 1900 meters during all hours). Additionally, we analyzed daily data and found provisional evidence consistent with the following: (i) individuals engage in intra-day substitutions of activities and (ii) the intra-day substitution mitigates but does not nullify the effects of air pollution on human activities. Therefore, if air pollution only peaks for a couple of hours within a day, the results indicate that the foregone benefits of some behaviors, such as leaving home, can be partially recouped because people can choose to leave home in the hours when air pollution is low.