When do Indians feel hot? Internet searches indicate seasonality suppresses adaptation to heat

In a warming world an increasing number of people are being exposed to heat, making a comfortable thermal environment an important need. This study explores the potential of using Regional Internet Search Frequencies (RISF) for air conditioning devices as an indicator for thermal discomfort (i.e. dissatisfaction with the thermal environment) with the aim to quantify the adaptation potential of individuals living across different climate zones and at the high end of the temperature range, in India, where access to health data is limited. We related RISF for the years 2011–2015 to daily daytime outdoor temperature in 17 states and determined at which temperature RISF for air conditioning starts to peak, i.e. crosses a ‘heat threshold’, in each state. Using the spatial variation in heat thresholds, we explored whether people continuously exposed to higher temperatures show a lower response to heat extremes through adaptation (e.g. physiological, behavioural or psychological). State-level heat thresholds ranged from 25.9 °C in Madhya Pradesh to 31.0 °C in Orissa. Local adaptation was found to occur at state level: the higher the average temperature in a state, the higher the heat threshold; and the higher the intra-annual temperature range (warmest minus coldest month) the lower the heat threshold. These results indicate there is potential within India to adapt to warmer temperatures, but that a large intra-annual temperature variability attenuates this potential to adapt to extreme heat. This winter ‘reset’ mechanism should be taken into account when assessing the impact of global warming, with changes in minimum temperatures being an important factor in addition to the change in maximum temperatures itself. Our findings contribute to a better understanding of local heat thresholds and people’s adaptive capacity, which can support the design of local thermal comfort standards and early heat warning systems.


Introduction
Gradual global warming in combination with one of the strongest El Niño events to date led to record high temperatures across the world in 2015 and 2016 [1,2]. The devastating impact of a single heat wave in India in May 2015, with over 2200 fatalities, demonstrated that extreme heat is a serious issue even in countries regularly exposed to high temperatures [3,4]. The Intergovernmental Panel on Climate Change (IPCC) has declared heat stress as one of the key health risks in Asia [5]. Heatwaves are expected to continue to increase not only in intensity, but also in duration and frequency [5][6][7].
Various studies have provided evidence for heatrelated health impacts in a range of geographical and contrasting income settings [8][9][10]. Such studies usually rely on heat thresholds to specify weather conditions above which increased negative health effects are observed in a population. While the majority of these studies have identified thresholds for the most extreme impact of heat stress-increased mortality- it remains unclear when people start to experience thermal (dis)comfort. Thermal discomfort (i.e. dissatisfaction with the thermal environment) is not only a potential health hazard, it also impairs people's ability to function effectively [11][12][13][14] and their satisfaction e.g. at home, at work or elsewhere [15].
To understand how humans react to global warming it is important to understand their ability to adapt to extremes of heat. Human adaptation to heat (and cold) involves a complex set of physiological (body acclimatisation to local prevalent climate) [16][17][18], behavioural (personal, technological and cultural) [16][17][18] and psychological (habituation, expectation and preferences) [16][17][18] factors. While there is evidence for adaptation, based on spatial variations in health outcomes related to heat indicators and on few studies which analysed variation in temperature-related mortality over time in one location [19], such quantitative approaches are less applicable to low-and mid-income countries, where recent and good quality health data is usually not available.
With this study we aim to understand thermal discomfort through the analysis of internet search behaviour, using India as a case study. Air conditioning is acknowledged as being an effective protective measure against heat stress [20] and as such we assume that the weather conditions at which people start searching the internet for air conditioning devices can be indicative for thermal discomfort, an assumption we will further test. Additionally, we examine whether there is spatial variation in thermal discomfort across different states in India and, if so, whether it can be related to differences in long-term average climate of these states. Such a relationship would signal adaptation potential of people to heat.
Regional Internet Search Frequencies (RISF), provided by major search engines like Google, have been used for many purposes, including health surveillance [21][22][23]. However, only a limited number of studies have so far used RISF or other online social media services, like twitter, to study heat exposure and thermal discomfort [24]. New in our study is that we relate RISF to actual weather conditions via mechanistic concepts, which allow us to quantify a specific degree of heat discomfort, namely the weather condition at which people feel the need for air conditioning.
India, where high quality health data is largely lacking [25], makes an interesting case study to investigate thermal discomfort with the majority of its population already exposed to prolonged high temperatures during summer. At the same time, heat exposure varies between states due to the country's distinct climate zones [26], and different regional behavioural practices. RISF are available online on Google Trends [27] and represent a single or a number of specific keyword searches relative to all searches conducted in Google, normalised from 0-100 (hence frequencies). One hundred represents the maximum fraction of internet searches within the selected time period for any of the keywords if multiple keywords are requested, or any of the locations if multiple locations are requested. RISF data are currently made openly available with a weekly time resolution. For India, search frequencies are available at state level.

RISF data
We scanned RISFs for a range of search terms, each reflecting some sort of electrical heat relief solution (i.e. fan, evaporative cooler and air conditioning). RISF works best when there is a large population of Google Search users in the area of interest and for unambiguous search queries that reach a high level of requests. Rather than stacking different search queries for different electrical cooling devices into one single predictor, which can lead to prediction errors [28], we narrowed our approach down to the search term returning the largest volume of searches, which was 'air conditioning' (AC). AC is also a rather unambiguous word (unlike 'fan') and is used throughout the country (unlike evaporative cooler, which is traditionally less popular in states with a sub-tropical climate [29]). We expect RISF for AC to be a proxy for assessing thermal discomfort; when temperatures rise, people are likely to be interested to buy an AC, to switch on their AC and find out it needs repair, search for spare parts, or go to a place with AC (e.g. hotel, restaurant or cinema), and search for this online.
With Google Correlate, a program that gives for each search term a list of queries with similar data time series patterns from 2003 till present we checked for unwanted search associations with AC and excluded these from our Google Trends search. An example was the strong link between 'AC' and 'Milan', the Italian soccer club. Finally, we added several synonyms for AC, including names of two widely sold brands, to our search to account for potential regional differences. Table 1 gives the search terms related to AC which were fed into Google Trends, or excluded from our search query. The selected search terms were stacked to form a single query for each state (see supplement S1 available at stacks.iop.org/ERL/13/054009/mmedia).
Although, internet usage in India is expanding rapidly, with 26% of the population having access in 2015 [30], search frequencies in several mountainous states in the north and north-east of India and in smaller union territories were low, leading to substantial noise in the data. Thus, these states were excluded from our analysis. Time series data of sufficient quality were derived for the following 17 Indian states: Andhra Pradesh, Bihar, Chhattisgarh, Delhi, Gujarat, Haryana, Jharkhand, Karnataka, Kerala, Madhya Pradesh, Maharashtra, Odisha, Punjab, Rajasthan, Tamil Nadu, Uttar Pradesh and West Bengal. Together, these states cover the main part of peninsular India, representing a range of different climates and socioeconomic conditions.

Climate and population data
We obtained six-hourly temperature and humidity data through the ECMWF ReAnalysis-Interim (ERA-Interim) [31] database, a gridded climatological dataset at 0.5 o spatial resolution. Mean temperature over peninsular India and the Indo-Gangetic plain and its inter-annual variability is well represented by ERA-Interim, showing the best performance among reanalysis products [32]. State-wise long-term (1979-2012) average climate indicators (minimum, mean and maximum monthly temperature) were derived from the Watch Forcing Data Era Interim (WFDEI) [33], a dataset that goes further back in time than ERA-Interim but with data only available until 2012.
Heat stress is generally considered to be not only a result of high temperatures, but of a combination of weather conditions [34,35]. We considered for our study minimum, mean, and maximum temperature and the Heat Index (HI) [35][36][37][38], a thermal comfort indicator combining air temperature and relative humidity. Other empirically derived thermal comfort indices are more complex and require additional input parameters, often not available. Several studies have also raised doubts if more elaborated heat indices applied on a population level would lead to different or more conclusive results [39][40][41][42].
Within Indian states large climate gradients exists. To achieve a better match between state-wide weather conditions and the state-wise RISF data we derived population density-weighted weather variables for each state. We assume that RISFs are dominated by locations within a state with high population density-rather than taking an area based average. For this, the Gridded Population of the World Version 3 (GPWv3) map at 2.5 arc-minute resolution was obtained from the Center for International Earth Science Information Network (CIESIN) [43].

Heat threshold
To quantify region-specific heat thresholds for thermal discomfort, we formulated a simple mechanistic heat threshold model, which describes weekly RISF of 'AC' as a function of a heat indicator (for example temperature, HI, wet-bulb temperature or Universal Thermal Climate Index) and a heat threshold (equation 1). In its most basic form, simulated RISF increase when the heat indicator rises above the threshold and RISF decrease again when the heat indicator drops below this threshold. Additionally, we introduced an empirical saturation function, based on the data observation that RISF peak when the threshold value is reached, but search volumes go down when it is hot for a consecutive period of time: in the meantime people might have purchased what they needed, searched for a cooling device out of an impulse triggered by the start of the hot weather, or may have gotten acclimatised or used to the heat [18,44] and thus stopped searching. Our heat threshold model then reads as: , and a [−] and b are scaling parameters that account for any scaling applied by the RISF providing platform. T H is the heat threshold and S(t) the saturation function. The parameter fS controls how much the saturation function increases with each degree above the heat threshold, while c is a scaling parameter that controls the rate of decrease of the saturation function (i.e. desaturation) if the observed heat is below the threshold. To correct for any longer-term trends in RISF over the five years, e.g. as a result of an increase in internet users changing the overall pattern of searches [45], we de-trended the RISF by subtracting any significant (p < 0.05) linear trend over the period January 2011 until December 2015.
The AC heat threshold model, implemented in the programme R (Version 3.2.3), was calibrated to each state and for the climate variables minimum, mean, and maximum temperature and the HI separately. Model fit was expressed by the R 2 between modelled and observed internet searches. To account for parameter uncertainty of the heat threshold, the model calibrations were run 1000 times allowing a deviation of 2% in the explained variance, R 2 .

Testing assumptions
To verify the robustness of our results and rule out alternative explanations to a search for AC representing actual discomfort, rather than it being driven by seasonally recurring behaviour or external triggers, we performed two tests.
Recurring behaviour linked to tradition, holidays or religious festivals (giving, e.g. time to shop or be online) could potentially also drive search behaviour and thereby-undesirably-influence our derived heat threshold if such behaviour coincides with the rise in temperatures (other random searches would either be captured as a base level of searches, or be part of the variation we cannot explain, which in itself is not a problem). To test for seasonality, we compared our 'actual daily weather' input model with a model using long-term average daily temperatures (1979-2012) as input, representing the standard seasonal cycle over the year (from here onwards referred to as the 'seasonal model'). A better performing actual daily weather-based model, expressed by a higher explained variance in search behaviour, gives confidence that our thresholds are linked to temperature conditions experienced at that moment.
In addition, we looked at the first day of threshold exceedance for each state, defined as the moment the heat threshold is exceeded the first time in the year for 10 consecutive days. We reason that if timing of search peaks and exceedance of heat thresholds differ per state, the external influence of national campaigns or nation-wide heat stress warnings is less likely.

Heat threshold controls
Using the spatial variation in heat thresholds, i.e. the differences between states, we analysed whether people continuously exposed to higher temperatures show a lower response to heat extremes through adaptation. We correlated state-specific heat thresholds with temperature-based climate indicators (30 year yearly averages of monthly mean, minimum, maximum and the range between minimum and maximum temperature) per state. A relationship would suggest an adaptation of the heat threshold to local climate. We carried out a bootstrapped Pearson correlation to find the best independent climate indicator, followed by an ordinary least squares regression analysis.
Additionally, we explored the idea that economic status-with wealthier people more accustomed to air conditioners [44,46,47]-would lead to a lower acclimatisation and thus to internet searches at lower temperature thresholds. We checked if state level average Gross Domestic Product (GDP) per capita from 2011-2015 had a moderation effect on the relation between local climate and the heat thresholds. Data was obtained from the Indian Ministry of Statistics and Programme Implementation [48]. All statistical tests were performed with IBM SPSS Statistics 23.

State-wise heat thresholds
Both actual RISF and temperature show individual spikes superimposed on a distinct seasonal fluctuation ( figure 1(a) and (b), for Delhi), with low RISF in winter and a rise and high RISF in summer, which starts in India around the beginning of April. While the rise in RISF coincides with a rise in temperature, RISF subside earlier than temperature. Simulated RISF with a heat threshold based on daytime temperature (at noon) follow the actual RISF pattern well for Delhi (R 2 = .86). The heat threshold above which RISF for AC starts rising is 27.1 • C (SD = 1.5, 95% CI (24.2, 29.7), figure 1(b)). Whenever this heat threshold is exceeded the saturation function starts building up ( figure 1(c)), reducing modelled search volumes especially around the second half of the year, when temperatures remain high, but internet searches decrease. ). Of all the tested weather variables, our heat stress thresholdmodel gives the highest and almost equally good model fits with daytime temperature (temperature at noon) and daytime HI. We will mainly show proof of concept and results for daytime temperature (see supplement table S2.1), because thresholds in terms of temperature are generally easier to comprehend [42].
A comparison of the actual daytime temperature model with the seasonal model, shows that the former is better able to predict RISF, in terms of higher explained variance, except for the southern states Kerala and Tamil Nadu (supplement table S2.1, column 7 versus column 9). Seasonal fluctuations in temperature are modest in these (sub) tropical states and fairly regular, without extreme temperature spikes. Lastly, we used the day of first exceedance of the heat thresholds in each year, i.e. the vertical lines in figures 1(b) and supplement figure S2.1, to identify any homogenous, nation-wide patterns in search behaviour triggered by external factors. The first day of exceedance differs per state for each year and per year for each state and shows no strict order amongst states between years, except that-in general-states coming out of colder winters like Punjab tend to exceed their threshold later than warmer states like Kerala (figure 3). The yearly spread, i.e. the difference between the earliest and latest exceedance date between states, in the date of first exceedance is on average 48 days. The spread is on average 15 days for each state between different years, with Maharashtra showing the largest spread (31 days).

Heat threshold controls
All local climate indicators were significantly correlated with the heat thresholds, however, mean minimum monthly (i.e. temperature of coldest month of the year) and mean intra-annual temperature range (i.e. mean temperature of the warmest-coldest month, figure 2(c)) had the highest correlation coefficients (table 2). As these two indicators were also almost perfectly negatively correlated with each other, we further explored intra-annual temperature range, and mean monthly temperature instead ( figure 2(b)), in an ordinary least squares regression analysis. Intra-annual temperature range significantly predicted heat threshold temperature in • C, =    monthly temperature the heat threshold increased by 1.1 • C. GDP per capita did not correlate significantly with heat threshold, r = − .050, 95% BCa CI [−.67, .69], p = .849. In order to test its impact as a potential moderator that affects the relation between intra-annual temperature difference and heat threshold, a moderator analysis was performed, however no significant interaction effect between GDP per capita and intraannual temperature difference was found, = 0.000, t (13) = 1.036, 95% CI [0.000, 0.000], p = .319.

Discussion
In this study Regional Internet Search Frequencies (RISF) function as 'human sensors' for defining heat discomfort for Indian states based on outdoor daytime temperature. We showed the applicability and robustness of a heat threshold model to describe thermal discomfort corresponding to a desire for airconditioning (AC) for 17 Indian states-a first to our knowledge. Our method presents an alternative to statistical methods that, for example, correlate searches directly to health surveillance data [49] or social media posts to temperature [24]. Such methods do find a strong correlation, but generally pass over the fact that searches could show threshold behaviour, in our case by only responding to temperature above a certain threshold.
Our heat threshold model performed well for the majority of states. The strong variability between years, in the day of first exceedance of the heat thresholds, suggests the derived thresholds reflect distinctive individual state-specific search behaviour rather than a homogenous nationwide reaction to an external trigger. We cannot fully rule out state specific triggers such as a response to a local weather forecast, which would represent the anticipation of discomfort, rather than the discomfort itself. This would, however, fit our model as an individual would still have to determine whether the forecasted temperature would lead to discomfort.
The superiority of the actual weather-over the seasonal model is most obvious in states with high seasonal variation and distinctive temperature spikes, mainly in the north of the country. In the tropical southern states, temperatures are high year-round, seasonal variation in temperature is low and temperature spikes are few and modest. In absence of clear seasonality or spikes in temperature, other triggers like holiday or festivalinduced shopping sprees might surface more strongly. Here, also the heat thresholds expressed as Heat Index (HI) yielded better results than for daytime temperature. This might be because these southern states, next to more stable (and high) temperatures throughout the year, also have high, but seasonally varying, relative humidity levels. For these states, an index combining temperature and humidity reflects better when people start feeling hot.
The influence of intra-annual temperature differences on thermal discomfort is striking. States that have a larger temperature difference between winter and summer show a lower heat threshold, indicating a lack of short-term acclimatisation during the transition phase between winter and summer. This might make people in those states more sensitive to heat as compared to people who are exposed to high temperature throughout the year. Intra-annual temperature differences explained 63% of the variance in heat threshold alone. The opposite, a positive correlation with the heat threshold, was found for mean temperature as a predictor variable. These results point to that people exposed to relatively higher temperature throughout the year tend to have higher heat thresholds, indicating a gradual lessening of response to heat the warmer a state gets, i.e. the presence of some form of local adaptation. Whether the main driving force behind adaptation is dominantly physiological, behavioural, or even of psychological nature, remains an open question. Previous research showed comfortable indoor office temperatures increase from cooler towards warmer climates [50,51], confirming the pattern we see in this study. However, our results imply that expected increases in maximum temperature due to climate change [5] do not directly lead to adaptation in the form of a higher heat threshold, as long as high intra-annual temperature differences remain.
Contra to our expectation, we did not find a moderation effect of GDP per capita on the relation between intra-annual temperature differences and heat thresholds. One reason for this could be that the GDP data are too aggregated and therefore not representable for the socio-economic status of individuals searching for AC on a state level.
Several caveats provide opportunities for future research. First, derived heat thresholds are only representative for the population in India having internet access, which tends to be dominated by males in urban areas [45]. People who do not have internet access or do not search for AC online, might differ from our study population, due to socio-economic, demographic and geographical and therefore have a different heat threshold. Higher gender diversified usage and better internet coverage over the whole country combined with longer time-series of RISF and climate data will allow for improving our threshold estimates. Second, looking at heat thresholds derived from RISF for a diversity in electrical cooling devices, such as evaporative coolers (a very common cooling device in Northern India) or fans could be an interesting way forward to represent socio-economic differences, as these cooling devices are not limited to the upper middle and higher economic classes only. Third, thermal comfort perceptions can differ, depending on whether a person is inside or outside [52]. We could not go in-depth into the exact motivation of our study population to search online for AC during hot periods. Combining RISF with the in-depth knowledge that can be derived from targeted surveys on heat perception and AC usage can be a way forward to investigate the motivation of AC searches and to verify the thresholds derived from this study.
Defining thermal (dis)comfort through conventional methods such as field-based questionnaires or chamber experiments is a challenging and resource intensive task. Recent simulation approaches on the other hand never directly involve human subjects in their assessment [53]. Our internet-based measure of 'feeling hot' lies in between these different approaches, with the benefit of being less prone to biases generally associated with the method of assessing perceptions through surveys [22] and less resource intensive, but still having direct feedback from individuals through their search behaviour.
Our findings can inform policy and practice. First, despite apparent regional climate differences across India, it is surprising that air-conditioners in most Indian office buildings are still operated at temperature levels around 22.5 • C ± 1 • C all year round [54]-so that standards originating from the western world can be met. With a 5%-6% reduction in the Energy Performance Index (i.e. annual energy consumption per square meter of office floor area) per degree possible [55], the energy saving potential of a more flexible and location-specific cooling standard, reflecting the predominant local climate, could be enormous. Second, our method can further facilitate the development of local early heat warning systems. Warnings should incorporate the course of temperature during the winter and spring season rather than focus on high temperature extremes during peak summer alone. Finally, this study could be used as a first careful step towards quantifying the minimum adaptation potential within India, and beyond, in light of future global warming. Studies developing future projections on the impacts (e.g. health [56], decreased productivity [57], energy demand [58,59] and health costs [60], etc.) of extreme heat in a warmer world should incorporate some form of adaptation in their long-term projections.

Conclusion
This study shows the use of Regional Internet Search Frequencies (RISF) for air conditioning devices as an indicator for thermal discomfort. The spatial variation in derived heat thresholds across states in India was used to explore adaptation. Our results indicate there is potential within India to adapt to warmer temperatures, but that a large intra-annual temperature variability strongly reduces this potential to adapt, due to a 'reset' triggered during the winter. Especially, in contexts where high quality health surveillance data is lacking, these findings can contribute to a better understanding of local heat thresholds and people's adaptive capacity. Such better understanding can support the design of local thermal comfort standards, early heat warning systems and future adaptation projections in light of climate change. As our results indicate, these projections should not only take into account changes in maximum temperatures, but also factor in minimum temperatures during the winter and spring months.