A seven-fold rise in the probability of exceeding the observed hottest summer in India in a 2 °C warmer world

Heatwaves and extreme temperatures during summer (April–May) in India have profound implications on public health, mortality, water availability, and productivity of labourers. However, how the frequency of the hottest summers in observed record (1951–2015) will change under the warming climate in India is not well explored. Using observations from the India Meteorological Department, we show that mean maximum summer temperature has increased significantly in three (arid, monsoon, and savannah) out of five major climatic regions of India during 1951–2015. We identify the hottest summer in the observed record in the five climatic regions in India. The arid, cold, and temperate regions experienced the hottest summer in 2010 while monsoon and Savannah regions witnessed the hottest summer in 1979 and 1973, respectively. Based on simulations from the Climate of 20th Century Plus (C20C+) Detection and Attribution project, we show that the regional hottest summer of 2010 can be attributed to the anthropogenic warming. We then use simulations of a large (2000 year) ensemble of the EC-Earth model to estimate the exceedance probability of the observed hottest summer in the present climate, 2 °C and 3 °C warming worlds in India. The exceedance probability of the observed hottest summers shows a rise of more than seven and twenty-fold in the 2 °C and 3 °C warming world, respectively, compared to the present climate. The projected increases in the frequency of the hot summers and associated heatwave days will pose great societal challenges in the future in India.


Introduction
The global consensus for climate change mitigation is aimed at limiting global mean surface temperature rise to 1.5°C from the pre-industrial level  by the end of the 21st century (UNFCCC 2015). However, the rise in the global mean surface temperature has almost breached the 1°C mark in 2018 and is likely to reach the 1.5°threshold by 2040 without strong mitigation efforts as highlighted in the Intergovernmental Panel on Climate Change's Special Report on Extreme Events (Allen et al 2018). Previous studies show that limiting the global mean temperature rise within 2°C is itself an enormous challenge, which requires sustained mitigation efforts and reliance on negative emissions in the long run (Peters et al 2012). A business as usual scenario (RCP 8.5) will continue to set new records of mean surface temperature each year in the majority of the world, which may disproportionally affect developing countries (Power and Delage 2019). Compared to the rest of the world, the frequency of exceptionally hot days in the tropics is projected to rise manifold in a 2°C warmer world in comparison to 1°C warmer world (Mahlstein et al 2011). In addition to the increased number of hot days and nights, prolonged and frequent heatwaves will continue to affect the tropics under high emission scenarios (Hoegh-Guldberg et al 2018).
The annual mean maximum temperature shows a significant increasing trend of 0.1°C per decade during 1901-2018, which is higher than the minimum temperature trend of 0.02°C/decade (MoES 2019). The largest warming in India has occurred in the winter and post-monsoon seasons Kumar 1997, Arora et al 2005). The marked asymmetry in the warming trends of diurnal temperature in India is quite distinct from the global patterns (Srivastava et al 1992), which makes an increase in mean maximum temperature a major contributor of warming (Kumar et al 1994, Arora et al 2005. The increased frequency of heatwaves in India in the observed and projected future climate has been reported in previous studies (Azhar et al 2014, Murari et al 2015, Mazdiyasni et al 2017. However, the implications of climate warming on temperature during summer (April-May) in India are not well identified. As high morbidity and mortality are linked with high temperature during the summer in India, even a slight increase in mean temperature can have serious implications on the population exposure to extreme heat (Azhar et al 2014, Murari et al 2015. Here, we focus on the summer (April-May) warming in India under the projected future climate scenarios and quantify its implications for the frequency of the hot summers and associated cumulative heatwave days (CHWD). We identify the observed hottest summer in five climatic regions in India and estimate the probability of exceedance of these events. We then use a large ensemble (2000 years: 400 ensemble members with 5 year data) of simulations from the EC-Earth climate model of the present climate, 2°C, and 3°C warming worlds to provide a robust assessment on the projected changes in the frequency and magnitude of the hot summers and heatwave days.

Data
We used gridded temperature data from India Meteorology Department (IMD), which is available for 1951-2015 at 0.5°spatial resolution. The gridded temperature product was developed using 395 observation stations across India (Srivastava et al 2009). Here, we consider the mean daily maximum temperature for the summer (referred to as mean maximum summer temperature) months of April and May. These months are classified as the summer season by the IMD (also called the hot-weather season) with an annual precipitation contribution of less than 8% (Parthasarathy et al 1994) and the country experiences maximum temperature during these two months (Kothawale et al 2010a).
In addition to observed data from IMD, we used simulations from CAM 5.1 from Climate of 20th Century Plus Detection and Attribution (C20C+D&A) project (Stone et al 2018). CAM5.1 is the atmosphere/ land component of the CESM earth system model for which land surface properties were derived from CLM 4.0. In CAM5.1 simulation, sea surface temperature (SST) was used from NOAA OI v2 (Hurrell et al 2008, Reynolds et al 2002. More details on the experimental set-up of CAM 5.1 simulations can be obtained from Stone et al (2018). We use two 48member ensembles of CAM5.1 simulation at 1°spatial resolution spanning the 1961-2014 period, run under a factual and counterfactual (alternate) historical climate scenarios. The factual (ALL: represents all forcing) simulations represent the real world, which includes both natural and anthropogenic forcing, while in the counterfactual (NAT) scenario, anthropogenic forcings are fixed to the pre-industrial level. CAM5.1 is widely used in the literature for attribution studies (Angelil et al 2014, Min et al 2019, Shiogama et al 2016. Bias correction was not performed on CAM5.1 simulations, which were used only for the attribution study. To estimate the exceedance probability of the hot summers, we use EC-Earth 2000 year simulations (400 ensemble members of five years each) for three scenarios (Present climate, 2°C, and 3°C warmer worlds). Present climate represents the period in which the global mean surface temperature corresponds to the observed 2011-2015 mean (based on HadCRUT4 data; Morice et al 2012). The 2°C and 3°C ensembles correspond to a rise in global mean temperature of 2°C and 3°C from the pre-industrial level. EC-Earth is a fully coupled global climate model developed by the international EC-Earth consortium, which incorporates an atmospheric, land surface, ocean, and sea ice model. The model based on the seasonal forecast system of the European Centre for Medium-Range Weather Forecasts is used here at 1.1°spatial resolution. We re-gridded the data to 1°resolution using bilinear interpolation to make it consistent with the observational data. Bias correction was not performed on EC-Earth simulations, as we estimated standardized departure in the observed data and used that as a threshold to estimate the probability of exceedance of the observed hottest summer in the future. Further details on the EC-Earth model can be found in Hazeleger et al (2012) while on the experimental setup can be obtained from Van der Wiel et al (2019). We compared maximum summer temperature from CAM5.1 and EC-Earth against the IMD for a common period of 2011-2015 (figure S1 is available online at stacks. iop.org/ERL/15/044028/mmedia). Both CAM5.1 and EC-Earth dataset capture the spatial variability of IMD summer maximum temperature, however, with some bias, which is expected (figure S1).

Methods
We identify the major climatic zones in India based on the global classification system proposed in Peel et al (2007). Peel et al (2007) classified the Indian region into 19 climatic zones based on temperature, precipitation, and vegetation. Out of the 19, 14 climatic zones confine mostly to the Himalayan region. To perform the regional analysis, we reclassified the 19 zones into 5 major climatic zones after merging relatively similar zones. The five climatic zones (figure 1(f)) are: arid regions, which includes both arid and semi-arid regions in the northwestern parts of India and rain shadow regions in the Indian peninsula; cold regions, which encompass all the Himalayan states of the country; tropical monsoon forest regions that includes the western coast of India and the southern parts of the northeastern states where the southwest monsoon hits the earliest in the country; tropical Savannah regions which comprise the tropical regions in India, and temperate dry summer regions, which encompass the Gangetic and Brahmaputra plains and the narrow belt in Punjab-Haryana plains that receives moderate rainfall during the summer monsoon season.
We use daily maximum temperature of April and May from IMD to identify the hottest summer in each climatic zone during the observed record . We estimate summer temperature The temperature anomaly is calculated with respect to 1961-90 climatological mean for each climatic zone. The red line shows the linear trend fitted using Mann Kendall test; the corresponding Sen's slope and p-value is provided in the respective figures. The brown asterisk indicates the year of hottest summer in each climatic zone, which is also tabulated in the figure. (f) shows the spatial plot of the temperature anomaly during the hottest summer in each climatic zone (1973, 1979 and 2010 summers) with respect to the 1961-90 climatological mean. We use the mean of daily maximum temperature in the summer months of April and May to calculate mean maximum summer temperature. The five climate regions are outlined in the map. anomalies for 1951-2015 against the reference period of . The return period of the identified hottest summer for each climatic zone was estimated by fitting non-stationary generalized extreme value (GEV) distribution to 1951-2015 mean maximum summer temperature. We consider a linear change in the location parameter with the assumption that global warming only affects the mean (m) of the distribution and not the variance (s) and shape parameter (e) (Kew et al 2019): The covariate, β 1 is the linear trend in the temperature series, t indicates time steps while β 0 is a constant. The non-stationary GEV distribution is widely used in estimating return levels/periods of weather and climate extremes that show a significant trend (e.g. Cheng and AghaKouchak 2014). The mathematical details of non-stationary GEV distribution are discussed in Coles (2001). We use bootstrap standard deviation method for estimating the 95% confidence interval using 10,000 bootstrap samples (Paciorek et al 2018). To estimate if the non-stationary model fits the observed mean maximum summer temperature, we conducted a deviance test using the log-likelihood of the parameter estimates under the assumption of stationarity and non-stationarity as in Ali and . Deviance is defined as D = ln{ML S } − ln{ML NS }; where ML S and ML NS are the maximum likelihood estimate under the stationary and non-stationary assumption.
We use daily maximum temperature from simulations of CAM5.1 (Neale et al 2010, Stone et al 2018 to identify the role of anthropogenic warming in the occurrence of the hottest summer. Before conducting the attribution study, we used 1961-90 climatological mean of IMD and CAM5-ALL simulations to compare the spatial distribution of mean maximum summer temperature (figure S2). We estimate mean maximum summer temperature anomalies of the identified hottest summer in each climatic zone for both ALL and NAT simulations with respect to 1961-1990 climatological mean of NAT simulation for each model run. We then fit a non-parametric Gaussian kernel distribution to the mean maximum summer temperature anomalies to estimate the probability of extreme events (Min et al 2019). We estimate the probability of exceedance of the observed hottest summer (IMD temperature anomaly of the identified hottest summer) in each climatic zone for both the factual P ALL and counterfactual P NAT scenarios and then estimate the Risk Ratio (RR; Allen 2003, Stone andAllen 2005) due to anthropogenic warming for each climatic zone, where: . 2 ALL NAT A RR value above 1 indicates that the probability of exceedance of the event has increased due to anthropogenic greenhouse emissions (P ALL >P NAT ) and a RR value above 2 indicates that the probability of exceedance of the event has more than doubled due to anthropogenic warming.
We define CHWD as the total number of days in all the heatwave spells occurring in the summer. We use the approach from Russo et al (2015), which does not use any rigid threshold for identifying heatwaves. Most of the indices used to estimate heat waves in the previous studies are region, sector, and season-specific (Russo and Sterl 2011, Otto et al 2012, Perkins and Alexander 2013. Russo et al (2014) developed a heatwave magnitude index (HWMI) that can be applied across different climatic zones, seasons, and sectors. We use the modified HWM daily Index (HWMDI) (Russo et al 2015), which is based on the long-term temperature in a region and is independent of a rigid threshold as used, for example, by the IMD. Therefore, HWMDI can detect increases in extreme temperatures even in cold climate regions (Mishra et al 2017, Mukherjee andMishra 2018). Russo et al (2015) method provides the heatwave with the largest magnitude in a given time period (season or year). Since we were interested in detecting all the heatwave days (CHWD) during summer, we used all the heatwaves occurred in the summer that satisfy the following conditions: (1) the maximum daily temperature of the day considered should be greater than the 90th percentile of the daily maximum temperature of the 31 day window starting 15 days before the day under consideration during the reference period, (2) the maximum daily temperature is greater than the 25th percentile of the annual maximum daily temperature of the reference period, and (3), the conditions (1) and (2) are satisfied for a minimum of three consecutive days.
We use daily maximum temperature from EC-Earth to estimate changes in the probability of the hottest summer in the observational record under the warming climate. We conduct two analyses using the EC-Earth experiments. In the first analysis, the probability of the observed hottest summer was estimated in the three (present climate, 2°C, and 3°C warming worlds) different global warming scenarios. We use the mean daily maximum summer temperature of the 2000 year ensemble in the present climate simulations as the reference and estimated the standardized departure of temperature for each summer from the present climate mean for all the three warming scenarios. Since the present climate simulation of the EC-Earth is designed corresponding to 2011-15 global mean temperature (based on HadCRUT4 data, Morice et al 2012) to exclude the influence of model bias against the observed climate, we used the IMD climatological mean of the closest 30 years (1986-2015) as a reference period.
Standardized departure of the hottest summer for each climate region was estimated using mean maximum temperature from IMD observations as: where SD IMD is standardized departure for the observed hottest summer, T HS is the temperature of the observed hottest summer, T mean and T std are climatological  mean and standard deviation for the observed climate. The hot summers in the EC-Earth simulations were identified as the years when standardized departure of EC-Earth (present climate) exceeded the standardized departure of IMD (SD IMD ).
where SD ECE is the standardized departure of a given summer in EC-Earth simulations, T is the mean maximum temperature of summer under present climate, 2°C and 3°C warmer worlds, and T mean and T std are mean and standard deviation estimated from EC-Earth's present climate simulations. If SD ECE was equal to or exceeded SD IMD for a given summer, the summer was considered as hot summer. Since, we used standardized departure for both IMD and EC-Earth datasets, the bias correction was not performed. Next, we estimate the probability of exceedance as the ratio of the number of summers exceeding the threshold (SD IMD ) to total number of summers for the present climate, 2°C, and 3°C warming worlds using EC-Earth simulations. In addition, we estimate the mean CHWD during these identified hot summers. To do so, the duration of all the heat wave spells that lasted more than three days in each summer was identified. We evaluate the probability of mean maximum summer temperature to exceed the observed hottest summer temperature in the 2000 year ensemble for each climatic zone. Additionally, the same analysis was performed for each 1°grid to estimate the probability of hot summers (exceeding the standardized departure of the observed hottest summer in each 1°grid) and the mean CHWD associated with the identified hot summers.
In the second analysis, we estimate the mean maximum summer temperature in the present climate, 2°C and 3°C warmer worlds corresponding to 1-1000 year return periods. We fit a stationary-GEV distribution to the 2000 year ensemble of mean maximum summer temperature departures in each climatic zone for all the three scenarios. We did not use the non-stationary GEV distribution due to shorter time (5 year) of the EC-Earth ensemble members. To understand how the large ensemble facilitates in reducing the uncertainty in the return period, we also fit the GEV distribution to a randomly selected 100 year sample from the 2000 year ensemble (Van der Wiel et al 2019). Since the record length of the observed data (or projections) often limited to about 100 years, the 100 year sample highlights uncertainty in the estimation of return period of an extreme climate event (e.g. hottest summer) under the observed or projected climate.
Additionally, we also estimate the return periods of 50, 100 and 500 year event in the present climate and in the two warming scenarios (2°C and 3°C) using the extreme value analysis. Please see supplemental figure S3 for the detailed methodology.  First, we identify the observed hottest summer in each climatic zone using IMD daily maximum temperature data for 1951-2015 (figure 1). We find 2010 as the hottest summer in the arid, cold, and temperate regions. On the other hand, the summers of 1979 and 1973 were the hottest in the tropical monsoon and Savannah regions (figure 1). The maximum temperature anomaly (with respect to 1961-90 climatological mean) during the hottest summer in the observational record exceeded 2.3°C in the arid and cold regions while 1.4°C in the temperate and Savannah regions (figure 1). The maximum summer temperature anomaly (1.1°C) was relatively less warm in the monsoon region than the other climatic zones. However, the standardized departure (with respect to the reference period of 1986-2015) of mean maximum summer temperature almost exceeded 2 in all the climatic zones (table S1). Three of the five climate regions experienced the hottest year during the last decade. The spatial distribution of the hottest summer temperature shows an extreme warm anomaly in the colder northern, arid northwestern, upper and middle Gangetic plains, and in central India (figure 1(f)). A relatively moderate warm anomaly in summer temperature is confined only in the western and eastern coastal regions and in the Brahmaputra plains (figure 1(f)). A temperature anomaly greater than 1°C (standard deviation=1.2°C) in the hottest summer can be found in the majority of the country except for a few regions.

Observational analysis
We estimated the trend in mean maximum summer temperature in each climatic zone using a nonparametric Mann Kendall (Mann 1945) trend test and Sen's (Sen 1968) slope. We find a significant (p-value<0.05) rise in mean maximum summer temperature in the arid, tropical monsoon, and Savannah regions (figure 1). Significant warming in mean maximum summer temperature of more than 0.1°C/decade occurred in these three climatic zones (figures 1(a), (c)-(d)), which is consistent with the rise in global mean annual temperature (Allen et al 2018). However, the cold and temperate regions do not exhibit a statistically significant rise in mean maximum summer temperature during 1951-2015 (figures 1(b), (e)). Overall, we find a rise of above 1°C in mean maximum summer temperature in more than 60% of the country during 1951-2015.
Next, we conducted the extreme value analysis using the non-stationary GEV distribution on mean maximum summer temperature for 1951-2015 in each climatic zone. We obtained deviance values higher than 3.841, which is the critical value of the Chi-square test at 5% significance level, for arid, monsoon, and Savannah regions. We performed non-stationary analysis as three out of the five climatic zones satisfied the deviance test. Further details of the test to identify the non-stationary climate can be found in Ali and   figure S4). The hottest summer (temperature anomaly of 2.5°C) in the cold region had a three-times higher return period than the hottest summer in the arid region (temperature anomaly of 2.3°C). However, we note a high uncertainty in the estimated frequency of the hottest summer at 95% confidence level, which is due to the relatively short (65 years) observational record (figure S4). Our results show that longterm data are required to reduce the uncertainty in the return periods of hot summers in the observed and projected future climates.

Attribution of the hottest summers
Next, we conduct an attribution analysis of the hottest summers using the C20C+D&A simulations for the ALL and NAT scenarios (figure 2). Before performing the attribution analysis, we compare the spatial distribution of 1961-90 climatological mean maximum summer temperature of IMD and CAM5.1 ALL simulations (figure S2). Our results show that CAM 5.1 simulations capture the observed spatial variability; however, there is a high bias in the cold climate zone. This bias is at least in part due to a sparse gauge network, which results in a high bias in temperature in the Himalayan region (Shah and Mishra 2016).
We use 48 simulations for each of the ALL and NAT scenarios for the identified hot summers in 1973, 1979, and 2010. Using two-sided Wilcoxon rank-sum and Kolmogorov-Smirnov tests, a significant (p-value <=0.05) difference in mean and distribution of temperature anomalies was found in the ALL and NAT scenarios for all the climatic regions except monsoon and Savannah ( figure 2, table S2). Our results show that anthropogenic emissions have a significant contribution to the hottest summers in India. We estimated the Risk Ratio (RR) and its 95% confidence interval based on the bootstrap standard error estimate using 10,000 bootstrap samples (Paciorek et al 2018). We find a RR value of 204 (95% CI: [129,327]) in the arid region, which highlights the influence of anthropogenic emissions on the hottest summer. Similarly, RR higher than 2 for the cold, (2.4 [2, 2.9]) and temperate, (3.0 [2.6, 3.4]) regions indicated a prominent role of anthropogenic forcing on the hottest summer ( figure 2 and table S3). In the Savannah region, RR of 1.5 [1.4, 1.7] was found, which shows the anthropogenic contribution to the summer temperature. However, the temperature anomalies during 1973 in the tropical monsoon were not significantly influenced by the anthropogenic warming at 95% CI and 2(f) shows the Risk (RR) and its 95% confidence interval for each climatic zone. The horizontal solid line represents RR=1 and the dotted line represents RR=2. We used 48 runs from the CAM5.1 model simulated historical dataset from 1961 to 2014 for the analysis. To fit the distribution function to the data, we used a non-parametric Gaussian kernel distribution. (monsoon, 1.4 [0.8, 2.4]). Overall, the observed hottest summer in 2010 can be attributed to anthropogenic warming with high confidence while the events that occurred in the 1970s cannot be directly associated with anthropogenic climate warming.
Consistent with our findings, Jaswal et al ( 3.3. The probability of exceedance of the hottest summer and associated CHWD Our results so far showed that the 2010 summer was the hottest during 1951-2015 in three of the five climatic regions in India and it was directly associated with anthropogenic warming. We, therefore, estimate the exceedance probability of the summers like 2010 (or 1973 and 1979) in 2°C and 3°C warmer worlds using EC-Earth simulations. We use standardized departures of observed mean maximum summer temperature (table S1, using 1986-2015 IMD climatology as reference) for the hottest year (e.g. 2010, 1973, and 1979) and estimate the number of years in the EC-Earth's present climate (out of 2000 year simulations) in which the standardized anomalies are exceeded in each climate-zone. For all the climate zones, the probability of exceedance of the observed hottest summer of 2010 (or 1973 and 1979) is less than 3.5% in the present-day climate ( figure 3 and table S4). Our results are based on 2000 year simulations; therefore, these estimates of the likelihood of hot summers can be considered robust against sampling limitations.
Next, we estimate the exceedance probability of the observed hottest summers in a 2°C and 3°C warmer worlds. The probability of exceedance of the hot summers in a 2°C and 3°C warmer worlds is projected to rise manifold in all the climatic zones. For instance, for the cold, monsoon, Savannah and temperate climatic regions, the probability of exceedance of the hot summers is projected to increase by more than seven-fold and twenty-fold in a 2°C and 3°C warmer world (with respect to the present-day climate), respectively. For the arid region, the exceedance probability is projected to rise above twentyfold in a 2°C warmer world and more than hundredfold in a 3°C warmer world from the present-day climate (figures 3(a), (b)). A substantial rise is observed in the exceedance probability of hot summers in the monsoon regions, which is above 40% in 2°C and 90% in 3°C warming worlds. Similarly, the cold climatic region also exhibited 25% and 70% increase in the exceedance probability of hot summers in 2°C and 3°C warming worlds. The present climate simulations of the EC-Earth data underestimate mean maximum summer temperature relative to the IMD (2011-15) data especially in the cold, temperate, and monsoon climatic regions (figure S1), however, we assume that this bias does not influence the estimation of exceedance probability as we use standardized departures in our analysis. We find a high spatial variability in the projected changes in exceedance probability, which can be associated to large scale teleconnections such as ENSO (Pai et al 2013, Murari et al 2016 in EC-Earth simulations. Overall, the frequency of the current hot summer in India is projected to increase manifold if the global mean temperature increases by 2°C or 3°C (figure 3(a) and table S4).
To understand the effect of the increased probability of exceedance of hot summers on the frequency of hot days in India, we estimate the mean CHWD in all the identified hot summers in the present climate, 2°C and 3°C warmer worlds ( figure 3(b)). Similar to the increase in the frequency of hot summers under the warming climate, CHWD is projected to rise significantly. We estimated the mean CHWD for all the summers in the present-day climate that exceeded the standardized departure of observed  mean maximum summer temperature for the hottest summer for each climate zone (table S1). Mean CHWD varies between 2 and 7 days in the present climate for all the climate zones except for the monsoon region. In a 2°C warmer world, CHWD are projected to increase by about two-fold in all the climatic regions in India ( figure 3(b)). The mean CHWD is projected to rise by about three or more times in all the climatic zones in India in a 3°C warmer world. The exceedance probability of hot summers and associated mean CHWD is projected to rise significantly under the projected future climate. The same analysis performed in each 1°grid shows that the exceedance probability of the hottest summer in each grid and the associated CHWD increases manifold in the warming world (figures 3(c)-(h)). Mean CHWD is projected to increase across Eastern India, Gangetic plains and central India (figures 3(c)-(h)). The projected increase in the exceedance probability of the hot summers will lead to increased frequency of hot days under a warming climate.
Finally, we estimate the magnitude of standardized departure of mean maximum summer temperature corresponding to return periods from 1 to 1000 in all three warmer worlds considered ( figure 4 and table S5). Here we note that summer temperature anomalies at lower return periods (5-20 years) may not be the hottest year during the record. The 2000 years simulations of the present climate, 2°C and 3°C warming worlds provide us a basis to reduce the uncertainty in the estimate of the return period of the hot summers. We estimate the magnitude of standardized temperature departures using stationary GEV distribution fitted on the 2000 year ensemble from EC-Earth. We randomly select a 100 year sample to determine the uncertainty in the estimation of the return value in shorter records. Our results show that a shorter length of record results in large uncertainty in the estimates of the return periods, which is to be expected (figures 4(a)-(e)).
To provide a more robust estimate of recurrence of summer temperature anomalies, we use the 2000 year ensembles from EC-Earth (figures 4(f)-(j)). The extreme value analysis based on the 2000 year ensembles shows a statistically significant (p-value<0.05) increase in the frequency and magnitude of summer temperature anomaly in the 2°C and 3°C warming worlds (table S5). For instance, the return period of a 1-in-50 year event in the present climate decreases to less than 1-in-10 and 1-in-5 years in a 2°C and 3°C warmer worlds (figures 4, S5) in the majority of climatic regions. The return period of a 1-in-50 year event in the present climate decreases to less than 1-in-5 years in both 2°C and 3°C warming in the tropical monsoon regions. Similarly, 1-in-100, 1-in-500, and 1-in-1000 year events in the present climate are projected to become frequent in a 2°C and 3°C warmer world. The 1-in-500 year present climate event in arid regions becomes a 1-in-3 year event in 3°C and 1-in-9 year event in 2°C scenarios (figure S5). The return period of a 500 year present climate event in the cold region increases to less than 1-in-50 year in 2°C and to 1-in-10 year in 3°C warmer worlds. Overall, we find that in all the climatic regions the rare events of hot summers are likely to become more frequent in a 2°C and 3°C warmer worlds (figure 4, table S5 and figure S5).

Discussion and conclusions
Observed cooling over the Indo-Gangetic Plain in IMD data can be attributed to the presence of irrigation and atmospheric aerosols. Intensive irrigation over the Indo-Gangetic plain results in cooling in surface as well as air temperature as reported in the previous studies ( Mathur and AchutaRao (2019) reported that irrigation decreases maximum temperature by about 3°C over the most heavily irrigated parts of India. Apart from irrigation, atmospheric aerosols also affect precipitation and temperature over the Indo-Gangetic Plain (Oldenborgh et al 2018, Kumari et al 2019. For instance, Bollasina et al (2011) reported a decline in the summer monsoon precipitation over the Gangetic Plain due to atmospheric aerosols. Similarly, Kumar et al (2017) showed that the presence of aerosols results in cooling over the Gagetic Plain. They , however, reported that the cooling due to aerosols is much lesser than due to irrigation. Therefore, the representation of irrigation and atmospheric aerosols in climate models can play a vital role for the understanding of the observed and projected changes under the warming climate.
Since, we aimed to estimate the frequency of the observed hottest summer under the warming climate, long-term simulations representing the present climate and 2°C, and 3°C warmer worlds were required. As CAM5.1 simulations are available only for a limited period, we used EC-Earth simulations of 2000 years (400 runs of five year each) for each of the three scenarios (Present climate, 2°C, and 3°C warmer worlds). However, the EC-Earth simulations are not available for the NAT and ALL scenarios. Ideally, it would have been better to use the simulations from the same climate model for the attribution analysis as well as for the future projections. Apart from the data related limitations, previous studies have highlighted the challenges associated with attributing a single climate extreme to anthropogenic climate change (Hulme 2014, Cattiaux andRibes 2018). Cattiaux and Ribes (2018) suggested a detailed approach for selecting the extreme event for attribution analysis. We, however, followed the simple approach based on aggregated mean maximum temperature for each climatic region in India to identify the hottest summer. Notwithstanding, these limitations associated with the selection of events, simulations from the climate models, and representation of irrigation and atmospheric aerosols in the climate models, our findings show a robust increase in the probability of the observed hottest summer over India under the warming climate.
We find a profound rise in the frequency and magnitude of the hot summers in a 2°C and 3°C warmer world in all the climatic regions in India. Our attribution analysis showed a direct link between the occurrence of the hottest summers and anthropogenic warming in three out of the five climatic zones. Limiting global mean temperature rise below 2°C, as negotiated in the Paris agreement, would reduce future increases in the frequency of the hot summers and associated CHWD in India substantially, reducing negative impacts on the society. Failing to limit the global mean temperature rise to 2°C may have a farreaching consequence in various sectors in India. For instance, projected increases in hot summers and extremely hot days are projected to increase morbidity and mortality (Field et al 2012, Azhar et al 2014, Murari et al 2015. The section of population without access to passive/active cooling systems and those who are exposed to outdoor weather like farmers, labourers, and vagrants are the most vulnerable to heatwaves and extreme temperature events (IPCC 2012, Akhtar 2007. Extreme hot days also negatively impact the agricultural and water resources sectors (Nath andBehera 2011, Misra 2014). Therefore, we emphasize the importance of strong global mitigation action and the significance of developing local adaptive strategies to reduce the exposure of the vulnerable population to extreme temperature events in India. https://portal.nersc.gov/c20c/. The data that support the findings of this study are available from the corresponding author upon reasonable request.