Human influence on historical heaviest precipitation events in the Yangtze River Valley

With the recurrence of high-impact extreme events and the growing public demands to understand the causes of the events, event attribution has emerged as a frontier of climate change research. Typically, an event attribution study focuses on one individual extreme event that has just occurred. Studies rarely examine human influence on multiple extreme events in different times of the past. Here we conduct a comprehensive attribution analysis on the four heaviest precipitation events in the Yangtze River Valley during the past 100 years. We start by defining extreme precipitation events as the heaviest precipitation over a fixed size area that is of direct relevance to flood preparedness and management. When examining the events over the historical time, we allow the precise location of the area to change in different years. By definition, four extremely strong events are identified, and they happened in the summer of 1931, 1954, 1998 and 2020. We find that the impacts of greenhouse gases (GHGs) and anthropogenic aerosols (AAs) on these events show clear difference in different time period. The impacts were negligible in the early period and became more and more discernible since the late 20th century. The GHGs have gradually increased the occurrence probability of extreme precipitaiton while the AAs have decreased the occurrence of extrem precipitation. These competing effects from the GHGs and AAs have led to a slight and then gradually increasing human influence on extreme precipitation over time. GHGs have exerted a larger influence on short-duration precipitation events while AAs have had a larger influence on monthly mean precipitation. The more extreme the precipitation event, the clearer the anthropogenic influence.


Introduction
As the globe warms, many types of extreme weather and climate events are increasing and have led to serious economic losses (Lee et al 2018, IPCC 2021. The middle and lower reaches of the Yangtze River Valley (MLYRV) are located at the key East Asian monsoon region and have the highest population density and economic development in China, making the region very susceptible to increases in precipitation extremes. Observations have shown that the frequency and intensity of extreme precipitation at most stations over the MLYRV have increased (Ma et al 2015, Hu et al 2016, generally consistent with the global-scale changes in precipitation extremes since 1960s (IPCC 2021).
An increasing number of extreme events have led to urgent need to understand their causes and thus the growth of event attribution studies. Many authors have attempted to investigate human influence on extreme precipitation in southeast China but their findings are not always consistent. Several studies have shown that the anthropogenic forcing has increased the probability of intense, short-duration rainfall events (Burke et al 2016) and the extreme monthly precipitation (Sun et al 2019). Li et al (2018) investigated the third wettest May of 2016 in the Yangtze River Valley (YRV) and did not find significant anthropogenic influence. Some studies found that anthropogenic forcing had reduced the likelihood of extreme precipitation amount (Li et al 2021a) and also the probability of extreme Meiyu event in 2020 due to weakened summer monsoon circulation (Tang et al 2022) resulting from the influence of aerosols (Zhou et al 2021). Obviously, there is still no consensus about human influence on extreme precipitation in eastern China, the relative roles of greenhouse gases (GHGs) and aerosols in changing precipitation extremes still need to be examined.
Currently, event attribution studies mostly focus on a single event that occurred very recently. While King et al (2016) examined historical hot extremes, finding a significant human contribution to the probability of record-breaking global temperature events as early as in the 1930s, examining historical extreme precipitation has been lacking. Here, we conduct an attribution study on the four most destructive extreme precipitation events in the MLYRV since the early 20th century. The four events which occurred in 1931, 1954, 1998 and 2020 were characterized by sequence of rainstorm activities that created longduration (over 50 d) and excessive amount of precipitation during Meiyu season (Liu and Ding 2020), resulting in destructive floods (Li et al 2021b) and severe impacts on society and the economy. At least 145 000 people died in 1931 event for the lack of flood control measures and the number of deaths in 1954, 1998 and 2020 events were over 30 000, 1000 and 100 respectively, causing economic loss up to billion dollars (Zong andChen 2000, CMA 2021). Our aim is twofold: to understand the impacts of increasing emissions of GHGs and aerosols on heavy precipitation in the MLYRV; and to compare if the effect of human emissions on extreme precipitation events are different in the historical period. We also propose an approach to pre-calculate risk ratios (RRs) for a set of extreme precipitation events with predefined return periods, which can be used in the future for possible consideration of fast-track attribution.
The remainder of the paper is organized as following. We first describe the data and methodology used in this study. We report in section 3, the general features of the top four extreme precipitation events in the MLYRV, the performance of the CMIP6 models in simulating extreme precipitation in the region and show the attribution results of these four events. Temporal evolutions of the influences of GHGs and aerosols on the extreme precipitation events of pre-defined return periods are also presented in this section, which is followed by a conclusion section.

Indices used in this study
In the YRV, the main rainy season is Meiyu season and generally occurs during June and July (JJ). We thus use three indicators, the JJ mean precipitation, the maximum 1 day and consecutive 5 day precipitation in the two months to characterize the extreme monthly and short-duration precipitation events respectively in the MLYRV (26-34 • N, 110-120 • E, black box in figure 1). These indicators are computed from both station observations and climate model simulations, to be described below, for subsequent analyses. The retrieved precipitation indices are all expressed as percentage anomalies relative to their 1961-1990 longterm means. The percentage anomalies are calculated at 2.0 • × 2.0 • resolution grid boxes and then averaged across the region. Our attribution analysis is based on the averages of these indices for one particular region. For simplicity, these regional averages will be referred to as Pmean for regional mean June-July mean precipitation, Rx1day for regional mean June-July maximum 1 day precipitation, and Rx5day for regional mean June-July maximum 5 day precipitation.

Observational and model data
Homogenized daily precipitation observations of 2419 stations across China for 1951-2020 are obtained from the National Meteorological Information Center of the China Meteorological Administration (CMA; Cao et al 2016). To reduce the potential influences of missing values on the assessed changes in the precipitation indices, stations containing more than 25% missing daily values in the 70 summers during 1951-2020 are excluded from the analysis, leaving 405 stations in the MLYRV region. Daily station observations are converted into 2.0 • × 2.0 • resolution by first averaging all available station observations within a 0.5 • × 0.5 • grid box, then averaging available 0.5 • × 0.5 • grid box values within a 2.0 • × 2.0 • grid and finally computing R×1day and R×5day at each 2.0 • grid. This processing method has been found to be more appropriate for model-data comparisons of precipitation and temperature extremes (Ou et al 2013).
Due to the lack of station observations prior to the 1950s, the CRU TS Version 4 monthly precipitation for the period 1901-2020 (CRU; Harris et al 2020) and the European Centre for Medium-Range Weather Forecasts Atmospheric Reanalysis for the 20th Century (1900-2010; ERA20C) daily precipitation reanalysis (Poli et al 2016) are used to evaluate the precipitation mean (Pmean) and daily extreme (R×1day and R×5day) indices before 1950, respectively. Their consistency with the station observations is also evaluated over their common periods.
For attribution purposes, the considered indices are also calculated from the Coupled Model Intercomparison Project Phase 6 (CMIP6; Eyring et al 2016, O'Neill et al 2016 simulations under the all combined effects of anthropogenic and natural forcings (ALL), natural (NAT) external forcing, anthropogenic aerosol (AA) forcing, greenhouse gas (GHG) forcing as well as from the preindustrial control (CTL) simulations, where all external forcings are held constant at preindustrial levels. The NAT, GHG, and AA simulations for 2015-2020 are driven by the corresponding forcing defined under the shared socioeconomic pathway (SSP) 2-4.5 emissions scenario (Gillett et al 2016). Given this and the fact that difference in the external forcing among different SSP scenarios during this period is very small, the ALL simulations, if ended in 2014, are extended to 2020 using projections under the SSP2-4.5 scenario. Table 1 provides detailed information about the simulations used in this study.
It is noted that all the model simulations are bilinearly interpolated to a 2.0 • × 2.0 • grid before Table 1. Available models and numbers of runs used for percentage anomalies of monthly mean precipitation (Pmean, relative to 1961(Pmean, relative to -1990  computing the precipitation indices. The calculated precipitation indices from different simulation experiments (i.e. ALL, NAT, GHG, AA, CTL) of each climate model are converted to percentage anomalies with respect to the ensemble averages of their  climatological mean values in the available ALL simulations from that climate model.

Calculation of event probability
As the three indices for mean and extreme precipitation are transformed to the regional mean, it is difficult to make proper assumptions for the underlying probability distributions for these processed regional mean values. As a result, we use an empirical method (Bonsal et al 2001) to estimate the occurrence probability of precipitation events. Given a sample of indices during a period of N years that are sorted in ascending order, the empirical method estimates the occurrence (or exceedance) probability of a value that ranks m in the sorted sample as P = 1 − (m − 0.31) / (N + 0.38). The empirical probability method does not assume a form of probability distribution. We analyze three RRs to assess the influences of different external forcings on the occurrence of extreme precipitation events in the MLYRV: where P ALL , P GHG , P AA , P NAT , and P CTL represent the occurrence probabilities of an extreme event in ALL, GHG, AA, NAT, and CTL simulations, respectively. These RRs quantify how many times as likely the occurrence of an extreme event is in one climate condition relative to another (Allen 2003, NAS 2016. For example, RR ALL measures the influence of ALL forcing as of the year of the event relative to a naturally forced (NAT) climate condition. To estimate the involved probabilities, we use simulations in an 11 years time window centered at the year of the event, except for the CTL probabilities, for which all the available CTL simulations are used. In cases when the event occurs in the first or last 5 years of the 1900-2020 study period, such as the 2020 extreme Meiyu event, simulations for the first or last 11 years are used instead. We use a bootstrapping method to estimate the 5%-95% uncertainty ranges of the RRs. We repeatedly draw samples with replacement from the given data to estimate a set of event occurrence probabilities and thus RRs, with which the 5%-95% uncertainty ranges are calculated.
It is noted that event attribution in existing studies is almost exclusively framed to answer whether human activities have influenced a certain type of extreme event occurred in a predefined and fixed region. From the perspective of emergency preparedness and flood management, floods in the MLYRV can be caused by extreme precipitation in the middle and lower reaches but also in, for example, the upper reach. We therefore propose a method to consider changes in the most extreme precipitation events as measured by the defined indices in all possible regions of similar size to the MLYRV in a larger domain in eastern China (22-40 • N, 96-120 • E, blue box in figure 1), which roughly covers the midupper to the lower valley of the Yangtze River. The most extreme precipitation is searched in a 'large rainfall area' by sliding a rectangle with the same size as the MLYRV (i.e. the black box in figure 1) over the larger domain (blue box). In this way, time series of most extreme Pmean, R×1day, and R×5day over the large rainfall area are obtained for the observations, reanalyzes and climate model simulations, and are used for subsequent event attribution analysis.  Figure 1 shows time series of Pmean, R×1day and R×5day in the MLYRV from 1900 to 2020. The most impressive feature is the large interannual to interdecadal variations in all the indices with no clear longterm trend until the 1980s. Based on Pmean, the top four heaviest precipitation events occurred respectively in 1954, 2020, 1998 and 1931, which are also reflected in R×1day and R×5day. This is consistent with those recorded in the historical literature (Duan and Takara 2020). Before 1950, CRU and ERA20C show distinct differences for Pmean, with the latter exhibiting substantially higher variability, which is also seen for R×1day and R×5day. This reflects the large uncertainty of data at early period due to the lack of reliable precipitation observations. After 1950, both CRU and ERA20C show good consistency with CMA observations. Hereinafter, we use the CMA observation for analyzing the events during 1951-2020, while for events before 1950, the CRU data is used for Pmean because this dataset is based on observations and the ERA20C for R×1day and R×5day because of the availability of daily values. We note that prior to 1950, station observations of precipitation were generally sparse to constrain the CRU data and that sea level pressure observations that were used to constrain ERA20C were also limited. For this reason, precipitation data prior to 1950 are not as reliable as those in the later period and caution is needed when interpreting precipitation changes prior to 1950 over this region.

Observed heavy precipitation in the MLYRV
The spatial distributions (figure 1) of rainfall accumulations during these four events show that the 1931 rainband was tilted in a northeast−southwest direction with the center located in south of MLYRV, but basin-wide zonally oriented rainfall belts appeared along the YRV in 1954, 1998 and 2020. The center of 1998 was more southern than the 1954 and 2020 events, so the 1998 rainfall belt extended to southwest China. The varying spatial distributions of the rainfall centers of the four events provide a justification for our choice of nonfixed extreme precipitation regions to frame the extreme precipitation attribution analysis. Figure 2 shows the temporal evolution of observed and model-simulated mean and extreme precipitation indices in the 'large rainfall area' . The magnitude and ordering of top four events show slight changes but still can be recognized as the heavy precipitation events in the historical record. The spread of the CMIP6 multi-model simulations almost covers the observed variations in all the three considered precipitation indices, indicating a good ability of models in reproducing the variability of observed precipitation in a large area. Different from the changes in the MLYRV, there are large variations in R×1day  and R×5day before 1950s, which reflects large uncertainty of reanalysis data such as ERA20C in the early period. To verify the model performance in simulating the heaviest events, we compare the CMA-observed and model-simulated return periods of the top ten observed heavy precipitation events in the 'large rainfall area' during 1951-2020 (figure 3). For all the indices, the model-simulated return periods accurately match the observations in the 'large rainfall area' , reasonably reproducing most of the extreme precipitation events. Only for the extraordinarily extreme events, such as the top two events with return periods longer than 100 years, the models show some bias especially for the Pmean. The relatively fast decay of the empirical distribution function at the upper tail may be responsible for the biases. However, all these indicate a reliable performance of models in describing such events and the empirical fitting is appropriate for the calculation of event occurrence probability.

Attribution of the heaviest precipitation events 3.3.1. The 1931 event
The observed Pmean (CRU data) for the 1931 event is +40% larger than the 1961-1990 mean, with a return period estimated by the empirical method being ∼45 years (table 2). For the models, there is almost no separation between the probability density functions (PDFs) under different forcing experiments ( figure 4). Quantified probability calculation shows that the probability for Pmean in 1931 is 0.18 (90% CI: 0.16-0.19) in the ALL simulations while it is 0.23 (0.21-0.25) in the NAT simulations, which corresponds to a RR RR ALL = 0.77 (0.67-0.89). Likewise, we can estimate the influence of AAs and GHGs as RR GHG = 1.20 (0.98-1.40) and RR AA = 0.65 (0.54-0.77).
For R×1day, the three probability ratio estimates are RR ALL = 0.95 (0.69-1.38), RR GHG = 1.25 (1.02-1.58) and RR AA = 0.66 (0.50-0.82), while for R×5day, the estimates are RR ALL = 2.13 (0.45-144), RR GHG = 1. 85 (0.61-4.19) and RR AA = 1.22 (0.02-3.03). The RR estimates for the three indices basically indicate negligible influence of external forcing on the 1931 event due to low anthropogenic emissions in the early 20th century. However, we also note a few RRs for R×5day close to 2, which is possibly due to quite small model sampling and also the relatively fast decay of the empirical distribution function at the upper tail, thus impeding a reliable estimation of human influence.

The 1954 event
The 1954 event is the heaviest precipitation event in the MLRYV during the instrumental period, with the Pmean reaching +83% (CMA observation). For all the three indices, the observed values fall at the very far upper tail ends of the probability distributions regardless of external forcings (figure S1). The occurrence probability Pmean for a 1954-like event is 0.6% (0.3%-0.9%) in the ALL simulations, while it is 0.9% (0.4%-1.5%) in the NAT simulations. This gives RR ALL = 0.66 (0.26-1.65). We can also calculate that RR GHG = 1.64 (0.78-2.71) and RR AA = 0.32 (0.08-0.66). For R×1day, RR ALL = 0.65 (0.13-2.63), RR GHG = 1.07 (0.41-1.92) and RR AA = 0.44 (0.01-1.08). For R×5day, RR ALL = 1.00 (0.48-2.03), RR GHG = 1.81 (1.08-2.81) and RR AA = 0.22 (0.02-0.46). These results suggest that with slight increases in GHG emissions, GHG forcing emerged, roughly doubling the probability of event occurrence, but AA made the opposite contribution, reducing the probability to half of that in the preindustrial climate. The stronger AA effects than GHG led to a net effect of ALL forcing to slightly reduce the likelihood of 1954-like event, but uncertainty is large.

The 2020 event
The second heaviest rainfall event occurred in 2020. The modeled PDFs from different forcing experiments show more clear separation compared with those in the early period (figure 4). It can be derived that RR ALL = 1.72 (1.21-2.50), RR GHG = 2.64 (1.81-3.85) and RR AA = 0.07 (∼0.00-0.14). For R×1day, RR ALL = 1.33 (1.12-1.54), RR GHG = 1.80 (1.61-2.05) and RR AA = 0.54 (0.40-0.66). For R×5day, RR ALL = 1.25 (1.04-1.51), RR GHG = 1.84 (1.58-2.15) and RR AA = 0.44 (0.34-0.56). These RR estimates show that since the early 21st century, GHGs have started to play a leading role in increasing heavy precipitation by approximately two times, so anthropogenic influence positively contributes to both mean and daily extreme precipitation. We notice the different conclusion about Pmean in Lu et al (2021), and it may be related to their selected threshold, different methods and framings.
To summarize, figure 5 shows the RRs under different external forcings for these four events. In the early 20th century, the influence of GHGs was almost negligible. Even though the uncertainty ranges of RR for aerosol forcing for the 1931 and 1954 events often below one, there is not clear consistency among three indices, suggesting effect of aerosol forcings could not be robustly detected in the early of the 20th century. As anthropogenic emissions increased, the effect of GHGs in enhancing heavy precipitation and the effect of AAs in reducing heavy precipitation started to emerge and gradually enhanced. For both monthly mean and short-duration extreme precipitation, GHG forcing increased the occurrence probability of extreme events, while AA forcing reduced the probability. The wetting effects due to steadily increasing GHGs were offset by the drying effects of AAs in different periods, resulting in different net impacts of human activities on extreme precipitation in different periods.

Attribution of precipitation events with different return periods
To investigate the temporal evolution of GHG and AA impacts more clearly, we conduct the attribution analyses on the events with different return periods. We use return levels for the 1-in-5 year, 10 year and 50 year events from ALL model simulations over the MLYRV (table 3) to infer the RRs for events having the same return periods.  Figure 6 shows the temporal evolution of the RRs since 1850. For all the three indices at the early stage, the RR ALL remains approximately unity, indicating little effect of external forcings on extreme precipitation. Later, as emissions increase, the RR GHG rises and the RR AA decreases, indicating the gradual strengthening effects of GHG in increasing extreme precipitation and AAs in suppressing extreme precipitation. This reflects the counter effects of globally nearuniform greenhouse forcing that enhances precipitation and the increasing aerosol emission in this region due to industrial development that suppresses precipitation. This seems to be consistent with the findings that anthropogenic influence on climate, including increases in GHGs has had a detectable contribution to the observed shift toward heavy precipitation in Eastern China, with the effect of GHGs forcing being offset by that of AAs (Ma et al 2017). For all the indices, the more extreme the precipitation event, the more obvious effect of anthropogenic forcings.
For Pmean, the impact of external forcing on precipitation gradually emerges since the late of the 20th century. The effect caused by GHGs was relatively small compared to the effect caused by AAs such that it rarely has increased the odds of extreme precipitation events, with RR remaining approximately at unity. Until the early 21st century, RR ALL gradually approached unity with increasing GHG emissions. Then, the anthropogenic influence on Pmean turns into a slight wetting effect, which is clearly seen from the attribution of the 1-in-50 year event and 2020 event in section 3.3.4. For more extreme Pmean events, AAs show more influence. For example, AAs decreased the occurrence probability of a 1-in-50 year event by over five times around the 1990s and then showed recovery impacts.
For R×1day and R×5day, the anthropogenic influence is similar to that for Pmean, but the influence from GHG is larger, especially for 1-in-50 year events, indicating a greater influence from the GHG-induced moisture increase. In the 1950s, ALL forcing initially reduced the odds of these shortduration events, and since late 1990s, ALL forcing started to increase them. The GHG wetting effect on the daily extremes are significantly stronger than that on Pmean, leading RR ALL to become greater than unity earlier. Aerosols obviously decreased the occurrence probability of extreme precipitation, but the uncertainty range was larger. The competing effects of GHGs and AAs lead to different total anthropogenic influences on extreme precipitation events in different periods.
According to the Clausius-Clapeyron relation, as pointed out in previous studies, GHGs can induce warming in the atmosphere, leading to increases in atmospheric total column moisture (e.g. Li et al 2015). Both the increase in local water vapor and the enhanced water vapor transport by monsoon meridional advection favor precipitation over the MLYRV. In contrast, the cooling induced directly by aerosols dries the East Asian summer monsoon mean rainfall (Lau andKim 2006, Zhou et al 2020). The larger cooling anomaly over land compared to that over ocean can reduce the land-ocean thermal contrast, which weakens East Asian summer monsoon circulation. Thermodynamically, cooling can directly reduce the saturation vapor pressure as relative humidity changes little, it reduces moisture in the atmosphere, leading to weakened moisture advection. All these can be the possible mechanisms for the external forcing influence on precipitation in the MLYRV. However, we also note that for different time-scale precipitation and different metrics, GHG and AA changes play different roles.

Conclusion
The observational and CMIP6 model data are employed to attribute the historical heaviest precipitation events in the MLYRV during the past 100 years. A new attribution method using an empirical probability estimate in a nonfixed reference area is adopted to calculate the probability change in a large rainfall area. With this method, the CMIP6 models are generally able to reproduce the return periods of the heaviest precipitation events indicated by June-July mean precipitation, R×1day and R×5day, though some biases are seen for the extraordinary events. With these model results, we find a clear difference of GHGs and AAs impacts on the four extreme events in eastern China in different historical periods, with negligible impacts in the early period and discernible human influence in the late period, in particular since the beginning of the 21st century. The PDFs of extreme indices under different external forcings show distinct separation when the human emissions become quite large after the beginning of the 21st century.
The temporal evolution of the aforementioned human (GHG and AA) influences on the historical events shows gradually strengthened impacts as the time evolves for 1-in-5 year, 10 year and 50 year events. Their competing effects led to a slight human influence and then gradually increasing human influence on extreme precipitation after the 21st century. GHGs have had a larger influence on R×1day and R×5day while AAs have had a larger influence on Pmean.
This study illustrates the different impacts of GHG and AA on extreme precipitation in different periods with different climate states. It partially explains the reasons for conflicting conclusions about human influence on precipitation events in the YRV region. Figure 6 illustrates a non-smoothing evolution of human influence on precipitation events, which might be caused by different climate states, the model sampling and also the event itself. Different metrics such as the mean or extreme precipitation, different intensities of the events such as 5 year or 50 year events, different historical periods could lead to different attribution results. The relatively small sample from the limited model runs also impedes an accurate estimate of event occurrence probability. However, the general conclusion holds, that is, the human influence on regional extreme precipitation events is emerging and becomes more important with increasing GHG emissions.

Data availability statement
The data generated and/or analyzed during the current study are not publicly available for legal/ethical reasons but are available from the corresponding author on reasonable request.