Diverse estimates of annual maxima daily precipitation in 22 state-of-the-art quasi-global land observation datasets

Observational evidence of precipitation extremes is vital to better understand how these events might change in a future warmer climate. Over the terrestrial regions of a quasi-global domain, we assess the representation of annual maxima of daily precipitation (Rx1day) in 22 observational products gridded at 1° × 1° resolution and clustered into four categories: station-based in situ, satellite observations with or without a correction to rain gauges, and reanalyses (5, 8, 4 and 5 datasets, respectively). We also evaluate the interproduct spread across the ensemble and within the four clusters, as a measure of observational uncertainty. We find that reanalyses present a heterogeneous representation of Rx1day in particular over the tropics, and their interproduct spread is the highest compared to any other cluster. Extreme precipitation in satellite data broadly compares well with in situ-based data. We find a general better agreement with in situ-based observations and less interproduct spread for the satellite products with a correction to rain gauges compared to the uncorrected products. Given the level of uncertainties associated with the estimation of Rx1day in the observations, none of the datasets can be thought of as the best estimate. Our recommendation is to avoid using reanalyses as observational evidence and to consider in situ and satellite data (the corrected version preferably) in an ensemble of products for a better estimation of precipitation extremes and their observational uncertainties. Based on this we choose a subsample of 10 datasets to reduce the interproduct spread in both the representation of Rx1day and its timing throughout the year, compared to all 22 datasets. We emphasize that the recommendations and selection of datasets given here may not be relevant for different precipitation indices, and other grid resolutions and time scales.


Introduction
Precipitation is heterogeneous in space and time and its measurement is further complicated by the heterogeneity of the ground-based measurement network. While most of the mid-latitudinal terrestrial regions are well monitored, in the tropics the station density is sparse, of lower quality and data availability is limited (Alexander 2016). Overall, the observation of extreme precipitation from ground-based instruments is challenging. Better spatial coverage is achieved by interpolating in situ-based data onto a grid, however, gridded datasets represent an area-averaged measure of precipitation and are therefore intrinsically different to a stationbased measure Knutson 2008, Gervais et al 2014a). Uncertainties arise due to the different gridding methods applied (Dunn et al 2014, Avila et al 2015, but the advantage of gridded datasets is that they allow comparison with other observational products (such as satellite datasets) and are useful for model validation.
Observational datasets created from satellite retrievals generally provide better spatial coverage compared to station-based gridded products, and this allows the assessment of precipitation in data-sparse areas of the globe. However, as the satellite era is relatively new, there is little guidance on the reliability of these observations to the study of long-term changes in precipitation extremes. Reanalysis data provide an alternative to ground-based and satellite-only observations. Using a model constrained by assimilated observations, they allow a complete spatio-temporal characterization of precipitation and other atmospheric fields to investigate the mechanisms at play. However, precipitation is generally not assimilated and is a result of the model physics although guided by the other assimilated variables. There is generally less agreement between different reanalyses datasets than between in situ datasets (Bosilovich et al 2009, Sun et al 2018. While we recognise that reanalyses are not strictly observations, they are used widely enough in the literature as a proxy for precipitation observations especially in trend analyses and model evaluation studies (e.g. Kharin et al 2013, IPCC 2013, Sillmann et al 2013, Sun et al 2018. Furthermore, studies focusing on mechanisms often opt for reanalyses because they provide a consistent multi-variate framework that is harder to obtain from observations alone. Therefore, reanalyses are incorporated in this study for intercomparison purposes.
In order to estimate how diverse the representation of extreme precipitation is across observational products, an ensemble of datasets can be considered. Most studies intercomparing extreme precipitation in observational products are regional in scale and often in regions of high data-density (e.g. Gervais et al 2014b, Yin et al 2015, Timmermans et al 2019 while less effort has been made to characterize regions of various spatial coverage at the global scale (e.g. Donat et al 2014, Herold et al 2017. At present, a large number of observational datasets exist for precipitation, from a variety of observational sources. This is mainly explained by an increasing number of satellite and reanalyses datasets as well as new product versions being released regularly. This study aims to comprehensively examine the representation of a measure of precipitation extremes in a large number of observational datasets commonly used in the climate community. Extreme precipitation is defined here as the annual maxima of daily precipitation, and we focus on quasiglobal land areas (50°S:50°N, 130°W:180°E). The observational products used in this study are available through the Frequent Rainfall Observations on Grids (FROGS) database (Roca et al 2019a), which provides a variety of gridded observational precipitation datasets (at 1°×1°resolution). We consider 22 datasets from different sources: station-based in situ data, satellite retrievals with or without a correction to rain gauges, and reanalyses. We define observational uncertainty as the spread across the large ensemble of observational products, and the respective uncertainty for each data source. We further aim to give guidance on the use of these different observational products specifically as it relates to annual maxima of daily precipitation at the (quasi) global scale and over land. The observational datasets used in this study are presented in section 2 with the definition of extreme precipitation. Section 3 describes the results. The findings are discussed in section 4 and conclusions are given in section 5.
2. Data and indices for observed extreme precipitation 2.1. Observational datasets of daily precipitation We consider 22 datasets with a quasi-global coverage (50°S:50°N; 130°W:180°E) and we focus on land only. All datasets were gathered and reformatted onto 1°×1°daily grids for the FROGS database (Roca et al 2019a) with a common land-sea mask (from REGE-N_ALL_v2019) applied. Roca et al (2019a) describe the estimation of daily precipitation accumulation in all these datasets (see table 1). In addition to analysing each dataset separately with every other dataset, we also form product 'clusters': in situ-based (5 datasets), satellite with (8 datasets) or without (4 datasets) a correction to rain gauges and reanalyses (5 datasets).

Precipitation extremes
We define extreme precipitation as the annual maximum 1 day precipitation amount (in mm), or Rx1day (Zhang et al 2011). This gives information on the magnitude rather than frequency of extreme precipitation events, but we compare the timing of these annual maxima throughout the year between the datasets in figure 3 (and related text). Annual extreme precipitation is compared to annual total wet-day precipitation (i.e. total from days with precipitation>1 mm), or the prcptot index. Both precipitation indices were characterized and are recommended by the WMO/ WCRP/JCOMM Expert Team on Climate Change and Detection Indices (ETCCDI, Zhang et al 2011). They are calculated using the ClimPACT software (Alexander and Herold 2015, https://climpact-sci. org/) from daily precipitation fields. This ensures consistency in the calculation of the indices across all datasets, and in particular how missing values are treated.
By definition, Rx1day spans one value per year (wettest day of year), which is a common criticism of this index in that it can miss other 'extremes'. Indices based on the exceedance of a threshold such as the 95th or the 99th percentile (i.e. R95p and R99p) might be preferable but the drawback with these percentilebased indices is the requirement of a base period for the percentile calculation. Base periods are recommended to be at least several decades long (for example, WMO uses a 30 year standard-currently . However, the longest overlapping period for the 22 datasets used here is 13 years (2001-2013), which is therefore probably too short to use percentile-based indices. As an illustration, we compare the mean global R99p values in two datasets (REGE-N_ALL_v2019 and JRA-55) but using three base periods of different lengths (15, 30 and 50 years) and we show large differences in particular after 1990 (up to above 12 and 17 mm respectively; see supplementary figure 1 is available online at stacks.iop.org/ERL/15/ 035005/mmedia). This shows that the sensitivity to the choice of the base period is amplified in the presence of a trend, i.e. when the base period is calculated prior to the start of a trend, the trend estimate will be higher compared to if the base period spans years when there is a trend.

Results
The global distribution of Rx1day values is first compared to the global distribution of annual total wet-day precipitation (prcptot) for all years during 2001-2013 (figure 1). We start by making a point on the use of GPCP_CDR_v1.3 dataset. This satellite product is provided with a 'valid range attribute' included in the file, and we compare here Rx1day and prcptot distributions with (panel f) and without (panel g) this valid range applied to the data. This shows that the use of valid range masks all values above 100 mm. It is worth noting that this is not perceivable from the distribution of prcptot values, as they consist in the sum of daily precipitation amounts in a year. It is also not perceivable if the climatological Rx1day values are considered instead of the values of all years over the 2001-2013 period (supplementary figure 2 versus figure 1). In addition, comparing 50% of the global distribution of Rx1day (through the 25th and 75th percentiles; horizontal dashed lines on figure 1) across the datasets shows that GPCP_CDR_v1.3 has among the narrowest distribution (MERRA1 has the The datasets are clustered into 4 groups: in situ-based data (blue label), satellite data with or without a correction to rain gauges (orange or red label), and reanalyses (green label). Horizontal and vertical solid (dashed) lines indicate the median value (the 25th and 75th percentiles) of the distribution of Rx1day and prcptot, respectively, and their values are written on each panel. Note that a common range is applied to X-and Y-axes for an easier intercomparison and that it can mask highest data of some datasets. narrowest). However, all other datasets have values above 100 mm and it thus seems reasonable to conclude that GPCP_CDR_v1.3 data should be used without applying the valid range as it hinders the study of extreme precipitation by excluding the most extreme values over the globe. In the rest of this study, we therefore only consider the raw data of GPCP_CDR_v1.3. The scatterplots form different 'data clouds' for each datasets. Some indicate that the largest values of Rx1day are found for the largest values of prcptot (e.g. most of the in situ-based datasets), while others indicate the largest values of Rx1day occur around the 75th percentile of the prcptot distribution (e.g. n, o, q, r, t, u; rightmost vertical dashed line). The median of the Rx1day distributions (horizontal solid lines in figure 1) indicates that the driest datasets are (in decreasing order) CHIRP_V2, MERRA1, ERAi, PERSIANN_v1_r1, CHIRPS_v2.0 and the wettest are (in decreasing order) MERRA2, 3B42_IR_v7.0, GPCC_FDD_v2018, CFSR, 3B42RT_v7.0. The median values range from 26.7 to 59.1 mm, showing large interproduct spread, which is even larger for the 25th (from 12.5 to 32.6 mm) and 75th (from 42.7 to 102.7 mm) percentile values. On the contrary, prcptot values compare much better across the 22 datasets. We find a better comparison between the datasets for the median and 25th and 75th percentiles of the global distributions of prcptot (vertical lines in figure 1), in agreement with the findings of Alexander et al (2020) and Roca (2019b).
We cannot conclude which of these 'data clouds' is the most realistic but we observe less difference between in situ-based datasets than within any other clusters, and in particular between the reanalyses which exhibit the largest differences. In situ-based observations (blue labels in figure 1) show a similar representation of Rx1day and prcptot global distribution with the exception of GPCC_FDD_2018 that shows higher values for Rx1day, and among the highest values of the ensemble. Compared to the in situ -based cluster, we find more spread (e.g. among the median values) within the satellites clusters, either with or without correction to rain gauges (orange and red labels in figure 1, respectively).
Cluster-averaged 2001-2013 climatological values of Rx1day show differences across the four clusters, or the four types of observations (figures 2(a), (d), (g), (j)). Extreme precipitation intensity over the driest regions varies across the four clusters (e.g. central Asia, Arabian Peninsula, western US), as previously highlighted by Donat et al (2014). The intercluster differences are also large over the wettest regions and in the tropics in general (figures 2(a), (d), (g), (j) and (n), (o)). Extreme precipitation intensity on the leeward side of extratropical northern and southern America is contrasted between the clusters, and in particular between the corrected and uncorrected satellite products, with a tendency towards lower estimates in the corrected datasets (i.e. closer to in situ-based data). Note that for an easier intercomparison, values are plotted up to 160 mm but cluster-averaged in situ-based data indicate higher values (up to 260 mm) compared to reanalyses (up to 180 mm), satellite with correction (up to 160 mm) and satellite without correction (up to 150 mm). This cannot be explained by scaling issues as all data were first interpolated onto a common grid but could be explained by structural differences in the measurement of precipitation and/or by interproduct spread within each cluster, which is further investigated as follows.
We then examine how much interproduct spread is associated to each source of observation using the multiproduct standard deviation and coefficient of variation (i.e. standard deviation normalized by the multiproduct mean of climatological Rx1day; figures 2(b), (e), (h), (k) and (c), (f), (i), (l)). Reanalyses show the largest spread while in situ-based data show the smallest (see also supplementary figures 3 and 4), in agreement with Donat et al (2014). Satellite data lie in between with generally more spread within the uncorrected than the corrected product clusters (figures 2(e), (f), (h), (i)). Within the reanalyses cluster, the largest interproduct differences are located in the tropics with multiproduct standard deviation values generally over 50 mm (coefficient of variation values above 50%; figures 2(k), (l)) or more for regions like central Africa for instance. In the extra-tropics, uncertainties in reanalyses are lower, yet generally higher than for any other observational source. On the contrary, in situ-based data show little interproduct spread with the highest uncertainties in the tropics and in particular in the Sahara desert, where there are few rain gauges (figures 2(b), (c)). The spread within the in situ-based cluster is mainly explained by large differences between GPCC_FDD_v2018 and the other datasets (see also supplementary figures 3 and 4). Satellite data show interproduct spread levels closer to in situ-based than reanalyses except over a dry region extending from northern Africa to Central Asia where interproduct differences are large (figures 2(f), (i)). Over such semi-arid regions, satellite products tend to show large uncertainty due to low detection skill (Maggioni et al 2016). These results agree with those of Herold et al (2017), and they are generally in line with those for annual total wet-day precipitation (prcptot; supplementary figures 5-7) yet of weaker intensity compared to annual extreme precipitation.
Finally, the cluster-averaged representation of climatological Rx1day in the in situ-based and corrected satellite clusters are the most similar and show reduced interproduct spread compared to the uncorrected satellite data and reanalyses clusters. We further show that selecting only these two groups of data instead of considering all four reduces the uncertainties (bottom panels of figure 2). Some areas still present a relatively high spread, but these can be either dry or data-spare regions, or both. Furthermore, figure 1 highlighted PERSIANN_v1_r1 as one of the driest datasets (see above) and indeed it presents a pronounced widespread dryness compared to other products (supplementary figures 3 and 4(m)), as well as large regions of missing values (especially at the beginning of the 2001-2013 period). Therefore, in the context of this study, we do not recommend using PER-SIANN_v1_r1. Finally, these results point to a subset of 5 in situ-based and 7 satellite datasets that present reduced interproduct differences in the estimation of annual daily precipitation maxima over global land compared to the initial 22 products.
We further investigate how the timing of extremes in a year compares across the selected 12 datasets (figure 3). We first intercompare their climatological annual cycle and find that interproduct spread is generally higher in the tropical band (up to 1.1 mm d −1 ) than in the extra-tropics (up to 0.8 and 0.9 mm d −1 in the Southern and Northern hemisphere, respectively; first column of figure 3; note that a 21 d running averaged is applied). The satellite product 3B42RT_v7.0 indicates the highest values of mean daily precipitation during the wet season in the Northern Hemisphere extra-tropics, leading to larger interproduct spread. Except for this, results show a relatively similar annual cycle between the 12 selected datasets, in particular in the extra-tropics, with little differences between the in situ-based and satellite clusters. This is however very different when all 22 datasets are considered (supplementary figure 8). The interproduct spread is much higher (up to 1.0 mm d −1 in both extra-tropical bands and 2.8 mm d −1 in the tropics) and the reanalyses present higher values of mean daily precipitation, in particular in the tropics.
Next, we compare the timing of extreme precipitation through the distribution of the month of occurrence of Rx1day (i.e. distribution of the months when annual maxima of daily precipitation are recorded between 2001 and 2013 for each grid cell; right column of figure 3). These distributions are estimated in latitudinal bands and we find that distributions are relatively comparable across the selected datasets. In the (mm) and its corresponding standard deviation (mm; (n), (q)) and coefficient of variation (%; (o), (r)). Note that by definition, spread measured by standard deviation gives higher emphasis for the wettest regions and spread measured by coefficient of variation gives higher emphasis for the driest regions. A common range is applied and this can mask values higher than 160 mm. extra-tropics, selected datasets agree on a higher occurrence of extremes during the wet season (figures 3(b), (f)), and in the tropics they indicate a flatter distribution peaking in August (the Indian monsoon imprint; figure 3(d)). Interestingly, uncertainties are higher in both extra-tropical bands compared to the tropical band, where little interproduct spread exists and only CHIRPS_v2.0 presents a slightly higher peak in August. Previously, we highlighted higher interproduct spread in extreme precipitation intensity in the tropical band. We show here that it is certainly not explained by different timings of extremes across the datasets, i.e. that we are comparing annual maxima of precipitation that generally occur in similar months but have intensity values that differ between products. Over extra-tropical land in the Southern Hemisphere, the occurrence distributions are relatively similar across the datasets however with higher interproduct spread for the satellite than the in situ-based cluster, and again CHIRPS_v2.0 showing slightly higher occurrences during the wet season. In the extra-tropical Northern Hemisphere, the satellite products generally indicate a flatter distribution compared to in situ -based datasets, and their cluster presents higher interproduct spread than the in situ-based cluster.
In extra-tropical Northern Hemisphere (and the Southern Hemisphere to a lesser extent), GSMAPgauges-NRT-v6.0 shows the least variation over the months in comparison to the other datasets and to GSMAP-gauges-RNLv6.0 whose distribution peaks in summer and early autumn (months 6-10), in agreement with in situ-based observations ( figure 3(b)). Therefore, for purpose of our study investigating Rx1day at the global scale, we suggest using the RNL preferably over the NRT version for GSMAP-gauges products in order to ensure that the extremes occur in a reasonably similar period throughout the year (compared to other datasets). In the Northern Hemisphere (and the Southern hemisphere to a lesser extent), the timing of extremes is relatively similar between both 3B42 versions, whereas 3B42RT_v7.0 shows higher mean precipitation values than 3B42_v7.0 throughout the year and in particular during the wet season (figures 3(a), (b)). While it is difficult to conclude whether or not the 3B42RT_v7.0 version is overestimating daily mean precipitation, we can suggest the use of the 3B42_v7.0 preferably over the 3B42RT_v7.0 version for the 3B42 products as the former benefits from much more rain gauges information than the latter. Finally, we cannot conclude that there is a best version among the REGEN and GPCC in situ-based products but we have previously highlighted more intense precipitation extremes in GPCC_FDD_v2018 compared to the other in situ-based products (and to most of the datasets used here). Based on this intercomparison and within the context of this study, we suggest a final selection 10 observational datasets that enable reduced interproduct spread in the estimation of Rx1day over quasi-global land.

Discussion
We find a better general agreement within the in situ -based cluster compared to any other cluster. This is partly explained by some interproduct dependencies. Out of 5 in situ-based datasets, there are two pairs from a common center and indeed the two REGEN datasets are largely similar (see also supplementary figure 4). They are also generally similar to GPCC_FDD_v1.0 as they share the most of their rain gauges but present lower estimates of Rx1day compared to GPCC_FDD_v2018. Differences are likely to be due to additional station data and improved quality control in the later version of the GPCC dataset (pers comms U. Schneider) but also might be related to changes in the gridding method employed. The fifth in situ-based dataset (CPC_v1.0) shares less with the four others, but complete independence can never really be achieved with station-based gridded products that need to span as many gauges as possible. Independence for observations in general is very difficult to obtain as most of the satellite products correct their estimates to rain gauges and share the data from the same instruments (radiosondes, satellite observations, etc).
An important limitation to our knowledge and our ability to estimate observed extreme precipitation over the globe is the lack of rain gauges in many regions such as Africa, South-America and South-East Asia. This is an evident limitation for products gridded from in situ data but more generally for any type of observational dataset as most of them rely on station-based estimates or use them for validation. The quantity of stations is insufficient and heterogeneously distributed over the globe but coverage is also limited in time (Kidd et al 2017). The two versions of the REGEN dataset (Contractor et al 2019) allow the evaluation of the impacts of temporal limitations. Indeed, REGEN_LONG_v2019 only considers stations with at least 40 years of data to generate the final gridded product while REGEN_ALL_v2019 considers all available stations. We find higher amounts of annual precipitation extremes in REGEN_LONG_v2019 compared to REGEN_ALL_v2019 (figures 1(a), (b)), with the largest differences over north-western South-America and south-east Asia (supplementary figures 5(a), (b)), but no such impacts for total annual precipitation (supplementary figures 6 et 7(a), (b)). Furthermore, the longest period of overlap among all datasets used in this study is 13 years, which hinders the intercomparison of trends in observed extremes.
Based on this quasi-global intercomparison of observations for Rx1day, we suggest (upon availability) using the version of a satellite product corrected to rain gauges rather than the uncorrected version. However, such suggestion might not hold for all land regions of the globe or other grid resolutions. Over south-east Asia and tropical Africa and South-America for instance, we do not have sufficient stations and the available data quality is doubtful. Hence, it remains to be verified if, over these regions of poor coverage, it is best to consider the corrected or uncorrected version of a satellite product. Interestingly, comparing such pairs of products over south-east Asia for instance shows significant differences between the two versions (e.g. CMORPH_v1.0_CRT versus CMORPH_v1.0_RAW, CHIRPS_v2.0 versus CHIRP_V2 and 3B42RT_v7.0 versus 3B42RT_UN-CAL_v7.0; supplementary figure 5(g) versus (o), (h) versus (p), (j) versus (r)), while only a few rain gauges exist over this area. Similarly, we also find important differences for annual total precipitation between the corrected and uncorrected versions of a satellite product over tropical regions poorly sampled on the ground (supplementary figures 7(j), (r)). The accuracy of such correction to rain gauges and its value should then be further investigated region by region and individually for each dataset.

Conclusion
This study focuses on the estimation of the annual maximum 1 day precipitation amount (Rx1day index) in a variety of observations and the assessment of the observational uncertainty (defined as interproduct spread) over land at the (quasi) global scale. We have conducted an intercomparison of 22 gridded products (at 1°×1°daily resolution) that we have clustered into four groups: in situ-based (5), satellite with (8) or without (4) a correction to rain gauges and reanalyses (5). We have compared the climatology of annual maxima 1 day precipitation over the 2001-2013 period (overlapping all datasets) and have evaluated the interproduct spread across the ensemble and within each cluster. Compared to annual total daily precipitation, annual extreme precipitation shows higher interproduct spread. This is not sensitive to the use of Rx1day as Alexander et al (2020) find that other extreme precipitation indices (e.g. R99p) show similar levels of interproduct spread (see also Herold et al 2017 andMasunaga et al 2019).
Reanalyses present a heterogeneous representation of extreme precipitation and in particular over the tropics with either a widespread wet (MERRA2 and CFSR) or dry (MERRA1 and JRA-55) state -or a spatially contrasted state for ERAi-compared to in situ -based data. The interproduct spread is the highest within the reanalyses cluster compared to any other cluster. Our main recommendation is therefore to avoid using reanalyses for observational evidence when investigating extreme precipitation at the global scale. Furthermore, we recommend the use of GPCP_CDR_v1.3 without applying the valid range provided in the file as it disregards all values above 100 mm. Finally, we do not recommend PER-SIANN_v1_r1 because of a widespread dryness compared to in situ-based data in addition to large areas of missing values for some years. We emphasize that these recommendations are relevant to the context of this study, i.e. the estimation of annual maxima of daily precipitation over quasi-global land.
Extreme precipitation intensity in satellite data broadly compares well with in situ-based data. At the quasi-global scale and at a 1°×1°grid resolution, our results thus indicate that satellite data can be used with in situ-based observations when assessing Rx1day. Some satellite data provide different versions, with and without a correction to rainfall gauges. Our work has shown a general better agreement with in situ-based observations and less interproduct spread for the corrected datasets, which is therefore preferred to the uncorrected version for broad-scale studies assessing annual maxima of daily precipitation.
Based on the level of observational uncertainty associated with Rx1day over global land, we cannot conclude in any product emerging as the best observational evidence (in agreement with Herold et al 2017). We strongly encourage using an ensemble of observations from different sources and centers to estimate precipitation extremes and better assess their associated uncertainties. The interproduct spread in the observations is probably underestimated in many studies focusing on observations or on model evaluation. Herold et al (2016) show that this spread is similar to the uncertainties (intermodel and internal variability) of the Coupled Model Intercomparison Project Phase 5 (CMIP5, Taylor et al 2012) with regards to a precipitation index representing the mean daily precipitation amount when it rains (i.e. SDII index). Furthermore, Herold et al (2017) show that Rx1day is sensitive to different resolution. The interproduct spread in the observations is significantly higher at a resolution of 1°×1°than 2°×2°, which are, respectively, the resolutions that the next generation of global climate models (i.e. CMIP6) is likely to have and the resolution of the last generation of models (i.e. CMIP5). We therefore encourage model evaluation studies to consider product sensitivity as higher model resolution will certainly continue to be sought after.
For studying annual maximum 1 day precipitation over land at the (quasi) global scale, we suggest using an ensemble of observations from the FROGS database. Indeed, based on our results we recommend a selection of in situ-based and satellite products (specifically REGEN_ALL_v2019, REGEN_LONG_v2019, GPCC_FDD_v1.0, GPCC_FDD_v2018, CPC_v1.0 (in situ-based) and GPCP_CDR_v1.3 -no valid range applied-, CMORPH_v1.0_CRT, CHIRPS_v2.0, 3B42_ v7.0, GSMAP-gauges_RNL_v6.0 (corrected satellite)). We find greater similarity in extreme precipitation intensity and the timing of these extremes (i.e. distribution of month of occurrence of the wettest day in a year) within these selected datasets compared to all 22 products, giving some confidence to the use of this selection of datasets. It is important to acknowledge that this selection is relevant for the purposes of estimating annual precipitation maxima over quasi-global land but might not be relevant to other precipitation indices, grid resolutions or time scales. This reinforces the need for more wide ranging extreme precipitation intercomparisons.