Water Masers as an Early Tracer of Star Formation

We present a study of the correlation between 22 GHz water maser emission and far-infrared/submillimeter (IR/sub-mm) sources. The generalized linear model (GLM) is used to predict H2O maser detection in a particular source with defined physical parameters. We checked the GLM predictions by observing a sample of selected sources with the Effelsberg 100 m telescope. In total, 359 sources were observed. H2O masers were detected in 124 sources, with 56 new detections. We found 22 sources with a significant flux variability. Using the GLM analysis, we estimate that 2392 ± 339 star formation regions (SFRs) in the Galaxy may harbor H2O masers detectable by single-dish observations at the noise level of ∼0.05 Jy. Analyzing the luminosity-to-mass ratio (L/M) of the ATLASGAL and Hi-GAL clumps associated with different maser species, we find that 22 GHz water masers have significantly lower values of L/M in comparison to 6.7 GHz class II methanol and 1665 MHz OH masers. This implies that 22 GHz water masers may appear prior to 6.7 GHz methanol and OH masers in the evolutionary sequence of SFRs. From the analysis of physical offsets between host clumps and maser interferometric positions, we found no significant difference between the H2O and class II methanol maser offsets against the host clump position. We conclude that the tight association between water masers and IR/sub-mm sources may provide insight into the pumping conditions of these masers and the evolutionary stages of their onset.


Introduction
Water maser emission proved to be an efficient probe of high-mass and low-mass star formation in the Galaxy (Furuya et al. 2003;Szymczak et al. 2005). Studies of the interstellar 22 GHz and other H 2 O maser lines have been carried out over the years and showed that a collisional mechanism is at work (Neufeld & Melnick 1991) and the association of water masers with shocks is well established (Elitzur et al. 1989). The conditions suitable for shocks to excite water masers can occur in star-forming regions due to different excitation schemes, including the impact of protostellar jets (Kaufman & Neufeld 1996;Hollenbach et al. 2013), large-scale shocks (Walker 1984;Mac Low et al. 1994), and disks (Fiebig 1997;Gallimore et al. 2003).
Class I methanol (cIM) masers, another shock-tracing maser type driven by collisional pumping (Leurini et al. 2016), are usually found at some distance from a radiation source in the presence of shock waves that produce suitable conditions to excite the masers (Slysh et al. 1994;Cyganowski et al. 2009). cIM masers trace the edges of outflows in star-forming clumps (Plambeck & Menten 1990;Voronkov et al. 2006).
In contrast to cIM masers, water masers reveal significant variability on the timescale of ∼10 days (Felli et al. 2007), which may be caused by shock-wave propagation (Liljeström & Gwinn 2000), turbulence (Strelnitski et al. 2002;Sobolev et al. 2018), or geometrical effects (Burns et al. 2020). The variability of cIM masers is a much rarer phenomenon. Leurini et al. (2016) suggest a timescale of ∼15 yr for the variability of cIM masers. Kurtz et al. (2004) reported that a half of their sample (37 sources) showed no change in the 44 GHz line flux, with the exception of two sources, W3(OH) and G11.94-0.62. The survey of 144 sources by Yang et al. (2020) displayed no clear evidence of variability of the 95 GHz cIM masers on a timescale of 6 yr.
The observational database for water masers has been significantly improved in the last 10 yr. For example, the unbiased H 2 O southern Galactic Plane Survey (HOPS; Walsh et al. 2011Walsh et al. , 2014 uncovered 540 water maser sources at 22 GHz in the following region of the Galaxy: −70°< l < 30°, |b| < 0°. 5. According to the categories of maser sources from Ladeyschikov et al. (2022) based on the SIMBAD database (Wenger et al. 2000), 70% of these sources are associated with star formation regions (SFRs), 19% with evolved stars, and 11% are not associated with any known source or have been categorized as "unknown." According to the full data archive of the water maser database (Ladeyschikov et al. 2022), in the region covered by the HOPS, there are more than 1031 known water maser sources, and 89% are associated with SFRs. The following papers make the most significant contribution to the increase in the number of known water maser sources: Breen et al. (2010a), , Caswell et al. (2011), Urquhart et al. (2011), Cyganowski et al. (2013), Svoboda et al. (2016), and Titmarsh et al. (2014Titmarsh et al. ( , 2016. In this paper, we present a comparative study between 22 GHz water maser emission and infrared and submillimeter sources from the Herschel infrared Galactic Plane Survey (Hi-GAL) and APEX Telescope Large Area Survey of the Galaxy (ATLASGAL). Both catalogs were used for analyzing sources associated with water masers. Interestingly, among the 814 water maser sources in SFRs, 80% are associated with submillimeter sources from the ATLASGAL complete compact source catalog (Contreras et al. 2013;Urquhart et al. 2014Urquhart et al. , 2018Urquhart et al. , 2022) (hereafter ATLASGAL CSC) and 85% are associated with infrared sources from the Hi-GAL compact source catalog II (Elia et al. 2021) (hereafter Hi-GAL CSC). Based on the ATLASGAL physical parameters catalog, we perform a search of water maser sources toward ATLASGAL clumps to explore the detection rates toward sources with a high probability of maser detection based on the generalized linear model (hereafter GLM) of maser presence.
The GLM was used previously for 22 GHz water maser prediction in RCW 106 (Breen et al. 2007), 12 GHz methanol masers toward 1.2 mm dust clumps (Breen et al. 2010b), and 22 GHz water masers toward 1.2 mm dust clumps ). In the paper by Manning et al. (2016), the authors investigate three different techniques for maser prediction: linear discriminant analysis (Feigelson & Babu 2009), GLMs (McCullagh & Nelder 1989), and random forests (Carliles et al. 2010). They conclude that GLMs and random forests were the most accurate methods. Although the nonparametric random forest method can be more accurate than GLM when applied to large data sets, it cannot be used to test hypotheses directly (Cutler et al. 2007). Another method that can be used for maser identification is binary classification from neural networks (e.g., Kim & Brunner 2016). In this work, we concentrate on using the generalized linear model as the most straightforward method for maser prediction. Other methods will be investigated in future studies.

The Maser Sample, Infrared, and Submillimeter Clump Catalogs
The maser database 8 (Ladeyschikov et al. 2019(Ladeyschikov et al. , 2022) was used to study the statistical characteristics of water masers in the Galaxy and to compare them with those of other maser species, including CH 3 OH (class I and II) and OH masers. Currently, the maser database contains complete information about the H 2 O and CH 3 OH maser sources known from the literature.
The ATLASGAL 870 μm survey (Schuller et al. 2009), performed using the APEX telescope, is a continuum survey covering the whole inner Galactic plane (280°< l < 60°, |b| < 1°.5). The ATLASGAL CSC consists of 10,163 sources, 517 of which are located in the Central Molecular Zone (CMZ; 359°.3 < l < 1°.7, |b| < 0°.2). The catalog was obtained using the source extraction algorithm SExtractor (Bertin & Arnouts 1996) and is 99% complete at ∼6σ, which corresponds to the 870 μm peak flux density of 0.3-0.4 Jy beam −1 and positional accuracy of ∼4″ (Contreras et al. 2013;Urquhart et al. 2014). The physical properties (distance, dust temperature, luminosity, mass, and gas column density) of approximately 8000 dense clumps have been determined (Urquhart et al. 2018). The latest release of ATLASGAL CSC is available in Urquhart et al. (2022). Comparison of the water maser samples with the ATLASGAL CSC dense clumps provides a straightforward and convenient way of investigating the physical conditions in which these masers arise.
Another catalog that we used for studying the water maser sources is the Hi-GAL CSC. The catalog lists the physical parameters of ∼94,000 compact sources from the Hi-GAL (Herschel InfraRed Galactic Plane Survey; Molinari et al. 2010). The advantage of this catalog is 100% coverage in Galactic longitude, providing the unique opportunity to study the full sample of Galactic far-infrared clumps and their association with known water masers.

Matching between Water Maser and Infrared/ Submillimeter Continuum Counterparts
We matched the positions of all known water masers with far-infrared (Herschel Hi-GAL) and submillimeter (ATLAS-GAL) counterpart sources.
To study the ATLASGAL/Hi-GAL sources associated with water masers, we do not consider maser single-dish or VLBI flux densities, but only take into account the maser positions and maser detection/nondetection status. The matching between water masers and ATLASGAL/Hi-GAL sources was based on comparing the offsets between maser position and nearest ATLASGAL/Hi-GAL sources. We consider masers and dust clumps to be physically associated if the angular offset between them is smaller than the assumed matching radius. The pairs with larger offsets are discarded. If more than one ATLASGAL/Hi-GAL source is matched to a particular maser, then the clump with the smaller angular offset to the water maser center is considered the most likely association.
We define the matching radius as the maximum angular distance between maser and infrared/submillimeter sources for their association. The matching radius depends mainly on the beam size of maser observations, and this can vary significantly due to the large range of beam sizes used to produce the maser database. Ninety-five percent of single-dish detections beam sizes vary from 30″ (100 m Green Bank Telescope) to 120″ (Korean VLBI Network telescopes operating in single-dish mode, 22 m Mopra Radio Telescope, Medicina 32 m radio telescope). An analysis of beam sizes of the combined maser sample suggests that a matching radius of 60″, corresponding to a beam FWHM of 120″, covers 95% of the H 2 O maser observations. Thus, we used the matching radius of 60″ for ATLASGAL CSC. As a cautionary note we mention that the exact number of maser sources in this and subsequent sections may be different from the online database numbers up to 3%-4% due to the database improvements, e.g., defenition of the maser groups. These differences do not qualitatively affect the article's results.
The water maser database reports 1584 water maser sources in the region covered by the ATLASGAL survey of 280°< l < 60°, |b| < 1°. 5, excluding the Galactic center region (359°.3 < l < 1.7, |b| < 0°.2). According to source categories from the maser database (Ladeyschikov et al. 2022), 1269 (80%) masers are associated with SFRs. The crossmatching between ATLASGAL sources and water masers in the region 280°< l < 60°, |b| < 1°. 5, using a match radius of 60″, showed that 1069 maser sources (89% of SFRs) are associated with ATLASGAL CSC. The details of the matching statistics are presented in the upper panel of Figure 2. Given that there are ∼9600 ATLASGAL clumps in the region (excluding the CMZ), the percentage of the ATLASGAL sources with known maser counterparts is ∼11%.
The Hi-GAL catalog is not limited to Galactic longitude. However, the coverage is not regular for Galactic latitude, thus we used the approximation of the covered region with a combination of sinus and constant functions. The general view of the full Hi-GAL CSC and approximation region is presented in Figure 1. According to the maser database, there are 1762 maser sources in the region. Out of these, 1399 sources (79%) belong to the category of SFRs. Using a matching radius of 60″, we find 1418 (80%) masers associated with sources from the Hi-GAL catalog. The details of the matching statistics are presented in the lower panel of Figure 2. Decreasing the matching radius to 30″ leads to 1318 matches between water masers and Hi-GAL sources, which is 10% lower.
Interferometric positions are available for 40% of the total maser population of 1621 sources associated with SFRs. There are no issues with the ATLASGAL and Hi-GAL source association for sources with interferometric positions except for crowded regions. But the maser sources with single-dish data only have uncertainties in their position-the exact maser location is unknown within the beam. In total, there are ∼9600 such sources in the ATLASGAL catalog, excluding the Galactic center region (359.3 < l < 1.7, |b| < 0.2). The number of regions observed at 22 GHz is 3543. Thus, for the ATLASGAL catalog, the maser association is straightforward in most cases, as the spatial density of the ATLASGAL sources (∼150 sources deg −2 ) has the same order as maser observations (∼50 sources deg −2 ). The only problem is the crowded regions, where many submillimeter sources are found in close proximity to each other. In instances like this, we associate the nearest far-infrared or submillimeter source to a maser in that worst case; however, these cases do not dominate the overall statistics. In the case of the Hi-GAL catalog, the source density is much higher (∼400 sources per square degree). Hence, there is a higher chance of false-positive associations between masers and infrared sources.
To estimate the false-positive association of the Hi-GAL and ATLASGAL catalogs with the sample of masers, we shifted the data by 1°in Galactic longitude. For the ATLASGAL catalog with a matching radius of 60″, we found only 93 falsepositive matches, representing 8.7% of the total matches between masers and ATLASGAL sources (1069). For the Hi-GAL catalog, a similar analysis shows 424 false-positive matches, which is 30% of the total matches between masers and Hi-GAL sources (1418). Reducing the match radius to 30″ results in 108 false-positive matches (8.1% of total matches). Further reducing the match radius to 20″ results in 40 falsepositive matches (3.3% of total matches). We note that a significant reduction of the match radius results in a reduction of the maser sample. Therefore, we decide to keep the percentage of false-positive associations to a level similar to that in ATLASGAL (∼8%), which is achieved by using the match radius of 30″ for the Hi-GAL catalog. Thus, in the subsequent analysis, we used the matching radius of 60″ for ATLASGAL and 30″ for the Hi-GAL catalog.

Associations of H 2 O Masers with Different Catalogs:
From Infrared to Radio We analyzed the association of H 2 O maser sources with different surveys to find the one with a maximum association rate. The criterion of association is the presence of a counterpart source from a particular catalog with the maximum offset of 60″. The following surveys were investigated: IRAS (12-100 μm; Neugebauer et al. 1984   (1-2 GHz; Wang et al. 2018), and CORNISH (5 GHz;Hoare et al. 2012). The offset was set at 30″ for the Hi-GAL data due to the significant amount of false associations when using the 60″ offset (see details in Section 2.2). The area of comparison is limited by the following region: 10°< l < 60°, |b| < 1°, which covers all considered surveys. The total number of water maser sources in this region is 831. We limited the sample of masers only to SFRs, which resulted in 720 sources (see Ladeyschikov et al. (2022) for a more detailed description).
The analysis of the data showed that the number of associations of water masers in star-forming regions with sources from the IRAS, Bolocam, ATLASGAL, and Hi-GAL catalogs is 382 (53%), 621 (86%), 634 (88%), and 632 (88%), respectively. Since the number of ATLASGAL and Hi-GAL sources is almost similar, we investigate the number of the same sources matched in different catalogs. We found 681 masers associated with sources from Hi-GAL or ATLASGAL CSC. Only 585 have both ATLASGAL and Hi-GAL counterparts, thus 86% of sources are the same sources matched in different catalogs.
The number of associations with sources from radio surveys of the Galaxy is much smaller-236 (33%) for THOR (1-2 GHz) and 150 (21%) for CORNISH (5 GHz). Thus, more than two-thirds of sources with star formation activity revealed by H 2 O maser emission have no associated radio continuum emission. As a cautionary note, we mention that the amount of 22 GHz maser observations of radio sources of the Galaxy is not large. The CORNISH catalog contains 2637 sources, but according to the maser database, only 333 of these sources have observations of 22 GHz water masers. Among the 333 observed radio sources, 22 GHz masers were detected in 229 sources, i.e., in ∼2/3 of the total sample. However, the number of observed radio sources did not allow us to draw clear conclusions about the maser association rate. The observed sample contains mostly bright radio sources-mean value of S int ∼ 300 mJy-while more faint sources remain unobserved. The mean value of S int for the whole CORNISH catalog is ∼55 mJy. The systematic 22 GHz water maser survey of radio sources is required to investigate the H 2 O maser association rate.
The analysis suggests that it is most efficient to search for masers in the water vapor line in the direction of the sources from the ATLASGAL, Bolocam, and Hi-GAL surveys-these surveys have the greatest number of associations with maser sources among other surveys considered here. We did not consider the Bolocam catalog here and used only the ATLAS-GAL and Hi-GAL catalogs to study the statistics of SFRs with water masers. Although there is significant overlap between these catalogs, and ATLASGAL may be considered a biased subset of the Hi-GAL catalog, we consider both of them for internal checking and obtaining more reliable results.

Generalized Linear Model of Maser Presence
As shown in the previous section, the water maser sources have a maximum association rate with ATLASGAL and Hi-GAL CSC. However, the number of sources from these catalogs is much higher than what can be observed practically with available facilities. Instead, we study the physical parameters of those clumps that already have H 2 O maser observations to find the dependence between these parameters and maser detection statistics. We used the GLM (McCullagh & Nelder 1989) to construct a model for the prediction of water maser emission in a particular infrared or submillimeter source. We used the function glm, which is part of the base R package (R Team 2010). The physical parameters of the clumps were used as input catalogs for GLM. Both Hi-GAL CSC and ATLASGAL CSC were investigated to find the model with the best predicting power. To create a training set, we crossmatched the Hi-GAL and ATLASGAL CSC with the water maser observations stored in the maser database. The details of the crossmatching are presented in Section 2.2.
The GLM returns the maser detection probability in the form of the b values are regression coefficients, and x i are predictor variables. The choice of the appropriate cutoff threshold (e.g., 0.5) can determine whether an object has an associated maser or not. Four outcomes of the classification are possible: correct prediction (true positive and true negative) and false prediction (false positive and false negative). A quantitative measure of the predicting power of the model is its accuracy. The accuracy is the percentage of correctly predicted maser sources to the total number of maser sources: True Positives True Negatives Total sources in sample 2 The accuracy defined above cannot be considered the absolute accuracy of the model because it does not consider the accuracy of the used training set. If some bias exists in the training set, for example, weak masers are missed, then the GLM will be biased toward brighter masers. That may be the case in the GLM model for cIM masers described in Section 2.4.3. The combination of measurement and systematic uncertainties in the underlying training set ultimately limits the accuracy that can be obtained with any classification technique (Manning et al. 2016).
Before performing the actual maser prediction, we run a GLM test using synthetic columns. These columns are a function of the maser detection with different degrees of noise. The analysis shows that combinations of different parameters give higher predicting accuracy in comparison to the singleparameter model. The tests also confirm that the GLM stepwise refinement procedure correctly excludes insignificant parameters for predicting the presence of a maser. The details are given in Appendix A.
The GLM has the output of maser detection probability (p). To convert it to detection or nondetection, we have to choose a threshold. Usually in GLM analysis, the assumed threshold is p = 0.5 (Manning et al. 2016). We tested different values of p. In Figure 3 we present the dependence of the assumed maser detection threshold on the different GLM prediction accuracies. The best accuracy of the prediction is achieved at p ∼ 0.5 in all considered models and data sets. However, different values of p may be assumed for other tasks. For example, the best accuracy of the GLM model is achieved at p = 0.7 when the sample of observed sources has the noise level of σ = 0.2 Jy, while p = 0.5 is a good estimation for observations with σ = 0.05 Jy, which will be discussed in Section 2.5.

GLM of Water Masers Presence Using Hi-GAL CSC Data
About 10% of matches between Hi-GAL CSC and water masers are excluded from the training set due to the absence of physical parameters or distance estimation for them. In the analysis, we used a matching radius of 30″ (see details in Section 2.2). The resulting training-set size is 2832 sources-1194 (41%) sources have a H 2 O maser detection while the remaining sources are nondetections at 22 GHz. In total, 51 papers with 22 GHz maser observations were used for this training set, but the majority of the data (76%) comes from the following papers-Svoboda et al. When several observations are available for a particular source with a different status (detection and nondetection), we prioritize the maser detection over the nondetection, taking account of the variability of the masers. The observations with a noise level greater than 0.5 Jy are not included in the training set. The mean noise level of the observations included in the training set is 0.21 Jy.
The following physical parameters from Hi-GAL CSC were tested as predictors of water maser presence: source distance (D), source linear diameter (d), clump total mass (M), dust temperature of the clump derived from the modified blackbody fit (T), bolometric luminosity (L bol ), ratio of the bolometric luminosity to the submillimeter (λ 350 μm) luminosity (L ratio ), bolometric temperature (T bol ), luminosity-to-mass ratio (L/M), and surface density (Σ). For all values except the source distance and diameter, a logarithmic function is applied before the GLM approximation.
We ran a stepwise refinement in order to find the models with the best predicting power that contain only significant variables (p-value =0.0013). In the first initial run, we used all available physical parameters. Then, those parameters that had a p-value greater than 0.001 were removed from the list of parameters used, and the stepwise refinement was repeated to further test the significance of the remaining parameters. This procedure was repeated until all parameters used became highly significant (p-value =0.0013). From such analysis, we found that the regression coefficients for the probability of the 22 GHz maser detection are the following: where values in brackets are the standard errors. The regression contains only two independent variables-dust temperature (T dust ) and surface density (Σ). We repeated the GLM stepwise refinement using the distance-limited sample (2 < D < 5 kpc) to exclude the far and near sources. The linear regression in this case is not changed qualitatively, but the coefficients are slightly changed within the errors.
The threshold of p = 0.5 gives 1030 ± 77 predicted maser detections, while the observed number of maser detections is 1194. The accuracy of Equation (3) at p = 0.5 on the full sample of observed Hi-GAL sources is estimated to be 73.3% ± 0.1%.
We also ran the accuracy test for the model that contains only one variable-either surface density or dust temperature. The coefficients for such a model were found to be similar to the model with two parameters. The following linear regression is found for the surface-density model: The accuracy of the model with only the surface density as the maser predictor is found to be 69.3% ± 0.8%. For the dusttemperature-only model, the accuracy is 65.6% ± 0.6%. Thus, the combination of surface density and dust temperature results in an increase of 4% in the prediction accuracy compared to the surface-density-only model. The uncertainty in accuracy due to standard errors of p does not exceed 0.6%, thus we treat this increase as a significant one.

GLM of Water Masers Presence Using ATLASGAL CSC Data
We repeat the above calculations for the ATLASGAL CSC. The training set was constructed similarly to the Hi-GAL CSC but using the matching radius of 60″. After removing the sources with undefined physical parameters, the training set contains 1799 sources with 956 detections. The mean noise level of observations included in the training set is ∼0.1 Jy. The following physical parameters were tested as the maser predictors: source distance (D), dust temperature (T dust ), FWHM radius (r), bolometric luminosity (L bol ), clump mass at FWHM level (M FWHM ), luminosity-to-mass ratio (L bol /M FWHM ), mass surface density (Σ), peak H 2 column density (N H 2 ), H 2 volume density (n), and freefall time of the clump (τ ff ), After removing the insignificant parameters (p-value > 0.0013) and stepwise refinement until all parameters have become significant, we found that the following model has the best prediction power: The accuracy of Equation (5) while using the threshold of p = 0.5 is estimated to be 70.7% ± 0.2%. The predicted number of maser detections is 958 ± 90, while 956 sources were detected in the training set. The model with only H 2 column density as the maser predictor gives an accuracy of 63.6% ± 0.3%. Figure 3. The accuracy of the GLM maser prediction vs. the maser detection probability threshold. The error bars for accuracy were calculated from the standard error of maser probability. The green and gray lines are the GLM accuracy for 95 GHz cIM masers using Hi-GAL and ATLASGAL CSC data, respectively. The blue and red lines-the accuracy of GLM for the 22 GHz water masers using Hi-GAL and ATLASGAL CSC data, respectively.

GLM for 95 GHz Class I Methanol Masers
We also used GLM for the prediction of 95 GHz cIM maser emission. We used the Hi-GAL catalogs as input parameters for maser prediction. The training set for the Hi-GAL physical parameters catalog consists of 1060 maser observations at 95 GHz toward the Hi-GAL clumps. The training set was obtained using a search of 95 GHz maser observations toward the Hi-GAL sources and excluding the sources with undefined physical parameters. Sixty-eight percent of this training set was obtained from the work of Yang et al. (2017) and 32% was obtained from other works. We note that the mean noise level of the 95 GHz maser observations (1σ ∼ 1.1 Jy) is much higher than the 22 GHz maser observations (1σ ∼ 0.1 Jy), thus the training set used is not sensitive to weak cIM masers.
From the stepwise refinement of all Hi-GAL physical parameters, we found that the model with two parameters (surface density and dust temperature) has the best prediction power: 4.00 0.54 log 3.44 0.25 log . 6 dust Equation (6) has the overall accuracy of 78.8% ± 0.3% while using p = 0.5. The predicted number of detections in the training set is 322 ± 50, while 424 were detected from observations. The analysis of the ATLASGAL CSC results in the following model for 95 GHz masers, which has an accuracy of 77.4% ± 0.9%: When comparing the prediction accuracy of the GLM for the 22 GHz water and 95 GHz methanol masers, we conclude that the GLM for the 95 GHz methanol masers has higher accuracy than a similar model for water masers. That may be caused by the lower temporal variability of the 95 GHz methanol masers in comparison to 22 GHz water masers or the higher noise level of the 95 GHz maser sample. The latter may result in poor sensitivity to weaker masers. The impact of maser variability on the GLM detection rates will be further discussed in Section 4.2.

Applying the GLM to Individual Samples of the 22 GHz Maser Observations
In Section 2.4.1, it was shown that the GLM constructed from the Hi-GAL catalog using two parameters (surface density and dust temperature; see Equation (3)) has the best prediction accuracy for 22 GHz masers-73.3% ± 0.1%. However, the training set for this model was constructed from multiepoch observations. In practice, the sample is often single-epoch observations with sensitivity limitations, and thus part of the masers can be undetected due to variability and sensitivity limits. We applied the GLM model (Equation (3)) to different samples of the 22 GHz maser observations to test its accuracy. The following 22 GHz maser observations were investigated: sample from this work, Svoboda et al. (2016), ), Titmarsh et al. (2014, and Urquhart et al. (2009Urquhart et al. ( , 2011. The results are presented in Figure 4. We found that the maximum accuracy of the GLM (Equation (3)) is in the range of 61%-79%. The lowest accuracy (61% at p = 0.6) is found for the sources observed by Titmarsh et al. (2014Titmarsh et al. ( , 2016 and the highest accuracy (79% at p = 0.5) is found for the sources observed by Svoboda et al. (2016). For the sample of sources observed in this work, the accuracy is found to be 70.9% ± 0.3% (at p = 0.68). We note that for all observed source samples, with the exception of Svoboda et al. (2016), the threshold of ∼0.7 gives the maximum prediction accuracy. But in the sample observed by Svoboda et al. (2016) and the full combined set of data, the threshold of p = 0.5 gives the best prediction accuracy. We associate this difference with the noise level of Svoboda et al. (2016)-the mean value of 1σ is 0.043 Jy, while for the other samples the mean level of 1σ is ∼0.1-0.2 Jy. If we limit the sample of the sources observed by Svoboda et al. (2016) with σ > 0.05 Jy, the best accuracy is also achieved at p ∼ 0.7, similarly to other papers. The exceptional case is the observations by , where σ is ∼0.04 Jy, but the best accuracy is achieved at p = 0.7.
We conclude that the best GLM prediction accuracy (∼79%) may be achieved using the threshold of p ∼ 0.5 when the sources were observed with the 1σ level of ∼0.05 Jy. The threshold of p = 0.7 gives the best accuracy of ∼70% when the 1σ level is ∼0.1-0.2 Jy, i.e., fewer masers are detected with lower sensitivity.

On the Total Number of Star Formation Regions with Water Masers
The generalized linear model allows us to estimate the total number of star-forming clumps with a high probability of 22 GHz water maser emission by applying Equation (3) to the whole Hi-GAL CSC. We used the Hi-GAL catalog for this analysis as the ATLASGAL catalog has limitations in the Galactic longitude. Thus, the exact number of the submillimeter clumps in the Galactic plane is unknown. Using the threshold of p = 0.5 for maser detection, we find a total of 2392 ± 339 Hi-GAL sources that may harbor water masers that are detectable with 1σ level of ∼0.05 Jy. That is 3.1% of the total Hi-GAL sources (∼94,600). When the threshold of p = 0.7 corresponding to the 1σ level of water maser observations ∼0.2 Jy is used, then the total number of maser sources is 918 ± 193. However, the observed detection rates may be lower due to maser variability. As shown in Section 3.5, only two-thirds of previously known masers were detected. From the maser database (Ladeyschikov et al. 2022) we found that 1128 Hi-GAL sources with p > 0.5 were observed at 22 GHz, and masers were detected in 801 sources (detection rate 71%). A total of 547 Hi-GAL sources with p > 0.7 were observed and masers were detected in 458 sources (detection rate 84%). Thus, we conclude that currently only one-half (1128 out of ∼2400) of the total population of Galactic H 2 O masers at the sensitivity of σ = 0.05 Jy is detected in the literature and included in the maser database. The same is found for masers at a sensitivity of 0.2 Jy: 457 sources out of ∼900 are observed. That makes a strong case for a targeted or unbiased H 2 O maser survey, which is partially done in this work.
We compare the predicted number of H 2 O maser detections with an unbiased HOPS. According to the Hi-GAL CSC, there are 37,279 sources in the region covered by HOPS: 290°< l < 30°, |b| < 0°.5. When using the threshold of 0.7 for GLM (Equation (3)), the predicted number of maser detections in the region covered by HOPS is 597 ± 130. The actual number of maser detections in HOPS is 540, but only 353 are associated with sources from the Hi-GAL CSC, while other masers are mostly evolved stars. Thus there is a difference between the number of masers predicted by GLM and detected in HOPS. We associate the difference with the lower sensitivity of HOPS (1σ ∼ 2 Jy), as a threshold of p = 0.7 is applicable when the noise level is σ ∼ 0. 1-0.2 Jy. Increasing the threshold to 0.78 leads to 352 ± 73 sources predicted by GLM, which is close to the observed number (353). On the other hand, if we use the threshold of 0.5, then the number of H 2 O masers predicted by GLM in the region covered by HOPS is 1581 ± 234. That number can be achieved if an unbiased H 2 O maser survey will be done with a sensitivity of ∼0.05 Jy.
Another unbiased H 2 O survey is by Caswell et al. (2010), which was conducted with the ATCA. In the first session, the covered region is 305°< l < 306°.26, |b| < 0°. 15, while for the second session the covered region is 311°< l < 312°.18, |b| < 0°.15. In both regions, there are 330 Hi-GAL sources. GLM predicts 26 ± 4 maser detections when using the threshold of p = 0.5, corresponding to a sensitivity of ∼0.05 Jy. The observed number of H 2 O masers for both regions is 30, while 5 sources are known evolved stars. Thus, the observed number of masers associated with Hi-GAL sources (25) is close to the number predicted by GLM (26 ± 4).
We conclude that with GLM, it is possible to estimate the detection rates of H 2 O masers in different samples of SFRs with associated Hi-GAL sources that are not yet observed. It can be useful while planning observations. However, the accuracy of maser prediction in individual sources is ultimately limited due to maser variability and limitations on the sensitivity. We were not able to achieve an accuracy of more than ∼70% on different observed samples except for Svoboda et al. (2016) (see Figure 4).

Target Selection Criteria for Water Maser Observations
The number of infrared (∼94,000 sources from Hi-GAL CSC) and submillimeter (∼10,000 sources from ATLASGAL CSC) clumps significantly exceeds the practical limit of associated targeted H 2 O maser observations. We propose using the H 2 O maser detection probability (p) from the GLM (see details in Section 2.4) to limit the number of the sample sources by selecting the sources with the highest maser detection probability. As shown in Section 2.6, only one-half of the total Hi-GAL sources with a high probability of maser detection have been observed, reported in the literature, and included in the maser database. We created three representative target samples: (1) Unknown (hereafter U): 91 sources not observed previously in Svoboda et al. (2016) (hereafter S16) and other water maser surveys (according to the MaserDB database), but having a high maser detection probability (p > 0.25). This target sample provides the likely criteria for the detection of previously unknown water masers.
(2) Detected (hereafter D): 100 sources with a water maser detection in S16 and high maser detection probability (p > 0.7), but excluding well-known sources-those that have more than four observations. This sample aims to study the variability of previously detected, but not wellstudied, water masers, as well as to search for maser flares and cases of strong variability.
(3) Nondetected (hereafter N): 120 sources that were previously observed by S16, but were not detected at the 5σ level of 0.1 Jy, while they show a maser detection probability p > 0.3. This sample provides observational evidence of maser presence or absence in sources having single-epoch nondetections. According to the archival observations from the MaserDB database, the sources that were not detected at one epoch have a chance of being detected at another epoch. For example, 39% of sources with nondetections in the survey by Kim et al. (2018) were detected by other surveys.
Additionally, we reobserved the water masers that were initially targeted by our Pushchino RT-22 telescope survey of 100 ATLASGAL sources (Ladeyshchikov et al. 2022) but excluded those already observed using criteria (1)-(3). This sample consisted of 50 sources, including 4 sources detected previously in RT-22 observations (hereafter P) and 46 nondetected sources (hereafter PN). These sources were selected initially using the following criteria: the presence of ATLASGAL source with F peak,870 um > 1.0 Jy and the absence of H 2 O maser observations in the literature. This sample complements the U sample and provides a comparison between the detection rates using the RT-22 and Effelsberg telescopes.

The Effelsberg 100 m Observations
We conducted a simultaneous 22 GHz (6 16 −5 23 ) H 2 O maser and NH 3 (1,1), (2,2), (3,3) observations for targets described in Section 3.1 using the Effelsberg 100 m radio telescope 9 in the period between October and December 2021 (project code: 80-21). The position-switching mode with the double beam and the dual-polarization K-band receiver on the secondary focus (S14mm Double Beam RX) were used for the observations. Only data obtained from the beam pointed at the targets are used in this work.
In order to be able to compare our results with the observations of S16, which has a velocity resolution of 0.32 km s −1 , we choose high-spectral-resolution fast Fourier transform spectrometers (FFTSs) as our backends. Each FFTS provides a bandwidth of 300 MHz and 65,536 channels, which results in a channel spacing of 4.6 KHz (i.e., 0.06 km s −1 at 22 GHz), sufficient for comparison with the results of S16. The adopted rest frequency of the water maser observation was 22,235.08 MHz. For ammonia, the rest frequencies were 23,694.495, 23,722.633, and 23,870.129 MHz for transitions (1,1), (2,2), and (3,3), respectively. The half-power beamwidth of the Effelsberg telescope at 22 GHz was ∼40″ and ∼37″ at 23 GHz. The system temperature was typically around 110 K. The pointing and focus were checked roughly every 2-3 hr. The typical pointing uncertainties of the Effelsberg telescope were 5″-10″. The on-source time for each position was 1-1.5 minutes, which allowed us to achieve a typical noise level of about 0.16 Jy at 0.06 km s −1 at the H 2 O frequency. For the NH 3 data, we binned three channels to increase the signal-to-noise ratio. The typical noise level is 0.14 K at a resolution of 0.48 km s −1 . The histogram distribution of the noise level for the 22 GHz observations is shown in Figure 5.
The spectra are recorded in the diode calibration unit scale that is converted to the antenna temperature, T A , by multiplying by the noise diode temperatures, T cal . Following the method introduced by Winkel et al. (2012), we used the source NGC 7027 as the flux calibrator to derive T cal . The flux density of NGC 7027 was 5.61 Jy at 22.2 GHz and 5.58 Jy at 23.7 GHz based on the regular monitoring by the Effelsberg 100 m observatory. We convert T A to flux density in units of Jansky for the H 2 O maser observations using the following relation: F (Jy) = T A /η, where the gain η = 1.04 K Jy −1 . For the NH 3 data, we convert the antenna temperature to the main-beam temperature: T mb = T A /η mb , where the main-beam efficiency η mb = 0.589. The spectra were recorded using two circular polarizations. Because the calibration difference between the two polarizations is less than 3%, we used the same T cal value for both polarizations. For the entire observational program, we used T cal = 11 K for the 22 GHz H 2 O maser observations and 10 K for the NH 3 observations. In this work, velocities are given with respect to the local standard of rest (LSR).
The class program from the GILDAS software package (Pety 2005) was used for data processing, including baseline subtraction, binning (for NH 3 data), Gaussian fitting, and graphical spectra plotting.
In this work, we present the data on H 2 O maser emission. The data on ammonia emission will be presented in a future work.

Results of the Effelsberg Maser Search
Using the GLM described in the previous section, we selected the sources with maximum maser detection probabilities from the ATLASGAL CSC to search for 22 GHz water maser emission. The majority of the sources initially selected from the ATLASGAL catalog also have infrared Hi-GAL counterparts. Due to the higher accuracy of GLM using Hi-GAL CSC (see Section 2.4.1), we use predictions from the Hi-GAL catalog.
A summary of the maser detection rates is presented in Table 1, as well as the mean maser detection probability for each source sample. In Table 2 we present the 22 GHz water maser detections, where p AGAL and p HiGAL are the GLM maser detection probability calculated from the ATLASGAL and Hi-GAL CSC (see Section 2.4). In Figure 6 we present the spectra of the detected masers and in  (5), the predicted number of maser detections is 54 ± 7 when using the threshold of p = 0.5. Thus, the detection rate from the GLM is 39% ± 5%. This implies that the actual detection rate for the 22 GHz H 2 O masers is ∼1.3 times lower than the detection rate expected from the GLM.

Note.
a "PN" sources are those observed previously using the Pushchino RT-22 telescope (Ladeyshchikov et al. 2022), but do not show water maser emission brighter than σ = 3 Jy. In the context of the Effelsberg observations, these sources have the same meaning as unobserved. "P" sources are those detected previously using the Pushchino RT-22 telescope.  (This table is available in its entirety in machine-readable form.) p = 0.5, resulting in a detection rate of 91% ± 3% and 48 ± 10 with a threshold of p = 0.7 Furthermore, we additionally observed 118 sources that were previously targeted by Svoboda et al. (2016) and showed no detection of 22 GHz maser emission at σ ∼ 0.04 Jy for a channel width of 0.32 km s −1 . The sensitivity of the Effelsberg survey from this work corresponds to ∼0.07 Jy at a channel width of 0.3 km s −1 , thus our survey and the survey of Svoboda et al. (2016) are comparable in terms of sensitivity. Water maser emission was detected in 15 sources, corresponding to a detection rate of 13%. Thus, at least ∼13% of the sources from the observed subsample of the previously nondetected masers were in the minimum of activity (i.e., were not detectable in the archival observations), but appear in the Effelsberg observations. Moreover, from the water maser database (Ladeyschikov et al. 2022) we found an additional 14 sources that have H 2 O maser detection ("N" sources with the following numbers: 3, 5, 6,23,30,32,34,42,48,49,50,60,76, and 85) but were not detected in the Effelsberg observations. Thus, in total, we found 29 sources with maser detection in the literature or Effelsberg observations, resulting in a detection rate of 24%. The expected maser detections from the GLM is 66 ± 6 when using the threshold of p = 0.5, resulting in the detection rate of 56% ± 5%. Thus the actual detection rate of the previously nondetected sources is ∼2.3 times lower than predicted.
In the last two columns of Table 1, we show the number of maser detections predicted by the GLM model (Equation (3)) and corresponding model accuracy. From the full sample of 359 observed sources, we were able to predict the detection of 97 sources among 124 actual detections (78%). Nevertheless, the number of false-positive detections is ∼3 times larger than the number of true-positive detections. Thus for a particular observation session, it is almost impossible to predict the observed detection and nondetection status of highly variable water masers. Even for those sources for which a water maser detection has been reported before and that have a predicted GLM detection rate of 91%, we find a detection rate of 65%approximately one-third of the sources have become undetectable in the observations. That means that the observed nondetection should not be always considered as an absence of water masers but rather as a minimum of activity. Only multiepoch observations help to establish the presence of a maser in a particular source. In this regard, the archival observations and GLM may provide important information regarding the status of water maser emission. If water maser emission is not detected toward a source, but this source has a high GLM maser detection probability, then it is possible that after some period of time it will be detected.
Concluding on the detection rates for H 2 O maser emission, we found that observed detection rates are always lower than the ones predicted by the GLM. For sources that were previously detected or never observed, the GLM detection rate is ∼1.3 times higher than the actual observed detection rate. However, for the sources that were previously nondetected, the difference between the GLM and actual detection rate increases by a factor of ∼2.3. The reasons for this difference will be discussed in Section 4.2.

Velocity Range and Difference between Maser and Systematic Gas Velocity
We analyzed the H 2 O maser velocity range and the difference between the systemic velocity and H 2 O maser peak velocity on the sample of observed sources. The systematic velocities were taken from the ATLASGAL database. 10 The large velocity differences could indicate powerful energies from the driving sources. For example, in W49N, the H 2 O masers can reach about ±300 km s −1 away from the systemic velocity (Morris 1976;McGrath et al. 2004;Kramer et al. 2018). The H 2 O maser peak velocity, systematic gas velocity, and H 2 O maser velocity range are presented in Table 2.
For the majority of the detected maser sources, the profiles contain the brightest peak component and several fainter components that have a velocity range <10-15 km s −1 from the peak component. In 103 sources (83% of the total sample), the velocity range does not exceed 15 km s −1 .
Nevertheless, we identify 11 sources that have a large (>30 km s −1 ) maser velocity range. These sources are presented in Table 3.
We also analyzed the difference between the 22 GHz maser velocity range and systematic gas velocity, obtained from the molecular line observations stored in the ATLASGAL database. For the majority (77%) of the detected sources, the are the 22 GHz H 2 O maser peak velocity and velocity range. V span is the difference between the maximum and minimum velocity of the 22 GHz maser emission and V sys is the systematic gas velocity. Note. Other columns are the same as in Table 3. 10 https://atlasgal.mpifr-bonn.mpg.de difference between the maser and gas velocity does not exceed 15 km s −1 . We identify 13 sources with a difference larger than 30 km s −1 . These sources are presented in Table 4. The prominent sources with a large gas velocity offset are U98 (>100 km s −1 ), P1 (>80 km s −1 ), and U46 >70 km s −1 ). A significant maser emission velocity range was also detected in the sources P4 (>130 km s −1 ), D12 (>80 km s −1 ), and D5 (>60 km s −1 ). However, the source U98 is quite close (5′) to the W49 region, and the maser profile resembles the prominent W49 source, thus we assume that this source is affected by the nearest bright maser and the maser line is not associated with the submillimeter source. The P1 source was first discovered in a previous paper (Ladeyshchikov et al. 2022), but with only one component at ∼−70 km s −1 . Current Effelsberg observations reveal a new narrow component at ∼40 km s −1 , while the component at ∼−70 km s −1 still remains. In the U46 source, it is notable that the systematic gas velocity (7.1 km s −1 ) is significantly different from the maser velocity range (−72.6; −36.1). We checked the ammonia data toward this source and confirmed that the systematic gas velocity is ∼7 km s −1 .
The sources presented in Tables 3 and 4 may be considered for further follow-up studies, as they reveal high energies from the driving sources, accelerating the maser components to velocities of more than 30 km s −1 relative to the systematic gas velocity.

Observations of Maser Variability
Using the Effelsberg 100 m telescope, we observed 100 sources that have a detection of H 2 O maser in the archival data starting from 2010. We did not include earlier works in this analysis due to high rms noise levels and the significant time interval between the current observations and archival data. Only sources that have a maximum of four observations were included in this sample-we exclude well-known sources and reobserve only those sources that have a small number of observations in the available literature.
For each source, we calculate the variability index from the definition of Palagi et al. (1993): where F max and F min are the maximum and minimum flux density over all observed epochs. In the case when a maser was not detected, then F min is converted to the 3σ level of the observation.
In Table 5 we present sources with a variability index (V i ) higher than 5 and maximum flux density (F max ) more than 5 Jy to highlight the sources with significant variability. The last criterion ( > F 5 max Jy) was included to hide the faint masers that appear after the absence of detection, but due to the low 3σ level of nondetection, they have an enormously large variability index.
In total, 22 sources satisfy these criteria. The maximum variability index is 106.4 for the source D96 (AGAL 033.811-00.187). In the paper by Urquhart et al. (2011), this source has a peak 22 GHz flux density of 44.67 Jy, but we detect no emission at the 3σ level of 0.42 Jy. Svoboda et al. (2016) observed this source and detected an H 2 O maser with a peak flux of 2.365 Jy. The time span for the variability index, defined by the observations of Urquhart et al. (2011) and this work, is ∼13 yr. Note. Only sources with a variability index (V i ) larger than 5 and maximum flux density (F max ) larger than 5 Jy are shown. The ] and Z symbols correspond to a decrease and increase of the source flux density in the Effelsberg observations with respect to the archival data. Another interesting source is D16. This source has the second-largest variability index (48.6) with an increase in the peak flux density in the current observations. The Australia Telescope Compact Array (ATCA) observations of this source by Titmarsh et al. (2014) resulted in a nondetection at a sensitivity of σ = 0.15 Jy, while observations of Svoboda et al. (2016), Xi et al. (2015), and Cyganowski et al. (2013) reveal detections with a peak flux density of ∼1-3 Jy. We detect a H 2 O maser with a peak flux density of 21.8 Jy.
Concluding on the variability of the selected water maser sources, we found that among 22 sources, ten sources revealed a significant (more than five times) increase in the H 2 O maser flux, while 11 sources revealed a significant decrease in the H 2 O maser flux. For the source D44, the flux did not change significantly in our observations (F peak = 2.6 Jy), but the variability index was high due to the large difference between the values measured by Xi et al. (2015) (20.8 Jy) and Svoboda et al. (2016) (0.66 Jy). However, the change in the maser flux reported here is not directly associated with maser flares of limited temporal duration, as the difference between observed epochs may go up to 10 yr. Nevertheless, our study provides possible candidates for the flare sources exhibiting significant variability-only 22% of 100 sources observed in this work from the category of the previously detected sources have a variability index higher than 5. These sources should be considered for follow-up monitoring programs to study further the behavior of their variability, as they offer the best possibility of catching transient events.
We also checked the sources from the N sample (previously nondetected in the archive) and PN (previously nondetected in the PRAO RT-22 observations) for variability, but detected no significant (V i > 5, > F 5 max Jy) change in the maser flux. All sources detected in the N sample have a flux density below 5 Jy.

Correlation between Maser Species
For the study of the physical parameters of the maserassociated ATLASGAL and Hi-GAL clumps, we used a sample of different masers species from the maser database (Ladeyschikov et al. 2019(Ladeyschikov et al. , 2022. This has the advantage of using many different surveys and including all maser sources found in the literature. However, the resulting sensitivity of the maser sample is not homogeneous because the data were compiled from different surveys and telescopes. Nevertheless, for OH and class II CH 3 OH masers, the contribution of the unbiased surveys dominates over those of the targeted surveys in the maser database. For example, 279 OH masers (86%) from a total of 324 OH masers associated with SFR and included in the database were detected in The HI/OH/ Recombination line survey (THOR; 14°.5 < l < 66°. 8, Beuther et al. 2019). For the 6.7 GHz methanol masers, 850 11 (78%) of the known 6.7 GHz methanol masers (1082) in the Galactic longitude range 186°< l < 60°were detected in the MMB survey (e.g., Caswell et al. 2010;Breen et al. 2015). We found that 232 of the 6.7 GHz methanol masers (22% of the total known population located) in that part of the Galaxy were not detected in the MMB survey.
In general, the THOR and MMB unbiased surveys may be considered as those having enough sensitivity to avoid missing a significant part of the maser populations in the corresponding Galactic longitude ranges. Variability of maser sources and sensitivity limitation may be responsible for missing ∼20% of sources not detected by the MMB and THOR surveys. This estimate is consistent with the total methanol maser population estimate for methanol masers. According to Green et al. (2017), a total population of 1032 masers, which, factoring in the completeness estimate (Green et al. 2009), would imply a true total population of ∼1290 masers. Thus, 20% of the total population is missed in the MMB survey. However, for water masers, the situation is quite different. The H 2 O Southern Galactic Plane Survey (HOPS; Walsh et al. 2011) has a lower sensitivity (∼2 Jy) in comparison to MMB (∼0.17 Jy; Caswell et al. 2010). In the HOPS Galactic longitude range (30°< l < 290°), we found that this survey catches only 38% (540) of all 22 GHz H 2 O masers found in the literature in this longitude range (1377). When limiting the maser sample to SFRs, HOPS covers only 41% (325) of all masers included in the database (798). A significant number of masers missed by a blind survey, such as HOPS, may be associated not only with the significant H 2 O maser temporal variability but also by their limited sensitivity. However, if hypothetically we conduct an unbiased H 2 O survey with a sensitivity similar to MMB, then the maser sources that are in the activity minimum will be missed anyway. As was previously shown, in the Effelsberg observations (see Section 3.3), only ∼50% of the previously known H 2 O masers were detected in the survey, while the previously reported flux densities of these masers are sufficient to detect with the sensitivity level achieved. Thus, an unbiased survey of water masers has significant limitations in terms of finding most of the maser population even if conducted with high sensitivity.
In this regard, combining all the previous observations from the literature may be the way to partially overcome the variability issue of H 2 O masers. As the maser database contains observations from different epochs, even if a maser was not detected due to its minimum activity phase, it can be detected at other epochs. A higher sensitivity of targeted observations will significantly increase the available H 2 O maser sample in the maser database compared to the unbiased HOPS. For H 2 O masers in SFRs, the increase is ∼145% (from 325 to 798) when considering the Galactic longitude range of HOPS and ∼321% (from 325 up to 1371) when considering the whole Galactic plane.

On the Detection Rates of the 22 GHz Emission
From the results of the 22 GHz maser observations, we found that the actual detection rate may be ∼1.3 times lower than predicted by the GLM. In the case of the previously nondetected sources, the detection rate may be ∼2.3 times lower than predicted by the GLM. Because the GLM detection rate is based only on the submillimeter clump physical parameters and does not consider the maser characteristics and observational limitations, it is expected that the observed 22 GHz maser detection rate will be lower. The possible factors that may influence the observed detection rate of the H 2 O masers are the following: maser beaming, maser variability, and sensitivity limitations.
To quantitatively measure the influence of temporal variability on the difference between the GLM and actual detection rates, we used the data for the 95 GHz cIM masers from Yang et al. (2017). As cIM masers are expected to show little time variation (Menten et al. 1988;Kurtz et al. 2004;Leurini et al. 2016;Yang et al. 2020), we expect that the detection rates predicted by GLM for this type of maser will be different from the highly variable 22 GHz H 2 O masers. We used Equation (6) for the prediction of 95 GHz maser emission and calculated the detection rates toward sources observed by Yang et al. (2017). The GLM predicts that 242 ± 35 sources from the total sample of 928 sources observed by Yang et al. (2017) will be detected when using threshold of p = 0.5. But the actual number of detected sources is 298. Thus, the GLM underestimates the actual detection rate by a factor of 1.2. This is opposite to the 22 GHz H 2 O masers, where the GLM tends to overestimate the detection rates.
Another relevant survey of Kim et al. (2018) presents simultaneous observations of 22 GHz H 2 O and 44, 95 GHz CH 3 OH masers toward the red MSX sources. A total of 187 sources have an associated Hi-GAL source. Among them, water masers were detected in 65% and cIM masers at 95 GHz were detected in 51%. The estimated detection rate from GLM when using the threshold of p = 0.5 is 65% ± 7% for 22 GHz H 2 O and 19% ± 2% for 95 GHz CH 3 OH. Thus, the comparison between GLM and observed detection rates displays similar properties to those found in Yang et al. (2017)-GLM underestimates the class I CH 3 OH detection rates.
The strong temporal variability of water masers may be the reason for the detection rates being lower compared to those predicted by the GLM. This brings some uncertainty into the interpretation of maser observations-only maser detections can be considered a reliable indicator of maser activity. But the opposite is not true-the absence of maser detection does not guarantee the absence of a maser. If we consider only maser detections of the Effelsberg observations, then the number of detections predicted by GLM using p = 0.5 is 100 from 131 total detections (76%)-a significant fraction of maser detections were predicted by GLM.
As a cautionary note, we mention that due to the systematically higher noise level of 95 GHz maser observations (∼1.1 Jy) used as the GLM training set in comparison to the 22 GHz water maser observations (∼0.2 Jy), the 95 GHz maser GLM is biased toward more bright masers. Thus, the number of faint 95 GHz masers is underestimated. A more sensitive survey of cIM masers in a large sample of sources is necessary to construct the reliable GLM that allows predicting the detection rates with the inclusion of faint masers.

The Evolution Trends of Different Maser Species
A "straw-man" model was initially suggested by Ellingsen et al. (2007), who showed an evolutionary sequence for masers in SFRs. In this model, the methanol masers (both class I and II) are associated with the earliest evolutionary stage, followed by water masers. OH masers appear only in the evolved sources with H II regions. In the paper of Billington et al. (2020), this "straw-man" model was examined using the luminosity-tomass ratio (L/M) of the ATLASGAL clumps. The L/M and dust temperature are good indicators of the evolutionary state of star-forming clumps (Molinari et al. 2008;Urquhart et al. 2018;Billington et al. 2019). These physical parameters increase as clumps evolve-clumps become more luminous and hot. Billington et al. (2020) constructed a box plot (see Figure 14 therein) of the luminosity-to-mass ratio of the ATLASGAL clumps associated with different maser species-H 2 O (HOPS; Walsh et al. 2011), OH (THOR and SPLASH;Beuther et al. 2016;Qiao et al. 2020), and CH 3 OH (MMB; e.g., Breen et al. 2015). We repeated this analysis but used an increased water maser sample, included cIM masers at 95 GHz, and used both ATLASGAL and Hi-GAL data. The sample of OH and CH 3 OH masers at 6.7 and 12 GHz is close to the sample used by Billington et al. (2020), complemented by additional detections stored in the maser database (Ladeyschikov et al. 2019) in 16 papers (e.g., Caswell et al. 1995;Caswell 1996;Beuther et al. 2002;Bartkiewicz et al. 2009;Caswell 2009;Cyganowski et al. 2009). In this section, we used the latest version of ATLASGAL CSC (Urquhart et al. 2022).
We apply the following constraints for the maser associations: The beam size of the maser observations should be lower than 70″, and the maximum distance between a maser and an ATLASGAL or Hi-GAL source is 30″. These criteria were applied to exclude the false-positive associations when a maser was detected using the large beam sizes, which lead to lower reliability associations. The beam-size criterion was not used for the 12 GHz CH 3 OH maser observations due to the absence of such observations-the minimum beam size for single-dish 12 GHz maser observations is 114″. We also excluded ∼2% of sources that do not have a defined L/M. The maser sample was obtained by using a crossmatch between maser positions (both interferometric and single dish) and positions of infrared/ submillimeter sources. Only the nearest infrared/submillimeter source was assigned for each maser in case of multiple sources or crowded regions.
The sample size for different maser species used for the analysis in this section is presented in Table 6. We used two samples-full and distance limited (2 < D < 5 kpc). The more distant objects will be more massive and more likely to be associated with more evolved star formation. In other words, they are more likely to be associated with high-mass stars that happen to evolve much more quickly. Thus, there is the possibility of an evolutionary bias, and a distance-limited sample should be considered. We checked the presence of bias by comparing the L/M for the distance-limited sample and the full sample of the ATLASGAL/Hi-GAL CSC sources. The Kolmogorov-Smirnov (K-S) test gives the value of p = 0.002 for ATLASGAL CSC and p = 8 × 10 −10 for Hi-GAL CSC. Thus, only the ATLASGAL catalog may be considered distance independent (p < 0.0013) for the L/M.
In Figure 7, we present the box plot of L/M for different maser species associated with Hi-GAL and ATLASGAL CSC. Note. The values between slash symbols are the size of the full sample and distance-limited (2 < D < 5 kpc) sample. In the case of a single value, only the full sample size is shown .
In Figure 8, the cumulative distribution function of the luminosity-to-mass ratio is displayed for different maser species, both from the ATLASGAL and Hi-GAL CSC. Although different maser species overlap, we found that 22 GHz H 2 O masers appear earlier in the evolutionary sequence than CH 3 OH (class II) masers at 6.7, 12 GHz, and 1665 MHz OH masers, e.g., at the lower values of the luminosity-to-mass ratio. The results of the K-S test for both full sample and distance-limited samples is presented in  Figure 7 is the position of 95 GHz class I CH 3 OH masers before 22 GHz H 2 O masers in Hi-GAL data. This effect is also pronounced in different crossover points for the Hi-GAL and ATLASGAL catalogs between 22 GHz and 95 GHz masers in Figure 8. As was shown by the K-S test, there is no statistical difference in the L/M between 22 GHz H 2 O and 95 GHz CH 3 OH masers in both ATLASGAL and Hi-GAL CSC. Thus, we cannot treat any difference between them as significant. The origin of these effects may be associated with the higher sensitivity of the Hi-GAL catalog, resulting in detection of the extended emission at 70-500 μm. cIM masers are usually found at some distance from the host protostar, thus they are more likely to be associated with Hi-GAL sources from extended emission, which have lower values of the L/M.
The difference with Billington et al. (2020) may be associated with a larger sample size and the difference in the source selection-the data set used here covers SFRs in less evolved regions, where water masers are still not bright and were not detected by HOPS. As a cautionary note, we mention that the results of the analysis in this section still may be affected by the sample size and selection. The detection of a significant number of masers in the future may change the appearance of Figure 8. That is especially actual for 95 GHz cIM masers, as the currently used data set for 95 GHz methanol masers is biased toward bright masers due to the sensitivity limitations (σ ∼ 1.1 Jy). Moreover, interferometric positions are not available for all known H 2 O masers, and thus part of the sources may be false associations, especially in crowded regions (see the discussion in Section 2.2). However, we do not expect a significant change in the relative position of H 2 O and 6 GHz CH 3 OH masers in the L/M diagram, as the currently used data set for 22 GHz H 2 O and 6 GHz CH 3 OH masers includes the sensitive large-scale surveys (e.g., Breen et al. 2015;Svoboda et al. 2016), which cover a significant fraction of the masers in the Galaxy (see details in Section 4.1). The above analysis leads to the following conclusion: water and cIM masers are the earliest tracers of star formation activity among other maser species. Due to the significant difference in the L/M between the sample of sources associated with 22 GHz H 2 O and 6.7 GHz CH 3 OH masers, we conclude that 22 GHz water masers arise before 6.7 GHz methanol masers in the evolutionary sequence.
These conclusions agree with those presented in  but differ from those presented in Ellingsen et al. (2007), Breen et al. (2010a), and Jones et al. (2020, where water masers appear after the onset of class II methanol (cIIM) maser at 6.7 GHz in the evolution timeline. From the perspective of the data presented here, the position of 22 GHz water masers in the evolution timeline is close to cIM masers at 95 GHz. In contrast, the cIIM masers at 6.7 GHz appear in the later phase of the evolution. It is also in agreement with the recent study of Urquhart et al. (2022), which shows that  outflow activity is the earliest indication that star formation has begun. Water and cIM masers may reside in shock waves of the outflows from the protostars (Kaufman & Neufeld 1996;Voronkov et al. 2006). Thus, we consider that collisionally pumped water and cIM masers should be the earliest evidence of ongoing outflow activity in a particular star-forming region, while radiatively pumped cIIM masers may not exist yet. However, cIM and water masers have significant differences in the timescale of their variability, which will be discussed in the next section.

Maser Variability in the Large Source Sample
In Section 3.5, we study the variability of the sample of 100 maser sources. We found that in the sample of 100 sources, the variability index ranges from 1 to ∼100 based on the archival data. In 22% of the sources, a significant H 2 O flux change (more than 5 times) was detected.
We further study the variability index for a larger sample of maser sources and maser transitions using the maser database (Ladeyschikov et al. 2019(Ladeyschikov et al. , 2022 as the archival data source. The following maser transitions with large source sample sizes were investigated: 22 GHz H 2 O, 95 GHz CH 3 OH (class I), 6.7 GHz CH 3 OH (class II), 12 GHz CH 3 OH (class II), and 1665 MHz OH. From the list of sources, we select only those having two or more observations in a specific maser transition and maser detection in at least one epoch. For H 2 O and OH masers, only sources from the SFR category were included in the analysis. Due to the large amount of H 2 O maser data with a high noise level (σ > 0.5 Jy), we investigate the two samples independently: H 2 O maser observations with σ < 0.5 Jy and the full observation sample. The sample sizes for the different maser species are presented in Table 8.
The maser database stores information about the maser observations of sources from different epochs and facilities. We only used single-dish observations to study the maser variability. The exceptional case is the ATCA telescope that was intensively employed for water maser surveys (e.g., Titmarsh et al. 2014Titmarsh et al. , 2016. We do not consider the association with the ATLASGAL or Hi-GAL source in the maser variability study. The only information used is the maser position and flux density. Only sources associated with SFRs are investigated in this study. For peak flux density estimation only, the brightest maser component is considered. We ensure that different single-dish positions do not differ more than 10″ to prevent the false variability caused by position shifting. We calculated the variability index for each maser group by dividing the maximum and minimum flux densities in all observed epochs. The number of epochs is different for each source. Among 1577 water maser sources associated with SFRs, 988 (62%) have only one observed epoch, 589 (37%) have at least two epochs, 326 (20%) have at least three epochs, and 186 (12%) have more than three epochs. In total, we identify 250 sources with V i > 5 and a maximum flux density larger than 10 Jy. Among them, a significant variability was detected in 74 (V i > 50) sources. In 44 sources, we found that V i > 100.
The calculation of the variability index is similar to that described in Section 3.5. The variability index calculated based on the data from different facilities may differ in flux densities due to various calibrations, beam sizes, pointings, and velocity resolutions. That leads to an overestimation of the variability index. Specifically for the sources with a complex spatial structure because the larger beam size may cover more maser emission. It is also possible that the flux density may be slightly changed due to the different telescope pointings-we used an angular distance of 10″ as the maximum distance between the nearest observations. All of these effects have impacts on the considered maser variability index. For these reasons, we do not consider the variability index values less than 1.5 as significant.
In Figure 9, we present the cumulative distribution function of the maser variability index for the maser transitions considered. From the inspection of this figure, we conclude that H 2 O and OH masers show the largest variability index out of all other maser transitions, and the 95 GHz CH 3 OH maser has the lowest overall variability index. As defined above, the variability index is insignificant (i.e., less than 1.5) for 69% of the 95 GHz CH 3 OH masers, 57% of the 12 GHz CH 3 OH masers, 40% of the 6.7 GHz CH 3 OH masers, 19% of the 22 GHz H 2 O masers (observed with σ < 0.5 Jy), and 20% of 1665 MHz OH masers. On the other hand, a substantial variability index (i.e., larger than 5) was detected in 44% of the 22 GHz H 2 O masers, 20% of the 6.7 GHz CH 3 OH masers, 11% of the 12 GHz CH 3 OH masers, and 6% of the 95 GHz CH 3 OH masers.
We conclude that all considered maser transitions are variable, while different maser transitions display a different   degree of variation. The 95 GHz CH 3 OH masers are the most stable of the maser transitions considered. This is consistent with multiepoch cIM maser observations by Yang et al. (2020), which show no evidence of variability in the cIM masers over 7 yr period. Water and cIM masers reside in the shock waves of a protostar's outflows, where density and temperature are high enough. Why do maser species tracing shock-wave propagation show such large differences in variability? The physical conditions required for inverting the 22 GHz water maser transition include high densities (∼10 8 -10 10 cm −3 ) and kinetic temperatures (T K ∼ 200-2000 K) (Yates et al. 1997). However, for cIM masers, the conditions are less severe-densities of ∼10 5 -10 8 cm −3 and kinetic temperatures T K ∼ 40-400 K. Shock waves may propagate more steadily when considering regions of moderate densities and temperatures. Thus, cIM masers appear more stable over time. In contrast, the 22 GHz water masers are pumped only in very dense and hot regions. Such conditions may appear only in limited regions of shockwave propagation. Thus, we observe a significant water maser temporal variability.
The saturation of maser emission is another important issue that may influence the variability timescale. As described in Leurini et al. (2016), saturated masers are expected to show little or no variations-saturated masers undergo linear amplification, while unsaturated masers should amplify exponentially. Leurini et al. (2016) suggested a reasonable guess for the timescale of variation for cIM masers of ∼15 yr, if we assume they are saturated. The statistics presented here confirms that cIM masers have little variation in a large sample of sources. Thus, saturation may be assumed for 95 GHz methanol masers. This is opposite to 22 GHz water masers, where strong variability suggests unsaturated maser emission.
Another possible reason for the significant 22 GHz water maser variability is higher line intensities in comparison to other maser species. As the variability index has a positive correlation with the peak intensity (r = 0.66 for H 2 O maser data), more intense masers have a higher variability index. To check this, we limit the considered maser samples to those having maser peak flux density less than 10 Jy. This filter was applied to all considered maser species. We found that the results are qualitatively the same as for the total sample. Moreover, Billington et al. (2020) have shown that water and 6.7 GHz methanol masers have similar luminosity in a distantlimited sample (see Figure 9 therein). Thus, we conclude that the difference in variability index for water and 6.7 GHz methanol masers cannot be explained only by the higher relative intensities of water masers compared to those of 6.7 GHz methanol masers.

Correlation between the Variability Index and Physical Parameters
We further investigate the possible correlation between the maser variability index for the H 2 O masers and associated ATLASGAL and Hi-GAL sources physical parameters. We analyze the Spearman's correlation coefficient between the maser variability index and all available physical parameters of ATLASGAL and Hi-GAL sources, including the probability of maser detection described in Section 2.4. We considered ATLASGAL parameters; the maximum correlation is found between the GLM maser detection probability and maser variability index: r = 0.33 [0.22, 0.45], p = 1.979 × 10 −7 . However, this correlation may be explained by considering a higher variability index for the masers with a larger peak flux density. After limiting the maser sample to the sources that have a moderate peak flux density at 22 GHz (<10 Jy), the correlation between the variability index and GLM maser detection probability becomes r = 0.16 [−0.01, 0.31], p = 0.06, i.e., not significant at the 3σ level. The same decrease in the correlation is found for other ATLASGAL physical parameters. The situation does not change significantly when we use the Herschel Hi-GAL data. The maximum correlation between the maser variability index and Hi-GAL source parameters is found using the GLM maser detection probability: r = 0.33 [0.22, 0.45], p = 1.98 × 10 −7 . All other parameters have values of r < 0.2. However, after limiting the sample to nonbright masers (F < 10 Jy) only, the correlation coefficient decreases and does not become statistically significant at the 3σ level: r = 0.24 [0.07, 0.41], p = 0.008. The same takes place with other Hi-GAL physical parameters. Thus, we conclude that there is no clear correlation between the maser variability index and source physical parameters from the ATLASGAL and Hi-GAL catalogs.

Separation between Water and Methanol Masers and the ATLASGAL Clumps
In a previous paper (Ladeyschikov et al. 2020), it was shown that there is a physical separation between the ATLASGAL clumps and cIM and cIIM masers, with cIM masers being located at larger distances from the ATLASGAL clumps than the cIIM masers. This is in accordance with the current understanding of the methanol maser origin . While the cIIM masers are radiatively pumped close to the high-mass young stellar objects (YSOs), the cIM masers are collisionally pumped by shocks and can be located farther from an embedded protostar. We complement this analysis by studying the offsets between the ATLASGAL clumps and H 2 O maser positions from the interferometric observations.
For the analysis, we used the updated ATLASGAL CSC (Urquhart et al. 2022) and a distance-limited sample (2 < D < 6 kpc). We used the matching radius of 30″ to minimize the false-association rate. The sample size for the H 2 O, cIM, and cIIM masers contains 194, 142, and 339 sources, respectively. The results are presented in Figure 10.
From the inspection of Figure 10, we conclude that there is no significant difference between the offsets of the H 2 O and cIIM masers-both of them are located close to the host clumps in comparison to cIM masers. The K-S tests between the samples of H 2 O and cIIM masers result in a p-value of 0.29. On the other hand, the K-S test reveals significantly different distributions between cIM and H 2 O masers (p-value = 0.0013), as well as cIM and cIIM masers (p-value = 0.0013).
cIIM masers at 6.7 GHz are pumped by infrared radiation and reside in the circumstellar disks and inner parts of the outflows of high-mass YSOs, which have sizes less than 1000au (Sanna et al. 2015(Sanna et al. , 2017. The same scale is found for water masers (Moscadelli et al. 2019(Moscadelli et al. , 2020. According to Moscadelli et al. (2020), 84% of the 22 GHz water masers within the sample of 36 sources are found within 1000 au with respect to the radio continuum peak, which is a good proxy for the YSO position (Moscadelli et al. 2016). On the other hand, the characteristic scale of the cIM maser distribution is about 50 times larger (e.g., Voronkov et al. 2014). As both cIMM and H 2 O masers are collisionally pumped, we conclude that H 2 O masers trace the inner YSO outflow scales, while the cIM masers arise in the outer outflows regions.

Conclusions
This paper presents a statistical study of the infrared Hi-GAL and submillimeter ATLASGAL sources associated with water maser emission. From the generalized linear model analysis, we estimate that in the Galaxy, there are 2392 ± 339 sources with a high GLM probability of maser detection (p > 0.5), but only half are currently observed at 22 GHz. We have studied the observational properties of water masers by searching for 22 GHz emission toward a sample of 359 submillimeter sources. We were able to detect 22 GHz H 2 O maser emission toward 124 sources (∼34%). The masers were detected in 13% of the previously nondetected sources. One-third of the previously known water masers were not detected in the survey, suggesting temporal variability. We have found 10 sources with a significant (more than five times) increase in the H 2 O maser flux, while 11 sources reveal a significant decrease in H 2 O maser flux in comparison to the archival data.
1. Analysis of physical parameters of maser-associated host clumps reveals that 22 GHz water masers appear in the ATLASGAL clumps with lower luminosity-to-mass ratios (L/M) in comparison to 6.7 GHz methanol masers. This is confirmed by a K-S test on the L/M. That implies that water masers may appear earlier than cIIM masers in the evolutionary sequence of star formation. However, we found no significant difference between the L/M of 22 GHz water and 95 GHz cIM masers. Thus, these masers may appear in the same phase of star formation process. 2. Water masers reveal the highest degree of variability among other masers in SFRs. The strong temporal variability may be associated with unsaturated H 2 O maser emission, implying exponential amplification. That is opposite to 95 GHz masers, which are much more stable and suggest saturated emission.
3. We found no significant difference between the physical offsets between the ATLASGAL clumps and H 2 O and cIIM masers. Both maser types are located close to a YSO, while cIM masers are found at larger physical offsets.
The work of D.A.L. on the generalized model analysis in Section 2, testing the GLM predictions from observations (Sections 3.1-3.3, 3.5), and the data analysis in Sections 4.2, 4.3, and 4.5, was supported by Russian Science Foundation grant 20-72-00137. The work of D.A.L. and A.M.S. on the maser velocity range analysis (Section 3.4) and the source variability analysis from the archival data (Section 4.4) was supported by Russian Science Foundation grant 18-12-00193. The work of D.A.L. on the correlation between maser species (Section 4.1) was supported by the Ministry of Education and Science of Russia (the basic part of the State assignment, RK no. FEUZ-2020-0030). This work is based on observations with the 100 m radio telescope of the MPIfR (Max-Planck-Institut für Radioastronomie) at Effelsberg. We acknowledge the Effelsberg 100 m staff for their assistance with our observations. The ATLASGAL project is a collaboration between the Max-Planck-Gesellschaft, the European Southern Observatory where M equals 1 in the case of a maser detection and 0 in the case of a nondetection, and R [0,1] is a random number between 0 and 1. The column with the larger value of N has a higher impact on the resulting value depending on the maser detection, while the column with N = 0 is only a random number. We run the GLM stepwise refinement on the parameter set containing only the test columns. For the parameters with N = 0, 0.02, 0.03, and 0.05, the resulting p-value is larger than 0.1, i.e., not significant. The parameters with N = 0.1, 0.2, and 0.3 have pvalues of less than 0.001, thus they can be considered significant. The resulting model consists of three parameters, T 0.1 , T 0.2 , and T 0.3 , and has an accuracy of 74.9%. For comparison, the model with only one parameter T 0.3 has an accuracy of 65.8%, and the model with only one parameter of T 0.2 has an accuracy of 63.46%. Thus, we conclude that the combination of several parameters that have a significant influence on maser presence/absence can increase the model accuracy compared to the model with only the most significant parameter.

Appendix B Sources Nondetected at 22 GHz
In Table 9 we present the list of sources observed by our targeted program, but for which no emission at 22 GHz is detected. We used the 3σ detection threshold. Note. Coordinates of the sources are taken from the corresponding ATLAS-GAL source. p AGAL and p HiGAL are the same as in Table 2. Only the top 10 rows are shown, other information is available in the machine-readable table.
(This table is available in its entirety in machine-readable form.)