Introduction

Evaporation is one of the most crucial water balance factors in many natural resources studies considering the spatial and temporal scales (Chen et al. 2019; Hooshmand et al. 2013). A large proportion of the rainfall is lost due to evaporation (Althoff et al. 2019). Evaporation is usually lower in cold regions than other regions (Wu et al. 2016). However, evaporation is a significant factor in analyzing water requirements in cold regions. In cold regions, lakes, wetlands, and water bodies are often abundant (Woo et al. 2008). Due to meteorological changes, the amount of evaporation from water bodies in cold regions has increased significantly, so it is necessary to conduct studies for evaporation determination in cold regions (Woolway et al. 2020). The evaporation pan is the most well-known method for determining evaporation from the water surface (Roderick et al. 2004). Evaporation pans are standard at stations that measure meteorological factors (Lu et al. 2018). Due to the remarkable costs of construction, maintenance, and repair of meteorological stations in many areas, access to evaporation data from the pan is impossible (Wu et al. 2020). Experimental mathematical equations to reveal the governing equations of the evaporation phenomenon have been developed to overcome this problem (Qasem et al. 2019). Although the use of new models such as Support Vector Machine (Chen et al. 2019), Neural Networks (Majhi and Naidu 2021), and KNN (Al-Mukhtar 2021) has been considered by researchers in recent years, the mathematical formula of the equation for these models has led to little use in practice in engineering and application projects. Attention to features such as revision in specific meteorological conditions, explicit equations, and available factors make the extracted equations practical. Most studies in the field of evaporation determination have concentrated on different aspects (McMahon et al., 2013), such as actual evaporation from non-saturated surfaces (Anayah and Kaluarachchi 2014), potential evaporation (Kohler and Parmele 1967), lake and storage evaporation (McJannet et al. 2008), reference evapotranspiration (Hargreaves and Allen 2003) and evaporation from shallow lake and pond (Valiantzas 2006).

The equations presented for the study of surface evaporation are varied. However, it should be noted that the data required for these studies are generally unavailable and must be measured for each study. These problems make the application of related equations face severe limitations. Consequently, it is impossible to study the efficiency of the equations in different geographical and meteorological conditions. One of the evaporation pan features is its use in many meteorological stations in different meteorological and geographical conditions. As a result, access to its various data is less limited than other evaporation aspects. A review of the studies shows that studies on the determination of evaporation from the pan compared to similar cases are much less diverse. Therefore, the need for more studies in this field becomes apparent.

On the other hand, the study of the equations developed to determine the evaporation from the pan indicates that most equations require radiation as one of the main input factors. Measured radiation data are more limited than other factors such as temperature, wind velocity, and relative humidity. Therefore, equations that use radiation as one of the main factors also have practical limitations in many regions. Among the equations that use radiation as a significant factor to determine evaporation from the pan can be pointed out to PenPan (Rotstayn et al. 2006), Grifiths (Griffiths 1966), Mehta (Christiansen 1966), Stephen-Stewart (Stephen and Stewart 1963), Christiansen (Christiansen 1960), and Prescott (Prescott, 1940) and Rohwer (Rohwer, 1931). Some equations have been proposed to determine the amount of evaporation from the pan using limited meteorological factors. However, research in this area has been lower than in similar cases (Wang et al. 2023).

Temperature values measured in cold regions are much lower than in other regions (Salarijazi et al. 2023). Also, some meteorological factors, such as relative humidity and wind velocity, have their characteristics in these regions. Due to the different characteristics, it is essential to study the efficiency of the equations for determining evaporation from the pan in these regions. Also, due to the significant influence of evaporation factors in different fields, the study of selecting equations with more reliability is necessary to determine evaporation in cold regions. Literature reviews in studies applying cold region data indicate the non-simultaneous attention to some necessary conditions to increase the reliability of the equations for determining evaporation from the pan. Suppose conditions such as 1—application of data only from the cold region, 2—development of straightforward equations, and 3—application of a limited number of meteorological factors with a wide range of changes are not conducted simultaneously. In that case, the extracted equations, in addition to low reliability, are not proper in practical use.

This study aims to investigate and revise the evaporation equations of the pan to improve their reliability in cold regions. One of the most critical features of experimental equations is that researchers and engineers use them in various studies. These equations must have reliability in addition to being practical. In this research, the following conditions have been considered for selecting the study region and experimental equations: A—The datasets used belong to cold regions, and the number of them is the maximum amount available concerning quality. B—Meteorological factors in the study areas have a wide range of changes. C—Default equations are presented to determine evaporation from the pan D—Experimental equations studied are diverse E- Mathematical formulas of equations are clear and straightforward and, therefore, can be easily revised by conventional methods. What distinguishes this study from similar cases is that it seems it has not been done in cold regions considering the above conditions.

Materials and methods

Emberger classification

The climate is classified based on evaporation, temperature, and precipitation as critical meteorological factors by Emberger (1930). The precipitation (P in millimeters) is considered on an annual time scale (Derouiche et al. 2022). The mean of maximum temperatures of the hottest month and average of the minimum temperatures of the coldest month in the year by M and m notation is considered for the temperature factor (Vessella et al. 2022). It has to be noticed that M and m are the strict thermal limits for vegetation growth. However, the temperature factor is considered by (M/2 ‏ + m/2) in the Emberger classification (Canturk and Kulaç, 2021). Also, the (Mm) as temperature range is the term of the equation for the evaporation factor. The following equation is used for the classification (Caloiero et al. 2016).

$$ Q = { }\frac{{1000.{ }P}}{{\left( {\frac{M + m}{2}} \right).\left( {M - m} \right)}} = \frac{2000.P}{{M^{2} - m^{2} }} $$
(1)

The Emberger climatogram uses Q and m for y- and x-axes presented in Fig. 1 (Dereure et al. 2009).

Fig. 1
figure 1

The Emberger climatogram (Dereure et al. 2009)

Study region

Iran is located in the Middle East, and a range of different meteorological conditions can be identified (Roshan et al. 2017). In this study, the Emberger classification has been used to identify cold region datasets. Initially, 40 datasets were examined using the Emberger classification system, and finally, 23 cases were selected as cold regions datasets. These datasets are located in large areas of Iran’s geographical conditions, and their location is shown in Fig. 2. These datasets are in the range of longitude 45° 3′19″ to 60° 54′ 11″ and latitude 40° 33′ 29″ to 38° 18′ 17″ (Modabber-Azizi et al. 2023).

Fig. 2
figure 2

Location of cold region datasets

Studied datasets

The datasets used in this study have been measured in meteorological stations belonging to the Meteorological Organization of Iran. The information and location of these datasets are presented in Fig. 2 and Table 1. The proper quantity and quality of data have been the most critical criterion in selecting these datasets. The various steps of data pre-processing include the preparation of data and arrangement, the examination of missing data, the identification of marginal data, the study of the Emberger classification, and in the final steps, the selection of proper datasets with details Illustrated in Fig. 3. This study's evaporation data are recorded using class A pans. Class A evaporation pans are used in meteorological stations worldwide, including in Iran. This evaporation pan has a diameter of 121 cm and a depth of 25 cm and is installed slightly above the ground (Shammout et al. 2018).

Table 1 Information about studied datasets
Fig. 3
figure 3

Pre-processing steps for cold regions datasets

Box plots of metrological factors in the studied cold regions are presented in Fig. 4. Paying attention to the range of changes of these metrological factors indicates the range and extent of their significant changes. This feature will increase the reliability of the analysis because the related analysis will be valid in a wide range of meteorological factors. Another feature identified in Fig. 4 is that meteorological factors such as temperature, relative humidity, and evaporation show a regular within-the-year pattern. The presented box plot shows the statistical features of the studied datasets.

Fig. 4
figure 4

Box plot of meteorological factors in cold regions

Experimental equation to determine evaporation from the pan

Evaporation from the pan is measured at well-equipped stations. However, meteorological factors such as temperature, relative humidity, and wind velocity are measured in many regions. In contrast, evaporation data from the pan are not measured or are limited (Gavili et al. 2018). In addition, the measured evaporation data in many datasets do not have a long measurement period and, in many cases, are accompanied by a significant amount of missing data (Abtew et al. 2011). Some research has been done to develop experimental equations for determining evaporation from the pan (Terzi and Keskin 2010). In these experimental equations, an attempt is made to determine the amount of evaporation using a mathematical formula based on some meteorological factors (Irmak and Haman 2003). A set of equations was developed to determine evaporation from the pan using the radiation factor (Xu and Singh 2000). The radiation factor is much less available than wind velocity, temperature, relative humidity, and evaporation (Fulton et al. 2005). This research does not study the experimental mathematical equations that use radiation as an input factor, considering the mentioned applied aspect. Considering that factors such as wind velocity, temperature, and relative humidity are more available than other meteorological factors, the experimental equations that ultimately use these three factors have been investigated in this study (Salarijazi et al. 2023).

Trabert (1896) experimental equation

The experimental equation presented by Trabert (1896) is one of the most widely used equations for determining evaporation from the pan, which is very well-known (Lu et al. 2018). Temperature, relative humidity, and wind velocity are considered meteorological factors of the inputs of the equation (Guan et al. 2020). One of the features of this equation is that it assumes that the wind velocity radical is associated with evaporation, which can be a limiting assumption (Mohammadi et al. 2021). Trabert (1896) equation, like Papadakis (1961) equation, uses only one parameter to determine evaporation.

$$ e_{p} = { }\gamma_{{1{ }}} \sqrt v { }\left( {{ }e_{s} - { }e_{{d{ }}} } \right) $$
(2)

In the Trabert (1896) equation, \({e}_{{\text{p}}}\) is determined evaporation from a pan (mm/day), \(v\) is wind velocity at the height of 2 m above the ground (m/s), \(( {e}_{s}- {e}_{d })\) is vapor pressure deficit (kPa) and \({\gamma }_{1 }=0.3075\) is the default parameter.

Kohler (1954) experimental equation

One of the primary research for determining water evaporation is well-known to the Lake Hefner studies. Using the data set measured in this study, Kohler (1954) presented an experimental equation for determining evaporation from the pan. There is a fundamental difference between the Kohler (1954), Antal (1973), and Papadakis (1961) equations, and it is that in Kohler (1954), the wind velocity is also used to determine evaporation (Song et al. 2020).

$$ e_{{P{ }}} = \left( {e_{{s{ }}} - e_{{d{ }}} } \right)\left( {\gamma_{2} + { }\gamma_{{3{ }}} v_{P} } \right) $$
(3)

In Kohler (1954) experimental equation, \({e}_{{\text{P}}}\) is the amount of determined evaporation from the pan (in/day), \(( {e}_{s}- {e}_{d })\) is vapor pressure deficit (in Hg), \({v}_{P}\) the wind velocity at the standard height of the installation of the evaporation pan from the ground (miles/ day) and \({\gamma }_{2}=0.42, {\gamma }_{3}=0.004\) are the default parameters of Kohler (1954) equation.

Kohler–Nordonson–Fox (1955) experimental equation

Further studies by Kohler et al. (1955) revealed that using a power relationship between evaporation from the pan and the vapor pressure deficit is more efficient than the linear relationship (Izady et al. 2020). Based on this, they presented an equation as a revised mathematical formula compared to Kohler (1954) equation. The new equation, Kohler–Nordonson–Fox (1955), has been considered by researchers in various studies (Althoff et al. 2020).

$$ e_{{P{ }}} = ({ }e_{s} - { }e_{d} { })^{{\gamma_{4} }} { }\left( {{ }\gamma_{5} + { }\gamma_{{6{ }}} v_{P} } \right) $$
(4)

In the Kohler–Nordenson–Fox (1955) equation, \({e}_{{\text{P}}}\) is the amount of determined evaporation from the pan (in/day), \(( {e}_{s}- {e}_{d })\) is vapor pressure deficit (in Hg), \({v}_{P}\) the wind velocity at the standard height of the installation of the evaporation pan from the ground (miles/ day). In addition, \({\gamma }_{4}=0.88, {\gamma }_{5}=0.37, {\gamma }_{6}=0.0041\) are the default parameters of the equation.

Papadakis (1961) experimental equation

This equation is proposed by Papadakis (1961). Like the Antal (1973) equation, this equation uses temperature and relative humidity factors to determine evaporation. Unlike that equation, it considers the linear relationship between these factors. (Basnyat 1987). The mathematical formula of the Papadakis (1961) equation is much easier to determine evaporation than other similar equations (Bozorgi et al. 2020).

$$ e_{p} = { }\gamma_{7} \left( {{ }e_{s} - { }e_{{d{ }}} } \right) $$
(5)

The \({e}_{{\text{p}}}\) is determined amount of evaporation from the pan in (mm/day), \(( {e}_{s}- {e}_{d })\) is vapor pressure deficit (mbar), and \({\gamma }_{7}=0.5626\) is the default parameter of the Papadakis (1961) equation.

Antal (1973) experimental equation

The equation proposed by Antal (1973) uses temperature and relative humidity as meteorological factors to determine evaporation from the pan. In this equation, vapor pressure deficit is a secondary factor nonlinearly related to evaporation (Antal 1973; Basnyat., 1987). The relationship between temperature and evaporation is also considered nonlinear. However, this nonlinear relationship is indirectly included in the vapor pressure deficit factor (Agarwal et al. 2020).

$$ e_{{\text{P}}} = \gamma _{8} \left( {e_{s} - e_{d} } \right)^{{\gamma _{9} }} \left( {\gamma _{{10}} + \frac{t}{{\gamma _{{11}} }}} \right)^{{\gamma _{{12}} }} $$
(6)

In the above equation, \({e}_{P } is\) the amount of determined evaporation from the pan (mm/day), \(( {e}_{s}- {e}_{d })\) is vapor pressure deficit (mbar Hg), \(t\) is the average daily temperature in (°C). Also \({\gamma }_{8}=1.1, {\gamma }_{9}=0.7, {\gamma }_{10}=1, {\gamma }_{11}=273, {\gamma }_{12}=2.4\) are default parameters of the Antal (1973) equation.

Linacre- (1977) experimental equation

The equation presented by Linacre (1977) has significant differences from similar equations (Althoff et al. 2020). The first difference is that this equation between meteorological factors uses only temperature. The second difference is that, unlike other equations for determining evaporation from the pan, factors such as elevation and latitude of location are also used in this equation (Linacre 1977).

$$ e_{{\text{P}}} = \frac{{\left[ {{ }\gamma_{{13{ }}} \left( {\frac{{\left( {{ }t + { }\gamma_{14} { }H{ }} \right)}}{{\left( {{ }\gamma_{15} - L} \right) + \gamma_{16} { }\left( {{ }t - { }t_{d} } \right)}}{ }} \right)} \right]}}{{\left( {{ }\gamma_{17} - t{ }} \right)}} $$
(7)

In Linacre (1977) equation, \({e}_{{\text{P}}}\) is the amount of evaporation determined from the pan (mm/day), \(t\) is the average daily air temperature (°C), H is the elevation of the area (desired station) from the surface of the seawater (m), L is the latitude of location (degree) and \({t}_{d}\) is the dew point temperature (°C). Also \({\gamma }_{13}=700, {\gamma }_{14}=0.006, {\gamma }_{15}=100, {\gamma }_{16}=15, {\gamma }_{17}=80\) are default values for the Linacre (1977) equation parameters.

Linacre (1994) experimental equation

Another equation was proposed by Linacre (1994) to determine evaporation from the pan. This equation can be considered modified compared to Linacre (1977) equation (Stephens et al. 2018). Unlike Linacre (1977) equation, only meteorological factors are used in this equation. In other words, elevation and latitude of location factors are not required to determine the evaporation from the pan. In addition, this equation uses wind velocity as another meteorological factor in addition to temperature.

$$ e_{P} = \frac{{[{ }\gamma_{18} { }t - { }\gamma_{19} + { }\gamma_{{20{ }}} u{ }\left( {t - t_{d} } \right]}}{{\left[ {{ }\gamma_{21} + \frac{{\gamma_{22} }}{S}} \right]}} $$
(8)

In the above equation, \({e}_{{\text{P}}}\) is the amount of evaporation from the pan (mm/day), \(u\) is wind velocity at the height of 2 m above the ground (m/s), \(t\) is the average daily temperature (°C), \({t}_{d}\) is the dew point temperature (°C) and \({\gamma }_{18}=21, {\gamma }_{19}=166,{ \gamma }_{20}=6, {\gamma }_{21}=28, {\gamma }_{22}=46\) are default values for the Linacre (1994) equation parameters. In Linacre (1994) equation, \({\text{S}}\) is the slope of the psychrometric curve in (mmHg) determined from the following equation (Murry 1967; Basnyat 1987)).

$$ S = { }e_{s} { }\left( {{ }\frac{4098.03}{{{ }(t + 237.3)^{2} }}} \right) $$
(9)

In the above equation, \(t\) is the average daily temperature (oC) and \({e}_{s}\) is the saturated vapor pressure at the water surface, which can be obtained from the following equation (Xu et al. 2002):

$$ e_{s} = 6.1078\exp \left[ {{ }\frac{{17.2694{\text{ t}}}}{{\left( {{ }t + 237.3} \right)}}{ }} \right] $$
(10)

Revision process

The parameters of these equations must be revised to increase the efficiency of the experimental equations in determining evaporation from the pan in cold regions. The pan's evaporation values are expected to be determined more efficiently by revising the experimental equations' parameters. In other words, these equations will be adapted for cold regions. It is necessary to use an optimization method to adjust the parameters of the equations. The Nelder and Mead Simplex (NMS) optimization method is well-known in water and environmental engineering studies. This research used the MATLAB environment to apply the optimization method (Ahmadianfar et al. 2016).

A simplex is made up of n + 1 vertices in an n-dimensional space, each of which represents a potential solution to an optimization problem. The worst solution is replaced by a better solution obtained by performing some operations on the vertices in each iteration (Kshirsagar et al. 2020). Figure 5 depicts the NMS method’s implementation procedure (Barati 2011). The Nelder-Mead algorithm frequently resulted in remarkable initial iteration improvements and quickly gave satisfactory outputs. On the other hand, the algorithm can lead to many iterations with no significant improvement in the objective function.

Fig. 5
figure 5

Flowchart for NMS method (Barati 2011)

Analysis of the efficiency of experimental equations

Error metric selection to analyze the efficiency of equations is a critical step in simulation-related research (Ansarifar et al. 2020). Various error metrics have been presented in studies related to meteorological factors. Although the error metrics presented are numerous, multiple simulation aspects should be considered in their application. In this study, two error metrics were used to analyze and compare the efficiency of the experimental equations for determining evaporation from the pan. The first error metric is the mean error (ME), which generally determines the under-determined or over-determined errors (Whang et al. 2020). The second error metric used is the normalized root mean squared deviation (nRMSD). This error metric can consider the efficiency of determining evaporation in equations and is, therefore, a good option for comparing the results of different equations (Siebielec et al. 2004). The equations related to these error metrics are presented below:

$$ {\text{ME}} = 100{\text{\% }} \times \frac{1}{p}\mathop \sum \limits_{i = 1}^{p} \left( {f_{i} - g_{i} } \right) $$
(11)
$$ {\text{nRMSD}} = 100{\text{\% }} \times \frac{{\sqrt {\frac{1}{p}\mathop \sum \nolimits_{i = 1}^{p} \left( {f_{i} - g_{i} } \right)^{2} } }}{{\overline{g}}} $$
(12)

In the above equations, \({f}_{i}\) and \({g}_{i}\) represent the measured and determined values, respectively. The best value for the error metrics is 0. The numerous equations studied in this study and the use of two metric errors make it difficult to select and rank the proper equations Performance index (PI) has been used in this study to overcome the difficulty in analyzing the results (Despotovic et al. 2015). The PI simultaneously applies the error metrics, \({\text{nRMSD}}\) and \({\text{ME}}\), into the calculation. The error metrics must first be scaled to determine the P. This operation causes the high and low values of each error metric do not have a significant influence on the \(P\) value. The following equation can scale each error metric (Manju and Sandeep 2019).

$$ h_{{{\text{scaled}}}} = \left( {{ }\frac{{h - h_{{{\text{min}}}} }}{{{ }h_{{{\text{max}}}} - h_{{{\text{min}}}} }}} \right) $$
(13)

The \({h}_{{\text{max}}}\), \({h}_{{\text{min}}}\), \(h\) and \({h}_{{\text{scaled}}}\) are maximum, minimum, real, and scaled values of error metrics. The maximum and minimum of scaled error metrics transform to 1 and 0.

Using this equation, the values of each of the two error metrics in this study, \({\text{nRMSD}},\) and \({\text{ME}}\), will be in the range [0–1]. The following equation can be used to determine \(P\). In this equation, \(i\) is the equation under study (Manju and Mavi 2021).

$$ P = \mathop \sum \limits_{k = 1}^{q} c_{k} \left( {s_{k} - t_{ik} } \right) $$
(14)

For the presented equation the \({c}_{j}\) are constant and consider 1 for \({\text{ME}}\) and \({\text{nRMSD}}\). The \(q\) is the number of applied error metrics and \({t}_{ik}\) are equal to scaled values of error metrics \(k\) for equation \(i\). Moreover, \({s}_{k}\) consider as the median of scaled values of error metrics \(k\).

Based on the Mathematical formula of the \(P\) equation, it becomes clear that the higher the \(P\) value for one equation compared to other equations, the better the rank can be considered. In other words, the method of determining \(P\) is such that it creates a final error metric for comparing equations for a particular dataset. Equations with lower \(P\) values for a given dataset are in the following ranks (Feng et al. 2018).

Results

Findings of the revision process

The revised values of the experimental equation parameters in the MATLAB environment were determined using the NMS method. The parameters of experimental equations were calibrated considering the minimization of error. Recorded data belonging to all stations simultaneously used for optimization to achieve overall parameters. The minimization pattern of the objective function for these equations is shown in Fig. 6. The horizontal and vertical axes represent the iterations and values of the objective functions, respectively. The parameter values of the equations before and after the revision are shown in Table 2. The parameters' values are expected to change after revision compared to the default mode. The differences among equations’ parameters values in the two modes before and after revision lead to exciting findings. The maximum and minimum values of changes in the parameters of the equations belong to two single-parameter equations. The Trabert (1896) equation parameter changed about 900% after revision, while in Papadakis (1961) equation, this change is equivalent to 12%. This difference in the percentage change of the parameters of these two equations is noticeable. For Linacre (1977) and Linacre (1994) equations, the default value of a parameter was changed from 15 and 166 to 0, respectively. Considering the revised value of zero for this parameter, the Linacre (1977) equation is revised more efficiently in a mathematical formula. This revision reveals that the default Mathematical formula of the equation does not work well in determining evaporation from the pan in cold regions. Remarkable changes in the parameters of most equations indicate that the default equations do not have good reliability.

Fig. 6
figure 6

Minimization of the objective function of the studied equations in cold regions

Table 2 Values of parameters for studied equations

Error metrics maps for efficiency of experimental equations

The ME and nRMSD error metric values, representing efficiency, were determined for all 14 equations, including seven default and seven revised equations. Due to spatial distribution's importance on experimental equations' efficiency for determining evaporation from the pan, the spatial distribution of these error metrics was prepared as a map. Figure 7 shows the spatial distribution of the error metric ME, and Fig. 8 shows the error metric nRMSD as a map. In addition, the visual error metric of the equations in the fit is illustrated in Fig. 9. The default mode was first compared, and then all equations were compared to investigate the efficiency of the experimental equations for determining evaporation.

Fig. 7
figure 7

Spatial distribution of error metric ME for equations in cold regions

Fig. 8
figure 8

Spatial distribution of error metric nRMSD for equations in cold regions

Fig. 9
figure 9

Investigation of the visual efficiency of experimental equations of evaporation in cold regions

Comparative analysis of default experimental equations

The spatial distribution of error metrics and the graphical error metric are presented in Figs. 7, 8 and 9. The error metric PI values are illustrated in Fig. 10 to simplify these error metrics. This figure is determined based on considering all the studied datasets. The color spectrum was used to compare the PI for different equations in the analyzed datasets. According to Fig. 10, it is clear that the Kohler–Nordenson–Fox (1955) equation, with a slight difference from the Papadakis (1961) equation, has determined the most proper values among the default equations evaporation from the pan. Interestingly, the mathematical formula of both equations is simple, but for Papadakis (1961) is the simplest of the studied equations, achieving a proper performance by this equation has considerable importance. On the other hand, the two default Trabert (1896) and Linacre (1977) equations had the lowest efficiency in determining evaporation from the pan among the default equations. Trabert (1896) equation has an almost similar formula to Kohler–Nordenson–Fox (1955) equation, and this difference in results is important. Of course, the influence of default values of experimental equation parameters on their proper or improper performance is quite influencing (Yan and Mohammadian 2020). Under-determination or over-determination error is important in determining evaporation from the pan (Lu et al. 2018). The ME error metric analysis for default equations indicates that Antal (1973), Papadakis (1961), and Linacre (1994) equations face over-determination and Kohler (1954), Kohler–Nordenson–Fox (1955) equations, Linacre (1977), and Trabert (1896) to under-determination. The most under-determined and over-determined errors are default Trabert (1896) and Antal (1973) equations. The three equations Kohler–Nordenson–Fox (1955), Linacre (1994), and Papadakis (1961) have the least over-determined and under-determined error compared to other default equations. In other words, they have a balanced behavior in this field. Figure 9 indicates that most of the default equations do not have the proper reliability to determine evaporation from the pan.

Fig. 10
figure 10

The performance index for default equations in the cold region (red: lowest performance, blue: highest performance)

Comparing the efficiency of default and revised equations jointly

The error metric PI values are illustrated in Fig. 11 to analyze the performance index of all equations jointly. Comparing these 14 equations shows that the revised Kohler–Nordenson–Fox (1955) was the best equation for determining evaporation from the pan. The Revised Linacre (1994), Revised Linacre (1977), Revised Kohler (1954), Revised Papadakis (1961), and Revised Antal (1973) equations also performed well compared to other equations. Most of the revised equations have performed well in determining evaporation from the pan in cold regions seems to be an important finding. These equations have differences in mathematical formula and input factors, and a relatively small difference in their performance is a finding that was not expected. Among these 14 equations, the two default Trabert (1896) and the Linacre (1977) equations had the lowest efficiency in determining evaporation from the pan, described in the following section. The lowest value of the under-determined or over-determined error wasfor the revised Papadakis (1961) equation. The revised Kohler (1954) equation has had the slightest error in over-determination or under-determination after the revised Papadakis (1961) equation. The study of the mathematical formula of these two equations shows the more straightforward formula of the revised Papadakis (1961) compared to the revised Kohler (1954). A comparison of the efficiency of the default and revised equations in Figs. 7, 8 and 9 reveals that the revision process has increased the reliability in determining evaporation using experimental equations.

Fig. 11
figure 11

The performance index for default and revised equations in the cold region (red: lowest performance, blue: highest performance)

Analysis of improvement due to revision

The difference between equations' efficiency before and after the revision indicates that the highest revision belongs to Trabert (1896) equations. Despite the remarkable revision change on Trabert (1896) equation, the revised Trabert (1896) equation in some cases has led to inefficient results, emphasizing more attention to the mathematical formula of the equation. On the other hand, Papadakis (1961) equation is less affected by revision than other equations. In addition, the efficiency of the Papadakis (1961) equation is proper in both the pre- and post-revision process. This finding reflects the firm behavior of Papadakis (1961) equation. After revising equations parameters, under-determination and over-determination errors have also been adjusted.

Investigation in spatial analysis

The study of the spatial distribution of the studied equations (presented in Figs. 7 and 8) shows that the geographical distribution of the analyzed datasets has not caused a significant difference in the efficiency of the equations. However, it can be said that the efficiency of the equations in the central regions has been somewhat lower than in other regions. On the other hand, the efficiency of the equations in the eastern, northeastern, western, and northwestern regions has been somewhat higher than in other regions.

Discussion

Plots of changes in the objective function in the revision process (Fig. 6) show the efficiency of the applied optimization method. Numerous studies have emphasized the influence of the NMS method in studies related to hydrology (Pinnington et al. 2018), water resources (Lee., 2019), water quality (Ciolofan et al. 2018), climatology (Ntale et al. 2003), and environmental sciences (Yang et al. 2006) and the findings of this study also confirm them. Although in two equations Trabert (1896) and Papadakis (1961) with one parameter, the revision has caused completely different changes in these parameters. While the revision in the Papadakis (1961) equation slightly changed the parameter's value, the parameter changes were extraordinary in Trabert (1896). This finding indicates an entirely different behavior of the two equations in cold regions. Papadakis (1961) equation can be considered firm in cold regions. Suppose multiple data are unavailable to revise the evaporation equations. In that case, the default Papadakis (1961) equation can be a practical option in this region.

Another finding of these two equations is that despite the significant difference in the efficiency of the two equations in the default mode, the efficiency of the two revised equations was relatively similar after the revision process. The two equations Trabert (1896) and Papadakis (1961) had the most and the least influence on revision, respectively. Together, these findings indicate that in cold regions, the influence of default parameters on the inefficiency of the default Trabert (1896) equation was far greater than the mathematical formula of the equation.

A comparison between the efficiency of the default equations shows that the two Kohler–Nordenson–Fox (1955) and Papadakis (1961) equations were more efficient than the others. Kohler–Nordenson–Fox’s (1955) equation uses wind velocity as a further factor compared to Papadakis (1961) equation, which is the most critical difference between the two equations. The second difference between the two equations is that the relationship between vapor pressure deficit and evaporation is linear in Papadakis- (1961) and nonlinear in Kohler–Nordenson–Fox (1955). In addition, the finding that the simple mathematical formula of Papadakis- (1961) equation has high efficiency compared to other default equations and the low difference between its results and the Kohler–Nordenson–Fox (1955) equation indicates that vapor pressure deficit is the most critical factor affecting evaporation from the pan in cold regions. This finding is confirmed by the study of Matsoukas et al. 2011. In addition, these findings indicate that in the absence of data to revise the default equations, the Kohler–Nordenson–Fox (1955) equation is also a correct option for determining the evaporation of the pan in cold regions. Of course, with a slight difference, Papadakis- (1961) equation is also a valuable option for such conditions.

Comparative analysis of all default and revised equations reveals that the highest performance belonged to the revised Kohler–Nordenson–Fox (1955) equation. The revised equations are generally reliable in determining evaporation from the pan. Although the studied equations have been revised to varying magnitudes, ultimately, the revised equations have no extraordinary differences in efficiency. Therefore, it can be concluded that the efficiency of evaporation equations from the pan in the studied cold regions is mainly affected by the default parameters of these equations. The mathematical formula of the studied equations does not cause a tremendous difference in their efficiency. This finding is important because it does not support the assumption that different revised equations perform thoroughly differently in cold regions. This finding was unexpected, considering the equation's mathematical formula and the various inputs of the studied equations. This finding contradicts research findings in coastal areas (Mohammadi et al. 2023) and arid and semi-arid regions (Mohammadi et al. 2023). Another finding in this area is the remarkable efficiency of the revised Trabert (1896) equation. As the equation's mathematical formula shows, the wind velocity's influence on evaporation is radical, meaning it does not change in the revision process. Although other parameters of the equation change in the revision process, it was expected that the mentioned constraint would lead to improper performance of the revised Trabert (1896) equation. The findings of this study showed different results. Studies by Mohammadi et al. in coastal areas (2021) and arid and semi-arid regions (2021) indicate the low efficiency of the revised Trabert (1896), which contradicts the results of this study. The mathematical form between wind velocity and evaporation from the pan in Trabert (1896) equation includes a multiplier such that when the wind velocity is zero, the evaporation amount equals zero. This problem can lead to a significant problem in determining the amount of evaporation in regions that have days with zero wind velocities for a significant part of the year. It seems that the zero wind velocity data in cold regions has been low, making the efficiency of the revised Trabert (1896) equation acceptable.

Two parameters in the two Linacre (1977) and Linacre (1994) equations have been revised to zero during the revision process. This revision simplifies the Linacre (1977) equation's mathematical formula; therefore, it can be concluded that the default Linacre (1977) equation is not proper for determining evaporation in cold regions. Examining the results of the ME error metric for the equations in both pre- and post-revision modes indicates that the revision process has been able to revise this problem correctly. This finding emphasizes the importance of adjusting the default equations to determine evaporation from the pan in cold regions.

The above combination clarifies that revising the default equations can improve all the studied equations if valid data are available. Revision of the mathematical formula of equations to improve the results has been emphasized in several studies, such as Dubovský et al. (2021) and Metcalfe et al. (2019), and the results of this study also confirmed those findings.

A comparison of the results of the equations in different regions has shown that the geographical distribution has a limited influence on the efficiency of the equations in the cold region. This finding is important because the various study areas had different meteorological characteristics. Therefore, it was expected that the spatial distribution would significantly influence the efficiency of the equations. This finding contradicts the study results by Li et al. (2020).

A joint investigation of the default and revised equations indicates that most of the default equations do not have good reliability and the reliability of the equations in determining evaporation increases with the revision process. For the Linacre (1994) equation, the revision process eliminates one parameter, while for the Linacre (1977) equation, the mathematical formula of the equation is simplified by removing one parameter. In Trabert (1896) equation, the mathematical formula of the relationship between wind velocity and evaporation causes the amount of evaporation to be set to zero on days when the wind velocity is zero, which reduces the reliability of this equation. Therefore, it can be concluded that among the studied equations, Linacre (1977), Linacre (1994), and Trabert (1896) equations are less reliable than other equations.

Conclusion

Evaporation has a significant influence on many environmental science processes. Using an evaporation pan is the standard method of determining the amount of evaporation. However, many regions have no evaporation measuring stations for different reasons. Therefore the use of experimental mathematical equations is necessary. Most of the data presented in various studies to provide experimental mathematical equations are not from cold regions, which creates serious doubts about applying these equations in cold regions. In this study, using the datasets of cold regions of Iran, the efficiency of seven equations Trabert (1896), Kohler (1954), Kohler–Nordenson–Fox (1955), Papadakis (1961), Antal (1973), Linacre (1977), Linacre (1994), examined in both default and revised modes. By examining the results of different equations, it can be concluded that revision has increased the efficiency of the studied equations. Although the magnitude of efficiency improvement has varied in different equations, finally, the revised equations do not differ much for efficiency. However, among these 14 equations, the highest efficiency belongs to the revised Kohler–Nordenson–Fox (1955) equation. This equation is characterized by a nonlinear relationship between vapor pressure deficit and evaporation and wind inclusion as another meteorological factor determining evaporation from the pan. Revision has reduced equations' under-determination/over-determination error, emphasizing the need to revise default experimental equations in cold regions. Although in the central regions, the efficiency of the equations was slightly lower than in other regions, in general, the spatial distribution did not significantly influence the efficiency of the equations in the cold regions. The comparison between the studied equations shows their different reliability in determining the evaporation from the pan in the cold region. The study used recorded data in Iran’s cold regions; therefore, it is possible that the results do not cover all relatively similar climates around the world. Improving the research using better datasets from different worldwide cold regions is recommended.