Estimating Missing Hydrological Data using Group Method of Data Handling (GMDH) in Mahanadi Basin, India

doi:10.21203/rs.3.rs-3180074/v1

Download PDF

Research Article

Estimating Missing Hydrological Data using Group Method of Data Handling (GMDH) in Mahanadi Basin, India

https://doi.org/10.21203/rs.3.rs-3180074/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

High-quality hydrological data is essential for a wide range of applications, including the planning, design, operation, and maintenance of multipurpose water resource projects. It also plays a crucial role in utilizing various modelling and statistical methods for flood prediction and management, conducting hydrological analyses, estimating and monitoring environmental flows, as well as supporting research and development efforts. Hydrological data comprising of river Gauge (G), Discharge (D), Sediment (S) and Quality (Q) are collected at daily, weekly, ten-daily or monthly frequencies using either manual entry procedures or automatic measurement systems at hydrological observations sites. These are stored in databases that are made available to researchers, water managers etc for planning and research purposes. Missing data is a common problem in numerous hydrological databases that leads to inaccurate results, reducing statistical power and reliability of the data. Missing data also affects statistical analysis thereby reducing reliability and modelling conclusions drawn from using these incomplete datasets. To address this problem, the Group Method of Data Handling (GMDH) approach is applied to analyze data matrices from four hydrological observation sites in the Mahanadi River Basin, India. The daily discharge data is split into a learning set (70%) and a training set (30%) for the GMDH model. The Coefficients of determination (R²), Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) was used to evaluate the performance of GMDH model. The results illustrated the successful application of the GMDH algorithm in addressing missing data issues within discharge data, effectively filling the gaps.

Group method of data handling (GMDH)

Missing data

Hydrological time-series data

Statistical Parameters

Missing data is a common problem in any data driven research spanning over different fields of study. Hydrological research is one such field where missing data is a major hurdle in carrying out various statistical analysis and modelling applications (Hamzah et al. 2021). Advancement in technologies such as Supervisory Control And Data Acquisition(SCADA), sensors, telemetry stations, remote sensing etc have led to collection of a large amount of long-time data series, however, still at many places important hydrological observations are being done manually.There can be myriad reasons for missing data in hydrological observations such as incomplete data entry, natural calamity like floods, human error or equipment malfunctions, mismanagement, lost files, etc. This has led to less reliability in data collection along with data gaps due to human error, equipment error, or other idiosyncratic reasons. Many researchers have attempted to solve the problem of missing data in hydrology using different imputation techniques(Khodakhah et. al. 2022; Aghelpour et. al. 2020; Aissia et. al. 2017; Harvey et. al. 2012). These techniques range from old standby methods that include removing the dataset with incomplete data or using simple statistical methods such as replacing the missing data with mean, median, nearby location values or interpolating values to using advanced computational techniques. Few advanced statistical techniques such as ARIMA (Autoregressive Integrated Moving Average), ARCH (Autoregressive Conditional Heteroscedasticity), GARCH (Generalized Autoregressive Conditional Heteroscedasticity), SVM (Support Vector Machines), Artificial Neural Networks (ANN) etc are tested and studied in the literature. A SWOT analysis of the different imputation techniques used in literature is presented in table 1 below.

Comparison of DIfferent IMputation methods used in HYdrology

Table 1.SWOT analysis of different Imputation Methods used in Time Series Analysis

Imputation Method	Strengths	Weakness	Opportunities	Threats	References
SVM/LSSVM	Can be used for both classification and regression; provide precise estimation of time-series data, especially in cases where the underlying model is non-linear, non-stationary, and not known beforehand.	less suitable for large datasets due to increased complexity and overfitting risks.	Handles non-linearity of hydrological data, easy Integration with other techniques to enhance imputation performance.	SVM does not perform very well when the data is sparse, having high number of missing values, or has more noise.	Kisi et. al., 2016; Khodakhah et. al., 2022; .Samsudin et. al. 2011.
ARMA/ARIMA	Most widely used withwell-defined mathematical foundations supported by a large body of research.	Ignores Heteroscedasticity, assume linearity and stationarity in the data that may be too restrictive.	Can be readily estimated using common statistical packages; can be extended beyond imputation to forecast future hydrological data values as well.	time-series nature of hydrological data and its vulnerability to weather-related risks and uncertainties are not sufficiently accounted for.	Gaoet. al.2018; Aghelpour. et. al. 2020;Samsudin et.al. 2011.
ARCH/GARCH	Includes modelling of time varying variances that considers uncertainity and risk in hydrological modelling	Widely used in the field of finance and econometrics and lesser explored technique for time-series hydrological data	more accurate forecasts of future volatility and perform better than models that ignore heteroscedascity, common statistical software packages can be used for imputing values.	Primarily capture volatility and do not account for non-linear relationships or other complex patterns that may be present in hydrological data.	Gao et. al., 2018;Onwubolu 2016
ANN	Effective when Distribution of dataset and relationship between different attributes is unknown	Needs substantial amount of data and information regarding Model’s structure; difficult to interpret and understand the imputation process.	Can be Integrated with other techniques to enhance imputation performance.	Subjectivity in the choice of final model fails to produce ideal results in majority of cases.	Meghwaret. al.,2016;IJARAI, 2012; Samsudin et.al. 2011
GMDH	Convenient for complex, unstructured systems, requires small sample size. No subjectivity involved in model selection.	Problem of multicollinearity at higher levels when using basic GMDH model.	GMDH has clear computational advantage when modelling many variables with large polynomial degree and small collected data.Efficient in case of Noisy data	No freely available software for implementing the GMDH algorithm; sensitive to outliers	Anastasakiset. al., 2001; Changzheng et. al., 2009;Onwubolu 2016;Dag et. al., 2016; Farlow. 1981; Mehra 1977; Samsudin 2011.

*Please note that the strengths, weaknesses, opportunities, and threats discussed in this SWOT analysis are specific to the imputation of missing hydrological data and may not be universally applicable to all situations.

SVM, a machine learning algorithm commonly used for classification and regression, poses challenges when applied to hydrological data. Firstly, it can be computationally demanding when dealing with large datasets. Additionally, SVM disregards the temporal relationship between data points as it treats each observation independently. Furthermore, interpreting the imputation results of SVM can be complex. As a result, SVM is seldom employed as a standalone technique, but rather combined with other methods to improve the imputation process.

ARMA and ARIMA models were originally introduced in hydrology by Box and Jenkins in 1976, and they have since become widely used for predicting river flow time series. These models assume that the error terms have a constant variance, implying that the magnitude of random fluctuations remains consistent as the time series progresses. However, in hydrological data, heteroscedasticity is often observed, meaning that the variability of errors changes over time due to factors like seasonality, trends, or external influences. Therefore, assuming constant error variance is not valid, and as a result, ARMA and ARIMA models fail to account for time-varying variances and cannot capture the evolving patterns of uncertainty present in hydrological data.

The ARCH model expands upon the conventional ARMA/ARIMA model by incorporating the concept of time-varying volatility. However, it should be noted that the ARCH model assumes linear and symmetric responses to shocks, which may not adequately capture the nonlinearity and asymmetry often observed in volatility. Additionally, while ARCH models are well-suited for modelling volatility, they may not be directly applicable for imputing missing data.

ANN models have shown promise in imputing missing hydrological data, but they have limitations such as the need for large datasets to effectively learn complex relationships and mitigate overfitting, Manual feature selection can be time-consuming and subjective. Furthermore, ANN models are often regarded as black box models, making it challenging to interpret the underlying relationships. They can also be computationally demanding and require more training time compared to other methods.

When it comes to hydrological data imputation, the GMDH algorithm offers a unique approach to imputing missing hydrological data. It can be viewed as a polynomial neural network that optimizes through least-squares fittings rather than iterative error minimization methods commonly used in traditional neural networks. GMDH training involves efficient linear algebra operations, leading to faster convergence compared to numerical methods. Additionally, GMDH possesses inherent resistance to overtraining, resulting in more robust and generalizable models.

Compared to other methods, GMDH offers several potential advantages for hydrological data imputation. It provides a transparent network of models, enabling clear interpretation of variable relationships, which is particularly valuable for hydrologists and researchers seeking to understand and explain imputation results. Moreover, GMDH performs well with limited data and exhibits improved resilience against noise in the data. These advantageous characteristics position GMDH as a valuable alternative to address the limitations of other models in the context of missing hydrological data imputation.

In this paper attempt has been made to evaluate the performance of GMDH technique for data imputation that allows for the reconstruction of entire daily streamflow datasets by filling gaps in the missing data.

GMDH also called self organizing modelling is a non-linear modelling technique first introduced by Ivakhnenko,1968 that establishes an input-output relationship within a complex system. The GMDH approach integrates elements from both statistics and neural networks, combining their strengths to improve the imputation of missing hydrological data.(Valenga et. al., 1998). It requires small data samples and is able to optimise models structure objectively (L. Anastasakis& N. Mort, 2001). GMDH has been used widely in diverse fields such as pattern recognition, physiological experiments, cybernetics, medical science, education, safety science, economics, tool life testing in gun drilling, forecasting of mobile communication, ecology, weather modelling, hydraulic field engineering systems etc (Onwubolu 2016); however, its use in hydrological modelling and forecasting is still less explored (Aghelpouret. al., 2020)

GMDH Algorithm

The GMDH algorithm follows the principle of polynomial approximation to connect input and output values. The general connection between input and output variable is expressed by the polynomial function known as Kolmogorov-Gabor polynomial as

Y = α₀ +\({\sum }_{i=1}^{n}{\alpha }_{i}{x}_{i}\) + \({\sum }_{i=1}^{n}{\sum }_{j=1}^{n}{\alpha }_{ij}{x}_{i}{x}_{j}\) + \({\sum }_{i=1}^{n}{\sum }_{j=1}^{n}{\sum _{k=1}^{n}\alpha }_{ijk}{x}_{i}{x}_{j}{x}_{k}\) +……… (1)

Where, X( x₁,x₂,….x_n) denotes the input variables, α( α₁, α₂,…. α_n)represents weights and Y is the output variable (Li et. al., 2020).

The above equation necessitates a substantial amount of data and computationally-intensive calculations involving high-dimensional matrices to determine a large number of coefficients. However, the GMDH algorithm overcomes this limitation by employing a multilayered perceptron-type structure. In this structure, the functions utilized are either linear or second-order polynomial functions in two or three variablessuch as:

F₁ (x_i ,x_j) = α₀ + α₁ x_i + α₂x_j………… (2)

F₂ (x_i ,x_j) = F₁ (x_i, x_j) + α₃ x_ix_j + α₄\({x}_{i}^{2}\) + α₅\({x}_{j}^{2}\) ……… (3)

The model parameters are determined through the least-squares method, where the residues between the model outputs and targets are minimized. To achieve this, all possible combinations of two inputs are utilized to generate an output (as illustrated in Fig. 1). The outputs are evaluated, and the best results meeting a specified threshold are selected. This process is repeated for subsequent layers until the threshold criteria are satisfied.

GMDH algorithms have been successfully applied for short term prediction and forecasting of River Flow (Ikeda et. al. 1976; Valença et. al. 1998; Aghelpour et. al. 2020; Bonakdari et. al. 2020; Khodakhah et. al. 2022),water quality (Haghiabi et. al. 2018), turbidity (Tsai et. al. 2017), Dissolved Oxygen (Tomić et. al. 2018), river water level (Li et. al. 2020) etc. For instance, Dag et al. (2016) developed the R package GMDH for short-term forecasting and observed improved performance compared to ARIMA models and exponential smoothing methods. Changzheng et al. (2009) combined GMDH with the Expectation Maximization (EM) algorithm to address missing values in noisy data and found its superiority over four other popular imputation methods.

Moreover, Acock et al. (2000) demonstrated the utility of GMDH in filling gaps in weather data obtained from weather stations. Meghwar et al. (2016) utilized GMDH for analyzing missing data in river flow. Additionally, Aghelpour et al. (2020) employed the GMDH method for modeling and forecasting river flow, concluding that it is a reliable predictor for daily river flow, except during severe floods.

Overall, these studies highlight the versatility and effectiveness of GMDH in various applications, including both forecasting and addressing missing data challenges in hydrological analysis.

The objective of this research is to assess the efficacy of the GMDH technique in imputing missing values in long-term hydrological series data. The study employed historical hydrological datasets from four gauging stations located on the Mahanadi river and its tributaries to construct and evaluate the performance of the GMDH model.

Study Area:

The Mahanadi river, a significant east-flowing river in India, originates in the Dhamtari district of Chhattisgarh and empties into the Bay of Bengal. The Mahanadi basin is divided into three sub-basins: upper, middle, and lower Mahanadi. In this study, hydrological datasets from four Hydrological Observation (HO) sites operated by the Central Water Commission (CWC) (Table 2) in the lower sub-basin of the Mahanadi river were utilized to develop and evaluate the GMDH model. Among these sites, Tikarapara and Boudh are situated along the main course of the Mahanadi river, while Kantamal and Padampur are located on its tributaries, namely Tel and Ong, respectively. Each site has its own set of missing daily discharge data. The Fig. 2. below illustrates the geographical locations of the hydrological observation sites in the lower sub-basin of the Mahanadi river.

Table 2

Hydrological Observation site Location
River-Gauging Station	River/Tributary	Latitude	Longitude
Tikarpara	Mahanadi	20.6019	84.7761
Boudh	Mahanadi	20.860	84.322
Kantamal	Mahanadi/Tel	20.6527	83.7234
Padampur	Mahanadi/Ong	21.0154	83.1043

Methodology:

In this paper, we use the GMDH toolbox provided by Yarpiz organization to create GMDH polynomial Neural Network(http://yarpiz.com/323/ypml120-time-series-prediction-using-gmdh). The code was executed in MATLAB using the following fundamental steps:

Step 1: Split the initial dataset into training and testing sets.

In the first step, continuous data X = {x1, x2, ..., xM} without missing values is chosen as the input variables as shown in Fig. 3. The available data is then divided into separate training and testing datasets to learn the system structure and select the best results respectively. This helps in avoiding over fitting for prediction and is known as regularization. The first 70% of the time series was used for learning set and last 30% of the data was utilized as a testing set (Dag, O. et. al., 2016).

Step 2: Create combinations of input variables within each layer.

The second step involves constructing MC² = M(M − 1)/2 new variables within the training dataset. Additionally, a regression polynomial for the first layer is constructed by creating a quadratic expression that approximates the output y in the given Eq. (3).

Step 3: Apply an optimization principle to determine the elements within each layer.

In step three, the contributing nodes in each hidden layer are identified based on

the mean root square error (RMSE) values. The least effective variable is then eliminated by replacing the old columns of X with the new columns Z.

Step 4: Stopping rule for multilayer structure generationStep four involves the iterative process of the GMDH algorithm by repeating steps 2 and 3. The iteration continues until the errors of the test data in each layer no longer decrease. At that point, the iterative computation is terminated.

In each of the dataset the next 30 days values were imputed and compared with original data using the performance indices as discussed below. Figure 5 shows the

Performance Indices:

The evaluation of model performance for both training and forecasting data is conducted using commonly used metrics such as root-mean-square error (RMSE), correlation coefficient (R) and Mean Absolute Error (MAE). These indices, widely employed in time series forecasting evaluations, provide insights into the accuracy and correlation of the results. The definitions of RMSE, R and MAE are as follows:

Correlation coefficient R² measures how well the regression model fits the data, with values ranging from 0 to 1. A higher R²aproaching 1 indicates a better fit. The formula for the correlation coefficient (R) is as follows:

R ² = Cov(X, Y) / (σ_X * σ_Y)

where:

Cov(X, Y) is the covariance between variables X and Y,
σ_X is the standard deviation of variable X,
σ_Y is the standard deviation of variable Y.

RMSE = √(1/n * Σ(y_pred - y_actual)^2)

where:

RMSE is the Root Mean Square Error,
n is the number of data points,
y_pred represents the predicted values,
y_actual represents the actual values.

The RMSE provides a measure of the overall error between the predicted and actual values, with lower RMSE values indicating better model performance and higher accuracy (Aghelpour et. al., 2020).

MAE

The MAE is calculated by taking the average of the absolute differences between the predicted and actual values. The formula for MAE is as follows

MAE = (1/n) * Σ|y_pred - y_actual|

where:

MAE is the Mean Absolute Error,
n is the number of data points,
y_pred represents the predicted values,
y_actual represents the actual values.

As the RMSE and MAE criteria approach zero and simultaneously R2 approach one, accuracy will be higher.

Table 3 provides a summary of the original data utilized in the study. The accuracy of the model for post-processed streamflow data is presented in Table 4. During the monsoon month, imputations were conducted for the Tikarapara and Boudh stations, whereas imputations for the Padampur and Kantamal stations were carried out during the non-monsoon period. The findings demonstrate that imputations performed during non-monsoon months result in lower Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) scores compared to imputations conducted in the monsoon month. This suggests that the imputation performance is relatively superior during non-monsoon months. This observation can be attributed to the reduced volatility observed in the data during the non-monsoon period. Further, it is evident that the GMDH algorithm performs well with both a smaller and larger number of data points. Figure 5 showcases the performance of the model across training, testing, and all data sets by using the Correlation Coefficient (R2) values at different H.O. (Hydrological Observation) sites. Additionally, Fig. 6 depicts the results of the next 30-day forecast obtained through the GMDH algorithm when implemented using MATLAB code.

Table 3

Statistical parameters of preprocessed flow data for different HO Sites
Variables	Q(Tikarapara)	Q(Boudh)	Q(Padampur)	Q(Kantamal)
Numeric values	2922	2465	2407	17045
Max value	29800	26637.5	2296.33	20000
Min value	0	0	0	0
Std. Deviation	2973.13	2774.203	102.76	991.38
Missing values	83	137	1192	316
Period of Data Availability (with missing data)	2014–2021	06-2015 to 02-2022	07-2015 to 01-2022	07-1975 to 02-2022
Period of Data Availability (without missing data)	01-2014 to 08-2016	07-2015 to 07-2017	07-2015 to 11-2017	07-1975 to 04-1980
Number of observations used for model run	974	762	884	1766

Table 4

Accuracy of model of post processed flow data for different HO Sites
Statistical Parameter	Discharge (Tikarapara)	Discharge (Boudh)	Discharge (Padampur)	Discharge (Kantamal)
R²	0.93	0.94	0.86	0.91
RMSE	1068.30	894.71	17.04	423.41
MAE	331.30	324.92	6.44	115.40

Various techniques exist for handling missing data in streamflow datasets, each with its own advantages and limitations.Imputation techniques in hydrology often suffer from a lack of well-defined theoretical foundations and appropriate method selection based on the statistical characteristics of the observed data and research objectives. Given the time series nature of hydrological data,GMDH has proven to be robust, effective, and efficient in imputingmissing data values. However, there is a limited availability of functional codes specifically designed for end-users to utilize this method. Recentbooks such as GMDH in C (Onwubolu(2014)) and GMDH-methodology And Implementation InMatlab(Onwubolu (2016)) has tried to make available to the public, functional error-free GMDH codes that can be modified to solve basic problems.

The study yields two significant conclusions. Firstly, GMDH imputation demonstrates excellent performance for non-monsoon months when the hydrological data exhibits lower volatility. This suggests that GMDH is particularly effective in situations where the data variability is relatively small. Additionally, in the field of hydrology GMDH outperforms other methods and consistently delivers reliable results, even when working with small datasets.

Furthermore, the model employed for imputing missing hydrological data exhibits the potential to be extended for the prediction of daily stream flow observations. This implies that the GMDH model, initially designed for imputation purposes, can be effectively utilized for predicting hydrological variables beyond simply filling in missing values.

Overall, these conclusions highlight the strengths of GMDH imputation, showcasing its efficacy for non-monsoon periods and its superiority over alternative methods. Furthermore, the study demonstrates the versatility of the GMDH model, suggesting its potential for extending its utility to predictive tasks in hydrological data analysis.

While this paper has successfully demonstrated the feasibility of GMDH in imputing missing hydrological data, it is important to note that there are still certain limitations that need to be addressed in future research to enhance imputation performance. Firstly, there is need to exploring ensemble methods that combine multiple GMDH models or integrate GMDH with other imputation techniques. For example, traditional GMDH have been overcome by computational intelligence methods that hybridize GMDH, resulting in very efficient and robust hybrid modelers (Bonakdari et. al. 2020; Onwubolu (2016)). Secondly, Incorporating domain knowledge and expert input into the GMDH modelling process. For example, experts in hydrology may have insights into the temporal and spatial patterns of hydrological variables such as river flow, rainfall, identifying and handling outliers or extreme events, validating and evaluating the imputation results etc. Thirdly, to address the inherent uncertainty in hydrological data, it is beneficial to incorporate uncertainty estimation techniques into the GMDH framework. By doing so, the imputation process can provide not only the imputed values but also an estimation of the associated uncertainty. Thus, future research should prioritize conducting comprehensive investigations into the aforementioned three aspects. This will help explore more efficient and accurate imputation techniques, thereby making significant contributions to the problem of missing data in streamflow datasets.

Competing Interests:

The authors have no relevant financial or non-financial interests to disclose.

Funding:

The authors declare that no funds, grants, or other support were received during the preparation of this

Acknowledgements:

We would like to express our sincere gratitude toCentral Water Commission (CWC) for collecting the hydrological datasets used in this study. We are thankful to National Water Informatics Centre (NWIC) that are involved in the acquisition, management, quality control and dissemination of the data. Their efforts in maintaining and making these valuable datasets available for research are greatly appreciated.

Acock, M. C., &Pachepsky, Y. A. (2000). Estimating missing weather data for agricultural simulations using group method of data handling. Journal of Applied meteorology, 39(7), 1176–1184.
Aghelpour, P., &Varshavian, V. (2020). Evaluation of stochastic and artificial intelligence models in modeling and predicting of river daily flow time series. Stochastic Environmental Research and Risk Assessment, 34(1), 33–50.
Aissia, M. A. B., Chebana, F., &Ouarda, T. B. (2017). Multivariate missing data in hydrology–Review and applications. Advances in Water Resources, 110, 299–309.
Anastasakis, L., & Mort, N. (2001). The development of self-organization techniques in modelling: a review of the group method of data handling (GMDH). Research Report-University of Sheffield Department of Automatic Control and Systems Engineering.
Bonakdari, H., Binns, A. D., &Gharabaghi, B. (2020). A comparative study of linear stochastic with nonlinear daily river discharge forecast models. Water Resources Management, 34, 3689–3708.
Changzheng, H., & Bing, Z (2009). A New Imputation Method based on GMDH. Balance, 625, 5.
Dag, O., &Yozgatligil, C. (2016). GMDH: An R Package for Short Term Forecasting via GMDH-Type Neural Network Algorithms. R J., 8(1), 379.
Farlow, S. J. (1981). The GMDH algorithm of Ivakhnenko. The American Statistician, 35(4), 210–215.
Gao, Y., Merz, C., Lischeid, G., & Schneider, M. (2018). A review on missing hydrological data processing. Environmental earth sciences, 77(2), 1–12.
Haghiabi, A. H., Nasrolahi, A. H., &Parsaie, A. (2018). Water quality prediction using machine learning methods. Water Quality Research Journal, 53(1), 3–13.
Hamzah, F. B., Hamzah, F. M., Razali, S. M., & Samad, H. (2021). A comparison of multiple imputation methods for recovering missing data in hydrological studies. Civil Engineering Journal, 7(9), 1608–1619.
Harvey, C. L., Dixon, H., & Hannaford, J. (2012). An appraisal of the performance of data-infilling methods for application to daily mean river flow records in the UK. Hydrology Research, 43(5), 618–636.
Ikeda, S., Ochiai, M., &Sawaragi, Y. (1976). Sequential GMDH algorithm and its application to river flow prediction. IEEE Transactions on Systems, Man, and Cybernetics, (7), 473–479.
International Journal of Advanced Research in Artificial Intelligence (IJARAI), Vol. 1, No. 4, 2012
Ivakhnenko AG (1968) The group method of data handling-a rival of the method of stochastic approximation. Sov Autom Control 1–3:43–55
Khodakhah, H., Aghelpour, P., &Hamedi, Z. (2022). Comparing linear and non-linear data-driven approaches in monthly river flow prediction, based on the models SARIMA, LSSVM, ANFIS, and GMDH. Environmental Science and Pollution Research, 29(15), 21935–21954.
Kisi, O., & Parmar, K. S. (2016). Application of least square support vector machine and multivariate adaptive regression spline models in long term prediction of river water pollution. Journal of Hydrology, 534, 104–112.
Li, Y., Shi, H., & Liu, H. (2020). A hybrid model for river water level forecasting: cases of Xiangjiang River and Yuanjiang River, China. Journal of Hydrology, 587, 124934.
Meghwar, S. L., Ahmed, W., Bano, R., & Memon (2016), M. A. Analysis of Missing Data for River Flow Using Statistical Tool Group Method of Data Handling (GMDH). In 8 th INTERNATIONAL CIVIL ENGINEERING CONGRESS (p. 177).
Mehra, R. K. (1977, December). Group method of data handling (GMDH): review and experience. In 1977 IEEE conference on decision and control including the 16th symposium on adaptive processes and a special symposium on fuzzy set theory and applications (pp. 29–34). IEEE.
Onwubolu, G. C. (Ed.). (2014). Gmdh-methodology And Implementation In C (With Cd-rom). World Scientific.
Onwubolu, G. C. (Ed.). (2016). GMDH-methodology and implementation in MATLAB. World Scientific.
Samsudin, R., Saad, P., &Shabri, A. (2011). River flow time series using least squares support vector machines. Hydrology and Earth System Sciences, 15(6), 1835–1852.
Tomić, A. Š., Antanasijević, D., Ristić, M., Perić-Grujić, A., &Pocajt, V. (2018). A linear and non-linear polynomial neural network modeling of dissolved oxygen content in surface water: Inter-and extrapolation performance with inputs' significance analysis. Science of the Total Environment, 610, 1038–1046.
Tsai, T. M., & Yen, P. H. (2017). GMDH algorithms applied to turbidity forecasting. Applied Water Science, 7(3), 1151–1160.
Valenga, M., &Ludermir, T. (1998, December). Self-organizing modeling in forecasting daily river flows. In Proceedings 5th Brazilian Symposium on Neural Networks (Cat. No. 98EX209) (pp. 210–214). IEEE.

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

Estimating Missing Hydrological Data using Group Method of Data Handling (GMDH) in Mahanadi Basin, India

Status:

Version 1

Abstract

Figures

Introduction

Group Method of Data Handling

Study Area:

Methodology:

Performance Indices:

Results and Discussion

Conclusion

Limitations and Directions for Future Use

Declarations

Competing Interests:

Funding:

Acknowledgements:

References

Additional Declarations

Status:

Version 1