Integrated Deep Neural Networks-Based Complex System for Urban Water Management

National Research Base of Intelligent Manufacturing Service, Chongqing Technology and Business University, Chongqing 400067, China Chongqing Sino French Environmental Excellence Research & Development Center Co., Ltd., Chongqing 400067, China Chongqing South-to-/ais Environmental Protection Technology Research Institute Co., Ltd., Chongqing 400069, China Computer School, Hubei University of Arts and Science, Xiangyang 441000, China Global Information and Telecommunication Institute, Waseda University, Shinjuku, Tokyo 169-8050, Japan


Introductio
Due to the rapid economic development and explosive population growth, water shortages have appeared in many parts of the world [1,2]. Urban water management system is of great significance to the sustainable development of urban water resources [3,4]. And it can encourage the water sector to allocate water resources rationally and maximize the long-term value and reliability of available water resources [5][6][7]. However, the urban water management system is a process that is extremely complex and difficult to control [8,9]. It includes the storage, distribution, and discharge of urban water [10][11][12]. And its main purpose is rational development, unified management, and optimal allocation for water resources [13,14]. To realize the urban water management system, it is necessary to grasp the issue of urban water consumption [15,16]. e expansion of the industry and the rise in global temperature have increased water consumption [17]; the government's introduction of water-saving measures and higher water prices will curb the growth rate of water consumption [18]. Based on this difference, it can be found that urban water consumption has certain volatility and ambiguity. erefore, it is necessary to analyze urban water consumption from multiple angles, which greatly increases the difficulty of predicting [19].
Many scholars have made contributions to the field of urban water management. ey have successively proposed a series of water management models, including statistical analysis models (autoregressive integrated moving average model [20,21], exponential smoothing [22,23], etc.), gray theory models (GM(1, 1) [24], NGBM(1, 1) [25], etc.), and neural network models (artificial neural network (ANN) [26,27], and long-and short-term memory network (LSTM) [28][29][30], etc.). Statistical analysis methods usually require datasets to follow certain assumptions or distributions [31,32]. However, the collected real datasets are unlikely to meet statistical assumptions. And gray theory models generally solve the modeling of small sample size and sparse data [33,34]. But water consumption is affected by many factors, and it is more suitable for deep learning methods for such prediction problems with complex data volume and difficult to fit fluctuation patterns [35,36]. Al-Zahrani and Abo-Monasar [26] proposed a neural network prediction model combined with coupled time series by considering climate variables. is model mainly combines the drought characteristics of the Middle East, and it can be very helpful in efficiently predicting water consumption in this region. Surendra and Deka [37] used an improved fuzzy-wavelet model to predict urban residential water consumption on the basis of considering climate change. is model introduces denoise and compress methods and uses climate data and water consumption at a certain time for model training and testing, which greatly improves the performance of the model compared with ordinary fuzzy models.
However, there are many factors influencing urban water consumption. To measure the relationship between each factor and the relationship between factors and water consumption is quite complex. In addition, current researches rarely classify impact factors through feature selection and consider the comprehensive impact of each category on water consumption. erefore, this paper proposes a new Urban Water management Mechanism based on Integrated deep neural network (UWM-Id). Firstly, use principal component analysis (PCA) and gray relational analysis (GRA) methods to reduce dimensionality and classify 15 impact factors of urban water consumption. Dimensionality reduction is performed through PCA, and the factors with greater correlation are combined to finally obtain three principal components; dimensionality reduction is performed through GRA, and the impact factors are sorted according to the gray correlation degree. en select indicators that are more relevant to water consumption by setting thresholds. en, the indicators contained in the three principal components are, respectively, input into LSTM, and the three LSTMs are fused to output the predicted value. Finally, the indicators selected by the PCA are input into several baseline methods to evaluate the performance of the proposed UWM-Id. e rest part of this paper is organized as follows. e framework and problem of this paper are illustrated in Section 2. e detailed principle and process of feature selection and integration model are described in Section 3. Section 4 includes a discussion of the data, experimental settings, and a series of experiments for UWM-Id. Section 5 includes the conclusions.

Framework of UWM-Id.
is research put forward a novel architecture UWM-Id whose framework design is illustrated in Figure 1. e architecture includes four layers: data layer, feature selection layer, model computing layer, and model integration layer.
(1) e data layer is the interface through which various data sources enter the model. e data source is the historical data of water consumption required for this study and indicators that affect water consumption used. e selection index should comply with the content of the evaluation, reflect the relevant content, and be scientific. erefore, this study selects 15 indicators from the perspective of economic development, political culture, and living facilities based on the above rules. (2) After data acquisition, data is transmitted to the feature selection layer for feature dimensionality reduction and classification. e impact factors in the dataset are integrated into three principal components through PCA, and each principal component contains one type of factor. Calculate the gray correlation degree between the impact factor and water consumption through GAR, and do the following processing: sort based on the gray correlation degree, and set the threshold and select factors that are more relevant to water consumption. (3) Input P i through the feature selection layer into the model computing layer with water consumption data, respectively. e model computing layer selects the LSTM that is commonly used for time-series data prediction. (4) Finally, the prediction results of the three primary LSTMs are synthesized, optimized, and integrated into a new model. erefore, comprehensive analysis and prediction of water consumption from three aspects improve the accuracy of predicting.

Problem Statement.
is paper proposes a management mechanism based on integrated neural network (UWM-Id). It provides support for urban water management systems by predicting water consumption. e flowchart of the management mechanism is shown in Figure 2.
UWM-Id includes two stages: training and prediction of primary LSTMs (Model 1 − Model 3); raining and prediction of integrated model (Model 4). e task of the previous stage is to analyze three different dimensions to predict the annual water consumption. Model 1 − Model 3 can actually be regarded as weak individual learners, respectively. e difference between them comes from the influence of different perspectives on water consumption. e goal of the second stage is to integrate the influence of the three perspectives. On the basis of the previous stage, weights are assigned according to the size of the loss so as to obtain a predicted value closer to the true value of water consumption. At this stage, the above individual learners are  Complexity strongly combined to make the integrated Model 4 more effective.
After the second stage, this paper will use GRA to reduce the dimensionality of features, then input the retained features into a series of baseline models, and take them as a comparative experiment to evaluate the performance and superiority of UWM-Id.

Feature Selection
e factor analysis method refers to a multivariate statistical analysis method based on the dependence relationship within the correlation matrix of the research index, which reduces some variables with overlapping information and intricate relationships into a few uncorrelated comprehensive factors [38,39].
Principal component analysis (PCA) is a type of the factor analysis method, and its principle is shown in Figure 3. e basic idea is as follows: (1) Under the premise of losing little information, multiple indexes are transformed into several unrelated comprehensive indexes (principal component P) through dimensionality reduction (linear transformation). (2) Each principal component is a linear combination of the original variables, and the principal components are not related to each other. is makes the principal components have some superior properties than the original variables.
X is the initial impact factor matrix of water consumption, X � (x 1 , x 2 , x 3 , . . . , x p ) ′ : where x i,j is the corresponding value of factor i in year j ( i � 1, 2, . . . , p; j � 1, 2, . . . , m), p is the dimension of X, and m represents the total number of years taken in the sample set.
Step 1. First standardize the sample set, as shown in the following formula: where x * i,j is the normalized value of x i,j and X j and σ j are the sample mean and standard deviation, respectively.
Step 2. Calculate the correlation coefficient matrix of X * . e correlation coefficient matrix (R) of the standardized impact factor set is as follows: Step 3. Calculate the eigenvalues of R and the corresponding factors. From the characteristic equation |R − λE| � 0, the m characteristic values λ j of R can be obtained. en arrange the eigenvalues from large to small, λ 1 ≥ λ 2 ≥ · · · ≥ λ p . And calculate the eigenvector corresponding to the eigenvalue expressed as follows: Step 4. Calculate the variance contribution rate (α i ) and cumulative variance contribution rate (α(i)) of each principal component.
When the cumulative variance contribution rate reaches more than 85%, the first factor is selected as the main component that affects water consumption and used in the integrated LSTM model. Specifically, the original feature space is subjected to dimensionality reduction processing by formula (8) to simplify the complexity of model solution. P is the variable after factor i takes into account the transformation of principal components, which effectively reduces the types of impact factors.

GRA-Based
Method. GRA is a multifactor-based analysis method that originated in the 1980s, which usually takes uncertainty systems as the research object. Water 4 Complexity consumption prediction has the characteristics of fewer data and strong uncertainty [40][41][42]. erefore, GRA can be used to analyze the factors affecting water consumption. is paper uses water consumption as a reference sequence and 15 indicators that affect water consumption as a comparison series, then calculates the gray correlation coefficient and gray correlation degree between the comparison sequence and the reference sequence, and according to the gray correlation coefficient and correlation degree finds out the key factors affecting urban water consumption and analyzes the influence degree of each index.
e specific calculation steps are as follows.
Step 1. Determine the reference sequence (urban water consumption): Determine the comparative sequence of water consumption: Step 2. Dimensionless processing is performed on the reference sequence and the comparison sequence.
Step 3. Calculate the correlation factor between the reference sequence and the comparison sequence. en the correlation factor between the comparison sequence x i (i � 1, 2, . . . , p) and the reference sequence x 0 is defined as where ρ ∈ (0, +∞) refers to the resolution; the smaller ρ is, the greater the resolution is.
Step 4. Calculate the relevance degree. Since the correlation factor ϑ i (k) is the correlation factor value between the reference sequence and the comparison sequence in each component, there are many results and it is not easy to compare. erefore, it is necessary to concentrate on the correlation factor of each index of each factor vector on one value, which is called the gray correlation degree. e gray correlation degree between x 0 and x i (i � 1, 2, . . . , p) is defined as Step 5. Sort by the gray correlation degree. Arrange the gray correlation degree between the impact factor and water consumption in order of magnitude to form the correlation sequence, denoted as G { }. It reflects the degree of correlation between each impact factor and water consumption.
Step 6. Set threshold and select features. Set appropriate thresholds and select features that meet the conditions. en input them into the baseline methods to compare with the UWM-Id proposed in this study.

Model Integration.
e basic neural network predictor used by UWM-Id proposed in this paper is LSTM. Refer to Figure 4 for the internal neuron structure of LSTM. LSTM solves the vanishing gradient problem of the original recurrent neural network (RNN). It has strong time-series data processing capabilities and is widely used in time-series data modeling [43][44][45].
LSTM consists of input gate, forget gate, and output gate.
e input gate is used to control the input of information; the forget gate determines the retention of the historical state information of the cell; the function of the output gate is to control the output of information. e activation function σ(·) makes the output range of the forget gate between [0, 1], which is usually the sigmoid function. When the output of the forget gate is 0, it means that all information in the previous state is discarded; when it is 1, it means that all the information in the previous state is retained.
where f t is the forget gate coefficient at the timestamp t; i t is the input gate coefficient at the timestamp t; C t ′ is the input data obtained through the tanh function; C t is the updated cell state at the timestamp t; o t is the output gate coefficient; h t is the output data at the timestamp t; h t−1 is the output data at the timestamp t − 1; x t is the input data at the timestamp t; C t−1 is the cell state at the timestamp t − 1; W f (·) and b f are the weight function and bias of the forget gate corresponding to the neuron from the timestamp t − 1 to the timestamp t, respectively; W i (·) and b i are the weight function and bias of the input gate corresponding to the neuron from the timestamp t − 1 to the timestamp t, respectively; W c (·) and b c are the weight function and bias of the input data corresponding to the neuron from the timestamp t − 1 to the timestamp t, respectively; W o (·) and b o are the weight function and bias of the output gate corresponding to the neuron from the timestamp t − 1 to the timestamp t, respectively. P i after the PCA dimensionality reduction operation is input into the LSTM basic predictor. en assign weights according to the prediction accuracy obtained by each predictor, and connect them with a fully connected layer. e output is a linear transformation of ς i (t), which is calculated as where w i (t) and b(t) are the weight vector and the bias vector of Model 1 − Model 3 at the t-th timestamp, respectively, and y(t) denotes the output from the full connection layer.
To avoid overfitting, the loss function is the following optimization objective: where y κ (t) and y(t) are, respectively, the prediction values and the real values of urban water consumption, κ denotes a conventional parameter in gradient descent method, α refers to the learning rate, and h κ represents the internal function of the selected model. e principle of the optimization algorithm is to select data and calculate the difference between the value obtained according to the model and the actual value at this time. If the difference is large, the parameter update amplitude is large, and vice versa. e optimization algorithm adopted here is the stochastic gradient descent (SGD). Traditional GD uses all training data for each iteration. But the SGD algorithm only randomly selects a set of training data to update the parameters in each iteration.
is study uses the PCA method to process the matrix of water consumption impact factors. e eigenvalues of the principal components, their corresponding variance contribution rates, and cumulative variance contribution rates are obtained by PCA, as shown in Table 2. is paper selects the principal component when α(i) > 85%. erefore, the eigenvalue vectors corresponding to the first three principal components are selected to change the original eigenmatrix for further load decomposition. It can be observed from Table 2 Figure 4: Internal neuron structure of long-and short-term memory neural network model.
6 Complexity e impact factors contained in each principal component are expressed in Figure 5. P 1 contains indicators such as population, total number of employed persons, and GDP, so it is defined as a socioeconomic factor. P 2 contains indicators such as precipitation and average relative humidity, so it is defined as a natural weather factor. P 3 includes electricity consumption and gas consumption, so it is defined as an energy factor.
Use GRA to obtain the gray correlation degree between total water consumption and each impact factor. And sort them according to the degree of gray correlation, as shown in Figure 6. According to the gray theory, when the correlation degree is greater than or equal to 0.8, it is a strong correlation degree. erefore, the top 9 correlation factors x 14 , x 11 , x 1 , x 4 , x 7 , x 15 , x 3 , x 6 , and x 13 are selected as the input of the baseline models in Figure 6.

Experimental Settings.
is paper selects the average absolute error (MAE) and root mean square error (RMSE) of the predicted mechanism as the measure, expressed by the formula where y O represents the real value of water consumption at the year O, y o ′ represents the predicted value of water consumption at the year O, and N is the total number of test sets in water consumption prediction model. For the above metrics, the lower values of these metrics represent better performance. In this experiment, it is necessary to select several water consumption prediction models as the baseline method to evaluate the performance of UWM-Id. erefore, we compare the proposed UWM-Id with baseline models in order to verify whether UWM-Id can reflect good performance in the case of integrated predictions through different perspectives. e selected baselines are as follows: (1) Random Forest (RF): it is a predictive model that integrates multiple decision numbers through the idea of integrated learning. (2) Multilayer Preceptor (MLP): Ii is a feedforward ANN in which every neuron is fully connected. (3) LSTM: it is a kind of sequential neural network model specially designed to solve the long-term dependence problem.
is experiment adopts the SGD optimizer to optimize the model, and the default learning rate (α) is 0.005. And during the experiment, we randomly split the dataset into training set and test set; the default ratio is 7 : 3.

Results and Analysis
First, we evaluate and compare the proposed UWM-Id with three baseline methods. Tables 3-5, respectively, list the model performance (MAE and RMSE) of UWM-Id and the baseline under the ratio of training set to test set of 6 : 4, 7 : 3, and 8 : 2. And on the basis of determining the ratio of the training set and the test set, the learning rate is set to 0.01, 0.008, and 0.005, respectively. In the above table, the cells that display the results include the evaluation value and the ranking of this evaluation value in UWM-Id and the baseline methods. After our calculations, the performance of the Year Complexity 7 x 1 x 2 x 3 x 4 x 8 x 10 x 12 x 14 x 15 x 5 x 6 x 7 x 9 x 11 P 1 P 2 P 3       proposed UWM-Id model is improved by 40%, 33%, and 20% compared with RF, MLP, and LSTM, respectively. It can be directly observed from these tables that the proposed UWM-Id is better than the baselines in almost all cases. And the baseline methods used for comparison can also be well used to predict urban water consumption. Although UWM-Id cannot achieve the best performance in some cases, the gap will not be too great.
is phenomenon can be attributed to two main reasons. First, the UWM-Id proposed in this paper uses the PCA method to reduce and classify the impact factors of urban water consumption, while other methods use the GRA method to reduce the dimensionality of features. e advantage of PCA lies in the more accurate and fine-grained classification and representation of features. Secondly, the UWM-Id proposed in this paper adopts an integrated model. e three types of features processed by the PCA method are input into the LSTM models, and then they are integrated by a fully connected layer according to the weight to capture the characteristic structure of water consumption under different perspectives.
In addition, we also conducted another set of experiments to evaluate the stability of the proposed UWM-Id. In this set of experiments, UWM-Id was only evaluated for parameter sensitivity alone and would not be compared with other baseline methods. e process of this experiment is as follows: set a series of values for the two parameters in the parameter combination; then input them into the UWM-Id model one by one; finally, test whether the experimental results are stable as a whole. Figures 7 and 8, respectively, show the MAE and RMSE values of UWM-Id under different parameter settings. ey all have three subgraphs, corresponding to the following parameter combinations: (a) training set ratio and optimizer, (b) learning rate and optimizer, and (c) learning rate and training set ratio. It can be seen from the above large number of subgraphs that, after adjusting the learning rate, optimizer, and dataset, the experimental results hardly change, which proves the stability of the proposed UWM-Id. e primary reason for this phenomenon is that UWM-Id integrates features from different perspectives of water consumption, making it less susceptible to parameter changes.
Based on the above two sets of experiments, the UWM-Id model was assessed from model performance and stability. Experimental results indicate that the UWM-Id proposed in this paper can effectively predict urban water consumption.

Conclusions
e urban water management system is a nonlinear, changeable process that is affected by many factors. It is complicated to consider the influence of multiple factors at the same time. Moreover, there are few or no examples of managing urban water resources by analyzing urban water consumption from multiple perspectives. erefore, this paper established an urban water management framework based on integrated deep neural networks (UWM-Id). is study took Chongqing, China, as an example, and established a water management mechanism based on water consumption forecasts.
rough PCA, the influencing factors of water consumption are divided into three categories: the socioeconomic factor, the natural weather factor, and the energy factor. en, input these three types of factors into the UWM-Id established in this paper, and conduct a series of experiments. e experimental results prove that the proposed UWM-Id has better accuracy and stability. It can provide certain assistance to the urban water management system.

Data Availability
e source codes and datasets used to support the findings of this study are available from the submitting author upon request via email (http://www.zwguo@ctbu.edu.cn).

Conflicts of Interest
e authors declare that they have no conflicts of interest.