Prediction for the Inventory Management Chaotic Complexity System Based on the Deep Neural Network Algorithm

,


Introduction
Researchers have found chaos in physics, chemistry, ecology, geography, and economics data [1], and the discrete nonlinear management system has been widely studied by many researchers [2][3][4][5][6][7][8]. Te concept of chaotic strategic management dates back to 1983. In 1994, Feichtinger [4] studied chaotic planning, queuing, and scheduling in management operations. Murphy [5] used chaos to study public relations' problems and crises. After reviewing the chaos management research, Joseph [6] pointed out that chaos management requires a change in rules and adaptability [1].
Te main purpose of inventory is to meet the demand, so demand forecasting is the basic premise of inventory management. Boardman and others used a clustering algorithm to compare new and existing similar products and predict sales volume of new products [9]. Van der Auweraer et al. utilized auxiliary installed base data to predict the spare parts demand [10]. Yu et al. proposed a support vector machine (SVM) model to predict the newspaper demand of diferent stores by including 32 features in the model [11]. Shimmura and Takenaka used the SVM method to forecast the demand for convenience store inventory data by reducing the feature dimension and data quantity [12]. Tanizaki et al. used POS, Bayesian linear regression, and other methods to predict hotel passenger fow [13].
In the era of big data, the cost of acquiring, storing, and processing a large amount of data is signifcantly reduced. Decision makers can observe historical demand and acquire data such as weather, prices, holidays, promotion information, and demographic information to improve demand forecasting accuracy [14,15]. In recent years, the advantages of machine learning in processing large datasets and high-dimensional feature data have attracted the attention of scientists. Te rapid increase in data changes the prediction algorithm from traditional forecasting approaches to deep learning [16][17][18][19][20][21][22][23][24][25][26][27][28][29][30][31]. For example, Kong et al. used the restricted Boltzmann machine (RBM) algorithm based on deep learning to predict trafc fow. Te phase space reconstruction of the RBM algorithm constructed the polymorphic long-term model of chaotic time series [17]. Wei and Wang proposed an anomaly detection method of hierarchical spatiotemporal feature learning network based on deep learning [18]. Zhang et al. used the residual neural network framework to model time proximity, period, and trend characteristics of crowd fow [19]. Haq et al. [29] utilized the multilayer bidirectional LSTM algorithm to identify the mitochondrial protein of the Plasmodium falciparum parasite. Khan et al. [30] used deep learning algorithms to predict residential and commercial energy consumption. Azar and Vaidyanathan [1] used a new deep learning algorithm to predict and analyze renewable energy power generation. However, as a typical nonlinear system, the complex inventory management presented a chaotic and nonlinear phenomenon with high complexity and small amplitude change during the time series change. It is impossible to make accurate predictions by using traditional machine learning. Tus, fnding a suitable deep learning algorithm for prediction is necessary. Having said that, however, the above mentioned deep learning algorithm can also be used in other chaotic systems [32][33][34][35].
Tis paper aims to: (1) Analyze the nonlinear characteristics of inventory management using the nonlinear dynamics theory; (2) Verify the inventory data characteristics and forecast the inventory by using LSTM, bi-LSTM, and DLSTM algorithms.
Tis paper predicted inventory data under complex, chaotic systems. Te prediction results concluded that the bi-LSTM algorithm is better for chaotic nonlinear datasets and provided a reference for other chaotic datasets. Te rest of this paper is organized as follows: in Section 2, the chaotic inventory management system, the inventory data, and the data irregularity are nonlinear y 0-1 test. Section 3 introduces prediction models: LSTM, bi-LSTM, CNN-LSTM, and DLSTM. Section 4 verifes the abovementioned algorithms by experiments, and the optimal model is obtained by comparing three indexes. Finally, the results are summarised in Section 5.

Inventory Management Model.
Many enterprises face inventory peoblems whih can be represented in form of complicated chaotic systems of equations as follows [36]: where s, p, q, and r are the system parameters, s represents the initial sales base, p represents the inventory fund transfer rate, q represents the product resource rate, and r represents the inventory efciency. x i represents the resources for sales in period i, y i represents the number of customers in period i, and z i represents the inventory capital of the company in period i. Normalizing the parameters of the inventory management model [36], the results would be: 0 < x i < 10 < y i < 1 and 0 < z i < 1/r. Where p � 0.43, q � 0.38, s � 0.11, and r � 0.72. Te attractors of a system (1) are shown in Figure 1.

0-1 Test.
Tis study implemented the 0-1 test to investigate whether the data is chaotic. He et al. used 0-1 test algorithm to make correlation analysis on the time series of fractional order system [8]. If φ(n) (n = 1, 2, 3, . . .) represents a one-dimensional observable iterative data, then the two real-valued functions would be [36]: where θ(i) � iω + i j�1 φ(j), the trajectories are visualised in Figure 2. If the bounded trajectory in the Figure 2 is a regular cloud shape, then the unbounded trajectory follows Brownian motion and the data is chaotic. Tis method was used to study the y and z sequence of the system (1). Its parameters were the same as those in Figure 2. Te p-s relationship is displayed in Figure 3. Te change of inventory safety threshold due to the change in stocks of goods with time is irregular, which cannot be accurately predicted by traditional algorithms [36]. Figure 3. Cell memory unit structure is added to the hidden layer of RNN, which allows the model to learn the information for a long time and efectively overcome the problem of gradient disappearance or explosion [29]. LSTM introduces a memory cell structure in the hidden layer, including three gate controllers: input, forgetting, and output gates [37], allowing the network to forget historical information and update the memory state with new information. Te structural diagram of LSTM neurons is shown in Figure 4.

LSTM Model. LSTM network improves RNN. RNN neurons are shown in
Te three gates adopt the sigmoid function, and all of them are nonlinear summation units. At the same time, the activation functions inside and outside the module are included. Te multiplication operation is used to control the activation functions of the units. Te calculation consists of the following steps: We calculate the value f t of the forgotten gate as follows: We calculate the value of the input gate as follows: We calculate the current time memory unit state value C t as follows: We calculate the output gate and memory output h t of the LSTM unit as follows: LSTM and RNN speculate backward data through forwarding information. Forward and backward information is used to predict the current time, strengthening the Complexity connection between feature information and predicted value and improving the model's prediction accuracy. Te research shows that the LSTM network has positive results in multivariate classifcation and prediction.

Bi-LSTM.
Te LSTM prediction model only predicts through the law of unilateral data, and it cannot fully mine the time feature information, so the prediction accuracy needs further improvement. Targeting the LSTM model's defciency, a bidirectional-LSTM (bi-LSTM) prediction model is proposed. Te structural diagram of Bi-LSTM neurons is shown in Figure 5. Bi-LSTM [37] uses two unrelated LSTM models to predict data from the front and back. Te output of the hidden layers of the two models is used as the input of the output layer, and fnally, the built-in function of the output layer outputs the fnal predicted value.
Bi-LSTM, based on the time window method, refers to the prediction of the next time step by using the historical value of the time window length of data. Te parameter value of the time window step represents the historical data for predicting the future value. For example, if the current value x t and the previous values x t−1 and x t−2 are used to predict the value of the next period x +1 .
Regularization avoids overftting in prediction. L1 and L2 regularization methods introduce a penalty for the problem of too large parameters in the model. Te most used regularization technique for deep learning is dropout, which randomly inactivates some neurons. Each training session is equivalent to a diferent weak classifer, thus improving the model's generalization ability and using the dropout method to improve the model's applicability.
According to Khan et al. [38], the hybrid network DB-Net, is proposed by combining the extended convolutional neural network (DCNN) with the bidirectional long-term and short-term memory (bi-LSTM). Sagheer and Kotb [39] put forward "CL-Net" based on a new hybrid structure T of ConvLSTM and LSTM. All the above improve LSTM and bi-LSTM deep learning models.

CNN-LSTM.
A convolutional neural network (CNN) comprises fve parts: input layer, convolution layer, pooling layer, full connection layer, and output layer. X = [x 1 , x 2 , . . ., x n ] is the input data matrix, where n represents the length of the time series and m represents the number of data features. Te time-series data are convolved to obtain the following equation: where ⊗ is the convolution operation, convolution kernel WC ∈ R j * m is the weight vector, j is the convolution kernel size, and b c is the bias of this layer. f c (·) represents the convolution layer activation function. o c is the convolution kernel feature mapping result.  Pool operation selects the most critical features of the convolution layer sequence to form the pooling layer. Tere are two kinds of pooling operations: maximum pooling and average pooling. Te commonly used pooling method is maximum pooling, and the maximum global pooling is used in the last pooling operation. Te expression is: where o p (k) is the output result of the k th pool; o p is the output result of maximum global pooling. A combination of timing features is realized through the full connection layer: Among them, W d is the weight matrix of the full connection layer, b d is the bias, and the activation function f d (·) of the full connection layer includes ReLU, tanh, and sigmoid.
Te output layer outputs the results of the full connection layer: W o is the weight matrix of the output layer, b o is the bias, and the activation function f o is the softmax function. CNN-LSTM is a combination of CNN and LSTM, which is divided into four layers: (1) Input layer: data input after normalization.
(2) CNN layer: this layer extracts the data features through CNN, where the convolution layer and pooling layer can extract the features that more clearly refect the inventory changes and reduce overftting. Te full connection layer can summarise and output the abovementioned features. (3) LSTM layer: the extracted features are converted into the corresponding data format of LSTM, and time series data mining is carried out through three gate mechanisms in LSTM to obtain the internal change rule and the prediction model. (4) Output layer: the activation function of the output layer is the Sigmoid function, and the LSTM prediction result is the output.

DLSTM.
In the Deep LSTM (DLSTM) architecture, as shown in Figure 6 [40], the input at time t, x t is introduced to the frst LSTM block along with the previous hidden state S t−1 (1) , and the superscript (1) refers to the frst LSTM. Te hidden state at time t, s (1) t is computed and moves forward to the next step and up to the second LSTM block. Te second LSTM uses the hidden state s (1) t along with the previous hidden state s (2) t−1 to compute s (2) t , which goes forward to the next step and up to the third LSTM block and so on until the last LSTM block is compiled in the stack.
Te beneft of such stacked architecture is that each layer can process some part of the desired task and subsequently pass it on to the next layer until the last accumulated layer fnally provides the output. Another beneft is that such architecture allows the hidden state at each level to operate diferently. Te previous two benefts have a signifcant impact in scenarios showing the use of data with long-term dependency or in the case of handling multivariate time series datasets.
Te prediction results of Bi-LSTM can be compared with LSTM. Te model structure of LSTM itself is relatively complex, and training is more time-consuming than CNN. Te characteristics of RNN networks determine that they cannot process data in parallel. Furthermore, although LSTM can alleviate the long-term dependence of RNN to some extent, it is difcult for longer sequence data.

Data Sources.
Te experimental data in this paper come from dynamic equation (1). According to the defnition of the state variable of dynamic system (1), the state variable Z is the inventory data. Te frst 70000 datasets were used as training datasets and the last 3000 test datasets, totalling 10000. In this paper, system (1) state Z was adopted, and 10000 samples were selected, as shown in Figure 7. Te abovementioned analysis showed that the inventory data are chaotic. To fully use the time series between the data, this paper predicts and evaluates the inventory data and verifes it with the actual data.

Evaluation Index and Model
Parameters. Tis paper used LSTM, bi-LSTM, GRU, CNN-LSTM, and other algorithmic models for prediction. To evaluate the efectiveness of these methods, mean square error (MSE), root mean square error (RMSE), and mean absolute error (MAE) were used to evaluate the model. Tese indicators are defned as follows [19]: where y i is the observed inventory quantity, y i is the forecast quantity of the inventory, and N is the number of test samples.
In this paper, LSTM, DLSTM, GRU, CNN-LSTM, and bi-LSTM algorithms were adopted, and the main parameter values in the algorithms are shown in Table 1.

Results
. Te inventory forecasting model adopted the LSTM algorithm, and the comparison between the predicted result and the actual value is shown in Figure 8. Te change of the Loss function after 50 cycles is displayed in Figure 9. Figure 8 shows the last 150 data of the test set, allowing the readers to check the predicted and actual values. MSE was 0.005315, RMSE was 0.072905, and MAE was 0.060346. All in all, the prediction errors were quite small.
Te comparison between the predicted result by using the bi-LSTM algorithm and the actual value is shown in Figure 10. Te change of the Loss function after 50 cycles is shown in Figure 11. Figure 10 shows the last 150 data of the test set for the convenience of readers to check the predicted and actual values. MSE was 0.001475, RMSE was 0.038405, MAE was 0.029732, and the forecasting errors were small. Te inventory forecasting model adopted the CNN-LSTM algorithm. Te comparison between the predicted result and the actual values is shown in Figure 12. Te change of the Loss function after 50 cycles is shown in Figure 12. Figure 12 shows the last 150 data of the test set for the convenience of readers to check the predicted and actual values. MSE is 0.027766, RMSE is 0.166631, MAE is 0.117720, and the forecasting errors are relatively small. Te inventory forecasting model adopted Figure 13 the DLSTM algorithm. Figure 14 shows that the last 150 data of the test set were used for the convenience of readers to check    Figure 14. MSE was 0.462163, RMSE was 0.6798, and MAE was 0.570947. By comparing the abovementioned evaluating indicator, the results are shown in Table 2. Te results obtained by bi-LSTM were the best with the slightest error, despite all other algorithms being used due to relatively small errors. Because the data fuctuation was not particularly large, DLSTM had no apparent advantages in this scenario. At the same time, we found no correlation between the complexity and performance of the model. For example, the DLSM algorithm is more responsible but is not the best for inventory safety prediction.

Complexity
Tere are often uncertain factors in the production process, such as many sudden orders, temporary consumption increases, the sudden advance of delivery, late delivery, and so on. Trough the abovementioned four algorithms, we can see that the bi-LSTM algorithm accurately predicted the inventory capacity, and it is of substantial value for enterprises to make purchase and demand plans.

Conclusion
Excessive inventory capacity causes inventory backlog, directly afecting the company's production efciency. In this paper, we focused on the prediction of inventory capacity. It used an inventory management dynamics system to obtain 10000 inventory data and used four prediction algorithms in artifcial intelligence: LSTM, BI-LSTM, CNN-LSTM, and DLSTM to train and predict. Te prediction results showed that bi-LSTM had the best prediction results. Tis study contributed to the academic circle by comparing diferent forms of neural network prediction of dynamics and chaotic nonlinear inventory management data. It also provided theoretical support for other predictions. Te predicted results ofered practical suggestions for enterprises' planned production and inventory ofcers when they decide on the optimal inventory of goods and reduce the likelihood of accidents due to excessive amounts of goods in warehouses. In future work, other algorithms, such as CNN-BILSTM and CNN-DLSTM, as well as AutoML as per Li et al. [41,42], could be used to predict inventory and compare with the four deep learning methods in this research.

Data Availability
Te data used to support the fndings of this study are available from the corresponding authors upon request.

Conflicts of Interest
Te authors declare that they have no conficts of interest.