A Hybrid Big Data Analytics and Industry 4.0 Approach for Projecting The Cycle Time Range

: To enhance the effectiveness of projecting the cycle time range of a job in a factory, a hybrid big data analytics and Industry 4.0 (BD-I4) approach is proposed in this study. As a joint application of big data analytics and Industry 4.0, the BD-I4 approach is distinct from existing methods in this field. In the BD-I4 approach, first, each expert constructs a fuzzy deep neural network (FDNN) to project the cycle time range of a job, which is an application of big data analytics (i.e., deep learning). Subsequently, fuzzy weighted intersection (FWI) is applied to aggregate the cycle time ranges projected by experts to consider their unequal authority levels, which is an application of Industry 4.0 (i.e., artificial intelligence). After applying the BD-I4 approach to a real case, the experimental results showed that the proposed methodology improved the projection precision by up to 72%. This result implied that instead of relying on a single expert, seeking the collaboration among multiple experts may be more effective and efficient.


Introduction
After a job is released into the factory, it may be delayed due to a variety of reasons (e.g., waiting, machine down, product quality problems, etc.), resulting in a long cycle time [1]. These delays are often unpredictable, which leads to the uncertainty in the cycle time of a job. Nevertheless, accurate job cycle time forecasts are an input required for many managerial activities [2][3][4][5]. For this reason, various methods, such as fuzzy logic [6], artificial neural networks [6], case-based reasoning and expert systems [7], regression [6], data mining and big data analytics [8][9][10], production simulation [11], and hybrid methods [2,[12][13][14], have been proposed to predict the cycle time of a job.
The experimental results of some past studies showed that it is not easy to accurately predict the cycle time of a job, especially in a large and complex factory [14][15][16][17][18]. Some researchers therefore tried to project the range of the cycle time of a job by constructing a confidence interval [6,[10][11]15] or generating a fuzzy cycle time forecast [2,[12][13]19], and then took some measures to narrow the range.
The ranges of cycle times should include actual values to maximize the hit rate and be as narrow as possible to minimize the average range. These two goals are not easy to balance. Blindly narrowing the average range often leads to a sharp decline in the hit rate. To address this issue, a hybrid big data analytics and Industry 4.0 (BD-I4) approach is proposed in this study. The BD-I4 approach combines deep learning and artificial intelligence, which are two critical subfields of big data analytics and Industry 4.0 [20][21][22][23][24].
In the proposed BD-I4 approach, a group of experts (or agents) collaborate to project the cycle time range of a job following the steps below. First, each expert constructs a fuzzy deep neural network (FDNN) to project the cycle time range of a job: (1) The configurations of the FDNNs constructed by different experts may not be the same: It is a challenging task to determine the optimal configuration of a FDNN.
A deeper and larger FDNN increases the accuracy of cycle time forecasting for the training data, but is trained slower and may over-fit, resulting in a low hit rate for test data.
(2) The FDNN is constructed by fuzzifying the parameters of a deep neural network (DNN). The parameters fuzzified by different experts may not be the same.
Subsequently, the fuzzy weighted intersection (FWI) operator [25] is adopted to aggregate the cycle time ranges projected by all experts. Over existing aggregators such as fuzzy intersection (FI), FWI has the following advantages [26]: (1) Sometime experts have unequal authority levels, which can be considered by the FWI operator.
(2) If experts lack a consensus, the FI result is an empty set, but the FWI result may be not.
The contribution of the BD-I4 approach includes: (1) It is the combination and application of big data analytics and Industry 4.0 to projecting the range of the cycle time. Similar treatments have rarely been taken in the past.
(2) How to construct a precise FDNN by fuzzifying a DNN is investigated theoretically in this study.
(3) The BD-I4 approach is a general method that can be applied to other fields.
The organization of this research is outlined as follows. First, a review of past studies is presented in Section 2. Then, the proposed BD-I4 approach is detailed in Section 3. To illustrate the applicability of the BD-I4 approach and make comparison with some existing methods, a real case study has been conducted. The results are summarized in Section 4. The conclusions of this research are given in Section 5.

Past Work
Deng et al. [27] proposed a fused FDNN network approach to classify images, in which two fuzzy neural network (FNNs) were constructed. One FNN was a shallow adaptive network-based fuzzy inference system (ANFIS), while the other was a FDNN.
However, all parameters in the two FNNs were crisp, which is distinct from the BD-I4 approach proposed in this study. In addition, one FNN was shallow, while the other was deep. In contrast, all FDNNs in the BD-I4 approach are deep. Subsequently, the outputs from the two FNNs were aggregated by a crisp feedforward neural network.
Most FDNNs in the literature are of Takagi-Sugeno-Kang (TSK) type. For example, Rajurkar and Verma [28] constructed a three-layer FDNN, in which each node of the hidden layer represented a TSK fuzzy inference system (FIS). Therefore, the FDNN can be considered as an ensemble of several ANFISs.
Mudiyanselage et al. [29] constructed a FDNN to detect cancers. The FDNN was a deep ANFIS with seven layers. All parameters in the FDNN were crisp. However, the ANFIS was of TSK type, the rules in which may not be able to approximate the relationship between the cycle time of a job and its attributes. In addition, FISs or ANFISs cannot establish a range that definitely contains the actual value.
Liang et al. [30] proposed an artificial neural network ensemble approach to predict passenger demand. The FNN ensemble was composed of a (crisp) convolutional long short-term memory network (Conv-LSTM) and an ANFIS. The Conv-LSTM predicted passenger demand as a time series, while the ANFIS modelled the effects of external factors (such as weather, weekday, and day or night) on passenger demand. Similar to the previous references, all parameters in the two artificial neural networks were given in crisp values. The outputs from the two artificial neural networks were aggregated using a convolution method.
Qasem and Mohammadzadeh [31] constructed a deep learned type-2 FNN to identify a hyperchaotic system, predict chaotic time series, and predict the glucose level of a type-1 diabetes patient. The deep learned type-2 FNN was an ANFIS with seven layers. It was special because the output was mapped to an interval fuzzy number.
However, the output did not necessarily contain the actual value.
The novelties of the proposed BD-I4 approach are described. First, the FDNN constructed in the BD-I4 approach is an FNN with nonlinear activation functions (i.e., sigmoid functions), which is distinct from the ANFISs constructed in the past studies.
In addition, all parameters of the FDNN are fuzzy values that are to be optimized to minimize the average range of fuzzy forecasts, while guaranteeing that every fuzzy forecast contains the actual value.

Methodology
The procedure for the BD-I4 approach comprises the following steps: Step 1. (Expert) Configure (or re-configure) a DNN to predict the cycle time of a job according to his/her own preference.
Step 3. (Expert) Fuzzify the DNN to construct a FDNN based on his/her own preference.
Step 4. (Expert) Apply the FDNN to project the cycle time ranges of jobs.
Step 5. (Coordinator) Aggregate the projection results by all experts using the FWI operator.
Step 7. (Coordinator) If the projection precision is satisfactory, stop; otherwise, return to Step 1.
An activity diagram is provided in Fig. 1 to illustrate the procedure.

Projecting the cycle time range of a job using a FDNN
In the BD-I4 approach, experts construct FDNNs with different configurations to project the cycle time range of a job, as illustrated in Fig. 2. These FDNNs have four layers: the input layer, two hidden layers, and the output layer. In addition, fuzzified parameters in these FDNNs may be different.
where (1) wm % is the connection weight between input node p and first-hidden-layer is the threshold on first-hidden-layer node l. (1) () jl hm % is the output from first-hidden-layer node l. () denotes fuzzy subtraction.
Between the two hidden layers, the following operations are performed: wm % is the connection weight between first-hidden-layer node l and secondhidden-layer node q. (2) is the threshold on second-hidden-layer node q.
Outputs from the second hidden layer are aggregated on the output node, and then is outputted as wm % is the connection weight between second-hidden-layer node q and the output is the threshold on the output node.

Training the FDNN
The training of the FDNN is composed of two stages. First, the cores of fuzzy parameters are derived by training the FDNN as a crisp DNN using the LM algorithm, so as to minimize the mean squared error (MSE). The optimal solution is indicated with Subsequently, the lower and upper bounds of fuzzy parameters are determined, so as to minimize the average range (AR(m)) of fuzzy cycle time forecasts: However, deriving the optimal values of all fuzzy parameters at the same time is a NLP problem [13] that is computationally intensive. As an alternative, the optimal values of fuzzy parameters are derived separately as follows.

Deriving the optimal value of
is the parameter closest to the output node, fuzzifying which is the most effective way to estimate the range of the output. The core of () o m , has been obtained. The lower and upper bounds of () o m  % can be derived according to the following theorems.

Proof.
Substituting Equation (9) into Equation (8) gives Therefore, 13 3 The other parameters are not fuzzified, so 3 Im that is a fixed value: To minimize AR(m), 3 () j om should be as low as possible, which corresponds to the In addition, Therefore, Substituting Equation (13) into Constraint (17) gives To Theorem 1 is proved. Proof.
The required proof is similar to that of Theorem 1.

Deriving the optimal value of
The next closest parameter to the output node is the connection weight between a second-hidden-layer node and the output node, i.e., Proof.
According to Equation (7), wm % may be negative. Substituting Equation (22) into Equation (13) gives Only f o l w is fuzzified, the other parameters are set to their optimized cores: Substituting Equation (24) into Constraint (17), we have As a result, Therefore, Widening Theorem 3 is proved. Proof.
The required proof is similar to that of Theorem 3.
wm are fuzzified sequentially, then the following theorem holds.
The required proof is straightforward since the other variables in Equations (21) and (29) are not affected by the fuzzification of () o m  . ( 2)* ( 2)* 2 2
Other parameters can be fuzzified in similar ways. However, these parameters are far from the output node. Fuzzifying these parameters has very little effect on the range of the output. In the proposed methodology, if an expert fuzzifies multiple parameters, these parameters are fuzzified sequentially according to their closeness to the output node.

Aggregating the projected cycle time ranges using FWI
The projected cycle time ranges by all experts are aggregated using FWI [25]: with the following membership function: where m  is the authority level of expert m; The FWI result meet the following requirements [25]: operator. (3) Aggregating the cycle time ranges projected by experts using FWI has the following advantages over existing aggregation methods such as FI and partialconsensus FI (PCFI): (1) The FWI result exists even if experts lack an overall consensus.
(2) Some experts are more authoritative than others, which can be considered by applying FWI.

Background
To evaluate the effectiveness of the proposed methodology, it has been applied to a wafer fabrication factory (wafer fab) to project the cycle time ranges of jobs. Although advanced forecasting methods have been proposed in recent studies to predict the cycle time of a job in a wafer fab, the forecasting accuracy, in terms of mean absolute percentage error (MAPE), was seldom less than 10% [16,18]. For this reason, the range of the cycle time of a job needed to be projected. Three experts (an industrial engineering manager, a production planning engineer, and a management information system engineer) applied the proposed methodology to fulfill this task in a collaborative manner.
The cycle time of a job was predicted by fitting the relationship between six parameters (job size, fab work-in-process (WIP), the queue length before bottlenecks, the queue length on the processing route, the average waiting time of recently completed jobs, and fab utilization) and the cycle time. Therefore, P = 6. The data of 120 jobs have been collected. Among them, the data of the first 2/3 jobs were adopted to train the FDNN, while the remaining data were left for evaluating the forecasting performance.

Application of the proposed methodology
At first, experts configured FDNNs according to their own preferences that are summarized in Table 1. MATLAB R2017a was applied to construct these FDNNs on a PC with an i7-7700 CPU of 3.6 GHz and 8 GB of RAM.   The training results are shown in Fig. 3. Obviously, these DNNs predicted the cycle times of jobs very accurately for the training data. The forecasting accuracy, measured in terms of mean absolute error (MAE), MAPE, and root mean squared error (RMSE), is presented in Table 2. However, when the DNNs were applied to test data, the forecasting accuracy was not satisfactory, showing the need for projecting the range of the cycle time.    The projection precision achieved by these experts are compared in Table 3. It was noteworthy that although Expert #1 optimized the forecasting accuracy in terms of MAE, MAPE, and RMSE, his projection precision was inferior to that of Expert #2. In addition, although Expert #3 achieved a poor performance in minimizing AR, he elevated the hit rate to 95% for test data. These facts showed the necessity for aggregating the projection results by all experts. which was much narrower than the original three ranges. This result supported our belief that instead of relying on a single expert that fuzzified all parameters, seeking multiple experts to fuzzify selected parameters and then collaborate may be more efficient and effective. However, experts had unequal authority levels in this collaborative forecasting task. In particular, the industrial engineer manager had a higher authority level than the other experts. For these reason, FWI was applied instead, for which the authority levels of experts were set to 0.5, 0.3, and 0.2, respectively.
Taking Job #88 as an example, the aggregation result using FWI is shown in Fig. 7. The results are provided in Fig. 8. Obviously, the width of the aggregation result was narrower when FI or FWI was applied. The aggregation results for all jobs in test data are summarized in Fig. 9. The average range was 429.9 (hrs), and the hit rate was 87.5%.

Comparison with existing methods
To elaborate the effectiveness of the proposed methodology, its projection performance was compared with those of four existing methods: the 6 confidence interval method [33], the fuzzy linear regression (FLR)-quadratic programming (QP) method [24], the FBPN method [2], and the selectively fuzzified FBPN approach [34].
In the 6 confidence interval method, the lower bound on the cycle time was established by subtracting three times the standard error from the core, while the upper bound was established by adding the same allowance to the core. The standard error of forecasting was approximated by a function of RMSE: where V is the number of parameters in the cycle time forecasting method. In this way, the cycle time range was always six times the standard error of forecasting. Such a range was wide though, but did not necessarily contain actual value [10]. In addition, the cycle time range determined in this way was symmetric, which might be impractical [19].
In the FLR-QP method, the cycle time of a job was predicted using a FLR function of the six factors, which support was used to project the cycle time range. In theory, there are various ways to fit a FLR equation. In the FLR-QP method, a QP problem was solved. In the FBPN method, a FBPN was constructed and trained to project the cycle time range of a job. However, only the threshold on the output node was fuzzified to simplify the required computation.
In the selectively fuzzified FBPN method, both thresholds on the hidden and output layers of the FBPN were fuzzified, which further narrowed the cycle time range. To this end, a NLP problem was solved, with the aid of a random search and local optimization algorithm. The random search and local optimization algorithm first randomized the values of thresholds on the hidden layer, and then optimized the threshold on the output node.
The performances of projecting the cycle time range using various methods are compared in Table 5. To combine AR and hit rate, the cost for inclusion (CFI) [35] was also calculated for each method:

AR CFI
Hit Rate  (45) According to experimental results, the following discussion was made: (1) The case of Expert #2 showed that a more complicated (or deeper) DNN did not necessarily result in better forecasting accuracy for test data.
(2) The two traditional methods based on probability theories or statistics, 6 confidence interval method and FLR-QP, generated very wide ranges of cycle times. In contrast, fuzzified artificial neural network methods, FBPN and selectively fuzzified FBPN method, were more effective in shrinking the average range. However, their performances in maximizing the hit rate were not satisfactory.
(3) The proposed methodology overcame the difficulty of existing fuzzified artificial neural network methods by elevating the hit rate to a sufficiently high level.
(4) In addition, the proposed methodology maximized CFI, showing its ability in elevating the hit rate without considerably increasing the average range. The advantage was up to 72% over the 6 confidence interval method.
(5) For comparison, FI was also applied to aggregate the projected cycle time ranges by experts. The results are shown in Fig. 10. Although AR was reduced to a very low level -213.2 (hrs), however, HR was only 57.5%. Therefore, aggregating the projected cycle time ranges using FWI seemed to be a more promising treatment. Fig. 9. Aggregation results using FI.

Conclusion and Future Research Directions
The BD-I4 approach is proposed in this study to enhance the effectiveness of projecting the cycle time range of a job, which is a critical task when facing the difficulty of accurately predicting the cycle time of the job. The proposed methodology is a joint application of big data analytics and Industry 4.0, which makes it distinct from existing methods in this field. In the BD-I4 approach, first, each expert constructs a The BD-I4 approach has been applied to a real case to evaluate its effectiveness and make comparison with several existing methods. According to the experimental results: (1) The BD-I4 approach effectively tightened the ranges of fuzzy cycle time forecasts, while achieving a very high hit rate for test data.
(2) FWI was shown to be a more effective aggregator than FI that led to a very low hit rate. In addition, the unequal authority levels of experts could be reasonably reflected in the aggregation result using FWI.
The authority levels are subjectively assigned in FWI. In future studies, other aggregators can be devised to aggregate the cycle time ranges projected by experts with unequal authority levels. In addition, other types of deep-learning neural networks can be constructed to project the cycle time range of a job.

Declarations
Acknowledgment: This work was supported by Ministry of Science of Technology of Taiwan.
Funding: This research received no external funding.

Conflicts of Interest:
The authors declare no conflicts of interest.

Availability of data and materials:
There is no original data associated with this review.
Code availability: Not applicable.
Ethics approval: Not applicable.
writing original draft: Toly Chen. and Yu-Cheng Wang; writing-review and editing: Toly Chen and Yu-Cheng.