Reliability Analysis of Intelligent Manufacturing Systems Based on Improved FMEA Combined with Machine Learning

: Along with the booming of intelligent manufacturing, the reliability management of intelligent manufacturing systems appears increasingly important. Failure mode and effects analysis (FMEA) is a prospective reliability management instrument extensively utilized to manage failure modes of systems, products, processes, and services in various industries. However, the conventional FMEA method has been criticized for its inherent limitations. Therefore, this paper devises a method based on an improved FMEA model combined with machine learning for complex systems and applies it to the reliability management of intelligent manufacturing systems. The structured network of failure modes is constructed based on the knowledge graph for intelligent manufacturing systems. The grey relation analysis (GRA) is applied to determine the risk prioritization of failure modes, hereafter the clustering analysis is employed to extract the features of failure modes. The results show that the proposed method can more accurately reflect the coupling relationship between the failure modes compared with the conventional FMEA method. This research provides significant support for the reliability and risk management of complex systems such as intelligent manufacturing systems.


Introduction
The upsurge of the global intelligent manufacturing revolution has promoted the continuous deepening of the integration of advanced manufacturing technology and new-generation information technology [1].Countries around the world are actively participating in this revolution and have formulated relevant strategic plans, such as the National Strategic Plan for Advanced Manufacturing in the US, Industry 4.0 in Germany, New Industrial France, The future of manufacturing: a new era of opportunity and challenge for the UK and so on.Based on the reshaping of the international industrial pattern and the challenges confronted by the Chinese traditional manufacturing industry, China put forward the made in China 2025 plan in 2015.In recent years, China has continuously issued relevant policies, laws, and regulations to vigorously support manufacturing enterprises and promote their transformation and upgrading, to accelerate the transition from a massive manufacturing country to a manufacturing powerhouse.Significantly, intelligent manufacturing systems are crucial to realize the digitalization, networking, and automation of the manufacturing industry.Due to the complexity, it is difficult to manage the risks of the intelligent manufacturing systems.Therefore, the reliability analysis of intelligent manufacturing systems is of great significance to the high-quality development of intelligent manufacturing, and many scholars have done related research.For instance, He et al. [2] proposed integrated predictive maintenance (PdM) strategy to improve the mission reliability of manufacturing systems and the quality of products.Wan et al. [3] developed an extended FMEA applied to evaluate risks in the intelligent manufacturing process, which was based on rough sets and TOPSIS (Technique for Order Preference by Similarity to an Ideal Solution), and introduced environmental impacts as one of the risk factors.Chen et al. [4] established a reliability evaluation model for multistate intelligent manufacturing systems based on operational quality data.
Identification and evaluation of the risks of failures are essential to improve the maintenance strategy and management.It obtains more importance in complex engineering systems [5].To achieve this, several techniques such as failure mode and effects analysis (FMEA), fault tree analysis (FTA), reliability block diagram (RBD), Monte-Carlo simulation (MCS), Markov analysis (MA), and Bayesian networks (BN) have been developed and applied [6].Among them, FMEA is a structured and proactive reliability management technology utilized to enhance the safety and reliability of systems, products, processes, and services [7].
So far, FMEA is still one of the most valuable and effective reliability analysis methods used in various industries [7, ].Arabian-Hoseynabadi et al. [9] verified that the FMEA method had the potential to improve the reliability of the wind turbine (WT) system, especially for the offshore environment.Li and Zhou [10] used FMEA to construct the reliability analysis method for urban gas transmission and distribution systems.Tazi et al. [11] developed a hybrid cost-FMEA analysis for the reliability analysis of wind turbine systems.Wang et al. [12] presented a new FMEA model combined with the house of reliability (HoR) and rough VIsekriterijumska optimizacija i KOmpromisno Resenje (VIKOR) approach and demonstrated the effectiveness of the model for the transmission system of a vertical machining center.
In the conventional FMEA method, the risk ranking of each failure mode is determined by risk priority number (RPN) which is calculated by multiplying the values of the risk factors severity (S), occurrence (O), and detection (D) [13].However, the conventional FMEA method has inherent limitations such as ignoring the weight of risk factors, getting the same RPN that may have different meanings of the failure mode risk and lacking scientific bases for the calculation of the RPN [8,1416].In order to overcome the deficiencies associated with the conventional FMEA method, various methods have been used to improve the FMEA model.On the one hand, the weight of risk factors is considered in the risk analysis.Subjective weight methods such as analytic hierarchy process (AHP) [7], objective weight methods such as entropy weight method [17], or comprehensive weight methods [18][19][20] give different weights to risk factors.On the other hand, many approaches have been applied to increase the reliability and rationality of risk prioritization.Bian et al. [21] proposed a new risk prioritization model based on D numbers and TOPSIS to evaluate the risk in FMEA.Kumar et al. [22] applied fuzzy FMEA and fuzzy logic with the grey relational approach (GRA) to rank the identified failure modes.Baghery et al. [23] prioritized the manufacturing processes based on the process failure mode and effects analysis (PFMEA), interval data envelopment analysis, and grey relational analysis.Tian et al. [24] established an integrated fuzzy MCDM approach for FMEA, and a fuzzy VIKOR approach was employed to obtain the risk priorities of failure modes.Huang et al. [8] integrated probabilistic linguistic term sets and the TODIM (an acronym in Portuguese for interactive multi-criteria decision making) method to evaluate and prioritize the risk of failure modes.Because of the powerful data processing capabilities of machine learning, it has also become one of the improvement directions of the conventional FMEA method.For example, Ku et al. [25] proposed a BPN-based FMEA system (N-FMEA) for failure modes classification and reliability calculation.Keskin and Ozkan [26] introduced Fuzzy Adaptive Resonance Theory (Fuzzy ART), which was developed for clustering problems in artificial neural networks.Jomthanachai et al. [27] integrated DEA and machine learning for risk assessment.
When conducting reliability analysis of large-scale systems, due to a large number of failure modes and complex relationships, it is possible to consider building a structured network of failure modes.The knowledge graph is a structured semantic network used to represent the relationship between entities and has the powerful capability for visualization and knowledge reasoning.It provides semantically structured information that is interpretable by computers, which is regarded as an important ingredient to build more intelligent machines [28].Applying the knowledge graph to failure mode and effects analysis is greatly promotable for the intelligence of reliability management.Compared with the conventional FMEA method, the FMEA method based on the knowledge graph has significant advantages in forming a failure mode knowledge base, fault reasoning ability, fault range analysis, and fault multi-level analysis.
Inspired by the aforementioned discussions, this paper explores a reliability analysis method based on improved FMEA combined with machine learning for intelligent manufacturing systems.
The main contributions of the paper are as follows.
(1) the knowledge graph of failure modes of the intelligent manufacturing systems is constructed, which counts for a great deal for the establishment of the structured network of failure modes and knowledge base of the intelligent manufacturing systems.In addition, the knowledge reasoning ability and knowledge retrieval ability contained in the knowledge graph have great value of guidance and reference for evaluating and preventing failure modes.
(2) Combined with grey relation analysis (GRA) and K-means clustering, an improved FMEA model is established.The improved model can more reasonably reflect the risk prioritization of failure modes compared with the conventional FMEA method and provides the theoretical basis for the prevention and monitoring of failure modes of complex systems such as intelligent manufacturing systems.
The remainder of this paper is organized as follows.In Section 2, a reliability analysis model utilizing knowledge graph theory, GRA, and machine learning is developed for prioritizing and classifying failure modes by improving FMEA.In Section 3, The application of the improved FMEA approach on reliability analysis of the intelligent manufacturing systems is provided.In Section 4, the results are discussed and suggestions are made for the risk prevention and monitoring of intelligent manufacturing systems.Finally, concluding remarks and further research proposals are presented in Section 5.

The proposed method
In this section, we propose a new reliability analysis method for FMEA based on the knowledge graph, GRA, and machine learning.The proposed method mainly consists of three phases: evaluating the risk of failure modes, determining the risk prioritization of failure modes, and extracting the features of failure modes.The method is detailedly described in subsequent sections.

Evaluate the risk of failure modes
In order to evaluate the risk of failure modes, it is necessary to effectively identify the potential failure modes, determine the standard of evaluation linguistic terms, and organize an FMEA team for evaluation.Therefore, a structured network of failure modes based on the knowledge graph is proposed in this phase.
Step 1. Identify the potential failure modes and construct the knowledge graph The construction process of the knowledge graph includes five key technical modules: knowledge extraction, knowledge representation, knowledge fusion, knowledge reasoning, and knowledge storage, which structure scattered data and integrate them into a complete knowledge base.In the early stage, the knowledge graph of failure modes is, to a certain extent, dependent on experts' subjective judgment.Consequently, experts are required to operate manually to form training sets in information extraction, information processing, and information fusion.The architecture of the knowledge graph of failure modes is shown in Figure 1. .Experts evaluate various failure modes based on Likert's five scaling method and established the FMEA evaluation matrix F = �  � × .The five levels of linguistic terms correspond to 1, 3, 5, 7, and 9 points respectively.In addition, each failure mode possesses three risk factors, including S, O, and D. The linguistic terms for rating failure modes are shown in Table 1.

Determine the risk prioritization of failure modes
In this phase, the weight of risk factors is taken into consideration, and GRA is utilized to rank the risk of failure modes.GRA is a method of multi-factor statistical analysis, which usually takes the uncertain system as the research object.It is a method to quantitatively describe the changing trend of the system.This method can greatly reduce the analysis difficulties caused by unclear and missing information and is often used to improve the ranking accuracy in FMEA.

Step 3. Calculate the weights of risk factors by AHP
The concrete procedure of the AHP method [19] is summarized as follows.
(1) Establish the judgment matrix Each expert of the FMEA team compares the importance of the three risk factors S, O, and D, and establishes a judgment matrix.The judgment matrix of the k-th expert is   = �  � × ( = 1,2, ⋯ , ), in which each pair of factors is compared using the numerical rating.  represents the relative importance of the i-th risk factor over the j-th risk factor, and (2) Calculate the consistency ratio (CR) CR is a ratio between the matrix's consistency index and random index, used to indicate the probability that the matrix judgments were randomly generated, and in general ranges from 0 to 1.
A  of 0.1 or less is considered acceptable.Otherwise, the judgments are untrustworthy and need to be reconstructed.CR is defined as: where RI is the random consistency index related to the dimension of matrices.Obtained by the "table look-up" method, when  = 3,  = 0.52.CI is the consistency index, and can be expressed as: where   is the largest or principal eigenvalue of the matrix and  is the order of the matrix.
(3) Obtain the weight vector Through normalization, the weights of factors based on the k-th expert's opinion are obtained as: Where ,  = 1,2, ⋯ , , and  = 1,2, ⋯ , .The weights of risk factors obtained by combining the opinions of experts are calculated as Where  = 1,2, ⋯ , .The weight vector of risk factors S, O, and D is expressed as: Step 4. Calculate the risk score of failure modes by GRA GRA is adopted as a tool for risk prioritization, and the specific description is given below.
(1) Set the reference sequences and the comparative sequences In the first stage, values in the FMEA evaluation matrix for each failure mode are processed into comparability sequences.The reference sequence which indicates the ideal state is set as The i-th failure mode can be expressed as a comparative sequence Besides, the matrix numbers should be normalized first by non-dimensional treatment.Because the lower the risk priority, the more reliable the failure modes are, which is a cost criterion [29].The normalized equation is defined as: Where  = 1,2, ⋯ ,  and  = 1,2, ⋯ , .
(2) Calculating the grey relational coefficient for each failure mode Based on the normalized matrix, the relational coefficient was constructed using the following equation: Where  = 1,2, ⋯ ,  and  = 1,2, ⋯ , .This process is used for determining how close   () is to  0 ().The larger the coefficients, the closer   () and  0 ().Let  be the distinguishing coefficient, (0,1) , and usually is set to 0.5, which affects the relative value of risk without changing the priority.
(3) Calculating the grey relational grade The grey relational grade can be calculated by Eq. ( 8).The larger the value of  0 , the higher the failure mode risk priority.

Extract the features of failure modes
Failure modes are caused by different causes.In addition to identifying failure modes, we hope to carry out targeted maintenance according to different types of quality problems.Therefore, the classification and feature extraction of various failure modes is the basis of research.In this phase, we combine the K-means clustering algorithm to classify failure modes, based on RPN calculated by the conventional FMEA and the grey relational grade obtained in the previous step.
The k-means is a well-known unsupervised machine learning algorithm that solves clustering problems.It is used for discovering the cluster structure in data sets with the greatest similarity within the same cluster, but the greatest dissimilarity between different clusters [30].The k-means method is often applied to the cluster analysis of scattered points in the two-dimensional coordinate system, which is suitable for the classification of failure modes and provides a theoretical basis for fault maintenance and continuous improvement.
Step 5. Classify failure modes by the k-means algorithm (1) Normalize RPN calculated by the conventional FMEA and the grey relational grade obtained in the previous step to form m two-dimensional data points representing failure modes.
(3) Let U = {u 1 , u 2 , ⋯ , u  } be the initial centroids of the cluster by random selection.According to the K-means clustering results, failure modes are divided into s categories, and the corresponding measures for monitoring and preventing failure modes can be put forward based on the features of different categories.

Implementation
In this section, the model proposed in the previous part is utilized to analyze the reliability of the intelligent manufacturing systems, including constructing the knowledge graph and evaluating failure modes, grey relational ranking of failure modes, and K-means clustering analysis.
Generally, the complete intelligent manufacturing systems consist of multiple subsystems, which also belong to different system dimension layers.Referring to the intelligent manufacturing standardization systems, a 5-tier architecture is established, consisting of the network layer, enterprise layer, management layer, control layer, and equipment layer.
(1) The network layer refers to the data information network based on Ethernet, which can realize the information interaction between enterprises and the data transmission and storage within the enterprise.
(2) The enterprise layer refers to the management and operation system built by the enterprise itself under the network layer.It is the most comprehensive and core functional layer in the enterprise, including subsystems such as enterprise resource planning (ERP), supply chain management (SCM), and customer relationship management (CRM) system.
(3) The management layer, as a connecting link between the preceding and the following layer, realizes the transition from enterprise management to workshop production.It is mainly composed of subsystems that control the overall production of the enterprise, including the manufacturing execution system (MES), product lifecycle management (PLM), etc.
(4) The control layer is the functional layer that realizes the production of specific workshops.It is also one of the biggest characteristics of intelligent manufacturing systems that distinguish them from the conventional manufacturing field.It includes subsystems such as supervisory control and data acquisition (SCADA), distributed control system (DCS), programmable logic controller (PLC), etc.
(5) The equipment layer refers to the frontline production workshop units, including a series of intelligent production equipment, which can most intuitively reflect the intelligence and informatization of the production process.
Based on the division of functional layers mentioned above, combined with relevant literature and actual investigations, a total of 26 failure modes of the intelligent manufacturing systems are determined in this paper.The specific failure modes and their causes are shown in Table 2.In addition, in intelligent manufacturing systems, the failure cause corresponding to each failure mode may be caused by the comprehensive causes of different subsystems.Therefore, combining literature and actual conditions, this paper summarizes four types of failure causes, including human error, design defect, configuration defect, and force majeure.These four failure causes cover most of the defects and problems that may occur in the actual production process.Among them, force majeure includes not only some natural accidents and disasters but also some inevitable situations, such as changes in the relationship between supply and demand.For each enterprise, as long as the enterprise is in the macro environment of the market, the relation between supply and demand will inevitably exist and continue to change, which is difficult to avoid and eliminate.The classification of failure causes is shown in Table 3.Through the knowledge graph, the location and relevant information of each failure mode in the entire intelligent manufacturing system can be traced.Furthermore, the knowledge graph provides the functions of semantic search and semantic inference.In the knowledge base, users can retrieve failure modes according to keywords and infer the possible coupling relationship with other failure modes that are superior, subordinate, or peer.When the knowledge base is broader and more accurate, the reasoning ability of the knowledge graph will be stronger.The failure mode network can be used as the basis of the knowledge base and greatly improve the efficiency of knowledge reuse.For instance, for  21 inaccurate equipment condition monitoring, the most likely failure cause comes from the testing equipment or the system itself.According to the knowledge graph of failure modes, we can realize that some failure modes at the network layer, such as  5 transmission failure of the communication line, may appear as the superior level failure modes of At present, the application of the knowledge graph in intelligent manufacturing systems still has resistance as follows.
(1) There is a lack of relevant databases in the field of intelligent manufacturing.Knowledge graph technology requires a large amount of labeled data to construct training sets.However, intelligent manufacturing is an emerging field.Due to the confidentiality of the database in the intelligent manufacturing field, the particularity of intelligent manufacturing enterprises, and the lack of relevant industry standards, the databases often cannot be effectively integrated, which is difficult to provide a good training database for the application of the knowledge graph.
(2) The advantages of knowledge graph theory applied to intelligent manufacturing systems are not clear enough.The knowledge graph has powerful processing capabilities for huge data sets, which reflects its potential for applications in the field of intelligent manufacturing.
However, the issues are not clear enough that how to apply the knowledge graph technology to the design, assembly, manufacturing, and other processes of intelligent manufacturing systems, and how effective it is for the process improvement of the intelligent manufacturing systems.
The FMEA team was organized to undertake the risk evaluation, which consisted of three experts A, B, and C. A is an intelligent manufacturing system implementation consultant, B is an enterprise informatization consultant, and C is an intelligent manufacturing system reliability expert at the university.In view of the experts' different knowledge backgrounds and professional fields, distinct weights are allocated to them to reflect their importance in the FMEA process, i.e.  = (0.3,0.3,0.4).
The weight vector of risk factors S, O, and D can be calculated using Eqs.( 3)-( 5), and the result is  = (0.4900,0.2698,0.2402) .The linguistic evaluations on failure modes by the FMEA team members are shown in Table 4.The evaluation results can be transformed into a numerical FMEA evaluation matrix F = (  ) 26×3 .
TABLE 4: Linguistic evaluations on failure modes by the FMEA team members.
Failure modes Combing the linguistic evaluations in Table 4 with expert weights, The FMEA evaluation matrix is shown in Table 5.The matrix is normalized using Eq. ( 6), for example, When carrying out FMEA, the smaller the risk factor value in the normalized matrix, the larger the risk of failure mode.Thus, the failure mode reference sequence consists of the minimum value of each risk factor in the normalized matrix, which is (0.1183,0.0818,0.1231).The grey relational grades are calculated using Eq. ( 7)- (8).The results and the comparison of risk rankings between the RPN method and the GRA method are shown in Table 6.In practice, fault modes are generally divided into three categories, including fault modes with low risk, moderate risk, and high risk.Therefore, let s = 3.On the basis of the above data, through K-means clustering analysis by MATLAB programming, the clustering results are obtained and shown in Figure 3 and Table 7.As shown in Figure 3

Comparison and discussion
According to Table 6, the comparison of failure mode ranking between the conventional FMEA method and the improved FMEA method is shown in Figure 4.It can be obtained that the RPN calculated by the conventional FMEA method has duplicate values.For example, the risk priority of  12 and  20 are both ranked third, the risk priority of  3 and  13 are both ranked fifth, and it is difficult to judge the specific risk priority of the failure mode and give suggestions on reliability management.This is because the conventional FMEA method does not consider the weight of risk factors, and the calculation method of RPN also determines that it is easy to cause the repetition of RPN.However, the improved FMEA method combined with AHP, GRA, and Kmeans clustering can more precisely reflect the risk prioritization of failure modes and put the failure modes that are able to radiate other functional layers in a higher risk prioritization.
From the perspective of risk priority, cluster 1 contains the failure modes with low values of RPN and grey relational grades, that is, whether in the conventional FMEA method or the improved FMEA method, the risk priority of these failure modes is low.From the perspective of risk factors, on the whole, the probability of occurrence and detection of failure modes in cluster 1 is relatively low.From the perspective of the structured network of failure modes, the failure modes in cluster 1 are widely distributed in all functional layers of the intelligent manufacturing system and mainly distributed in the downstream functional layers.From the perspective of risk causes, the failure modes in cluster 1 are mostly caused by equipment failures and network failures, and most of them contain the factors of human error.For intelligent manufacturing systems, human error is avoidable.
With the dynamic development of intelligent manufacturing systems, human error will show a decreasing trend. 9 ,  25 , and  26 are only caused by force majeure, and they can be prevented by taking precautions in advance.In general, this type of failure mode has a relatively small impact on the reliability of intelligent manufacturing systems.
In cluster 2, from the perspective of risk priority, the ranking of RPN and grey relational grade are both at the medium level, or one value is larger and the other is smaller.From the perspective of risk factors, the probability of severity (S) and occurrence (D) of failure modes in cluster 2 is relatively high.From the perspective of the structured network of failure modes, the failure modes in cluster 2 are basically in the upstream position of the failure mode network.They not only affect their own functional layer but also affect their downstream functional layers, causing the overall function loss and downtime of the intelligent manufacturing systems.This is difficult to be reflected in the conventional FMEA method and is the main reason for the difference in risk priority between the conventional FMEA method and the improved FMEA method.
From the perspective of risk priority, cluster 3 contains the failure modes with high values of RPN and grey relational grades.From the perspective of risk factors, the failure modes in cluster 3 have high values of S, O, and D, indicating that they have a strong impact on intelligent manufacturing systems.Meanwhile, most of them come from failure modes of the enterprise layer, the management layer, and the control layer, which are easy to cause bidirectional diffusion of failure modes, and need to invest more resources for prevention and monitoring.
In conclusion, the improved FMEA method can reflect the characteristic that the impact degree of failure modes will dynamically change with the development of intelligent manufacturing systems.In addition, the improved FMEA method can rank the failure modes that affect other functional layers in a higher risk priority and then cause subsequent failures.The improved FMEA method is more suitable for the reliability analysis of intelligent manufacturing systems than the conventional FMEA method and is more reasonable and in line with reality.

Failure modes in cluster 1
The risk priority of failure modes in cluster 1 is not high and they can be detected in time and a series of control measures can be taken.Such failure modes are often in the scope of the existing monitoring system and have a relatively sound prevention mechanism.For enterprises, they only need to improve the existing prevention and control mechanism, and there is no need to invest additional resources to prevent and control failure modes in cluster 1.
Among them, the failure causes of  14 ,  17 ,  19 , and  24 all include human error.
Therefore, it is necessary to strengthen the process review mechanism and the relevant business training of employees in various departments and adopt a more comprehensive staff management mechanism.For intelligent manufacturing systems, the management level and operation level are bound to continue to improve, the requirements for relevant personnel will also increase, and the severity, occurrence, and detection of such failure modes will continue to decline.In addition,  uncontrollable and difficult to predict, intelligent manufacturing enterprises need to formulate complete emergency plans, and always pay attention to relevant early warning information, so that the emergency plans can be activated, and the failure modes can be restored in an orderly manner to ensure production.

Failure modes in cluster 2
A significant common feature of failure modes in cluster 2 is that most of them are upstream of the structured network of failure modes.This type of failure mode often leads to the occurrence of a series of subsequent failure modes and has a serious impact on the integrity of intelligent manufacturing systems.The suggestion for such failure modes is to focus more on prevention and control, carry out feedback treatment as soon as possible, and minimize the impact in time before the failure modes radiate to the whole system.Therefore, it is necessary to establish a complete predictive monitoring system and a maintenance system that can respond to such failure modes in a timely manner.In the future, when the field of intelligent manufacturing is becoming more and more mature, this type of failure mode will gradually be transferred to cluster 1, which are controllable risks that the intelligent manufacturing systems can carry out automated supervision.
Among them,  2 ,  4 , and  6 are all caused by system design defects.It is necessary to optimize the relevant system design and improve the control system, which will help enterprises to minimize the impact of failure modes in time. 8 are affected by the macro environment of the market.The characteristics of real-time changes in the market will make these two failure modes unavoidable and difficult to predict, so it is difficult to establish a monitoring mechanism.Then the better strategy is to enhance the flexibility and toughness of the enterprise itself so that the enterprise can make corresponding production adjustments in time to adapt to the changes in market supply and demand. 11 and  23 are related to human errors and the impact can be reduced by establishing a complete information verification system.

Failure modes in cluster 3
Failure modes in cluster 3 have the greatest impact on intelligent manufacturing systems, and enterprises should invest the most resources to prevent them.Intelligent manufacturing systems contain a huge number and variety of related equipment.Each piece of equipment is closely coordinated and operated.Once a certain link goes wrong, it will cause the current production stagnation, and at worst, affect the function of the whole intelligent manufacturing system. 3 ,  7 ,  10 ,  12 ,  13 ,  15 ,  16 ,  18 , and  20 all involve important links in the internal operation of intelligent manufacturing systems.For these failure modes, it is necessary to establish an early warning and monitoring system to find failures in time and formulate a complete feedback mechanism.When a fault occurs, it shall be maintained in time to restore its function before irreparable losses are caused, so as to ensure the continuous operation of the whole intelligent manufacturing system.

Conclusion
Taking the transformative development of the manufacturing field as an entry point, an improved FMEA method is proposed, combined with the effective method of machine learning in the field of data processing, and applied to the reliability analysis of intelligent manufacturing systems.
In this study, we summarize the failure modes into three clusters.The results show that for the intelligent manufacturing systems, failure modes in cluster 1 have low-risk priority, which is mostly caused by equipment failures and network failures, and most of them contain the factors of human error which will show a decreasing trend with the dynamic development of the intelligent manufacturing systems.For enterprises, they only need to improve the existing prevention and control mechanism, and there is no need to invest additional resources in reliability management.
Failure modes in cluster 2 have a high severity index, which is often in the upstream position of the failure mode network, and its occurrence will radiate to the downstream functional layers, resulting in the loss of the overall function and downtime of the intelligent manufacturing systems.With the maturity of the intelligent manufacturing systems, this type of failure mode will gradually be transferred to cluster 1, which are controllable risks that the intelligent manufacturing systems can carry out automated supervision.Failure modes in cluster 3 involve important links in the internal operation of the intelligent manufacturing systems, which have a strong impact on the intelligent manufacturing systems and need to invest more resources for prevention and monitoring.
Compared with the conventional model, the improved FMEA can reflect the characteristic that the impact degree of failure modes will dynamically change with the development of intelligent manufacturing systems.Moreover, the improved FMEA put the failure modes that are able to radiate other functional layers in a higher risk prioritization, which is more in line with reality.In addition, the greatest advantage of machine learning lies in its ability to process huge and complex databases.Despite the knowledge base of failure modes of intelligent manufacturing established in this paper is small, a complete structured network will be formed and the improved FMEA method can bring its superiority of data processing into full play when the knowledge base gets wider and wider.
In summary, the proposed method is effective and robust for the reliability management of complex systems, including intelligent manufacturing systems.In future research, we can further consider how to build the knowledge base that can describe more complex relationships between failure modes, and establish a more scientific classification logic of failure modes in reliability analysis.

FIGURE 1 :Step 2 .
FIGURE 1: The construction process of knowledge graph.Step 2. Evaluate failure modes using linguistic terms Suppose a general FMEA problem including  failure modes   ( = 1,2, ⋯ , ) based on  risk factors, which are evaluated by  FMEA team members   ( = 1,2, ⋯ , ) .In order to reflect the relative importance of experts in the evaluation process, each team member should be assigned a weight   > 0( = 1,2, ⋯ , ) satisfying ∑   = 1  =1

( 4 )
Keep iterating the following until optimal centroids are found which means the clusters will not change anymore.a. Calculate the sum of the squared distance between data points and centroids, and assign each data point   to the nearest cluster.b.Re-compute the centroids for the clusters by taking the average   ( = 1,2, ⋯ , ) of all data points of that cluster iteratively.c.K-means terminates since the centroids converge and do not change.

Figure 2 .
Among them, the yellow line indicates the unidirectional diffusion of the effects of failure modes, and the red line indicates the bidirectional diffusion of the effects of failure modes.It means that the failure modes of the network layer may cause the failure modes of the control layer and the equipment layer, that is, the possible superior level failure modes of the control layer and the equipment layer can be tracked in the network layer.As the link between the enterprise layer and the control layer, the management layer is critical, and their failure modes will spread in both directions between the enterprise layer and the control layer.

FIGURE 2 :
FIGURE 2: Knowledge graph of failure modes of the intelligent manufacturing systems.

𝐹𝐹𝐹𝐹 21 .
Similarly,  21 is also the superior level failure mode of  19 inaccurate product quality control.

FIGURE 3 :
FIGURE 3: The results of K-means clustering.

FIGURE 4 :
FIGURE 4: Comparison of failure mode ranking between the conventional FMEA method and the improved FMEA method.

TABLE 1 :
Linguistic terms for rating failure modes.

TABLE 2 :
The FMEA of the intelligent manufacturing systems.

TABLE 3 :
The classification of failure causes.

TABLE 5 :
The FMEA evaluation matrix of failure modes.

TABLE 6 :
Comparison of risk rankings.
1 ,  5 ,  14 ,  19 ,  21 , and  22 are mainly caused by equipment failures and network failures.It is necessary to improve the software and hardware of the intelligent manufacturing systems, which can greatly reduce the occurrence of failure modes in cluster 1, and is the inevitable trend of the development of intelligent manufacturing systems.Since  9 ,  25 , and  26 is