Bayesian Belief Network-based approach for diagnostics and prognostics of semiconductor manufacturing systems

https://doi.org/10.1016/j.rcim.2011.06.007Get rights and content

Abstract

Semiconductor manufacturing is a complex process in that it requires different types of equipments (also referred to as tools in semiconductor industry) with various control variables under monitoring. As the number of sensors grows, a huge amount of data are collected from the production; and yet, the relations among these control variables and their effects on finished wafer are to be fully understood for both equipment monitoring and quality assurance. Meanwhile, as the wafer goes through multiple periods with different recipes, failure that occurs during the process can both cause tremendous loss to manufacturer and compromise product quality. Therefore, occurred failure should be detected as soon as possible, and root cause need to be identified so that corrections can be made in time to avoid further loss. In this paper, we propose to apply Bayesian Belief Network (BBN) to investigate the causal relationship among process variables on the tool and evaluate their influence on wafer quality. By building BBN models at different periods of the process, the causal relation between control parameters, and their influence on wafer can be both qualitatively indicated by the network structure and quantitatively measured by the conditional probabilities in the model. In addition, with the BBN probability propagation, one can diagnose root causes when bad wafer is produced; or predict the wafer quality when abnormal is observed during the process. Our tests on a Chemical Vapor Deposition (CVD) tool show that the BBN model achieves high classification rate for wafer quality, and accurately identifies problematic sensors when bad wafer is found.

Highlights

► Apply BBN to investigate the relationship between process variables on semiconductor tool. ► Quantify the impact of process variables on wafer quality with BBN conditional probability. ► Diagnose root-cause and predict the wafer quality with observed signals during the process.

Introduction

Semiconductor process is one of the most delicate processes in manufacturing industry. It is characterized by expensive equipment with batches of wafers being processed by a series of chemical operations. In today's “high-mix” fabrication facility, due to high equipment cost and capacity limitation, different products with their own recipes are being processed on various types of tools [24]. This brings an extremely complex operation condition to the tools and makes the equipment degradation highly unpredictable. Although sophisticated control programs are designed to ensure product specification, fault still occurs during the operation and subsequently it causes tremendous loss to manufacturer because in most cases rework is infeasible for the mis-processed wafer. Considering the batch processing scheme and the run-to-run control strategy that have been widely adopted in semiconductor industry, failure that happens at any time during the process should be detected promptly so that engineer can stop the operation to avoid further loss, and make corrections based on the diagnosis before process can be resumed.

Over the years, methodology for fault detection, diagnosis and prognosis has evolved from univariate statistical process control (SPC) to multivariate system analysis. Traditional quality control methods such as Shewhart, CUSUM chart [21] have encountered great challenges as the amount of data and number of variables increase. Since control charts are normally created for critical quality parameters of the process (i.e., film uniformity in deposition), when out-of-control alarm comes one need to identify its cause such that correction can be made. However, root cause identification is not trivial without engineer expertise as there are many factors that can affect the process. Furthermore, different control charts are needed for different problems, and to monitor them simultaneously would be impractical when the number of charts is large. Goodlin et al. [8] proposed to build type-specific control chart for different classes of failures so that the root cause of fault can be immediately identified when failure is detected by any of the chart. Although the method validates its effectiveness on an etching process with 6 types of faults, choosing features out of 19 process variables that coincide with each failure still requires a lot expert knowledge. Cherry et al. [3] developed a multivariate approach using Principal Component Analysis (PCA), and applied to monitor a post-lithography metrology process with 18 critical dimension measurements on the wafer. The overall process health is indicated by a scalar index that converted by PCA analysis.

Since PCA only captures the statistical dependency for a static system, due to parameter drift and shift during the manufacturing process due to tool aging or recipe change, adaptive capability is desired in order to accommodate those insignificant variance and also avoid false alarm. Based on the PCA approach, Spitzlsperger et al. [31] proposed an adaptive method with two update strategies for covariance matrix. By updating mean and standard deviation that used to rescale testing data, or rebuilding the covariance matrix from recent data, the model is used on a plasma stripper and is proved to be able to reduce the number of false alarms while accommodate itself to parameter small drifts during the process. Yan [35] also applied PCA method on controller data from an aluminum gate CMOS process with 36 parameters. Due to the fact that controller parameters are correlated to many process factors, PCA was used to decouple the correlation and transfer the original parameters into orthogonal variables, which has one-to-one relation with process factors.

In addition to statistical methods, artificial intelligent algorithms are also investigated over the years for semiconductor data analysis. Chen and Liu [2] proposed a neural network based approach to specifically detect spatial defect types such as ring, semi-ring, scratch etc. Although this approach provides an automatic checking and eliminates any manually visual inspection of the wafer, the applicability is still limited to detect spatial type defect. A hybrid expert system was developed by Kim and May [16] to incorporate engineering expertise for fault diagnosis. Neural networks are used to approximate the functional form of the failure history distribution of each component, based on which maintenance is scheduled. Predicted failure rates are then converted to belief levels. He and Wang [9] developed a k-nearest neighbor (kNN)-based fault detection approach. The algorithm characterizes the sum square distance of each of normal sample with its k nearest neighbors by a non-central χ2 distribution. And an unknown sample can be considered normal if the sum square distance with its k nearest neighbors is below a threshold with certain confidence; otherwise it is considered as a fault. Independent Component Analysis (ICA) is also reported for semiconductor process monitoring. Unlike PCA approach, ICA seeks a different data representation by expressing them as a linear combination of some independent components, in the meantime the variables of reconstructed data would also be as independent as possible. Lee et al. [18] proposed three different indices for fault detection. Based on this philosophy, several other approaches are proposed to enhance the performance of ICA-based method. A dynamic ICA (DICA) is developed in [19] to augment the observed data matrix by adding time-lagged observations. In [12] Adjusted Outlyingness metric (AO) is utilized for rejecting outliers and online process monitoring. And in [13] Support Vector Machines (SVM) classification is used instead of the aforementioned fault detection indexes to address the issue of kernel density estimation which yields poor performance as index is usually autocorrelated over the process. A real-time malfunction detection and diagnosis approach was developed by Hong et al. [11], which applies a time series neural networks (TSNNs) on in-situ metrology data for feature prediction. Rather than identifying process faults based on the electrical measurements obtained from finished product, TSNN provides real-time prediction for process variables, based on which evidential reasoning is performed with Dempster–Shafter theory.

Our review shows that current methodologies for fault detection and diagnosis in semiconductor industry are mainly concentrated on two areas. Expert system was used in early years based on engineering expertise of the process with some enhancements from evidence reasoning algorithm such as neural networks. Machine learning methods based on multivariate statistical analysis has also validated its versatility with applications from various processes. Although both methods have established huge popularity in the industry, their limitations also stand out. For expert system, engineering knowledge is critical to build the foundation. As the complexity of modern fabrication tools grows, expertise can only play a small role in identifying important factors for failures. On the other hand, statistical analysis is purely based on numerical connection and lacks an interpretation of physical significance. It is the correlation we can get from the data rather than some casual relationships of the system. Furthermore, both methods are focusing on fault diagnosis with little emphasis on prognosis.

Since its introduction by Pearl [25], [26] in early 1980s, BBN has been successfully applied in the domains of knowledge discovery and probabilistic inference [32]. The application in engineering stems recently from biomedical field where enormous data from human gene needs interpretation without sufficient prior knowledge. And the challenge is to uncover the gene/protein interactions and key biological features of cellular systems. Some examples of BBN application in biomedical area can be found in [22], [6], [5], [30]. In addition, there are also some industrial diagnosis and prognosis application using BBN (e.g., nuclear power plant control system [15], sensor fault detection and identification [20], military vehicles [23], aircraft engines [27], machining operation [29], etc.).

Although several applications were found in semiconductor realm using Bayesian-related inference for process monitoring and fault detection [34], [36], [7], BBN method has not specifically been applied for process diagnosis and prognosis. Therefore, we propose to apply BBN for fault diagnosis and prognosis in semiconductor process. By building BBN models at different periods of the process, the causal relation between control parameters, and their influence on wafer can be both qualitatively indicated by the network structure and quantitatively measured by the conditional probabilities in the model. In addition, with the BBN probability propagation, one can diagnose root causes when bad wafer is produced; or predict the wafer quality when abnormal is observed during the process.

The remaining of this paper is organized as follows: Section 2 briefly introduces the concept of BBN and discusses its applicability for fault diagnosis and prognosis. Section 3 addresses several issues that associated with the process in semiconductor industry and discusses our solutions in order to properly implement BBN algorithm. Section 4 provides an example with application on a Chemical Vapor Deposition (CVD) process. Section 5 concludes the paper and identifies future work for improvement.

Section snippets

Bayesian Belief Networks

Bayesian Belief Network is a statistical model that employs a graphical representation to quantify probabilistic relationships among random variables. In this section, we will briefly review the BBN concept, its graphical properties, the probability propagation given a observed status on any random variable, and the modeling techniques for structuring learning as well as parameter training.

BBN applicability for semiconductor process

Based on the review in last section, there are three benefits that one can obtain by applying BBN model for semiconductor process:

  • 1.

    With the BBN structure, it is able to reveal the causal relations of parameters and gives user a qualitative understanding about the system. In addition, with those condition probabilities that associated with the nodes, we can make a quantitative judgment about the strength of their relations.

  • 2.

    When a particular status is observed on a node or several nodes, through

Application example

Chemical Vapor Deposition (CVD) is a process to produce thin films on the wafer in semiconductor manufacturing. Based on engineering expertise, wafer quality can be attributed to several critical parameters such as temperature, pressure, flow rate etc. However, impacts from many other control parameters remain unclear. And their effect on final product need to be quantified to ensure the process complies with specification. In this application we apply BBN to identify the cause-and-effect

Discussion and future work

In this paper, we propose a Bayesian Belief Network based solution for fault diagnosis and prognosis in semiconductor manufacturing, and an application of CVD process is given. The results of our work have validated the effectiveness of using BBN for semiconductor process diagnosis as well as prognosis. Relations between process parameters that discovered from BBN structure give user better understanding of the interactions between various control variables. Furthermore, statistical inference

References (37)

  • T. Yuan et al.

    Spatial defect pattern recognition on semiconductor wafers using model-based clustering and Bayesian inference

    European Journal of Operational Research

    (2008)
  • H. Akaike

    A new look at the statistical model identification

    IEEE Transactions on Automatic Control

    (1974)
  • F. Chen et al.

    A neural-network approach to recognize defect spatial pattern in semiconductor fabrication

    IEEE Transactions on Semiconductor Manufacturing

    (2000)
  • G. Cherry et al.

    Multiblock principal component analysis based on a combined index for semiconductor fault detection and diagnosis

    IEEE Transactions on Semiconductor Manufacturing

    (2006)
  • G. Cooper et al.

    A Bayesian method for the induction of probabilistic networks from data

    Machine Learning

    (1992)
  • N. Friedman et al.

    Using Bayesian networks to analyze expression data

    Journal of Computational Biology

    (2000)
  • R. Ganesan et al.

    A multiscale Bayesian SPRT approach for online process monitoring

    IEEE Transactions on Semiconductor Manufacturing

    (2008)
  • B. Goodlin et al.

    Simultaneous fault detection and classification for semiconductor manufacturing tools

    Journal of the Electrochemical Society

    (2003)
  • Cited by (87)

    • Cyber–physical systems framework for AI in smart manufacturing and maintenance

      2024, Artificial Intelligence in Manufacturing: Applications and Case Studies
    • A novel health prognosis method for system based on improved degenerated Hidden Markov model

      2022, Robotics and Computer-Integrated Manufacturing
      Citation Excerpt :

      Data-driven models use existing data to predict the health of equipment and are suitable for highly complex systems even with little prior knowledge. They rely solely on past observed trajectories [8], including neural networks [9–12], support vector machines [13,14], and Gaussian process regression [15]. The difficulty in finding the effective part of raw data, slow convergence, and local minimum value are the primary drawbacks of the applicability of these models.

    • An interpretable unsupervised Bayesian network model for fault detection and diagnosis

      2022, Control Engineering Practice
      Citation Excerpt :

      Bayesian Network (BN) is also one of the popular methods among these causal approaches (Cai, Huang, & Xie, 2017). Yang and Lee (2012) considered a Bayesian network based on several discretized sensor variables, and these variables consist of different states: normal, warning, or error. By entering quality data in the evidence node, the faults can be isolated by analyzing the posterior probabilities of other nodes.

    View all citing articles on Scopus
    View full text