Smart Tool Wear Monitoring of CFRP/CFRP Stack Drilling Using Autoencoders and Memory-Based Neural Networks

Caggiano, Alessandra; Mattera, Giulio; Nele, Luigi

doi:10.3390/app13053307

Open AccessArticle

Smart Tool Wear Monitoring of CFRP/CFRP Stack Drilling Using Autoencoders and Memory-Based Neural Networks

by

Alessandra Caggiano

^1,2

,

Giulio Mattera

^3,*

and

Luigi Nele

³

¹

Center for Advanced Metrological and Technological Services (CESMA), University of Naples Federico II, Corso Nicolangelo Protopisani 70, 80146 Naples, Italy

²

Fraunhofer Joint Laboratory of Excellence on Advanced Production Technology (Fh J_LEAPT UniNaples), Piazzale Tecchio 80, 80125 Naples, Italy

³

Department of Chemical, Materials and Industrial Production Engineering, University of Naples Federico II, Piazzale Tecchio 80, 80125 Naples, Italy

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(5), 3307; https://doi.org/10.3390/app13053307

Submission received: 14 February 2023 / Revised: 1 March 2023 / Accepted: 2 March 2023 / Published: 5 March 2023

(This article belongs to the Special Issue Innovative Approaches for Machining Technologies of Composite Materials)

Download

Browse Figures

Versions Notes

Abstract

:

The drilling of carbon fiber-reinforced plastic (CFRP) materials is a key process in the aerospace industry, where ensuring high product quality is a critical issue. Low-quality of final products may be caused by the occurrence of drilling-induced defects such as delamination, which can be highly affected by the tool conditions. The abrasive carbon fibers generally produce very fast tool wear with negative effects on the hole quality. This suggests the need to develop a method able to accurately monitor the tool wear development during the drilling process in order to set up optimal tool management strategies. Nowadays, different types of sensors can be employed to acquire relevant signals associated with process variables which are useful to monitor tool wear during drilling. Moreover, the increasing computational capacity of modern computers allows the successful development of procedures based on Artificial Intelligence (AI) techniques for signal processing and decision making aimed at online tool condition monitoring. In this work, an advanced tool condition monitoring method based on the employment of autoencoders and gated recurrent unit (GRU) recurrent neural networks (RNN) is developed and implemented to estimate tool wear in the drilling of CFRP/CFRP stacks. This method exploits the automatic feature extraction capability of autoencoders to obtain relevant features from the sensor signals acquired by a multiple sensor system during the drilling process and the memory abilities of GRU to estimate tool wear based on the extracted sensor signal features. The results obtained with the proposed method are compared with other neural network approaches, such as traditional feedforward neural networks, and considerations are made on the influence that memory-based hyperparameters have on tool wear estimation performance.

Keywords:

composite materials; drilling; tool wear; sensor monitoring; deep learning; autoencoder; gated recurrent unit

1. Introduction

The use of new structural materials such as carbon fiber reinforced plastics (CFRP) allows for substantial weight reduction on aircraft, which positively impacts CO₂ emissions as well as management costs due to lower fuel consumption, consistent with today’s requirements for environmental sustainability [1]. For the assembly of aeronautical CFRP components, mechanical joining techniques such as riveting are widely employed to achieve strong and reliable joints. Consequently, the most widespread CFRP machining process in the aerospace industry is represented by drilling, which is needed to realize the holes for subsequent riveting. However, the anisotropic nature of composite materials, the very rapid tool wear growth due to the abrasive carbon fibers, and the intense stresses and vibrations, which can cause damage to material integrity, surface quality, and part aspects, make drilling of CFRP components a major manufacturing challenge [2].

In the aeronautical industry, where severe requirements are applied to geometrical and dimensional tolerances as well as surface integrity, the current practice for CFRP drilling consists of manual, semi-automated, or automated drilling processes where tools are replaced long before the end of tool life to avoid any risk of material damage. As a matter of fact, tool wear estimation is a complex task to solve because several wear mechanisms occur simultaneously during machining [3], and direct measurement is generally unfeasible in an industrial context. Hence, the problem of tool wear is often managed in the industry by using lower cutting parameter values with the aim of slowing the wear process and replacing the tool well before the expected end of life [4]. To overcome these inefficient procedures, different methodologies, including empirical [5] and stochastic modeling techniques [6], have been proposed in the literature, suggesting new strategies for tool management. One of the most effective approaches to fully exploit tool life and increase productivity while preserving the integrity of the work material is the implementation of a procedure able to accurately monitor the tool wear development online in order to set up optimal tool management strategies based on the actual tool conditions. Online real-time process monitoring based on the employment of sensor systems and advanced sensor signal processing procedures is a valuable solution for tool condition monitoring [7], allowing for in-process control of tool wear growth which is critical for hole quality assessment and drilling process effective automation. Multiple sensors of different natures can be employed to acquire various signals associated with the most relevant process variables, e.g., force, torque, vibrations, etc., which can be useful to monitor tool wear during machining. A fundamental matter in machining process monitoring is the identification of relevant sensor signal features (SFs) from the acquired sensor signals, which are well correlated with process conditions. As a matter of fact, the sensor signals need to be described by a reasonable number of SFs that keep the relevant information on the monitored machining process with the aim of effectively supporting decision making. Different signal processing methodologies, which generally include a data pre-processing step followed by feature extraction, selection, classification, and validation, can be employed. Decision making on the basis of the extracted SFs can be performed using different methodologies, e.g., based on machine learning and deep learning approaches. In this case, it is particularly important to automatically generate SFs that are as independent as possible from each other and then select the most effective ones to reduce the problem dimensionality and hence the computational effort and time. Nowadays, the increasing computational capacity of computers allows the development of different approaches based on Artificial Intelligence (AI) methods [8]. These approaches are ever more used for the development of industrial applications in several fields, such as welding [9], additive manufacturing [10], and in-line defect detection [11]. In the literature, different solutions have been proposed to deal with tool condition monitoring using these techniques. Hegabet al. [12] compared the results in tool wear estimation of the regression tree, support vector machine (SVM), Gaussian process regression (GPR), and artificial neural network (ANN) algorithms using as inputs the cutting speed and feed rate process parameters. Simon et al. [13] used thirteen statistical features extracted from the raw audio signals, namely mean, standard error, median, mode, standard deviation, sample variance, kurtosis, skewness, range, minimum, maximum, sum, and count, and used them as input to a binary K-star classifier, aiming to identify if a tool is still able to work or needed to be changed. Caggiano et al. [14] compared the performance of different ANN architectures in the estimation of CFRP drilling tool wear for different process parameters. Statistical features, namely mean, standard deviation, energy, kurtosis, and skewness, are extracted from the force, torque, acoustic emission, and vibration signals to construct a Sensor Fusion Pattern Vector (SFPV) composed of the principal components obtained via principal component analysis (PCA) of 8 statistical features, selected from the original 20 using the Pearson correlation coefficient, and the number of holes made by the tool. The SFPV vector constructed in this way was used as input to ANN architectures. Patra et al. [15] used the process parameters and the RMS current absorbed from the spindle motor acquired during the drilling of different materials to train different ANN architectures and compare results with a simple regression model. The results show that the current absorbed from the spindle motor has a high correlation with tool wear since it is associated with an increment of the drilling force. Wu et al. [16] used the statistical features, namely max value, median, mean, and standard deviation of cutting force signals and vibration along different directions (x, y, and z) and the same value for the acoustic emission signal to train and compared results in tool wear estimation of different algorithms such as Random Forest, Support Vector Machine and different ANN architectures.

Thanks to the huge improvements achieved in deep learning approaches, especially due to the capabilities of automatic feature representation, also Convolutional Neural Networks [17], Recurrent Neural Networks [18], and autoencoder architectures [19] have attracted considerable attention for tool wear prediction research. In particular, Marani et al. [18] proposed a prediction model for tool flank wear using an LSTM model network, using the spindle motor current signals during a machining process as input feature. Sun et al. [19] employed an autoencoder architecture to automatically extract features correlated with the failure of a cutting tool using an unsupervised learning approach. Shah et al. [20] compared the results for tool wear estimation using vibration and audio signals with LSTM and bidirectional LSTM. In particular, the input vector for the networks was composed of features extracted by a scalogram generated by a Morlet transform and a generative network, as a data augmentation method.

The literature review about the state-of-the-art methods for tool condition monitoring, with particular reference to the machining of composite materials, shows that the employment of automatic feature extraction techniques, which allow the complete automation of the tool wear estimation process, has not been fully explored. As regards the employment of recurrent neural networks, some applications have been reported in the literature to estimate tool wear in the machining of metals, but very few applications have been proposed in the machining of composite materials. Moreover, most of the proposed applications generally employ manual feature extraction methods, which, on the one hand, require a strong knowledge of the problem by the developer to extract useful information, and, on the other hand, imply some simplification hypotheses, such as considering the data Gaussian distributed or linearly related. To tackle this issue, the specific contribution of the present work is related to the development of a new integrated method for tool wear prediction in drilling based on the employment of autoencoders for automatic feature extraction and gated recurrent unit (GRU) memory-based recurrent neural networks for tool wear prediction. This method can exploit the potential of both models for tool wear prediction and represents an advancement compared to the literature, allowing the full automation of both feature extraction and tool wear prediction tasks. The proposed method is developed in this work to realize tool condition monitoring in the drilling of CFRP/CFRP stacks for aeronautical assembly based on force, torque, vibration, and acoustic emission sensor signals.

Different models are developed and assessed, also by comparing the results obtained with the proposed method with those achieved using other neural network approaches, such as traditional feedforward neural networks, and considerations are made of the influence that memory-based hyperparameters have on tool wear estimation performance.

2. Materials and Experimental Procedures

In this work, a multi-sensor monitoring system [21] was used to acquire different signals during drilling of aeronautical stack-ups composed of two overlaid, symmetrical, and balanced CFRP laminates of 5 mm thickness (Toray T300 carbon fibers, CYCOM 977-2 epoxy matrix). Each laminate is composed of 26 unidirectional prepreg plies with stacking sequence [±452/0/904/0/90/02]s.

The multi-sensor system comprised the following sensing units:

Kistler 9257A piezoelectric dynamometer to acquire the thrust force along the vertical direction, $F_{z}$ ;
Kistler 9277A25 piezoelectric dynamometer to acquire the cutting torque along the vertical axis, $T_{z}$ ;
Montronix BV100 sensor to acquire the acoustic emission RMS, AE RMS, and the vibration acceleration, V.

The analog signals were amplified and sent to an NI USB-6361 DAQ board with a sampling rate of 10 kS/s. The whole system employed for the experimental tests is presented in Figure 1.

The setup shown in Figure 1 was employed to conduct experimental drilling tests on the CFRP/CFRP stacks under different process conditions. In this work, four tests, namely T1, T2, T3, and T4, are considered due to the industrial interest in these cutting conditions. For all tests, tungsten carbide twist drills with a point angle of 120°, a helix angle of 30° and a diameter of 4.85 mm were employed to drill 60 sequential holes each on the CFRP/CFRP stacks with different process parameters reported in Table 1.

2.1. Tool Wear Monitoring

Since the most used parameter for tool wear monitoring is flank wear, expressed in terms of VB and VBmax values in mm [22], a Tesa Visio V-200 optical measuring machine was used to perform direct measurements on the drill bits during the experimental campaign. In particular, in order to evaluate tool wear and create the target vector for the deep learning agent, the flank wear (VB) was measured after every 10 consecutive drilled holes, obtaining 6 values for each test. A third-order interpolation curve was used to interpolate all values, as reported in Figure 2.

2.2. Sensor Signal Pre-Processing

In this work, an autoencoder neural network was employed to process each acquired sensor signal with the aim of extracting features automatically. As this approach is based on the use of a feedforward network, a fixed input length is needed, which implies that all the acquired sensor signals must be characterized by the same number of samples. Accordingly, an online acquisition module was developed so that the acquisition is saved for each channel of the sensor signal acquisition system starting from the moment when the thrust force

F_{z}

reaches a threshold value

F_{t h r}

, as per Equation (1). This condition indicates the tool–workpiece contact and assures the actual start of the drilling process.

F_{z} > F_{t h r}

(1)

The sensor signal acquisition is then stopped when a predefined number of samples

n_{s}

are acquired, in accordance with the process time duration calculated on the basis of the process parameters, as described in Equation (2).

n_{s} = \frac{60 \cdot h \cdot f_{s}}{f_{r} \cdot ω}

(2)

where h is the specimen thickness (mm),

f_{s}

is the sampling rate (samples/second),

f_{r}

is the feed (mm/rev), and

ω

is the spindle speed (rev/min). The above-defined thresholds were employed to perform the segmentation procedure on the thrust force signal. All the other sensor signals, including torque, vibrations, and acoustic emission RMS, were segmented using the same start and end point identified on the thrust force signal. An example of signal segmentation performed according to this procedure is shown in Figure 3 for the thrust force signal of hole no. 1 for the experimental test T2 carried out at 6000 rpm—0.20 mm/rev.

3. Smart Sensor Monitoring Approach

In this work, a new methodology for smart sensor monitoring to estimate tool wear during the drilling of composite materials is proposed. The logic scheme of the proposed framework is reported in Figure 4. The raw sensor signals acquired during the drilling process are provided in input to a pre-trained autoencoder network which performs feature extraction from the signal and acts as a denoising filter. The features extracted by the encoder part of the autoencoder architecture are used as input for the following software modules, which, using a pre-trained memory-based neural network, provide as output the estimated flank wear value, VB. It is worth mentioning that, in this approach, the memory-based neural network receives as input both current and past information coming from the autoencoder.

3.1. Sensor Signal Feature Extraction Based on Autoencoder Architecture

An autoencoder is an unsupervised machine learning algorithm that uses a neural network to learn efficient codings, also called latent representations, of unlabeled data. Given a set of data, the autoencoder learns a representation (encoding) by training the network to ignore irrelevant data (noise), generally for dimensionality reduction. The encoding is validated and refined by trying to reconstruct (decoding) the original input data from the encoding. A scheme of a generic autoencoder architecture consisting of an encoder and decoder is presented in Figure 5.

The training process that is employed to learn the weights of the encoder and decoder hidden layers is not different from the usual backpropagation used for standard feedforward neural networks [23]. For autoencoders, a reconstruction loss, L, given by a mean squared error function in which the target is the input vector itself, is used as a loss function, as shown in Equation (3).

L = \frac{1}{N} \cdot \sum_{i = 0}^{N} {(X_{i} - X_{i}^{'})}^{2} = \frac{1}{N} \cdot \sum_{i = 0}^{N} {(X_{i} - f_{θ} (X_{i}))}^{2}

(3)

where N is the batch size, i.e., the number of training samples in one forward pass, X is the original input vector, X’ is the autoencoder output vector, and

f_{θ}

is the neural network with weights

θ

. For this scope, the autoencoder uses two different parts, namely the encoder

c (X)

, in which the input vector is summarized in a set of features F, and the decoder

d (F)

, in which the features F in the latent space are used to reconstruct the input vector.

F = c (X) = w_{x f} X + b_{x f}

(4)

X^{'} = d (F) = w_{f x^{'}} F + b_{f x^{'}}

(5)

The weights reported in Equations (4) and (5) are the weights of the network that extract features from the original signal (

w_{x f}

) and the weights that allow reconstructing the signal starting from the extracted features (

w_{f x^{'}}

). The parameters

b_{x f}

and

b_{f x^{'}}

are the biases of the encoder

c (X)

and decoder networks

d (F)

. Autoencoders have become highly utilized since 1991, when Kramer [24] used this type of architecture to conduct nonlinear principal component analysis (PCA). Wang et al. [25] compared the ability of autoencoders to perform dimensionality reduction with that of other methods, such as PCA, highlighting the capability of autoencoders to detect repetitive structures, which is a valuable attribute for anomaly detection tasks [26]. The effectiveness of autoencoders in feature extraction is more evident when the input data do not have a Gaussian distribution, since the first- and second-order statistics used in the PCA are not sufficient to describe the data otherwise. Furthermore, authors have used the autoencoder architecture to reduce signal noise [27].

In this research work, an autoencoder architecture was employed to solve a double task: extracting features from different acquired sensor signals and performing signal denoising. The first innovation introduced in this work is the usage of an auto-encoder architecture for the automatic feature extraction method applied to acquired sensor signals since industrial process data are usually nonlinear and, therefore, standard PCAs lose the capability to generalize. The autoencoder was developed with the architecture described in Table 2, using a relu activation function in all the hidden layers and no activation for the output layer. The choice of the number of layers came from a trial-and-error procedure, in which the aim was to reduce the mathematical complexity and the reconstruction error at the same time. Since the signal pattern in the time domain is similar in shape, a simple architecture is capable of generalizing and reconstructing the signal.

The training procedure was developed as follows. Once each signal, namely thrust force, torque, vibration, and acoustic emission RMS, was acquired as presented in the previous section for 60 drilling operations starting with a new tool, the dataset was randomly split into training data and test data subsets using a 75:25 ratio. An autoencoder was trained separately for each signal, and the reconstruction result obtained for one of the signals, the thrust force signal, is given in Figure 6 for one of the holes executed in the experimental test T4 with process parameters 7500 rpm—0.2 mm/rev.

As demonstrated in Figure 6, the signal is reconstructed and filtered at the same time. Subsequently, only the encoding parts of each autoencoder were saved and used in the second step. In this second phase, the features for all signals were simultaneously extracted through their own autoencoders, and the 20 total extracted features were concatenated with the hole number, which provides information on the number of holes already processed by the tool, as proposed by Caggiano in [14]. The feature extraction module is shown in Figure 7.

Correlation analysis was conducted to estimate the quality of the extracted features and their correlation with tool wear. For the latter purpose, the Spearman correlation coefficient was used, and the results are reported in Table 3. By defining as highly correlated the features presenting a correlation coefficient

S_{s} > 0.7

and as well correlated the features with

S_{s} > 0.5

, it appeared that

74 %

of the features were highly correlated while the

17 %

were at least well correlated, so for the whole dataset more than

90 %

of the extracted features had at least a good relation with tool wear. This analysis showed that the acquired signals are all related to tool wear and that the unsupervised and automated feature extraction procedure proposed in this work was able to retrieve useful features to solve the task of tool wear estimation. Moreover, by observing the signal reconstruction performance in terms of the mean squared error in Table 4 and in terms of superposition with the original acquired signal in Figure 6, it is possible to notice how, using the extracted features, the autoencoder was able to reconstruct the time–domain signal without losing relevant information.

3.2. Tool Wear Estimation Based on Recurrent Neural Networks

Recurrent neural networks (RNN) are a family of machine learning algorithms designed to deal with sequential input data, such as time series [28], text [29], and video [30], that act as memory-based networks. As described by Lipton et al. [31], although Markovian chains, which model transition between states in an observed sequence, can be used to solve the same task, there are practical problems related to their implementation, such as the limitation for which each hidden state can depend only on the immediately previous state, and computational problems, since the operations become infeasible when the set of possible states grows. All these problems can be overcome using an RNN, as the latter can capture long-range time dependencies. A simple RNN has three layers: input-sequence, recurrent hidden, and output layer, as presented in Figure 8.

The activation of the hidden states at timestep t,

h_{t}

, is computed as a function of the current input

x_{t}

and the previous hidden states

h_{t - 1}

. This function is a composition of an element-wise nonlinearity with an affine transformation of both

x_{t}

and

h_{t - 1}

, as described in Equation (6):

h_{t} = Φ (W_{t} x_{t} + U_{t} h_{t - 1} + b_{t})

(6)

where

W_{t}

is the input-to-hidden weight matrix,

U_{t}

is the state-to-state recurrent weight matrix, and

Φ

is usually a logistic sigmoid function (

σ

) or a hyperbolic tangent function (tanh). As usual, the output layer,

y_{t}

, is an element-wise nonlinear combination of the hidden state, as described by Equation (7).

y_{t} = Φ (V_{t} h_{t} + b_{o})

(7)

However, tanh and

σ

activation functions saturate the neurons very fast and can vanish the gradient [32], and this makes it difficult to train standard RNNs [33]. Therefore, new architectures were studied to solve the problem of long-term sequence inputs, such as long short-term memory (LSTM) [34] and gated recurrent unit (GRU) [35]. GRU was proposed to make each recurrent unit adaptively capture dependencies at different time scales. As shown in Figure 9, a GRU is composed of several hidden paths that modify the behavior of the hidden layer compared to a standard RNN.

A GRU cell has gating units that modulate the flow of information inside the unit; from a mathematical perspective, as reported in Equation (8), there is an update gate that decides how much of the past information (from previous time steps) needs to be passed along to the future. This part of the network takes in input the input vector at time step t,

x_{t}

, and the cell hidden layer output at the time step

t - 1

,

h_{t - 1}

, generating an output at time step t,

z_{t}

, used in another part of the network to generate the outcome, following Equation (8):

z_{t} = σ (W_{z} x_{t} + U_{z} h_{t - 1})

(8)

where

W_{z}

is the input-to-hidden weight matrix of the update gate, and

U_{z}

is the state-to-state recurrent weight matrix of the update gate. Another gate, called the reset gate, is used to decide how much of the past information to forget, as described in Equation (9):

r_{t} = σ (W_{r} x_{t} + U_{r} h_{t - 1})

(9)

where

x_{t}

is the input vector at time step t,

h_{t - 1}

is the hidden layer output at the time step

t - 1

,

W_{r}

is the input-to-hidden weight matrix of the reset gate, and

U_{r}

is the state-to-state recurrent weight matrix of the reset gate. Even if the formulas in Equations (8) and 9 are the same, the different usage of these two gates in the output computation allows us to reach different goals, as reported above. Following the graph in Figure 9, the hidden state of the cell is computed following Equation (10):

h_{t} = t a n h (W_{h} x_{t} + r_{t} \otimes U_{h} h_{t - 1})

(10)

The hidden state of the cell is a nonlinear combination of the linear combination of the input sequence at time t,

x_{t}

, given by the matrix multiplication between the input vector at the time step t and a matrix,

W_{h}

, plus the Hadamard product (⊗) between the linear combination of the reset gate output

r_{t}

and the hidden state at the previous time step,

h_{t - 1}

, using the matrix

U_{h}

; in this way, what to remove from previous time steps is determined. Finally, the z gate updates the hidden state value at time t,

h_{t}

, considering which information of the past needed to be conserved. From a mathematical perspective, this is obtained using Equation (11).

h_{t} = z_{t} \otimes h_{t - 1} + (1 - z_{t}) \otimes h_{t}

(11)

From a computational perspective, the GRU architectures can store and filter the information using their update and reset gates in a simpler way compared to LSTM architectures, providing similar or even better performance depending on the applications.

3.3. Architecture Design and Traning Setup

In this research work, three different GRU architectures were used to investigate the influence of different hyperparameters on tool wear estimation performance:

The first architecture was characterized by 21 hidden layer neurons, equal to the number of features extracted by the autoencoder from the force, torque, vibration, and acoustic emission RMS signals plus the hole number, as reported in Figure 7. A hyperbolic tangent activation function was selected for the hidden layer, and a rectified linear unit was employed for the output layer. The output of the hidden layer is used as a feedback branch for the GRU network. A memory window of two samples was selected for the input layer. The output layer had only one neuron corresponding to the tool wear value predicted by the GRU network;
In the second architecture, a deeper structure was realized by introducing an additional hidden layer. In the first hidden layer of this second architecture, 42 neurons were employed, and a hyperbolic tangent activation function was selected. The second hidden layer was characterized by 21 neurons, and the output layer had only 1 neuron;
A third architecture was developed to study the effect of the memory window on the network performance. Since the first results showed that the best performance was provided by the deeper architecture, the input layer of the second architecture was changed from (2, 21) to (4, 21), i.e., using a memory window of four samples instead of two samples.

The three architectures were trained using the four datasets defined in Table 1, on an industrial AI-based computer with the characteristics reported in Table 5. The datasets were randomly divided into training and validation subsets according to a 75:25 split ratio, meaning 75% of the data were in the training subset and 25% in the validation subset. For all the training sessions, the Adam optimizer was selected [36], and a learning rate of

1 \cdot 10^{- 6}

was used for 500 k epochs to avoid unlearning problems. Finally, a gradient clipping in the range

(- 1, 1)

was employed to avoid exploding gradient problems. For the neural network development, training, and inference, the TensorFlow library was used.

4. Results and Discussion

The final training results obtained for all the datasets and the diverse architectures are summarized in Table 6 and Figure 10 in terms of root mean square error (RMSE) between the VB values predicted by the different neural network architectures and the experimental VB values. The results show that the proposed methodology allows us to achieve an accurate tool wear prediction for all the datasets, with very low RMSE values always in the range between

3.45 \cdot 10^{- 4}

and

1.2 \cdot 10^{- 3}

.

In particular, the results show that (i) deeper architectures can be used to better generalize the tool wear behavior and (ii) observing more drilling operations before the tool wear estimation, thus increasing the memory window w, can help increase the estimation performance. The drawback when increasing the memory window w is related to the need to wait until a number of holes equal to w are made to obtain the first prediction. Hence, given the excellent tool wear estimation performance achieved even with a small memory window of two samples, it may not be necessary to further increase w. Based on the results obtained in this work, a memory window of two samples seems to be suitable to achieve an accurate tool wear prediction, with performance almost comparable in most cases to the results obtained using a memory window of four samples (see Figure 10). In any case, this choice has to be justified in relation to the application requirements in terms of prediction time and performance. An example of a result obtained with architecture no. 2, i.e., with a memory window of two samples, is reported in Figure 11 for dataset 2 belonging to the experimental drilling tests carried out at 6000 rpm—0.2 mm/rev, for which an RMSE value equal to 5.3

\cdot 10^{- 4}

was obtained. The close proximity between the RNN predicted VB values and the experimental VB values confirms the high accuracy of the RNN prediction that can be effectively utilized for decision making on tool replacement.

Comparing the obtained tool wear prediction results for CFRP drilling with the results obtained in [37] using several machine learning algorithms such as Linear Regression (LR), k-Nearest Neighbors (kNN), and Random Forest (RF), where an RMSE of 0.22 was achieved, the methodology proposed in this work shows to reach better results using a more automated data workflow. For a final comparison, the presented methodology was compared to the performance reached using a standard feedforward neural network. In particular, the same architecture, composed of 42 neurons in the first hidden layer and 21 neurons in the second hidden layer, both with a hyperbolic tangent activation function, and with a ReLU activation function in the output layer, was used for both networks. A memory window of two samples was chosen for the memory-based network. For both learning processes, a learning rate of

1 \cdot 10^{- 5}

was used for 500 k epochs. Gradient clipping was applied in both training phases. The results for the dataset T2 are shown in Figure 12. These results, similar to those obtained for the other datasets, show that the use of a memory-based network allowed achieving a better performance compared to a standard feedforward network, confirming that the information on previous tool wear values significantly improves the estimation of the future tool wear values. These results show that the use of a memory-based network allowed achieving a better performance compared to a standard feedforward network, confirming that the information on previous tool wear values significantly improves the estimation of the future tool wear values. Finally, it is worth mentioning that, by using the presented methodology, it is possible to directly employ the raw time series data coming from the acquisition systems to estimate tool wear, without the development of additional software modules dedicated to signal processing and feature extraction, which is the approach usually proposed in the literature. Moreover, the proposed methodology ensures good performance also when the process parameters are changed, as suggested by the results in Table 6 and Figure 10.

5. Conclusions

In this work, an experimental drilling campaign on CFRP/CFRP stacks was conducted by acquiring force, torque, acoustic emission, and vibration signals with a multi-sensor acquisition system with the aim of developing a smart tool condition monitoring procedure for optimal tool replacement strategy. The acquired signals were processed through an innovative AI-based approach using autoencoders for automatic sensor signal feature extraction and gated recurrent unit (GRU) memory-based recurrent neural networks for tool wear prediction. The results obtained with the different proposed architectures showed that, by using a memory-based neural network such as GRU, the estimation errors were reduced compared to standard feedforward neural networks as well as to alternative machine learning methodologies presented in the literature. In particular, a minimum RMSE of

3.45 \cdot 10^{- 4}

was reached. The results showed that an increment in the architecture complexity, e.g., by adding an additional hidden layer or increasing the number of hidden layer nodes, led to a performance improvement for all datasets. Moreover, a further improvement was also obtained for some experimental tests by incrementing the memory window w from two to four samples. The main limitation of the presented methodology is related to the need to drill a defined number of holes (equal to the memory window w) before obtaining the first tool wear prediction, an aspect that has to be taken into account based on the requirements of the application. In conclusion, this work showed the potential of the proposed innovative AI-based approach using autoencoder architectures for automatic feature extraction and GRU memory-based neural networks for tool wear estimation in composite materials drilling aimed at smart tool condition monitoring. By exploiting the combination of different architectures, this approach allows for avoiding intermediate manipulations of the acquired signals, such as manual feature extraction, and has the possibility to make the development of the application simple and fast without reducing the estimation performance.

Author Contributions

Conceptualization, A.C. and G.M.; methodology, A.C. and G.M.; software, G.M.; validation, G.M.; formal analysis, A.C. and G.M.; investigation, A.C. and G.M.; resources, A.C. and L.N.; data curation, A.C.; writing—original draft preparation, A.C. and G.M.; writing—review and editing, A.C. and L.N.; visualization, A.C. and G.M.; supervision, L.N.; project administration, A.C. and L.N.; funding acquisition, L.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author, A.C., upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

M’Saoubi, R.; Axinte, D.; Soo, S.L.; Nobel, C.; Attia, H.; Kappmeyer, G.; Engin, S.; Sim, W.M. High performance cutting of advanced aerospace alloys and composite materials. CIRP Ann. 2015, 64, 557–580. [Google Scholar] [CrossRef]
Fleischer, J.; Teti, R.; Lanza, G.; Mativenga, P.; Möhring, H.C.; Caggiano, A. Composite materials parts manufacturing. CIRP Ann. 2018, 67, 603–626. [Google Scholar] [CrossRef]
Fernandez-Vidal, S.R.; Fernandez-Vidal, S.; Batista, M.; Salguero, J. Tool wear mechanism in cutting of stack CFRP/UNS A97075. Materials 2018, 11, 1276. [Google Scholar] [CrossRef] [Green Version]
Yılmaz, B.; Karabulut, Ş.; Güllü, A. Performance analysis of new external chip breaker for efficient machining of Inconel 718 and optimization of the cutting parameters. J. Manuf. Process. 2018, 32, 553–563. [Google Scholar] [CrossRef]
Iliescu, D.; Gehin, D.; Gutierrez, M.; Girot, F. Modeling and tool wear in drilling of CFRP. Int. J. Mach. Tools Manuf. 2010, 50, 204–213. [Google Scholar] [CrossRef]
Halila, F.; Czarnota, C.; Nouari, M. Analytical stochastic modeling and experimental investigation on abrasive wear when turning difficult to cut materials. Wear 2013, 302, 1145–1157. [Google Scholar] [CrossRef]
Teti, R.; Mourtzis, D.; D’Addona, D.; Caggiano, A. Process monitoring of machining. CIRP Ann. 2022, 71, 529–552. [Google Scholar] [CrossRef]
Hornby, A.; Michael, A.; Joanna, T.; Diana, L.; Dilys, P.; Patrick, P.; Victoria, B. Oxford Advanced Learner’s Dictionary; Oxford University Press: Oxford, UK, 2010. [Google Scholar]
Nele, L.; Mattera, G.; Vozza, M. Deep Neural Networks for Defects Detection in Gas Metal Arc Welding. Appl. Sci. 2022, 12, 3615. [Google Scholar] [CrossRef]
Mattera, G.; Paolela, D.; Nele, L. Monitoring and control the Wire Arc Additive Manufacturing process using artificial intelligence techniques: A review. J. Intell. Manuf. 2023; in press. [Google Scholar] [CrossRef]
Caggiano, A.; Zhang, J.; Alfieri, V.; Caiazzo, F.; Gao, R.; Teti, R. Machine learning-based image processing for on-line defect recognition in additive manufacturing. CIRP Ann. 2019, 68, 451–454. [Google Scholar] [CrossRef]
Hegab, H.; Hassan, M.; Rawat, S.; Sadek, A.; Attia, H. A smart tool wear prediction model in drilling of woven composites. Int. J. Adv. Manuf. Technol. 2020, 110, 2881–2892. [Google Scholar] [CrossRef]
Simon, G.D.; Deivanathan, R. Early detection of drilling tool wear by vibration data acquisition and classification. Manuf. Lett. 2019, 21, 60–65. [Google Scholar] [CrossRef]
Caggiano, A. Tool wear prediction in Ti-6Al-4V machining through multiple sensor monitoring and PCA features pattern recognition. Sensors 2018, 18, 823. [Google Scholar] [CrossRef] [Green Version]
Patra, K.; Pal, S.K.; Bhattacharyya, K. Artificial neural network based prediction of drill flank wear from motor current signals. Appl. Soft Comput. 2007, 7, 929–935. [Google Scholar] [CrossRef]
Wu, D.; Jennings, C.; Terpenny, J.; Gao, R.X.; Kumara, S. A comparative study on machine learning algorithms for smart manufacturing: Tool wear prediction using random forests. J. Manuf. Sci. Eng. 2017, 139, 071018. [Google Scholar] [CrossRef] [Green Version]
Duan, J.; Duan, J.; Zhou, H.; Zhan, X.; Li, T.; Shi, T. Multi-frequency-band deep CNN model for tool wear prediction. Meas. Sci. Technol. 2021, 32, 065009. [Google Scholar] [CrossRef]
Marani, M.; Zeinali, M.; Songmene, V.; Mechefske, C.K. Tool wear prediction in high-speed turning of a steel alloy using long short-term memory modelling. Measurement 2021, 177, 109329. [Google Scholar] [CrossRef]
Sun, C.; Ma, M.; Zhao, Z.; Tian, S.; Yan, R.; Chen, X. Deep transfer learning based on sparse autoencoder for remaining useful life prediction of tool in manufacturing. IEEE Trans. Ind. Inform. 2018, 15, 2416–2425. [Google Scholar] [CrossRef]
Shah, M.; Vakharia, V.; Chaudhari, R.; Vora, J.; Pimenov, D.Y.; Giasin, K. Tool wear prediction in face milling of stainless steel using singular generative adversarial network and LSTM deep learning models. Int. J. Adv. Manuf. Technol. 2022, 121, 723–736. [Google Scholar] [CrossRef]
Teti, R.; Segreto, T.; Caggiano, A.; Nele, L. Smart multi-sensor monitoring in drilling of CFRP/CFRP composite material stacks for aerospace assembly applications. Appl. Sci. 2020, 10, 758. [Google Scholar] [CrossRef] [Green Version]
Park, K.H.; Beal, A.; Kwon, P.; Lantrip, J. Tool wear in drilling of composite/titanium stacks using carbide and polycrystalline diamond tools. Wear 2011, 271, 2826–2835. [Google Scholar] [CrossRef]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
Kramer, M.A. Nonlinear principal component analysis using autoassociative neural networks. AIChE J. 1991, 37, 233–243. [Google Scholar] [CrossRef]
Wang, Y.; Yao, H.; Zhao, S. Auto-encoder based dimensionality reduction. Neurocomputing 2016, 184, 232–242. [Google Scholar] [CrossRef]
Fan, J.; Wang, W.; Zhang, H. AutoEncoder based high-dimensional data fault detection system. In Proceedings of the 2017 IEEE 15th International Conference on Industrial Informatics (INDIN), Emden, Germany, 24–26 July 2017; pp. 1001–1006. [Google Scholar]
Xiong, P.; Wang, H.; Liu, M.; Liu, X. Denoising autoencoder for eletrocardiogram signal enhancement. J. Med Imaging Health Inform. 2015, 5, 1804–1810. [Google Scholar] [CrossRef]
Shao, H.; Deng, X.; Jiang, Y. A novel deep learning approach for short-term wind power forecasting based on infinite feature selection and recurrent neural network. J. Renew. Sustain. Energy 2018, 10, 043303. [Google Scholar] [CrossRef]
Zhang, Y.; Liu, Q.; Song, L. Sentence-state lstm for text representation. arXiv 2018, arXiv:1805.02474. [Google Scholar]
Xu, Z.; Hu, J.; Deng, W. Recurrent convolutional neural network for video classification. In Proceedings of the 2016 IEEE International Conference on Multimedia and Expo (ICME), Seattle, WA, USA, 11–15 July 2016; pp. 1–6. [Google Scholar]
Lipton, Z.C.; Berkowitz, J.; Elkan, C. A critical review of recurrent neural networks for sequence learning. arXiv 2015, arXiv:1506.00019. [Google Scholar]
Hochreiter, S. The vanishing gradient problem during learning recurrent neural nets and problem solutions. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 1998, 6, 107–116. [Google Scholar] [CrossRef] [Green Version]
Bengio, Y.; Simard, P.; Frasconi, P. Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 1994, 5, 157–166. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Gated feedback recurrent neural networks. In Proceedings of the International Conference on Machine Learning, Lille, France, 7–9 July 2015; pp. 2067–2075. [Google Scholar]
Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Domínguez-Monferrer, C.; Fernández-Pérez, J.; De Santos, R.; Miguélez, M.; Cantero, J. Machine learning approach in non-intrusive monitoring of tool wear evolution in massive CFRP automatic drilling processes in the aircraft industry. J. Manuf. Syst. 2022, 65, 622–639. [Google Scholar] [CrossRef]

Figure 1. Experimental setup for the CFRP/CFRP stack drilling tests: CNC drill press and sensor monitoring system.

Figure 2. Tool wear curves obtained via third-order polynomial interpolation of the measured values for the experimental tests T1, T2, T3, and T4 are reported in Table 1.

Figure 3. An example of thrust force signal segmentation for the first hole of the experimental test T2.

Figure 4. Scheme of smart sensor monitoring architecture developed in this work. Using the acquisition system described in the introduction and the pre-processing methodology presented in Section 2.2, signals from different sensors are acquired. The raw signals are given in input to an autoencoder neural network that extracts features from the signals. These features are given in input to a memory-based neural network that is used to estimate the tool wear at the end of the current drilled hole. The two software modules are trained, tested, and deployed on an industrial GPU.

Figure 5. A typical autoencoder architecture consisting of encoder (extracting features from the input in the latent space) and decoder (reconstructing the original input from the latent representation).

Figure 6. Original and autoencoder-reconstructed thrust force signal for one of the holes executed in the experimental test T4 with process parameters 7500 rpm—0.2 mm/rev.

Figure 7. Feature extraction module: the outputs of the encoding parts of the four trained autoencoders are concatenated with the number of holes processed by the tool, realizing the feature’s vector.

Figure 8. A standard RNN architecture. The feedback loop provided by the previous output

U_{t}

gives memory capabilities to these architectures.

Figure 8. A standard RNN architecture. The feedback loop provided by the previous output

U_{t}

gives memory capabilities to these architectures.

Figure 9. Gated Recurrent Unit (GRU) scheme.

Figure 10. RMSE performance of the different RNN architectures for the four experimental tests. The inputs for all architectures are the output features of the four autoencoders plus the hole number.

Figure 11. Comparison between RNN prediction for architecture (41, 21) with memory window w = 2 and experimental tool wear target for experiment 2, with process parameters reported in Table 1. Final root mean squared error (RMSE):

5.3 \cdot 10^{- 4}

.

Figure 11. Comparison between RNN prediction for architecture (41, 21) with memory window w = 2 and experimental tool wear target for experiment 2, with process parameters reported in Table 1. Final root mean squared error (RMSE):

5.3 \cdot 10^{- 4}

.

Figure 12. Comparison between RNN and ANN predictions for architecture (41, 21) and experimental tool wear target for experiment 2 with process parameters reported in Table 1. Final ANN root mean squared error (RMSE):

6.4 \cdot 10^{- 4}

. RMSE RNN:

5.3 \cdot 10^{- 4}

.

Figure 12. Comparison between RNN and ANN predictions for architecture (41, 21) and experimental tool wear target for experiment 2 with process parameters reported in Table 1. Final ANN root mean squared error (RMSE):

6.4 \cdot 10^{- 4}

. RMSE RNN:

5.3 \cdot 10^{- 4}

.

Table 1. Summary of experimental conditions.

Experiment ID	Feed Rate [mm/rev]	Spindle Speed [rpm]
T1	0.11	2700
T2	0.2	6000
T3	0.15	6000
T4	0.2	7500

Table 2. Architecture used for each autoencoder.

Signal	Number of Hidden Layers
Force ( $F_{z}$ )	(16, 8, 16)
Torque ( $T_{z}$ )	(16, 4, 16)
Acoustic emission RMS ( $A E$ )	(16, 4, 16)
Vibration (V)	(16 ,4, 16)

Table 3. Spearman correlation coefficient,

S_{s}

, between autoencoder-extracted features and tool wear values.

Table 3. Spearman correlation coefficient,

S_{s}

, between autoencoder-extracted features and tool wear values.

Features	$S 1_{s}$	$S 2_{s}$	$S 3_{s}$	$S 4_{s}$
1	0.664	0.928	−0.753	0.452
2	−0.511	0.873	0.884	0.832
3	0.762	0.691	0.855	−0.666
4	0.593	0.692	−0.918	−0.311
5	0.821	−0.795	0.744	−0.749
6	−0.971	0.944	0.826	0.812
7	0.854	−0.88	0.261	0.803
8	−0.662	−0.857	0.903	− 0.803
9	−0.679	0.880	−0.863	−0.821
10	0.693	−0.896	0.858	−0.802
11	−0.869	−0.872	−0.960	0.802
12	0.873	−0.972	−0.884	0.262
13	0.811	−0.932	−0.969	0.831
14	0.677	−0.933	−0.978	−0.210
15	0.714	−0.916	−0.892	0.886
16	−0.757	−0.975	−0.969	− 0.951
17	0.662	−0.973	−0.205	0.744
18	0.674	−0.973	−0.793	0.701
19	0.757	−0.947	0.685	0.733
20	0.986	−0.353	−0.966	0.984

Table 4. Reconstruction performance of all datasets, in terms of mean squared error, for all acquired signals, force, torque, acoustic emission RMS, and vibration.

Signal	MSE T1	MSE T2	MSE T3	MSE T4
Force	14.9	14.3	19.5	20.26
Torque	4.88	1.47	1.04	4.01
Acoustic emission	2.05 · $10^{- 6}$	1.27 · $10^{- 5}$	4.88 · $10^{- 5}$	4.87 · $10^{- 5}$
Vibration	1 · $10^{- 4}$	1.39 · $10^{- 3}$	2.32 · $10^{- 3}$	2.22 · $10^{- 3}$

Table 5. The main characteristic of NVIDIA Jetson nano device.

CPU	Quad-core ARM A57, max frequency 1.43 GHz
GPU	128-core NVIDIA Maxwell architecture-based GPU, 512 GFLOPS (FP16
Memory	4GB 64-bit LPDDR4; 25.6 Gb/s
OS	Linux for Tegra

Table 6. Comparison of the different architectures used in this work. The inputs for all architectures are the outputs of the four autoencoders’ (features) and the number of the hole made by the tool.

Experiment	Architecture	Memory Window	RMSE
1	(21)	2	1.2 · $10^{- 3}$
1	(42, 21)	2	9.5 · $10^{- 4}$
1	(42, 21)	4	9.5 · $10^{- 4}$
2	(21)	2	6.2 · $10^{- 4}$
2	(42, 21)	2	5.3 · $10^{- 4}$
2	(42, 21)	4	5.1 · $10^{- 4}$
3	(21)	2	5.4 · $10^{- 4}$
3	(42, 21)	2	4.56 · $10^{- 4}$
3	(42, 21)	4	3.45 · $10^{- 4}$
4	(21)	2	1.1 · $10^{- 3}$
4	(42, 21)	2	9.2 · $10^{- 4}$
4	(42, 21)	4	9.1 · $10^{- 4}$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Caggiano, A.; Mattera, G.; Nele, L. Smart Tool Wear Monitoring of CFRP/CFRP Stack Drilling Using Autoencoders and Memory-Based Neural Networks. Appl. Sci. 2023, 13, 3307. https://doi.org/10.3390/app13053307

AMA Style

Caggiano A, Mattera G, Nele L. Smart Tool Wear Monitoring of CFRP/CFRP Stack Drilling Using Autoencoders and Memory-Based Neural Networks. Applied Sciences. 2023; 13(5):3307. https://doi.org/10.3390/app13053307

Chicago/Turabian Style

Caggiano, Alessandra, Giulio Mattera, and Luigi Nele. 2023. "Smart Tool Wear Monitoring of CFRP/CFRP Stack Drilling Using Autoencoders and Memory-Based Neural Networks" Applied Sciences 13, no. 5: 3307. https://doi.org/10.3390/app13053307

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Smart Tool Wear Monitoring of CFRP/CFRP Stack Drilling Using Autoencoders and Memory-Based Neural Networks

Abstract

1. Introduction

2. Materials and Experimental Procedures

2.1. Tool Wear Monitoring

2.2. Sensor Signal Pre-Processing

3. Smart Sensor Monitoring Approach

3.1. Sensor Signal Feature Extraction Based on Autoencoder Architecture

3.2. Tool Wear Estimation Based on Recurrent Neural Networks

3.3. Architecture Design and Traning Setup

4. Results and Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI