Structural Vibration Tests: Use of Artificial Neural Networks for Live Prediction of Structural Stress

One of the ongoing tasks in space structure testing is the vibration test, in which a given structure is mounted onto a shaker and excited by a certain input load on a given frequency range, in order to reproduce the rigor of launch. These vibration tests need to be conducted in order to ensure that the devised structure meets the expected loads of its future application. However, the structure must not be overtested to avoid any risk of damage. For this, the system’s response to the testing loads, i.e., stresses and forces in the structure, must be monitored and predicted live during the test. In order to solve the issues associated with existing methods of live monitoring of the structure’s response, this paper investigated the use of artificial neural networks (ANNs) to predict the system’s responses during the test. Hence, a framework was developed with different use cases to compare various kinds of artificial neural networks and eventually identify the most promising one. Thus, the conducted research accounts for a novel method for live prediction of stresses, allowing failure to be evaluated for different types of material via yield criteria.


Introduction
In the space industry, the launch evidently dominates structural requirements. Therefore, in order to demonstrate that a structure will survive the launch, it is analyzed using the finite element method (FEM) and tested in vibration test facilities [1]. During a vibration test, accelerations are usually monitored in order to assess the loads that the structure is experiencing. Ideally, load cells are also installed at the interface of the structure to directly monitor the interface loads and compare them against the design loads. This, however, is not always possible because the use of load cells or strain gauges has many technical, operational, and financial drawbacks [2]. Consequently, the input of the vibration test, i.e., the excitation load of the structure under test, is often adjusted based on the measured accelerations rather than on loads or stresses [3].
One specific example concerns the case where loads need to be monitored at the interface of a subsystem that is part of a larger complex system such as the James Webb Space Telescope (JWST) (Figure 1). The JWST is composed of several subsystems, each of which was tested separately before integration on the JWST. Figure 1 illustrates this problem where, particularly on the bottom, are depicted all the different mechanical test campaigns in which the Near-Infrared Spectrograph (NIRSpec) has been involved. One can observe the NIRSpec optical assembly stand-alone test (OA), followed by the integrated science and instrument module test (ISIM) and the optical telescope assembly test (OTE + ISIM = OTIS). The last mechanical test prior to launch has been recently conducted, the One approach is to use the coil current from the shaker, since the applied load can be correlated with the shaker current. However, this approach can only be used to estimate the load in the excitation direction [6]. Strain gauges could be used to recover strains at interfaces and thus loads. However, they require careful calibration to provide a robust indirect measurement of the interface loads. A force measurement device provides six global interface forces or moments and local load cell forces during vibration testing, allowing the measurement of the local forces in three orthogonal directions [6]. However, such devices are not available in every test facility center. Moreover, they are costly, take space that is not taken into account in the design, and often change the system's response, so they must be accounted for in all test prediction analyses. [2,5] The mass operator is a mathematical tool used to derive loads from measured accelerations [3]. It uses measured accelerations in order to calculate the interface loads or stresses representative of One approach is to use the coil current from the shaker, since the applied load can be correlated with the shaker current. However, this approach can only be used to estimate the load in the excitation direction [6]. Strain gauges could be used to recover strains at interfaces and thus loads. However, they require careful calibration to provide a robust indirect measurement of the interface loads. A force measurement device provides six global interface forces or moments and local load cell forces during vibration testing, allowing the measurement of the local forces in three orthogonal directions [6]. However, such devices are not available in every test facility center. Moreover, they are costly, take space that is not taken into account in the design, and often change the system's response, so they must be accounted for in all test prediction analyses [2,5].
The mass operator is a mathematical tool used to derive loads from measured accelerations [3]. It uses measured accelerations in order to calculate the interface loads or stresses representative of the Appl. Sci. 2020, 10, 8542 3 of 18 physical state of a structure. A simple example of a mass operator approach is the sum of weighted accelerations (SWA), which is nothing else but the application of Newton's second law F = ma, where a is a vector of measured accelerations, m is an equivalent mass matrix, and F is the vector of loads at chosen interfaces. Generally, the mass operator would be created before a vibration test based on finite element analyses' results. The actual computation of the mass matrix can be performed using one of several techniques. With these data, it is then possible during the vibration test to calculate interface loads based on the real-life accelerations, measured by the sensors with no additional hardware [2,3,5]. The authors of [3] provide an extensive review and comparison of mass operators, among them the fitted SWA, the frequency-dependent SWA, and the artificial neural network (ANN).
The fitted SWA is the most straightforward method to calculate mass coefficients. It consists of defining the mass coefficients as design variables of a minimization problem or a curve fitting problem [2,3] where the error E between the response calculated with the finite element method and the response provided by the mass operator is minimized as follows [3]: However, this method works well only over small frequency ranges with few modes. To solve this issue, the authors of [3] considered a frequency-dependent SWA where the frequency range is split into subranges and a fitted SWA is created independently for each subrange. This method is however not well suited to closely spaced modes. In order to generalize the definition of the mass operator, the authors of [3] presented the use of an artificial neural network (ANN) in two different approaches. The ANN can be used to calculate mass coefficients based on input frequencies and accelerations; this is then a generalization of the frequency-dependent SWA. The ANN can also be used to directly provide the force from accelerations and frequency inputs; this is the most general definition of an operator that can convert measured accelerations into quantities of interest such as forces. Both approaches showed great potential for load estimations [3], but the latter approach has shown many drawbacks especially regarding the ability to generalize the mass operator as an ANN, if the tested structure differs from the analyzed one due to uncertainties such as boundary conditions or material properties. Furthermore, a mass operator as an ANN has not been investigated for the estimation of internal structural stresses.
In the last few years, ANNs have shown many successful applications in various domains, from monitoring structural health [7] to predicting tool life [8]. In [9], convolutional ANNs are used to predict vibrations. In a civil structure, vibration-based structural damage can meanwhile be detected using methods based on machine learning [10]. This paper aims to contribute to this expanding field in structural mechanical engineering by expanding the work done in [3] on the use of ANNs. First and foremost, research work was performed on a large-scale structure within an industrial environment. Second, in addition to standard responses such as acceleration response, stresses were successfully predicted. Moreover, several types of neural networks were investigated that could be used to directly convert measured accelerations into structural stresses and hence enable the live prediction of stress during the vibration test. First, the general methods and considered ANNs are presented. Then, a use case is considered in order to test the different ANNs and get a better understanding and confidence about their ability to predict interface loads or stresses in a robust way. Finally, the paper concludes with a discussion on the findings and potential operational use of the proposed approaches.

Materials and Methods
In practice, mass operators are built using accelerations and stresses or loads. In this case, the accelerations, loads, and stresses were computed using the finite element method [11,12]. Once the mass operators were built and verified, they were deployed during the test to compute stresses and loads based on measured accelerations. In this paper, only ANNs are considered for creating mass operators and MATLAB 2018b (Mathworks, Natick, MA, USA) was used to create and train the proposed ANN.
A prediction of the structure's response is indeed provided by the finite element analysis (FEA) data. The FE model in this specific case needed to comprise two main aspects. One aspect was the accurate modeling of NIRSpec's ceramic bench, as this is the instrument being designed by AIRBUS and is one of the most sensitive parts. The other aspect was the compliance of the surrounding structure, i.e., the structural elements onto which NIRSpec was mounted. The latter aspect was addressed by conducting a coupled load analysis (CLA), where NIRSpec was considered via a standard FE model and the remaining ones, for instance, the instrument module, the optical telescope, and space craft elements, were represented through stiffness representative super-elements. From this CLA, only the forces and moments acting on NIRSpec were derived. In order to have the full picture, phase information was considered as well in order to depict the dynamical compliance of the overall structure. Next, the interface load input was condensed by only considering frequency steps in the vicinity of peaks in direct response as well as in cross-response. This condensation reduced the input size from roughly 42,000 frequency support points down to 700 (1.7%). This, evidently, reduced the computational efforts on our detailed FE model in terms of stress calculation and post-processing considerably, thereby allowing detailed investigation at mechanically interesting frequency ranges to address the first aspect of our FE approach, namely the detailed stress prediction on our ceramic bench.
However, to use FE models for deriving predictions, one has to assume damping. This highlights the major contributor to potential discrepancies, together with overall system nonlinearities stemming from interface mechanics, secondary structures like harnesses, implemented damper elements, and the like.
This infers that the real-life physical state of the structure, namely the interface forces and stresses, needs to be predicted live during the vibration test based on the actual response of the system. Only then will it be possible to adequately adapt the testing level to protect the structure. Unfortunately, only a limited set of data about the state of the structure is available, such as the measured accelerations at discrete locations on the structure [2]. From these accelerations, the stresses or forces working in the tested structure need to be derived using a dedicated method, such as mass operators or ANNs, which is the subject of this investigation. Any method must meet the following requirements:

•
Robustness with regard to natural frequency shifts during testing as compared to the ones computed with FEA.

•
Fast deployment during vibration testing in order to react adequately to the resulting responses; the effort of the post-processing model during the test must be small. • Accuracy for stresses and the interface forces. In this investigation, the von Mises yield criterion was used to monitor the state of the NIRSpec module.

•
Fast training, data acquisition, processing, and configuration.

•
Robustness with regard to the lack of sensors. As the number of available sensors during the test is restricted, the method must be accurate with a limited number of sensors and, at the same time, potentially inconveniently positioned ones [2,3].
Artificial neural networks (ANNs) mimic the human brain in its mechanisms to transfer data from one neuron to another (see Figure 2). They consist of a connection of different layers where each layer has a defined number of neurons. A neuron is similar to a computing block defined by an activation function, a set of weights and biases, an input, and an output. For more complex problems, a number of hidden layers can be inserted. Data are propagated through the ANN and the output of each layer represents the input of the next layer. The input to an ANN usually comprises the features and the targets. The feature data are used to predict the target data. In the case of mass operators, the features are the accelerations while the targets are the stresses. Such an ANN architecture can be described as a feedforward neural network [13]. If p is considered to be the input to a neuron and b the neuron's bias, then the output of that neuron is a = f (w · p + b), where f represents the neuron activation function and w is a weighting factor. While f is chosen with regard to the problem to be solved, w and b are both parameters that will be calculated based on a learning rule during the training [13]. During training, the network's neurons are first initialized, i.e., a random set of weights and biases is attributed to each neuron and an activation function needs to be assigned to each neuron in the layer. Then, the training data are forward-propagated through the network; each neuron applies its random weights and biases and its activation function to the input and produces an output, which is further propagated until the data reach the output layer. Afterward, an error function E is evaluated, usually the mean squared error (MSE) between the calculated outputs Y i and the target values T i : Appl. Sci. 2020, 10, x FOR PEER REVIEW 5 of 19 and biases is attributed to each neuron and an activation function needs to be assigned to each neuron in the layer. Then, the training data are forward-propagated through the network; each neuron applies its random weights and biases and its activation function to the input and produces an output, which is further propagated until the data reach the output layer. Afterward, an error function is evaluated, usually the mean squared error (MSE) between the calculated outputs and the target values : Finally, the error is back-propagated through the network in order to identify the neurons that are responsible for the error. The latter are then adapted to minimize the error, specifically their weights and biases are altered, while the connections of the neurons producing a low error are reinforced in this process [14].
The recurrent ANN is capable of exhibiting a dynamic behavior where the output of one layer can also be used as the input for a preceding layer. This makes it then possible for the neural network to create a temporary memory and process sequences of inputs [15]. This is particularly relevant as vibration tests are performed using frequency sweep where, for example, the frequency increases with time.
In this study, four different neural network models are compared to each other: • A frequency-dependent ANN (see Figure 3a): a feedforward ANN with the frequency values as additional feature data as in [3]. Thus, it is ensured that the data are associated with the corresponding frequency. Finally, the error is back-propagated through the network in order to identify the neurons that are responsible for the error. The latter are then adapted to minimize the error, specifically their weights and biases are altered, while the connections of the neurons producing a low error are reinforced in this process [14].
The recurrent ANN is capable of exhibiting a dynamic behavior where the output of one layer can also be used as the input for a preceding layer. This makes it then possible for the neural network to create a temporary memory and process sequences of inputs [15]. This is particularly relevant as vibration tests are performed using frequency sweep where, for example, the frequency increases with time.
In this study, four different neural network models are compared to each other: • A frequency-dependent ANN (see Figure 3a): a feedforward ANN with the frequency values as additional feature data as in [3]. Thus, it is ensured that the data are associated with the corresponding frequency. • A pretrained ANN: a feedforward ANN trained in two steps, in order to give special attention to the natural frequencies, which represent the most critical frequencies during the vibration test with respect to accelerations and stresses.

•
A nonlinear autoregressive exogenous (NARX) model (see Figure 3b): a recurrent ANN to depict the sequence nature of the input data, taking into account the last time step before making a prediction about the next one. For a nonlinear autoregressive exogenous (NARX) model, besides the external feature sequence u t , u t−1 , u t−2 , u t−3 , . . ., the targets y t of the network are also used as features, while a delayed version of them y t−1 , y t−2 , y t−3 , . . . is fed back into a feedforward network, according to [16] by y t = f (y t−1 , y t−2 , y t−3 , . . . , u t , u t−1 , u t−2 , u t−3 , . . .). While the benefit of such an ANN is its memory of the past values, the disadvantage of the NARX model is that each time step t of the sequence is treated as an independent layer. This can lead to an extremely deep ANN, resulting in an increase in computational time.

•
A recurrent ANN with a bidirectional long short-term memory layer (biLSTM): a recurrent ANN with a biLSTM layer to depict the sequence nature of the input data, taking into account both the last as well as the following time step for every prediction. The biLSTM layer is built up by a cell state and three different gates, namely the input, the output, and the forget gate. From this structure, an ANN with an LSTM layer is able to work with a memory. The prefix bi comes from the fact that it is able to use data from prior as well as following time steps. The input gate determines how much of a new value is used as input into the cell, while the forget gate determines how much of the cell state is to be forgotten, and the output gate determines how much of the cell state is used to compute the cell state of the next cell. These elements are combined through several functions as well as matrix operations. More information regarding the mechanisms of biLSTM layers can be found in [17]. • A pretrained ANN: a feedforward ANN trained in two steps, in order to give special attention to the natural frequencies, which represent the most critical frequencies during the vibration test with respect to accelerations and stresses. • A nonlinear autoregressive exogenous (NARX) model (see Figure 3b): a recurrent ANN to depict the sequence nature of the input data, taking into account the last time step before making a prediction about the next one. For a nonlinear autoregressive exogenous (NARX) model, besides the external feature sequence , , , , … , the targets of the network are also used as features, while a delayed version of them , , , … is fed back into a feedforward network, according to [16] by = ( , , , … , , , , , … ) . While the benefit of such an ANN is its memory of the past values, the disadvantage of the NARX model is that each time step of the sequence is treated as an independent layer. This can lead to an extremely deep ANN, resulting in an increase in computational time.

•
A recurrent ANN with a bidirectional long short-term memory layer (biLSTM): a recurrent ANN with a biLSTM layer to depict the sequence nature of the input data, taking into account both the last as well as the following time step for every prediction. The biLSTM layer is built up by a cell state and three different gates, namely the input, the output, and the forget gate. From this structure, an ANN with an LSTM layer is able to work with a memory. The prefix bi comes from the fact that it is able to use data from prior as well as following time steps. The input gate determines how much of a new value is used as input into the cell, while the forget gate determines how much of the cell state is to be forgotten, and the output gate determines how much of the cell state is used to compute the cell state of the next cell. These elements are combined through several functions as well as matrix operations. More information regarding

Data Generation
The data used to develop the proposed method represent the harmonic response of the system over the frequency range over which the structure will be tested, typically 5 Hz to 100 Hz. In this study, in order to train and evaluate the networks to compare the different ANNs, data had to be generated for the three different scenarios. The training, testing, and validation data were generated by conducting a finite element harmonic analysis to compute the accelerations and stresses or forces at given nodes and elements, respectively, over a determined frequency range (5-200 Hz, step of 2 Hz).
This data set was complemented by another set of data that was generated by conducting a finite element harmonic analysis over the same frequency range, with the same structure but different material properties. The Young's modulus of the JWST's optical bench was decreased by 5% in order to shift the natural frequencies of the structure, and to account for material property uncertainty. The remaining material properties were left unchanged. These artificial data helped the trained models to generalize and make better predictions when the material of the test structure was not identical to the material data considered in the finite element analysis.
The data were then divided into a training set, a testing set, and a validation set to enable the assessment of the training progress and process. In order to improve training, the data at natural frequencies of the structure were included in the training data set, while the remaining frequencies were randomly distributed between the training and the validation data set. Thus, it was ensured that the model learned the connections at the natural frequencies that were the most critical, since the structure experiences the stresses with highest amplitudes. In general, a small random number of frequencies can also be used as a test set to evaluate the model's accuracy. However, in this investigation, the models were assessed on independently generated test data with a changed Young's modulus. In this way, uncertainties, as experienced in reality, were taken into account.

Data Processing
To improve training and reduce the complexity of the problem to be solved, while increasing accuracy and speeding up the training process, the data of the various observations should be normalized. Every observation was scaled to be in a range from minus one to one. To make usable predictions during the test, the scaling parameters should be stored to denormalize the predictions to real-life figures [13].

Academic Use Case
For the first scenario, the theoretical case consisted of a very simple structure. It served as a benchmark to determine whether the method would be successful. The structure used for this scenario can be seen in Figure 4. The accelerations of 68 of the 90 nodes of the structure were used to predict the base force of the structure in element 100 (highlighted in Figure 4). The use of this excessive and unrealistic number of sensors (which, in reality, is never the case) enabled the assessment of the overall feasibility of the method. In the case where the method failed to predict the structure's base force, it could be deemed impractical. Furthermore, for this first scenario, the base force and not the stress was to be predicted using the accelerations because its relation to the measurable acceleration is more straightforward.
to generalize and make better predictions when the material of the test structure was not identical to the material data considered in the finite element analysis.
The data were then divided into a training set, a testing set, and a validation set to enable the assessment of the training progress and process. In order to improve training, the data at natural frequencies of the structure were included in the training data set, while the remaining frequencies were randomly distributed between the training and the validation data set. Thus, it was ensured that the model learned the connections at the natural frequencies that were the most critical, since the structure experiences the stresses with highest amplitudes. In general, a small random number of frequencies can also be used as a test set to evaluate the model's accuracy. However, in this investigation, the models were assessed on independently generated test data with a changed Young's modulus. In this way, uncertainties, as experienced in reality, were taken into account.

Data Processing
To improve training and reduce the complexity of the problem to be solved, while increasing accuracy and speeding up the training process, the data of the various observations should be normalized. Every observation was scaled to be in a range from minus one to one. To make usable predictions during the test, the scaling parameters should be stored to denormalize the predictions to real-life figures [13].

Academic Use Case
For the first scenario, the theoretical case consisted of a very simple structure. It served as a benchmark to determine whether the method would be successful. The structure used for this scenario can be seen in Figure 4. The accelerations of 68 of the 90 nodes of the structure were used to predict the base force of the structure in element 100 (highlighted in Figure 4). The use of this excessive and unrealistic number of sensors (which, in reality, is never the case) enabled the assessment of the overall feasibility of the method. In the case where the method failed to predict the structure's base force, it could be deemed impractical. Furthermore, for this first scenario, the base force and not the stress was to be predicted using the accelerations because its relation to the measurable acceleration is more straightforward. The second scenario basically represented a variation of the first scenario, where only six sensors were used to predict the base force as highlighted in Figure 4. This reduced number of sensors reflects reality, where the number of available measuring points is highly restricted. Thus, it provides the possibility to estimate the method's performance in a more realistic case with a limited number of sensors. The second scenario basically represented a variation of the first scenario, where only six sensors were used to predict the base force as highlighted in Figure 4. This reduced number of sensors reflects reality, where the number of available measuring points is highly restricted. Thus, it provides the possibility to estimate the method's performance in a more realistic case with a limited number of sensors.

Industrial Use Case
Last but not least, the NIRSpec use case represented an application of the method on a real and complex structure with a reduced number of sensors while predicting the element stress. Consequently, in the case where the models are able to make accurate predictions for those three scenarios, the method can be concluded as useful.
The considered use case scenario concerns an actual structure corresponding to the NIRSpec instrument's optical bench. The optical bench is equipped with ten sensors to predict the stress in one element (see Figure 5). This case makes it possible to evaluate the potential of the method for a real and complex structure with more complex eigenmodes and a limited number of sensors. In this use case, the stress is to be predicted because it represents a good indicator for the structure's physical state and enables the evaluation of the model's performance to predict other metrics than the force, as in [2].

Industrial Use Case
Last but not least, the NIRSpec use case represented an application of the method on a real and complex structure with a reduced number of sensors while predicting the element stress. Consequently, in the case where the models are able to make accurate predictions for those three scenarios, the method can be concluded as useful.
The considered use case scenario concerns an actual structure corresponding to the NIRSpec instrument's optical bench. The optical bench is equipped with ten sensors to predict the stress in one element (see Figure 5). This case makes it possible to evaluate the potential of the method for a real and complex structure with more complex eigenmodes and a limited number of sensors. In this use case, the stress is to be predicted because it represents a good indicator for the structure's physical state and enables the evaluation of the model's performance to predict other metrics than the force, as in [2]. In order to determine the stresses at the highlighted elements in Figure 5, a FEA was conducted with MSC NASTRAN version 2018.1.0. The structure was discretized by 96,073 nodes and 104,182 elements spanning from one-dimensional elements (i.e., rods and beams) over shell (triangular and quadrangular) to solid elements (tetrahedral, hexahedral, and pentangular). As boundary conditions, the FEA was subjected to forces and moments for each kinematic mount derived from the CLA, where phase information was provided as well. This approach is referred to as the multi-excitation method (MEM). All dynamic analyses were based on modal decomposition, and they are therefore modal frequency response analyses ranging from 5 Hz to 200 Hz. For each of these frequency steps, the von Mises stress was evaluated at the ten selected elements and used to train the ANN. It should be noted that this equivalent stress was used for this paper only. AIRBUS has developed a dedicated equivalent stress suited to predicting ceramic failure. Table 1 summarizes the number of neurons for the different models. The ideal number of neurons was determined in a trial and error way, aiming for the best performance of the MSE while keeping the number of neurons small. The number of delays of the NARX model was determined in the same way. As objective function, the mean squared error (MSE) was used for all the models. While the input differs for each model (see Table 1), the element stress or the base force were used as feature data for all ANN. Furthermore, except for the biLSTM, the Nguyen-Widrow layer initialization function [18] was used to generate the initial weights and biases of the neurons for all ANNs. For the biLSTM, the input weights were initialized with the Glorot/Xavier initializer [19], using an orthogonal initialization for the recurrent weights, while the forget gate bias was initialized with ones and the remaining biases with zeros. All models also share the same activation function, namely the hyperbolic tangent sigmoid, except for the biLSTM, which uses the sigmoid function for the gate, the hyperbolic tangent function for the cell state and hidden state, and the linear activation function for the regression layer. The used training algorithm is also indicated in Table 1. In order to determine the stresses at the highlighted elements in Figure 5, a FEA was conducted with MSC NASTRAN version 2018.1.0. The structure was discretized by 96,073 nodes and 104,182 elements spanning from one-dimensional elements (i.e., rods and beams) over shell (triangular and quadrangular) to solid elements (tetrahedral, hexahedral, and pentangular). As boundary conditions, the FEA was subjected to forces and moments for each kinematic mount derived from the CLA, where phase information was provided as well. This approach is referred to as the multi-excitation method (MEM). All dynamic analyses were based on modal decomposition, and they are therefore modal frequency response analyses ranging from 5 Hz to 200 Hz. For each of these frequency steps, the von Mises stress was evaluated at the ten selected elements and used to train the ANN. It should be noted that this equivalent stress was used for this paper only. AIRBUS has developed a dedicated equivalent stress suited to predicting ceramic failure. Table 1 summarizes the number of neurons for the different models. The ideal number of neurons was determined in a trial and error way, aiming for the best performance of the MSE while keeping the number of neurons small. The number of delays of the NARX model was determined in the same way. As objective function, the mean squared error (MSE) was used for all the models. While the input differs for each model (see Table 1), the element stress or the base force were used as feature data for all ANN. Furthermore, except for the biLSTM, the Nguyen-Widrow layer initialization function [18] was used to generate the initial weights and biases of the neurons for all ANNs. For the biLSTM, the input weights were initialized with the Glorot/Xavier initializer [19], using an orthogonal initialization for the recurrent weights, while the forget gate bias was initialized with ones and the remaining biases with zeros. All models also share the same activation function, namely the hyperbolic tangent sigmoid, except for the biLSTM, which uses the sigmoid function for the gate, the hyperbolic tangent function for the cell state and hidden state, and the linear activation function for the regression layer. The used training algorithm is also indicated in Table 1. The NARX model was designed in open-loop form, where the input targets were used as feedback features. The model used as many inputs as sensors and had one hidden layer, a defined number of delays, and one output layer per stress (see Table 1). The network with a biLSTM layer consisted of a sequence input layer with as many neurons as inputs, followed by a biLSTM layer. Then, there was one fully connected layer and, lastly, the regression output layer with its linear activation function and as many neurons as outputs.

Results
After the setup of the architecture of the different models, they were trained on the setting as listed in Table 1. Figure 6a,b shows an example of the learning curves for the pretrained ANN and the NARX model for the industrial use case, respectively.  The NARX model was designed in open-loop form, where the input targets were used as feedback features. The model used as many inputs as sensors and had one hidden layer, a defined number of delays, and one output layer per stress (see Table 1). The network with a biLSTM layer consisted of a sequence input layer with as many neurons as inputs, followed by a biLSTM layer. Then, there was one fully connected layer and, lastly, the regression output layer with its linear activation function and as many neurons as outputs.
After the setup of the architecture of the different models, they were trained on the setting as listed in Table 1. Figure 6a,b shows an example of the learning curves for the pretrained ANN and the NARX model for the industrial use case, respectively. The blue curves represent the MSE over the training epochs for the training data, the green curves represent the error for the validation error, and the red line represents the error for the test data. The green circle marks the optimal validation performance. The training curves of every model decrease (as clearly shown in Figure 6 for the pretrained ANN and the NARX model), indicating that the models are able to learn the underlying data. The remaining gap between the validation curves and the training curves can be ascribed to the generalization of the data. As the final validation error The blue curves represent the MSE over the training epochs for the training data, the green curves represent the error for the validation error, and the red line represents the error for the test data. The green circle marks the optimal validation performance. The training curves of every model decrease (as clearly shown in Figure 6 for the pretrained ANN and the NARX model), indicating that the models are able to learn the underlying data. The remaining gap between the validation curves and the training curves can be ascribed to the generalization of the data. As the final validation error is not too large, training can be concluded to be successful. After the training, the models were deployed on the test data. The results of their predictions can be seen in the following section.
The evaluation of the training on the theoretical cases shows that the NARX model is extremely sensitive to the resolution of the frequency range. Dividing the frequency range into 600 rather than 100 steps proves to increase the quality of training tremendously. This does not make a difference for the feedforward and the biLSTM networks, as it only increases computation time.

Discussion
In this section, the results of the three different use cases are discussed. Therefore, the different models' predictions of the test data are compared to the FEA and evaluated with a regression analysis.

Theoretical Case with 68 Sensors
The trained models were deployed to make predictions using the testing feature data. These data were generated by conducting the second FEA and reducing the Young's modulus of the academic structure's material by 5%. As can be seen in Figure 7b, the NARX model makes inaccurate predictions of the first three frequency steps. These steps were used as delays for training. The remaining frequency steps are predicted accurately. The ANN with biLSTM layer was the most delicate to train, and it makes more or less accurate predictions. It wrongly predicts the heights of some peaks, for instance, the peaks at frequency steps 40 and 90, as can be seen in Figure 7b. The frequency-dependent ANN predicts the heights of the peaks correctly (see Figure 7a), whereas the form of the peak at frequency step 50 is poorly predicted. The pretrained ANN (Figure 7d) makes slightly inaccurate predictions about the height and the form of the peak at frequency step 50 as well as the peak at step 90, corresponding to the peaks that are shifted the most in the testing data set compared to the training data set.
is not too large, training can be concluded to be successful. After the training, the models were deployed on the test data. The results of their predictions can be seen in the following section.
The evaluation of the training on the theoretical cases shows that the NARX model is extremely sensitive to the resolution of the frequency range. Dividing the frequency range into 600 rather than 100 steps proves to increase the quality of training tremendously. This does not make a difference for the feedforward and the biLSTM networks, as it only increases computation time.

Discussion
In this section, the results of the three different use cases are discussed. Therefore, the different models' predictions of the test data are compared to the FEA and evaluated with a regression analysis.

Theoretical Case with 68 Sensors
The trained models were deployed to make predictions using the testing feature data. These data were generated by conducting the second FEA and reducing the Young's modulus of the academic structure's material by 5%. As can be seen in Figure 7b, the NARX model makes inaccurate predictions of the first three frequency steps. These steps were used as delays for training. The remaining frequency steps are predicted accurately. The ANN with biLSTM layer was the most delicate to train, and it makes more or less accurate predictions. It wrongly predicts the heights of some peaks, for instance, the peaks at frequency steps 40 and 90, as can be seen in Figure 7b. The frequency-dependent ANN predicts the heights of the peaks correctly (see Figure 7a), whereas the form of the peak at frequency step 50 is poorly predicted. The pretrained ANN (Figure 7d) makes slightly inaccurate predictions about the height and the form of the peak at frequency step 50 as well as the peak at step 90, corresponding to the peaks that are shifted the most in the testing data set compared to the training data set. This evaluation can be illustrated by a regression analysis. Therefore, the predicted values, called output in Figure 8, are plotted against the calculated base force by FEA, referred to as targets, and a regression line is computed. Figure 8 shows the resulting regression plots, where the black dots represent the data points and the blue line represents the regression line. This evaluation can be illustrated by a regression analysis. Therefore, the predicted values, called output in Figure 8, are plotted against the calculated base force by FEA, referred to as targets, and a regression line is computed. Figure 8 shows the resulting regression plots, where the black dots represent the data points and the blue line represents the regression line.
(c) (d) Figure 7. Actual and predicted element force over frequency steps for the academic case with 68 sensors: (a) frequency-dependent ANN, (b) NARX with more frequency steps, (c) ANN with bidirectional long short-term memory layer (biLSTM) layer, and (d) pretrained ANN.
This evaluation can be illustrated by a regression analysis. Therefore, the predicted values, called output in Figure 8, are plotted against the calculated base force by FEA, referred to as targets, and a regression line is computed. Figure 8 shows the resulting regression plots, where the black dots represent the data points and the blue line represents the regression line. An overview of the respective regression coefficient of the models, i.e., the slope of the regression line in Figure 8 and the root mean square error (RMSE) between the target and the predicted base force, allows a quantitative comparison of the models. Table 2 summarizes these metrics for the prediction on the test data. The second values (R = 0.9644 and RMSE = 0.0305 N) for the NARX model are the regression coefficient and the RMSE, respectively, for the data without the first three values representing the delays. It can be determined that the pretrained model has the lowest performance, whereas the frequency-dependent ANN makes the most accurate predictions.  An overview of the respective regression coefficient of the models, i.e., the slope of the regression line in Figure 8 and the root mean square error (RMSE) between the target and the predicted base force, allows a quantitative comparison of the models. Table 2 summarizes these metrics for the prediction on the test data. The second values (R = 0.9644 and RMSE = 0.0305 N) for the NARX model are the regression coefficient and the RMSE, respectively, for the data without the first three values representing the delays. It can be determined that the pretrained model has the lowest performance, whereas the frequency-dependent ANN makes the most accurate predictions. The evaluation of the first theoretical case leads to the conclusion that the method has proven to be successful, even though these first predictions were made with an unrealistically large number of sensors.

Theoretical Case with 6 Sensors
The trained models were deployed to predict the test data with the shifted frequencies, resulting in the predictions seen in Figure 9. While, in this case, the biLSTM was not able to make adequate predictions, as can be seen in Figure 9c, the other models predicted the element force mostly accurately. It can be noted that the NARX model's predictions of the first frequency steps used as delays are not accurate, while the remaining curve is correctly predicted, as in Figure 9b. The pretrained ANN as well as the frequency-dependent ANN make slightly wrong predictions about the height and the form of some of the peaks (see Figure 9a,d). A regression analysis enforces the above observations, as can be seen in Figure 10. The regression coefficients and the RMSE for the predictions on the testing data are summarized in Table 3. While the ANN with biLSTM layer performs worst, resulting from its delicate training, the NARX model makes the most adequate predictions, even despite the delays included in the above calculation. The frequency-dependent ANN and the pretrained ANN perform similarly. A regression analysis enforces the above observations, as can be seen in Figure 10. The regression coefficients and the RMSE for the predictions on the testing data are summarized in Table 3. While the ANN with biLSTM layer performs worst, resulting from its delicate training, the NARX model makes the most adequate predictions, even despite the delays included in the above calculation. The frequency-dependent ANN and the pretrained ANN perform similarly.  This second theoretical case proves that most of the ANNs are also successful in the case, where the number of sensors is restricted, as is often the case in reality.  This second theoretical case proves that most of the ANNs are also successful in the case, where the number of sensors is restricted, as is often the case in reality.

NIRSpec Use Case
After the successful training of the ANN models, they were then applied to the NIRSpec use case. The test data with varied stiffness were generated by reducing the Young's modulus of the material of the optical bench plate by 5% of the initial value in the FE model. Figure 11 shows the prediction of the four considered models against the calculated stress with FEA with respect to the frequency steps. The frequency steps divide the considered frequency range (5-200 Hz) into equal steps, the steps and the corresponding normalized stress values both resulting from the FEA.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 15 of 19 material of the optical bench plate by 5% of the initial value in the FE model. Figure 11 shows the prediction of the four considered models against the calculated stress with FEA with respect to the frequency steps. The frequency steps divide the considered frequency range (5-200 Hz) into equal steps, the steps and the corresponding normalized stress values both resulting from the FEA. As can be seen from Figure 11b, the NARX model predicts the element stress curve without major deviation. It only seems to struggle slightly with the first frequency step, which can be ascribed to the use of that first step as delay in the model's architecture. The frequency-dependent ANN in Figure 11a struggles to predict the peak stress values, for instance, around frequency step 400 and also with the shapes of a few peaks, mainly at the last frequency steps from steps 500 to 600. The pretrained ANN, as seen in Figure 11d, also seems to struggle with the shapes of a few peaks, especially at the end of the frequency range. The fact that both these models struggle at the end of the frequency range can be ascribed to the fact that this represents the mode that was shifted the most by changing the material properties. The NARX model, however, does not face any difficulties with this. Figure 11c shows that the biLSTM also makes more or less accurate predictions, while it was the most delicate to train. However, it also struggles with some peak stress values and completely omits the As can be seen from Figure 11b, the NARX model predicts the element stress curve without major deviation. It only seems to struggle slightly with the first frequency step, which can be ascribed to the use of that first step as delay in the model's architecture. The frequency-dependent ANN in Figure 11a struggles to predict the peak stress values, for instance, around frequency step 400 and also with the shapes of a few peaks, mainly at the last frequency steps from steps 500 to 600. The pretrained ANN, as seen in Figure 11d, also seems to struggle with the shapes of a few peaks, especially at the end of the frequency range. The fact that both these models struggle at the end of the frequency range can be ascribed to the fact that this represents the mode that was shifted the most by changing the material properties. The NARX model, however, does not face any difficulties with this. Figure 11c shows that the biLSTM also makes more or less accurate predictions, while it was the most delicate to train. However, it also struggles with some peak stress values and completely omits the mode at frequency step 500. Figure 12 shows the resulting regression plots.

Conclusions
In this work, four different artificial neural network models were tested for their ability to predict stresses related to the excitation frequency for the launch scenario of the Near-Infrared Spectrograph.
In addition, they were tested on a theoretical case with differing numbers of sensors. With correctly trained ANNs, the monitoring of real shaker tests and thus the avoidance of overstressing the test specimens are possible.
The conducted investigation allowed the comparison of all ANN models with respect to the requirements formulated in Section 2. From Tables 2-4 in Section 4, it can be clearly deduced that the NARX model is the most promising one. Figure 7, Figure 9, and Figure 11 illustrate this conclusion. Thus, a trained NARX model could be used during vibration tests and decrease the time of prediction of the given structural parameters, which is crucial for adapting and notching the input load of the shaker in time.
As could also be seen, the recurrent ANN generally performs better than the feedforward ANN, handling the input as concurrent data. The ANN with biLSTM layer is able to make accurate predictions, even though its training is not conducted thoroughly due to the lack of data for a deep ANN. However, if such an ANN is trained with more data and more varied data, it possibly makes the most accurate predictions. In future studies, the potential of this network can be further investigated. For instance, the training data set for this model could be increased by including training data from FEA with several varied Young's moduli or varied damping parameters or by varying other material parameters that have an impact on the natural frequency.
While the NARX model performs the best, its performance is highly dependent on the number of available frequency steps. For example, if the frequency range to be predicted (from 5 to 200 Hz) is poorly resolved and only divided into 100 instead of 600 frequency steps, this situation has a negative effect on the quality of the NARX model's predictions. The other networks are not as sensible to the division of the frequency range. In particular, the ranges of eigenmodes should have a higher resolution by having additional frequency steps. Each time step t of the sequence is treated as a single layer, which can lead to an extremely deep ANN. On the one hand, this results in increasing the computational time, but on the other hand it increases the performance of the network. The performance of the NARX model can thus be maximized by training it with as many frequency steps as possible. However, in practice, this can be a hurdle, as the required higher resolution may not be available during the test. Table 5 outlines the qualitative evaluation for the different models in terms of the requirements introduced in the introduction.

Conclusions
In this work, four different artificial neural network models were tested for their ability to predict stresses related to the excitation frequency for the launch scenario of the Near-Infrared Spectrograph. In addition, they were tested on a theoretical case with differing numbers of sensors. With correctly trained ANNs, the monitoring of real shaker tests and thus the avoidance of overstressing the test specimens are possible.
The conducted investigation allowed the comparison of all ANN models with respect to the requirements formulated in Section 2. From Tables 2-4 in Section 4, it can be clearly deduced that the NARX model is the most promising one. Figures 7, 9, and 11 illustrate this conclusion. Thus, a trained NARX model could be used during vibration tests and decrease the time of prediction of the given structural parameters, which is crucial for adapting and notching the input load of the shaker in time.
As could also be seen, the recurrent ANN generally performs better than the feedforward ANN, handling the input as concurrent data. The ANN with biLSTM layer is able to make accurate predictions, even though its training is not conducted thoroughly due to the lack of data for a deep ANN. However, if such an ANN is trained with more data and more varied data, it possibly makes the most accurate predictions. In future studies, the potential of this network can be further investigated. For instance, the training data set for this model could be increased by including training data from FEA with several varied Young's moduli or varied damping parameters or by varying other material parameters that have an impact on the natural frequency. While the NARX model performs the best, its performance is highly dependent on the number of available frequency steps. For example, if the frequency range to be predicted (from 5 to 200 Hz) is poorly resolved and only divided into 100 instead of 600 frequency steps, this situation has a negative effect on the quality of the NARX model's predictions. The other networks are not as sensible to the division of the frequency range. In particular, the ranges of eigenmodes should have a higher resolution by having additional frequency steps. Each time step of the sequence is treated as a single layer, which can lead to an extremely deep ANN. On the one hand, this results in increasing the computational time, but on the other hand it increases the performance of the network. The performance of the NARX model can thus be maximized by training it with as many frequency steps as possible. However, in practice, this can be a hurdle, as the required higher resolution may not be available during the test. Table 5 outlines the qualitative evaluation for the different models in terms of the requirements introduced in the introduction.

Conclusions
In this work, four different artificial neural network models were tested for their ability to predict stresses related to the excitation frequency for the launch scenario of the Near-Infrared Spectrograph. In addition, they were tested on a theoretical case with differing numbers of sensors. With correctly trained ANNs, the monitoring of real shaker tests and thus the avoidance of overstressing the test specimens are possible.
The conducted investigation allowed the comparison of all ANN models with respect to the requirements formulated in Section 2. From Tables 2-4 in Section 4, it can be clearly deduced that the NARX model is the most promising one. Figures 7, 9, and 11 illustrate this conclusion. Thus, a trained NARX model could be used during vibration tests and decrease the time of prediction of the given structural parameters, which is crucial for adapting and notching the input load of the shaker in time.
As could also be seen, the recurrent ANN generally performs better than the feedforward ANN, handling the input as concurrent data. The ANN with biLSTM layer is able to make accurate predictions, even though its training is not conducted thoroughly due to the lack of data for a deep ANN. However, if such an ANN is trained with more data and more varied data, it possibly makes the most accurate predictions. In future studies, the potential of this network can be further investigated. For instance, the training data set for this model could be increased by including training data from FEA with several varied Young's moduli or varied damping parameters or by varying other material parameters that have an impact on the natural frequency. While the NARX model performs the best, its performance is highly dependent on the number of available frequency steps. For example, if the frequency range to be predicted (from 5 to 200 Hz) is poorly resolved and only divided into 100 instead of 600 frequency steps, this situation has a negative effect on the quality of the NARX model's predictions. The other networks are not as sensible to the division of the frequency range. In particular, the ranges of eigenmodes should have a higher resolution by having additional frequency steps. Each time step of the sequence is treated as a single layer, which can lead to an extremely deep ANN. On the one hand, this results in increasing the computational time, but on the other hand it increases the performance of the network. The performance of the NARX model can thus be maximized by training it with as many frequency steps as possible. However, in practice, this can be a hurdle, as the required higher resolution may not be available during the test. Table 5 outlines the qualitative evaluation for the different models in terms of the requirements introduced in the introduction. All in all, it can be stated that the conducted research was able to outline a methodology capable of live predicting equivalent stresses of a structure under vibration testing, thereby allowing failure to be evaluated for different types of material via yield criteria.

Robustness with regard to lack of sensors
Appl. Sci. 2020, 10, x FOR PEER REVIEW 17 of 19

Conclusions
In this work, four different artificial neural network models were tested for their ability to predict stresses related to the excitation frequency for the launch scenario of the Near-Infrared Spectrograph. In addition, they were tested on a theoretical case with differing numbers of sensors. With correctly trained ANNs, the monitoring of real shaker tests and thus the avoidance of overstressing the test specimens are possible.
The conducted investigation allowed the comparison of all ANN models with respect to the requirements formulated in Section 2. From Tables 2-4 in Section 4, it can be clearly deduced that the NARX model is the most promising one. Figures 7, 9, and 11 illustrate this conclusion. Thus, a trained NARX model could be used during vibration tests and decrease the time of prediction of the given structural parameters, which is crucial for adapting and notching the input load of the shaker in time.
As could also be seen, the recurrent ANN generally performs better than the feedforward ANN, handling the input as concurrent data. The ANN with biLSTM layer is able to make accurate predictions, even though its training is not conducted thoroughly due to the lack of data for a deep ANN. However, if such an ANN is trained with more data and more varied data, it possibly makes the most accurate predictions. In future studies, the potential of this network can be further investigated. For instance, the training data set for this model could be increased by including training data from FEA with several varied Young's moduli or varied damping parameters or by varying other material parameters that have an impact on the natural frequency. While the NARX model performs the best, its performance is highly dependent on the number of available frequency steps. For example, if the frequency range to be predicted (from 5 to 200 Hz) is poorly resolved and only divided into 100 instead of 600 frequency steps, this situation has a negative effect on the quality of the NARX model's predictions. The other networks are not as sensible to the division of the frequency range. In particular, the ranges of eigenmodes should have a higher resolution by having additional frequency steps. Each time step of the sequence is treated as a single layer, which can lead to an extremely deep ANN. On the one hand, this results in increasing the computational time, but on the other hand it increases the performance of the network. The performance of the NARX model can thus be maximized by training it with as many frequency steps as possible. However, in practice, this can be a hurdle, as the required higher resolution may not be available during the test. Table 5 outlines the qualitative evaluation for the different models in terms of the requirements introduced in the introduction. All in all, it can be stated that the conducted research was able to outline a methodology capable of live predicting equivalent stresses of a structure under vibration testing, thereby allowing failure ~ All in all, it can be stated that the conducted research was able to outline a methodology capable of live predicting equivalent stresses of a structure under vibration testing, thereby allowing failure to be evaluated for different types of material via yield criteria.
Author Contributions: L.W. carried out the presented research within her thesis, while R.O., M.S., and K.M.d.P. supported this study as supervisors. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.