Domain Adaptation via Alignment of Operation Profile for Remaining Useful Lifetime Prediction

Effective Prognostics and Health Management (PHM) relies on accurate prediction of the Remaining Useful Life (RUL). Data-driven RUL prediction techniques rely heavily on the representativeness of the available time-to-failure trajectories. Therefore, these methods may not perform well when applied to data from new units of a fleet that follow different operating conditions than those they were trained on. This is also known as domain shifts. Domain adaptation (DA) methods aim to address the domain shift problem by extracting domain invariant features. However, DA methods do not distinguish between the different phases of operation, such as steady states or transient phases. This can result in misalignment due to under- or over-representation of different operation phases. This paper proposes two novel DA approaches for RUL prediction based on an adversarial domain adaptation framework that considers the different phases of the operation profiles separately. The proposed methodologies align the marginal distributions of each phase of the operation profile in the source domain with its counterpart in the target domain. The effectiveness of the proposed methods is evaluated using the New Commercial Modular Aero-Propulsion System (N-CMAPSS) dataset, where sub-fleets of turbofan engines operating in one of the three different flight classes (short, medium, and long) are treated as separate domains. The experimental results show that the proposed methods improve the accuracy of RUL predictions compared to current state-of-the-art DA methods.


Introduction
Remaining Useful Life (RUL) prediction, also referred to as prognostics, is a key element of Prognostics and Health Management (PHM).Prognostics aim to estimate how long a system can maintain its specified functionality before reaching its end of life [1].Accurate RUL predictions enable scheduling maintenance from the perspective of operational and resource availability, avoid costly downtime, and prevent critical failures.In recent years, technological advancements and decreasing sensor costs have led to an increase in the amount of data collected from industrial assets [2].This increased data availability allows taking advantage of advanced data-driven approaches such as Deep Neural Networks (DNNs), which have recently shown great potential in PHM applications such as fault detection and diagnosis [3].
Traditional supervised machine learning methods assume that, on the one hand, a representative labeled dataset is available and, on the other hand, that the training and test datasets stem from similar distributions.Unfortunately, in real-world scenarios, both assumptions do not hold for many applications [4,5].While these challenges are also present in fault detection and diagnostics, they are particularly pronounced in prognostics.On the one hand, collecting a sufficiently representative dataset of run-time failure trajectories is often impossible due to the potentially catastrophic consequences of such an event in reality.On the other hand, deep learning models still face the challenge of domain shift due to the wide variety of operating conditions and limited training samples.These two challenges considerably impact the generalization ability of a model trained on a specific dataset and applied to a different but related dataset.
Unsupervised Domain Adaptation (UDA) has been shown to be effective in addressing the domain gap problem by adapting a model that has been trained on one domain (the source) to a different, unlabeled domain (the target).Previous research on domain adaptation has largely focused on classification tasks [6], and therefore, UDA has been extensively applied to fault diagnostics tasks [7,8].Recently, several research studies have also proposed applying UDA to prognostics tasks.These approaches typically involve aligning the feature distributions between the source and target domains through adversarial training [9] or reducing the disparity between feature distributions [10,11].Although adversarial training has been widely used in prognostics, it has not taken into account the distinct phases of cyclical operation profiles, such as those found in flights, for example, take-off, cruise and landing phases.A distinct marginal distribution defines each phase of the operation profile, and adversarial adaptation methods may fail to capture such multi-modal structures in condition monitoring data.Therefore, the alignment of the source and target domains using DANN [12] assumes that the marginal distributions of each phase of the operation profile are also aligned.However, it may lead to misalignment of the phase of the operation profile due to under-or over-representation since the different operation phases may have different durations in the different operation profiles, for example.
To improve the ability to adapt between different domains, we propose utilizing the common sequence of operation profile phases which each system undergoes.We take advantage of the prior knowledge of the operating cycles since the phases always occur in the same order, share specific characteristics, and their existence and sequence are invariant across all domains.For instance, in the aviation domain, a flight can be divided into several phases, that a flight follows, such as take-off, cruise, and descent.The flight phases differ in terms of, e.g., altitude, duration, speed, etc., and have distinct marginal distributions.Within the same flight class, e.g., the cruise phase will have similar characteristics.However, between different flight classes, the characteristics of the flight phases will be more different.Each phase has a different impact on engine stress and degradation and consequently also on the RUL.By incorporating the operation profile into domain adaptation, we aim to improve the target RUL prediction in the case of a domain shift.
In this paper, we consider the scenario where time-to-failure trajectories for complex systems such as aircraft jet engines and power plants are available within one fleet but not for units from a different fleet.To address this challenge, we define domains as sub-fleets of these systems that are operated under similar conditions.Our focus is on the domain adaptation problem, where we aim to transfer knowledge from a labeled source domain to an unlabeled target domain.The units within the same fleet may have a heterogeneous composition of operating conditions during their respective missions.These operating conditions can include differences in operating speed, machine load, working temperature, and environmental noise.These differences can lead to diverse distributions of marginal characteristics, making it difficult to adapt models to new domains.Previous research has shown that when the feature distribution is multimodal, adapting only the feature representation can be challenging for adversarial networks [13,14].
To tackle this issue, we propose to replace the single-domain discriminator from DANN [15] method with a discriminator for each phase of the operation profile.This approach is inspired by the success of multi-task learning [16], and auxiliary tasks [17] in facilitating adaptation between domains.The proposed methodologies align the marginal distributions of each phase of the operation profile in the source domain with its counterpart in the target domain.This work investigates in depth the learning of invariant features by adversarial learning for the alignment of sub-fleets (units operated in a similar way), taking into account their operation profiles.Two methods are proposed to deal with the above scenarios: 1) each part of the operational profiles can be assigned to a specific regime, or 2) the parts of the operational profile are smoothly assigned to all regimes with a defined probability, which is particularly useful for transition phases.
We evaluate the performance of the proposed algorithm on the NASA New Commercial Modular Aero-Propulsion System Simulation (N-CMAPSS) turbofan degradation dataset [18].The dataset contains three flight classes: short (S), medium (M), and long (L), with a total of 15 units, with five units in each flight class.The proposed domain adaptation methods were evaluated on three adaptation tasks of increasing difficulty.The models were applied between sub-fleets, each consisting of five units, flying in a specific flight class.The adaptation scenarios consist of transferring from short to medium flights (S → M ), from short to long flights (S → L), and finally from medium to long flights (M → L).The adaptation scenarios always involve the adaptation from a flight class where runto-failure trajectories are available to a flight class without any labels.The results of the three tasks demonstrate that the proposed algorithm, which takes into account the operation profile, can improve over the traditional DANN [15] and outperform other comparative benchmark methods.

Unsupervised Domain Adaptation
Unsupervised domain adaptation (UDA) [19] aims to improve the performance of a model on a target domain in the presence of a domain shift between the labeled source domain and unlabeled target domain.Several UDA methods have been proposed to align the feature distributions between the two domains during training using either discrepancy losses or adversarial training.
Prior works mainly relied on distribution alignment.
While Deep Adaptation Networks (DAN) [20] minimize the Maximum Mean Discrepancy (MMD) over domain-specific layers, Joint Adaptation networks [21] align the joint distributions of domain-specific layers across different domains based on a Joint Maximum Mean Discrepancy (JMMD).In contrast, adversarialbased methods strive to obtain domain-invariant representations through adversarial training.For instance, Domain-Adversarial Training of Neural Networks (DANN) [15] aims to create a domain-invariant representation by using a domain discriminator.To achieve this, the model is trained with a reversal gradient layer to make the feature space indistinguishable for different domains.Maximum Classification Discrepancy (MCD) [22] is another adversarial-based method that aims to reduce domain divergence and align the distribution of a target domain by considering task-specific decision boundaries through the use of task-specific classifiers in an adversarial training approach.Other approaches focused on the distribution of features for both domains across the batch normalization layers.Such approaches include Adaptive Batch Normalization (AdaBN) [23] and Automatic Domain Alignment Layers (AutoDial) [24], which aligns the distributions via modified batch normalization layers.
Most DA approaches have been developed for classification tasks, and domain adaptation for regression has received little attention.While classification approaches aim at generating decision boundaries to separate data into different classes, regression methods, on the other hand, aim at predicting continuous numerical outputs with a defined ordinal relationship.Early works on domain adaptation for regression problems introduced a weighting scheme that assigns a weight according to the similarity between the training and test samples [25,26].Recent works in the deep representation learning regime proposed different strategies to close the domain gap.For example, Weighting Adversarial Neural Network (WANN) [27] relies on an adversarial weighting approach to minimize the Y−discrepancy [28].DeepDAR [29] adopted the MMD metric in combination with a semisupervised loss to fit labeled and unlabeled data smoothly.Moreover, Representation Subspace Distance (RSD) [30] explores the Riemannian geometry of the Grassmann manifold to reduce the domain gap using the orthogonal bases representation of the subspace as opposed to instance representations.
Many of the mentioned methods for classification can be naturally extended to regression problems.Nevertheless, when dealing with complicated regression problems, there are still no straightforward solutions to the fundamental problem of unsupervised domain adaptation for regression.

Domain Adaptation applied to PHM
In recent years, domain adaptation has been increasingly applied to PHM applications [31], particularly in the area of fault diagnostics classification.Three main domain gaps exist in the PHM context: between varying operating conditions [32,33], between different units of a fleet [34], and between simulations and real data [35,36,37,38].
Most research in this area has focused on the transfer between discrete operating conditions.Classic domain adaptation methods, such as distribution alignment [39,40] and adversarial alignment [41,42], have been widely used in this context and have been found to be beneficial [8].Proposed approaches include discrepancy-based domain adaptation, using various criteria such as Maximum Classification Discrepancy (MCD) [42] and Maximum Mean Discrepancy (MMD) [43,44,45].Deep domain adversarial frameworks such as DANN [46,47] have been developed and applied for fault diagnosis of machines under varying working conditions.Some works have sought to improve upon adversarial alignment through the use of conditional discriminators [48,49].
In addition to fault diagnosis, transfer learning has also been recently increasingly applied to remaining useful life (RUL) prediction.For example, a transfer learning algorithm based on Bidirectional Long Short-Term Memory (BLSTM) recurrent neural networks was proposed in [50] for RUL estimation.The algorithm fine-tunes a model trained on a large amount of data from a source task with a small amount of data from a target task, usually a different but related task.The results showed that transfer learning is effective, except when transferring from a dataset of multiple operating conditions to a dataset of a single operating condition, leading to negative transfer learning.In addition to cases where labels are available for the target, also unsupervised domain adaptation approaches have been applied for RUL prediction.Different adversarial methods have been proposed to align the source and target domains and improve the performance of RUL prediction [9].For example, [9] used adversarial training to extract domain-invariant features from a Long Short-Term Memory (LSTM) model.Recent work [51] introduced a novel Wasserstein distance-based weighted domain adversarial neural network (WD-WDANN) for RUL prediction under different operating conditions by measuring the similarity of source samples to the target domain to determine the sample quality.Multiplekernel maximum mean discrepancies (MK-MMD) were proposed and shown to be more robust than single-kernel methods [52,11] and help to minimize the distribution discrepancy between different failure behaviors in the feature space.
Previous approaches proposed for RUL prediction in the context of PHM have mainly focused on applying domain adaptation on the entire operating cycle without considering the distinct phases of the operation profile for each domain.Such an approach aligning source and target domains using domain invariant feature learning implicitly assumes that the marginal distributions of different operating conditions are also aligned.We propose replacing the single-domain discriminator with multipledomain discriminators for each operating condition.In this way, we ensure that the different marginal distributions are aligned with their counterparts in the other domain.

Terminology
Before delving into the details of the proposed algorithm, it is important to define some key terms that will be used throughout the text.In this study, we make use of the terms operation profile and operation condition.An operation condition: refers to the specific state or set of conditions under which a system is operating and can be thought of as different domains.An operation profile: is a detailed and specific way of describing how industrial assets are operated and are controlled under different conditions while sharing similar characteristics in the sequences in which the operating conditions occur.An operation profile is typically composed of distinct and discrete phases, which are characterized by definite control and condition parameter ranges defining the states of the system that are retained for specific periods of time and are characterized by a unique marginal distribution of its characteristics.
Together, the operation profile (with different phases) and operation conditions provide a comprehensive understanding of how industrial assets function and perform over time.

General Problem Definition
In this paper, we study the unsupervised domain adaptation problem in the context of regression.During training, we are given access to a set of n s labeled samples from a source domain is a multivariate time sequence of p raw measurements of length T , where T is the length of the observation window.Finally, y i ∈ Y ⊂ R denotes the Remaining Useful Lifetime (RUL) value of the observed sequence x i .Labels of target data are not available during training.The goal of the task is to train a model using labeled set D s and unlabeled set D t multivariate time-series data of aircraft engines operated under different operating conditions.

Operation profile
Industrial assets may operate under different operating conditions but share similar phases of the operation profile.For example, in aviation, a flight is usually composed of different phases, such as take-off, cruise, and descent, requiring different thrust levels to propel the aircraft forward (Fig. 1), thus applying different types and levels of stress on the jet engine.More generally, the operation profile can be determined by discretizing a particular measured or control parameter based on domain knowledge or using an unsupervised clustering algorithm on the control parameters or multivariate observations.The final objective is to obtain a set of n p discrete and common phases of the operation profile for each of the two domains.
This research assumes that a finite number of discrete operation profile labels can either be provided based on domain knowledge or identified using the measured or control parameters.We argue that taking into account the different phases of the operation profile can improve the alignment process.
The following section presents how these operation profile labels can be exploited to extend upon the existing DANN methods to support the alignment process.

Domain-Adversarial Neural Networks
As the operating conditions of the source and target sub-fleets differ, a model trained on the source data would be able to easily distinguish the feature vectors of the target domain from the source domain due to the domain shift, leading to poor performance when applied to the target domain.Adversarial distribution alignment methods aim to address this problem by ensuring that the feature extractor is unbiased with respect to the characteristics of the source and target domains.
DANN [12] is a broadly applied domain adaptation method that aligns the distributions of source and target features by adding a domain discriminator and introducing adversarial training.

Feature Vector Multivariate time-series Input Data
Back propagation As illustrated in Figure 2, the DANN model is composed of a feature extractor f e , a regressor f r , and a domain classifier f d parameterized by θ e , θ r , θ d respectively.
The feature extractor f e takes the input data x and learns a feature representation denoted by f e (x).
The regressor predicts the RUL label for each input sample ŷ = f r (f e (x)).The RMSE loss is used for minimizing the error between the true and the predicted RUL on the source data (for the RUL prediction task): The second part of the model includes a domain discriminator, which is responsible for aligning the two domains.The domain classifier predicts the domain label for each input sample d = f d (f e (x)), where d ∈ {0, 1} is a domain label assigned to each training example to indicate its origin.The domain classifier uses the binary cross entropy loss function shown in Equation 2.
While the domain discriminator weights θ d are trained to minimize the binary cross-entropy loss, the feature extractor weights θ e are updated to maximize the binary cross-entropy loss.The feature extractor learns to extract domain-invariant features, rendering the domain discriminators incapable of predicting the true domain label.These two contradicting objectives are trained in an adversarial procedure utilizing a gradient reversal layer (GRL).GRL is an operation that acts as an identity function during the forward pass and reverses the sign of the gradient during the backward pass.The GRL allows us to implement this optimization problem in practice easily [12].

Operation profile-specific Alignment
In this research, we propose the Operation Profile-Specific (OPS) alignment framework, which aims to align the marginal distributions of the specific phases of the operation profile between different domains.These phases are often overlooked by other domain-invariant feature-learning techniques.Our approach extends the alignment component of DANN to consider each operating phase individually when aligning the source and target domains.In previous research, it has been demonstrated that when the feature distribution is multimodal, adapting only the feature representation can be challenging for adversarial networks [13,14].Therefore, to improve DANN's performance in these scenarios, we chose to use the DANN method, as it has a strong track record in domain adaptation tasks and has previously been successful in regression tasks for PHM [9].We propose two different approaches for operation profile-specific alignment: • Hard assignement : OPS-DANN (hard) • Soft assignement : OPS-DANN (soft)

OPS-DANN (hard)
Our first proposed extension to implement OPS alignment involves using dedicated domain discriminators for each operation profile.A hard assignment is made for each sample to its respective phase of the operation profile.The goal of OPS alignment is to separately align the distinct marginal distributions of each regime of the operation profile across the two domains.
This can be achieved by replacing the single domain discriminator task that discriminates between the source and the target domain with as many individual discriminators as there are discrete operating phases.Each individual domain discriminator still aims to differentiate between the source and the target domain.However, it only does so for samples belonging to one operating phase.This ensures that the marginal distribution of each operating phase is aligned with its counterpart in the other domain.The proposed framework is visualized in Figure 3.The domain discriminators are separated for each n p operating phases.Each domain discriminator is parameterized with its own set of weights θ dj with j ∈ {1, . . ., n p } and aims to predict the domain label dj,i = f dj (f e (x)) for a sample that is assigned to the i-th domain discriminator.The operating phase label z i assigns each sample to one of the n p discriminators.The binary cross entropy loss function shown in Equation 5 is utilized to determine the loss of each domain discriminator, where z i is a binary variable indicating whether a sample belongs to operating phase i or zero otherwise.The new domain discriminator loss is presented in Equation 5:

OPS-DANN (soft)
A soft assignment of samples to multiple discriminators can be useful when the operation phase la-bels are not certain.This can happen, for example, when a system is transitioning between two phases, and samples can be considered to belong to both operation phases.Instead of using a hard assignment of each sample to a single phase of the operation profile, we propose using a probabilistic assignment (soft assignment).To achieve this, we add an additional classifier after the feature extractor to classify the operation phase.This classifier is trained in a supervised manner using the available operating phase labels and outputs a probability distribution over the n p phases.The soft assignment allows samples to be assigned to multiple domain discriminators when the operating phase classifier is uncertain.The predicted probabilities are then used to weigh each sample's contribution to each of the n p domain discriminators.This model is depicted in Figure 4.
For OPS-DANN (soft), the previously introduced equations change slightly due to the additional operating phase classifier parameterized by θ z .For each source and target sample, it predicts a probability of a sample belonging to each of the n p operating phases ẑ = f z (f e (x)).The prediction ẑ can be written as a n p -dimensional vector ẑ = [ẑ 1 , . . ., ẑnp ].The cross-entropy loss function compares the model's predictions to the true operating phase labels.
The final loss for the OPS-DANN (soft) can be written as :

N-CMAPSS Dataset
We evaluate our proposed methods for Domain adaptation on the new Commercial Modular Aero-Propulsion System Simulation (N-CMAPSS) dataset, which contains run-to-failure trajectories of large turbofan engines [18].We compare the performance to previously proposed DA approaches.The dataset was created using NASA's high-fidelity simulation model [53], which allows the simulation of flight data over a wide range of flight conditions.Concretely, the flight data covers take-off, cruise, and descend flight conditions corresponding to different commercial flight routes.Each engine unit is assigned to one of the three flight classes based on the duration of the individual flights.Short flights are defined as flights that last between one and three hours, medium flights are between three and five hours long, and long flights last longer than five hours.
The degradation behavior of each engine unit is modeled as the combination of three contributors: an initial degradation, a normal degradation, and an abnormal degradation due to a fault.In the first phase, the engine degrades due to the normal degradation until the onset of a fault, whereby different fault types can occur.Normal degradation is modeled linearly.Once a fault is initiated, the engine enters an abnormal degradation phase until it ultimately reaches its end-of-life (EOL).During the abnormal degradation phase, the engine health decays exponentially.Figure 5 shows sample degradation trajectories of six engine units.
The degradation trajectories are given as multivariate time series of the sensor readings, containing measurements from 14 sensors from the turbofan engine, as well as scenario descriptors containing 4 parameters characterizing the operating condition of the flight, also referred to as scenario descriptors.The signals and the scenario descriptors are sampled at a frequency of 1 Hz.The different sensors include various temperature and pressure measurements, fan speeds, and fuel flow.The scenario descriptors include the altitude, speed, throttleresolved angle, and total temperature at the fan inlet.Each turbofan engine consists of five main components: the fan, the low and high-pressure compressor, and the low-and high-pressure turbine.During the simulated lifetime of each unit, one or a combination of several sub-components fails, leading to the onset of one of seven possible fault types.Fault severity increases over time.The N-CMAPSS dataset introduced several improvements compared to the popular original CMAPSS dataset [54], frequently used by researchers working on RUL-prediction tasks [55,9,56,57].The N-CMAPSS differs with respect to two main aspects.First, it considers actual flight conditions recorded on board of a commercial aircraft.Secondly, it extends the modeling of degradation by linking the degradation process to its operational history.The RUL prediction aims to indicate how many flights (i.e., cycles) of a particular engine (unit) are left before its EOL.
The N-CMAPSS dataset [18] contains eight datasets, each with different fault types and flight classes.In this study, we consider the different flight classes as distinct domains and focus on dataset three (DS03) of the N-CMAPSS dataset, which is characterized by a single fault type (HPTefficiency failure combined with LPT flow and efficiency failure) and equally distributed flight classes.We define the sub-fleets operated under different flight classes as a domain.The subset used for this paper contains an equal number of five units per flight class.Despite the identical number of units, the total number of samples of the three domains varies considerably due to the different lengths of the flights, as shown in Table 1.
In aviation, engines are subject to high stress during take-off.Therefore, in this dataset, short-haul flights have more take-offs and landings compared to long-haul flights, resulting in a different type of degradation.In addition, the sensor measurements of long-haul flights are significantly different from those of short-haul flights, as they reach higher altitudes and faster speeds, as can be seen in Figure 6.Given the three domains, there are six possible domain adaptation tasks, of which only the three presumably most difficult tasks are considered in this research.The three tasks comprise the transfer from a sub-fleet of short-range flights to the subfleets of mid-range and long-range flights, as well as from mid-range to long-range flights.These three tasks are particularly difficult because the marginal distribution of the characteristics of shorter flights does not cover those of longer flights, as visualized using the four scenario descriptors in Figure 6.The corresponding considered adaptation tasks are abbreviated as S → M for short to medium flight length adaptation tasks by S → L and M → L for the other two tasks, respectively.
In general, it seems intuitive to subdivide a flight into take-off, cruise, and landing operating conditions.However, in this work, we found a better suitable division of the operation profile: in ascending, steady, and descending operating conditions.This division allows a more fine-grained separation and is more closely related to how the airplane is operated.For example, the operating conditions of an airplane that changes its flight altitude during a flight would only be assigned to the cruise condition with the more intuitive partitioning approach.In contrast, the partitioning proposed in this research allows separating the two steady flight sequences from the descending one in between (see Figure 7).Ascending, steady, and descending flight conditions can be identified using the first-order derivative of the altitude measurement.To that end, the change in altitude between neighboring sampling points was considered and grouped into the three operating conditions using a threshold value.The threshold was experimentally set to T = 0.5 f t s .The Operation Phase label z can be defined based on the following inequalities: A median filter with a length of 51 was applied to smoothen the predictions.
. The key idea behind the proposed OPS alignment is to consider each operation phase separately during the alignment to ensure that its marginal distribution is matched with its counterpart in the other domain.Figure 8 shows exemplarily the marginal distributions of three exemplary sensors for the short, medium, and long flight domains.Most of the other sensor values follow similar patterns.

Preprocessing
Before a fault initiates, the normal degradation process is slow, and it can be assumed that the RUL is very large.Thus, for this work, we only start predicting the RUL after the onset of the fault.This results in a two-step process in which fault detection is performed first, and the RUL prediction is only initiated after the fault has been detected.In this paper, we are only focusing on the second step and assume that the fault onset detection is given.Therefore, the RUL prediction task aims to predict the remaining cycles after a fault has been initiated and detected.
Sensor measurements and scenario descriptors are taken as inputs to ensure a realistic usage scenario, resulting in 18 sensor values.Other quantities, such as virtual sensors and model health parameters, which require internal parameters of a simulator, are not considered in this research.
The dataset was downsampled by a factor of ten using a Chebyshev filter with order eight, which applies an anti-aliasing filter before the downsampling process.This ensured the same 0.1 Hz sampling frequency that was used in [56].
Each signal measurement was scaled to a range [−1, 1] using min-max normalization.The scaler was fitted on the source domain data for each adaptation task and subsequently applied to the target domain data.
The RUL was normalized for each engine unit to decrease from one to zero between the point in time when the fault occurs and the end of life.To that end, all samples belonging to one engine unit were divided by the maximum number of cycles of the respective unit.

Model Architecture
In this research, we compare the proposed methodology to other commonly used DA approaches.All of the DA methods use the same basic architecture to ensure a fair comparison.In particular, we used a one-dimensional-convolutional neural network (1DCNN) with the architecture inspired by that proposed in [56].The feature extractor consists of three 1D-convolutional layers (L=3).The first two layers have ten channels each, while the third layer condenses the feature representation into a single channel.Filters of size ten are used along with a stride of one, and zero-padding is added to ensure that inputs keep the same length when passing through the network.The network uses ReLU as the activation function.The feature extractor has a total of 3k trainable model parameters.
The RUL-regressor contains two fully connected layers (L=2).The first one takes the flattened output from the feature extractor and passes it through a fully-connected layer with 50 neurons.Then, after another ReLU activation function, the last layer predicts a single output value, the RUL, which is subsequently normalized to a range between 0 and 1 using a Sigmoid activation function.In total, there are approximately 3k parameters in the RULregressor.
The domain discriminator has a similar architecture as the RUL regressor but contains an additional layer.A first fully-connected layer with 50 neurons is followed by a second one with 30 neurons.The last layer is also fully-connected and ends with a single output neuron.Again, a Sigmoid activation function is applied to the final output, and ReLU activation functions are used in between fully connected layers.The number of layers and neurons is loosely inspired by an earlier work, which also used a DANN architecture on a similar dataset [9].The resulting domain discriminator has roughly 4k parameters.
OPS-DANN (hard): This model utilizes three domain discriminators, as there are three different phases for the operation profile in the dataset used in this work.The architecture of each discriminator is identical to the one described above, leading to a total number of 18k parameters.
OPS-DANN (soft): Similarly to the OCS-DANN (hard) version, this model utilizes three dis-tinct domain discriminators.Additionally, it uses an operating phase classifier, which is added after the feature extractor.This classifier uses the same architecture as the domain discriminator, except for the last activation function.A Sof tmax activation function is used in the model instead of a Sigmoid activation function, and this is because there are multiple classes that the model needs to predict and not just two classes since Sof tmax activation is designed to handle multi-class classification.OPS-DANN (soft) has the largest number of trainable model parameters, with a total of 22k.
Multi-Class OPS-DANN: We propose to explore the impact of the domain alignment of the operation phases in an ablation analysis.Compared to the architecture introduced above, this model only requires changes in the architecture of the domain discriminator of DANN.The domain discriminator predicts the domain of each sample and its operating phases.As there are three operating conditions, the discriminator requires six output neurons, one for each domain and operating phase pair.Consequently, the Sigmoid function must be exchanged by the Sof tmax activation to handle the multi-class output.This model has slightly more parameters than the original DANN, with roughly 10k parameters.

Comparison Methods
To fairly evaluate the performance of our proposed methods, we first establish a baseline using a feature extractor with a regressor that is trained only on the source data.We then compare the results of our Operation Profile-specific Domain Adaptation Network (OPS-DANN) and its variants to established domain adaptation techniques such as AdaBN, MK-MMD, and DANN.These methods have been previously applied to similar tasks in the field of prognostics and health management and have been shown to have strong performance, as reported in previous research studies such as [8].MK-MMD: This DA method is a domaininvariant feature learning technique.However, instead of using adversarial training, it aims to minimize a divergence measure between the two domains.To that end, the feature representation f , found by a feature extractor, is used to compute the MMD measure as shown in Equation 9. Multiplications of the feature transformation φ(•) can be readily computed by taking advantage of the kernel trick as demonstrated in Equation 10.For the multiple kernel version of MMD, K Gaussian kernels with different bandwidth parameters γ are combined to enhance model performance by adding them together, as shown in Equation 11.
For this work, five kernels were selected, similar to previous applications as reported in [10].The bandwidths parameter was set to 0.01, 0.1, 1, 10, and 100.
AdaBN: Unlike the other baseline methods introduced above, AdaBN [58] is not a domaininvariant feature learning technique.Instead, it aims to replace the normalization statistics of batch-norm layers computed on the source domain with those of the target domain.Notably, the target domain data is only used to update the normalization statistics, while all the other model parameters are trained using source data only.A necessary condition to utilize AdaBN is an architecture with batch-norm layers; consequently, the feature extractor's standard architecture had to be adapted.A 1D-batch-norm layer was placed following each of the three convolutional layers.
In summary, except for AdaBN, which uses additional batch-norm layers, all other baseline methods use the same architecture for the feature extractor and the RUL regressor to ensure a fair comparison.

Training Procedure
Before training, all models were initialized using Xavier normal initialization [59].All the data from each domain are considered during training for each adaptation task, and no test-training split is used.
In this research, several consecutive measurement points are combined into one input sample as input to the model using sequences of length 50 from the multivariate time series with a step size of one.
Model updates were performed using batch gradient descent with batches of size 256.One batch of source and target domain data was processed for each training step.
Table 2: Range of the hyperparameter search for all considered DA-methods.The optimal hyperparameters found on the S → L adaptation task are emphasized.
All DA methods were trained for the same number of epochs for each of the adaptation tasks to ensure a fair comparison.For the S → L and M → L adaptation tasks, the models were trained for 15 epochs.However, in the S → M adaptation task, the models were trained for 25 epochs due to the lower number of model updates per epoch.
The baseline was trained for 40 epochs in case the source domain was short flights S, and for 20 epochs in case the source domain was medium flights M .
All DA methods based on DANN additionally required the definition of the reverse gradient factor ρ.For this work, the same update rule for ρ is used as the one proposed in the original DANN [15], which gradually updates ρ from 0 to 1 according to Equation 12.
The learning rate is reduced after each epoch to ensure a smooth convergence using a learning rate schedule, likewise adopted from the original DANN paper [15].The proposed schedule is shown in Equation : where p once again describes the linear training progress.Unlike the gradient reversal factor, which only applies to DANN-based models, the learning rate schedule is used for all models.The initial learning rate α 0 is found using a hyperparameter search.
The hyperparameters and training specifications mentioned above were primarily selected based on prior work and applied to all models without further refinement.Other hyperparameters were explicitly tuned for each domain adaptation method, such as the learning rate, momentum, and the trade-off between multiple loss objectives.Table 2 summarizes the performed grid searches for each method and indicates the optimal hyperparameters found for each method.It is important to emphasize that for all methods, the grid search was only performed on the S → L adaptation task.The resulting optimal hyperparameters were then used for the other two adaptation tasks.The S → L task was selected to perform the grid search because it has the largest domain gap and is considered as the most challenging task.For AdaBN, the learning rate and momentum found on the S → L adaptation task were likely too big for the M → L adaptation task and did not lead to convergence.Only in this case, the second best set of hyperparameters found on the S → L was used for the M → L task, which is also indicated in Table 2.

Evaluation Metrics
In this work, all experiments were evaluated using the two common evaluation metrics used for RUL prediction [54]: the root mean square error (RMSE) and the NASA scoring function.
The RMSE is defined as follows: NASA's scoring metric is not symmetric and penalizes over-estimation more than under-estimation and is defined as follows:

Representation Transferability
The Proxy A-distance (PAD) proposed in [60] is a suitable metric to measure the divergence for domain adaptation tasks.Thus, we can use it to evaluate the transferability of representations Using the PAD, the divergence between two domains is computed by evaluating how well a classifier can separate the source from the target domain.If separation is easy, the two domains are likely dissimilar, and their discrepancy is large.On the other hand, if samples can hardly be discriminated between the two domains, they will likely have a small discrepancy.Using the classification error , the PAD dA can be found according to Equation 16.
In general, all DA methods that contain a domain-invariant feature learning part aim somewhat to minimize the divergence between the source and the target domain.For example, DANN uses the PAD as a discrepancy measure and aims to minimize it during training in the alignment component [15].

Experimental Results
In this section, we compare the prediction accuracy of the considered models in terms of the RMSE and the NASA score.The results are shown in Table 3.Additionally, the upper bound RMSE for the medium-haul and long-haul flight domains are 3.16 and 2.17, respectively.
In addition, to evaluating the performance, we also evaluate the PAD (qualitatively measuring the divergence between the domains), calculated from the learned feature embeddings.As described in the section 4.7, the PAD is used to evaluate the ability of the applied DA methods to extract domaininvariant features.In Figure 9  It is similarly important to compare the performance of a novel DA method to other state-of-theart methods.Even though DANN has the lowest RMSE out of the three considered baseline methods, the proposed OPS-DANN methods outperform DANN by at least 9% in RMSE and 2% in S-score.Thus, all models benefit from unlabeled target domain data, as expected in this setup.However, comparisons with the upper bound RMSE of 2.17 show that by using labeled short-haul flight data for adaptation, there is still a considerable performance gap to models trained with the labeled long-haul flight data.Evaluation of the alignment: Figure 9, shows that all three proposed OPS-DANN methods can extract features that contain less discriminative information about the two domains than the baseline on the S → M DA task.The two methods using individual domain discriminators (OPS-DANN soft  9, it becomes apparent that none of the methods are able to reduce substantially the domain distinctiveness compared to the models trained without adaptation.

Discussion
The findings of this study demonstrate that, on two out of the three tasks, the proposed three variants of the operation phase alignment methods perform similarly well and are able to achieve better results compared to both the DANN model as well as the other domain adaptation methods.This demonstrates the effectiveness of the proposed methods in adapting to different domains, particularly in addressing the unique challenges posed by multimodal feature distributions and the variability of operating conditions and system configurations.Furthermore, it should be noted that DANN is already an effective method for domain adaptation.However, the proposed operation phase alignment In addition to evaluating the performance of the proposed methods for RUL prediction, we also investigate the impact of each method on the embeddings generated by the alignment process.To enable 2D visualizations, we applied a principal component analysis (PCA) and present the first two principal components in Figures 10 and 11.This visualization provides insights into how the different phases of the operation profile are represented in the embeddings and how the proposed methods affect this representation.
As shown in Figure 10 DANN and our proposed approaches are able to effectively align the source and target embeddings, as demonstrated by the overlapping of the source and target data points in the figures.However, it is worth noting that there is a distinct difference between the methods with respect to how distinguishable the operation phases are in the embedding space.
In Figures 10 and 11, the OPS-DANN soft and hard approaches have phases that can be distinguished but that are more pronounced for the soft approach.On the contrary, the Multi-Class OPS-DANN approach has only one cluster that is similar to the one obtained with the DANN approach which is not able to distinguish between the operat-ing phases.However, the difference between DANN and Multi-Class OPS-DANN is that they have different shapes in the embedding space.Moreover, DANN aims to learn domain invariant features, and Multi-Class OPS-DANN aims to learn domain and operating-condition invariant features.This difference in the ability to distinguish between the operation phases highlights the unique characteristics of each proposed approach and their ability to align the operating characteristics and phases of the different domains.
It should be noted that the Multi-Class OPS-DANN uses the operation phase labels in a distinctly different way than the other two proposed methods.Instead of aligning the marginal distributions of all three operation phases separately, it aims to extract features invariant to both the domain and the operation phase of the input data simultaneously.Figure 11 shows that Multi-Class OCS-DANN successfully learns such a feature representation that is invariant to the operation phases.
Contrary to the Multi-Class OPS-DANN, the OCS-DANN (soft) models contain an additional classifier that aims to learn how to distinguish between different operating phases while simultaneously learning to extract features that are domain invariant.Figure 11 illustrates this behavior: each operating phase forms a separate cluster while the source and target domains overlap to a large extent.This behavior is similar to the OCS-DANN (hard).
This observation highlights the effectiveness of the proposed approach in aligning the operating characteristics and conditions of the different domains, leading to improved performance for RUL prediction

Conclusion
In this paper, we propose a novel approach that utilizes domain adaptation techniques to align the operating phases to improve the accuracy of RUL predictions.
The main novelty of our proposed approach is the integration of the information on the different phases of the operation profile into the alignment process.The proposed approaches align the marginal distributions of each phase of the operation profile in the labeled source domain with its counterpart in the unlabeled target domain.Two novel domain adaptation approaches are proposed based on an adversarial domain adaptation by considering the different phases of the operation profile separately.
The proposed methods have shown to be effective in improving the performance of deep learning models for RUL prediction by effectively transferring the models between sub-fleets that are operated under different conditions.The results of this study demonstrate the potential of these methods to improve the accuracy and reliability of prognostics and health management in real-world applications.Furthermore, the proposed methods have a better performance compared to state-of-the-art domain adaptation methods such as DANN, MK-MMD, and AdaBN.
There are three interesting potential future research directions resulting from this research.First, the proposed methods can be extended to enable them to tackle challenges arising from imbalanced operating conditions, which is a very common case in practical applications.Second, it would be interesting to investigate whether learning a soft assignment before performing adaptation may be beneficial by training an operating phase classifier separately from the DA model.Thirdly, while this research focused on regression, the proposed methodology can be easily extended to classification tasks such as for example for fault diagnostics problems.We leave the aforementioned open research directions for future work.
Duration of flight in [min] Altitude [m] (a) Operating Condition: Short flight Altitude profile Duration of flight in [min] Altitude [m] (b) Operating Condition: Long flight Altitude profile Altitude level at the different phases of the flight Take-off, cruise, and Landing.

Figure 1 :
Figure1: The figure1a and 1bshow two flights that were operated differently and have different profiles.Figure1cillustrates the presence of similar phases in both flights, which were used for the alignment process.

1 2Figure 2 :
Figure 2: Standard architecture of DANN approaches.The gradient Reversal Layer (GLR) ensures that the feature distributions over the two domains are indistinguishable for the domain classifier

Figure 3 :
Figure 3: Proposed OPS-DANN (hard) approach.The source and target samples of each operation profile are aligned with a dedicated domain discriminator.

Figure 5 :
Figure 5: Example of degradation trajectories of engine units over time for engines from different domains.Vertical lines denote the fault initialization of each engine unit.

Figure 6 :
Figure 6: The kernel density estimate (KDE) shows the probability distribution of the four scenario descriptors for each domain separately.In three out of four cases, it can be seen that the long-haul flights span the widest range of feature values, while the short-haul flights cover a noticeably smaller part.

Figure 7 :
Figure 7: Direct comparison of two possible separations of the operating conditions.Ascent and take-off are colored in turquoise, respectively; cruise and steady flight in dark blue; and descent and landing in green.The short periods of steady flying are differentiated from the rest using the change in altitude.

Figure 8 :
Figure 8: Kernel density estimate for marginal distributions of each operation profile for the three domains.Three exemplary sensor values were selected: Physical Core speed, Total Temperature at the LPT outlet, and total pressure at the HPC outlet.
, the A-distance varies from one task to another, showing a larger domain gap for the task S → L, a smaller one for the task S → M and almost no domain gap for the task M → L 5.1.1.S → L task Performance evaluation: The adaptation task from the short to the long flight domain is the most challenging because it has the largest domain gap.The baseline model trained solely on source data for this task (without any adaptation) exhibits an

Figure 9 :
Figure 9: A-distance for each method and task respectively

Figure 10 :
Figure 10: Visualizing the Impact of the Operation Profile Alignment on Domain alignment: A Comparison of DANN, OPS-DANN (soft), OPS-DANN (soft), and Multi-Class OPS-DANN Models using PCA on the Embedding space

Figure 11 :
Figure 11: Visualizing the Impact of the Operation Profile Alignment on the Phase alignment: A Comparison of DANN, OPS-DANN (soft), OPS-DANN (soft), and Multi-Class OPS-DANN Models using PCA on the Embedding space

soft assignement of domain classifier for each phase of the operation profile RUL Prediction
Figure4: Proposed OPS-DANN (soft) approach.An additional operation profile classifier is added to allow for the assignment of each sample to multiple domain discriminators depending on its probability of belonging to the respective operating regime.

Table 1 :
Number of units and samples of the three considered domains.

Table 3 :
The results for the three considered Domain Adaptation tasks.domness of the training procedure.Overall, none of the comparison methods DA methods could improve this task's target-free baseline significantly.None of the methods are able to reach the upper bound RMSE value of 2.17, which is achieved when the model is trained on the labeled target dataset.Evaluation of the alignment: Comparing the PAD values of all applied methods in Figure 5.1.2.S → M taskPerformance evaluation: Compared to the S → L task, the S → M task has a smaller domain gap.