A comparative study of energy graph-based fault detection and isolation techniques applied to a lignite plant

Energy and exergy interactions in industrial systems hold meaning across physical domains. This paper builds on the notion that capturing the energy and exergy interactions of a system, while retaining physical structural context, enables fault detection and isolation. To this end, three energy graph-based visualisation methods were developed for the purpose of fault detection and isolation. This paper presents a comparative study of the three analysis methods designated the 1) distance parameter method, 2) eigenvalue decomposition method, and 3) residual method. The study utilises data from a physical lignite plant in Janschwalde, Germany, in combination with simulation data of specific faults in order to compare the sensitivity and robustness of the three methods. The comparison is done firstly in terms of detection and secondly in terms of isolation. The distance parameter and eigenvalue decomposition methods showed high sensitivity and robustness for fault detection, while the residual method showed moderate comparative performance. In terms of fault isolation, the distance parameter method showed high sensitivity and robustness, while the eigenvalue decomposition method had irregular isolation performance. The residual method isolation results proved inconclusive.


Introduction
Modern automated industrial plants, as designed for Industry 4.0, have seen an ever-increasing need for process monitoring and health management across various operational modes, as well as reliable and efficient fault-tolerant control schemes [1][2][3][4].Since fault detection and isolation (FDI) is a sub-field of process monitoring, the methods and techniques of FDI are also key to Industry 4.0 applications [5].The typical methods for FDI can be categorised as first principle and data-driven methods, with the intermediate category of hybrid methods combining aspects of both [6,7].
Regardless of the method used, FDI evaluates a system's state or condition by comparing process measurements with a predefined reference.In the case of first principle methods, the reference is usually obtained from a priori physical insight of the system and a corresponding mathematical model [8].On the other hand, data-driven methods rely on the availability of historical data obtained from the monitored, fault-free system [9].Either of these two methods is generally implemented in the framework of a larger diagnostic system.The process of a typical diagnostic system can be expressed as a flow of transformations from its measurement space to a feature space [5,10].This feature space is then transformed into the decision space and finally the class space.These transformations would typically be applied to process measurements in the measurement space, such as temperature, pressure, or other process variables that would convey information regarding the health of the system (a physical system or simulation thereof).Two major challenges of FDI are minimising the dimensionality of the measurement space, and accurately isolating faults without degrading the robustness and sensitivity of the diagnostic system [5,11].Another challenge that arises in data-based approaches is interpretability.These methods often perform complex calculations on the data, such as data fusion and mapping, reducing the interpretability of the data with respect to the plant, complicating the analysis thereof [12][13][14].These difficulties can be partially overcome by a hybrid approach or a combination of FDI methods [15].
In energy conversion systems, thermomechanical monitoring techniques are typically used.The purpose of these techniques is to identify and measure any efficiency discrepancies and to pinpoint the primary sources of these discrepancies [16].However, in many cases, the ultimate cause cannot be detected only by thermodynamic parameters, and complementary techniques are required to aid in fault isolation [16].Thermoeconomics or exergoeconomics is a combination of economic and thermodynamic analysis that applies the concept of cost to exergy [17,18].This technique uses linear exergy equations to create a thermoeconomic model based on the purpose of each component.This approach has the potential to be very beneficial in power conversion systems, such as coal-fired power plants [19].
The use of energy as a variable to characterise industrial systems is common and widely established [20], making it well suited to FDI applications.Hu et al. [21] emphasise the importance of encoding expert knowledge into data-based FDI methods for improved performance, which can be seen as part of the energy characterisation step in the diagnostic method.Additionally, including structural information about the system allows for improved fault isolation [11,22,23].A similar approach is also followed in a method called formal analysis of concepts (FCA) that facilitates the analysis of the relationship between a set of components and a particular set of attributes [24].Another interesting concept also used to relate different variables is virtual sensing (also called soft sensing).Kim and Lee demonstrate how this approach can be used to improve fault isolation capacity [25].
FDI techniques should be able to store and manage system and interconnection data proficiently.Graph-based techniques are quite effective in semantic causal inference, heterogeneous association, and visualised explanation [26].The combination of energyand graph-based methods resulted in a hybrid technique called energy graph-based visualisation (EGBV) [1,[27][28][29][30].The advantage of EGBV comes from the fact that it processes and presents the casual system information using energy in such a way that it can be used for FDI.Furthermore, energy and exergy are proposed as process variables since they are unifying attributes that apply over different physical domains and facilitate the reduction of the data from process measurements.
This paper presents a comparative study of three FDI analysis methods that follow on the EGBV system characterisation approach described in [1,22,23].Previous research by Wolmarans which compares principal component analysis (PCA) to the eigenvalue decomposition method, motivates this research as a logical next step [31].The paper contributes in terms of the insights gained High-pressure pre-heater from the comparison of the three analysis methods within the decision and class spaces of FDI.The methods are compared using recognised metrics for sensitivity and robustness within the FDI context.The layout of the paper is as follows: Section 2 describes the system layout and operation principles with a detailed discussion of the considered faults.Section 3 discusses the topic of energy characterisation in the context of energy graph-based FDI, with the intent of constructing a node signature graph as a visualisation instrument.Section 4 focuses on the three FDI techniques and the application thereof.Section 5 describes the experimental design with a focus on fault simulation and data conditioning, or feature extraction.The application of the FDI techniques is discussed in Section 6.The results are given in Section 7 and discussed in Section 8. Finally, the paper is concluded in Section 9.

Plant layout
In this work, the measurement space and data mentioned in Section 1 is drawn from a physical lignite-fuelled, electrical generation facility in Jeanschwalde, Germany.Due to physical operation and access constraints, actual fault data could not be obtained.To overcome this constraint faults were simulated using an existing, validated, steady-state model built in Ebsilon Ⓡ .This model was composed and validated by an external company during the planning phase of later expansions of the facility to allow for evaluation of the planned changes.Fig. 1 shows a simplified system diagram of a single block of the facility.The facility consists of 6 such identical blocks, each with an electrical production capacity of 500 MW.This paper only considers a single block (henceforth referred to as ℎ  ).All energy exchanges between the environment (node 20) and ℎ   are denoted by Ė in Fig. 1.Table 1 summarises the energy-significant components of ℎ  .

Plant operational description
Thermodynamically, the system cycle can be described as a complex reheat Rankine cycle consisting of multi-stage turbines, preheaters, and a turbine-driven main water pump (SPAT, from Speisepumpenantriebsturbine).The boiler (components 1 and 3) consists of 3 main stages (combustion chamber, re-heater, and super-heater) and has both an airflow and coal input.In this paper, the boiler is considered as two parts; the main part that heats the fluid fed to the high-pressure turbine (component 2) and a reheating part that feeds the rest of the turbine stages (components 4-10).The two stages of pre-heaters in the system consist of a low-pressure pre-heater (component 15) and a high-pressure pre-heater (component 19).The hot fluid used for the pre-heating is obtained by bleeding a small part of the outlet fluid after each stage of the turbine.
The thermodynamic changes of the thermo-fluid (water) as it moves through the system cycle can be described as follows.Starting at atmospheric conditions at the exit of the cooling tower, water enters the small pump (component 14) and moves to the low-pressure pre-heater (component 15).The water is then heated to approximately 150 • C and enters the water storage tank (component 16) at approximately 10 bar.This water storage tank serves as central water storage to which more fluid can be added if required.The fluid is then compressed, while moving through the main pump (component 17), to a pressure of approximately 190 bar.Due to the addition of hot water from the high-pressure pre-heater (component 19) into the storage tank, the fluid temperature when exiting the tank is approximately 170 • C.This fluid enters the high-pressure pre-heater, increasing the temperature to approximately 260 • C. The boiler (components 1 and 3) forms the final heating stage, which increases the fluid temperature to a super heated steam temperature of 530 • C. Due to pressure drops within the system, the superheated steam enters the high-pressure turbine (component 2) at 160 bar.After the adiabatic expansion within the high-pressure turbine (component 2), the fluid temperature and pressure decrease to 350 • C and 45 bar respectively.This fluid stream is then reheated within the boiler to a temperature of 540 • C. Several adiabatic expansion cycles follow as the fluid moves through the various stages of the medium-and low-pressure turbines (components 4-10).

Fault types and simulation
In this paper, four main fault types (FTs) associated with steam turbine systems, with variations in both the location and size of the faults, are considered [33]: 1) solid particle erosion of the turbine blades; 2) leakage of the turbine's overflow valves; 3) overall wear and aging; and 4) cavitation in the main pump.
Dynamic normal operation data for The Plant were readily available but fault data were not.However, normal operational data from ℎ   and simulated fault data, from a validated steady-state model in Ebsilon Ⓡ , were used for feature extraction, as described in section 4. Due to limited access to information on the Ebsilon Ⓡ model's composition and validation, only a high-level validation of the model is reported in this paper.Actual measured normal operating condition, steady-state data at nominal load are compared with the model data for equivalent conditions.The main components identified for validation in terms of process variable values include the generator, the high-pressure turbine (HPT), medium pressure turbine (MPT), the boiler and the SPAT.The validation results are summarised in Table 2.The results illustrate an acceptable correlation between operational data and model data.The subsections that follow, describe the four faults in further detail.

Solid particle erosion (FT1)
Solid particle erosion (SPE) commonly occurs in the first stages of the high-and medium-pressure turbine blades due to the exfoliation of iron oxide and magnetic particles that contaminate the steam path earlier in the high-pressure parts of the boiler [34].SPE causes an increase in the swallowing capacity of the turbine and decreases efficiency.By setting the blade parameters of the various turbine stages in the Ebsilon Ⓡ to allow for such changes, the effects of SPE were modelled.The parameter values were changed to allow for specified changes in mass flow and efficiency drop as seen in Table 3.

Leakage of the overflow valve (FT2)
The overflow valve is usually placed in parallel with the turbine and is used for bypassing steam from the turbine, effectively controlling the imbalance between the turbines' steam consumption and the boiler's steam generation [35].According to Bellanca [34], overflow valve leakage is a possible consequence of SPE -however, Karlsson et al. [33] states that it can also be caused by a broken spindle.The fault causes high-quality steam to bypass a turbine stage and then either enters the turbine at a later stage (at lower pressure) or becomes lost as condensate.This results in a downstream temperature change, which can be monitored to detect such a fault.To simulate this fault the Ebsilon Ⓡ model was adjusted to incorporate a regulated overflow valve between the turbine stages.This valve was adjusted per simulation as described in Table 3 for FT2.

Overall wear and ageing (FT3)
Overall ageing is a degradation of the system manifesting as increased surface roughness and mechanical degradation of various components, lowering performance and efficiency.Temperature gradients, SPE, and low-quality steam all contribute to this fault.Because this fault will present changes in the system that correlate with some of the other faults (e.g.SPE), it is important to either monitor the change in the system performance over time or include multiple measurements that will allow differentiation between faults.

Pump cavitation (FT4)
Cavitation occurs in a fluid at a localised low-pressure zone when the fluid pressure falls below the vapour pressure of that fluid [36].These cavities contain droplets and vapour that form and collapse several times in a second.As a cavity moves downstream to a higher-pressure region, it collapses and causes a momentary, localised pressure increase, which can place stress on any nearby component walls (sometimes beyond its elastic capacity).According to Karassik et al., pump cavitation occurs where the pump pressure is 3% or more below the vapour pressure (i.e. the net positive suction head (NPSH) drops by 3% or more) [37].This was modelled by changing the fluid pressure at the input of the pump using an adjustable pressure valve, effecting the percentage pressure drops as indicated in Table 3.

Fault locations and intensity
Each of the four faults was simulated in Ebsilon Ⓡ at various locations in the system.The intensity of each fault was changed to obtain data for sensitivity and robustness analysis.Table 3 summarises the faults in terms of their locations and intensity in the system.Note that the term COMP# refers to the component and is numbered as shown in Fig. 1.In total, there are 17 unique fault IDs, each representing a specific FT, location, and intensity.The variations in the fault sizes are chosen to be between 1% and 10% beyond the normal operating limits, allowing for a large enough range to evaluate the robustness and sensitivity of the FDI techniques.

Energy characterisation
In the context of energy graph-based visualisation (EGBV), the energy characterisation of a system is considered a four-step process [1,22,38]: 1) Composing a system diagram indicating the component interactions and system layout; 2) Obtaining the energy attributes that describe the energy interactions across and between components; 3) Constructing an attributed graph; and 4) Composing a node signature matrix.

Graph-based visualisation
Energy distribution in the system can be visualised using modern graph theory, which retains both system structure and qualitative information encapsulated within the energy attributes.The final node signature matrix is then a generalised mathematical representation of the system in terms of energy at a specific time.In the context of this paper, a graph is formally defined as  = (, ) where  is a finite set of vertices (also called nodes) and  is a finite collection of edges (also called links) [39].This allows for the construction of a node-and-link graph (the attributed graph), where each node is a system component and each link a connection between components.A graph can be inexpedient when it is either too large (many nodes and links) or too small (few nodes and links).To facilitate the appropriate sizing of the attributed graph for the purpose of FDI, the following four criteria are applied: a) Include the locations where the system needs to be monitored for fault detection; b) Include components that have, relative to other components, a large number of energy interactions regardless of the energy quantity (i.e.turbine with one energy inflow and 2 outflows vs. a turbine with multi-stage inflows and outflows); c) Include components that have, relative to other components, large effects on the control of the system; and d) Include components that have large energy quantities associated with them.

Energy attributes
The energy interaction at any of the components in the system is illustrated in [1] as a transformation of energy between the inputs and outputs of each component from one form to another, with a change in exergy happening within the component.For ℎ  , the rate of exergy change in each component is only due to physical exergy.The general expression for this rate of exergy change is given by equation (1).
In equation (1) ṁ is the mass flow rate, ℎ and  respectively the fluid stream enthalpy and entropy for which the exergy change is calculated (subscript  denoting the component input and  the exit), and  0 the reference temperature.In the absence of chemicalrelated energy changes, the net work done by, or heat delivered to any component, is calculated using equation (2).
In equation ( 2), Ė = ṁℎ  is the input heat transfer rate and Ė = ṁℎ  is the exit heat transfer rate.By substituting equation (2) into equation ( 1), the rate of exergy change over a component as a function of the work done by the component is obtained as equation (3).
This allows the calculation of the rate of exergy change over each component.

Node signature matrix
To define the node signature matrix, consider a directed attributed graph  = (, , Ė,  Ḃ) where  = { 1 ,  2 , ...,   } is a set of  nodes.The set of links is defined as  ⊆  ×  , where the cartesian product of  with itself is a set of all ordered pairs (  ,   ) such that   ∈  and   ∈  .Ė ∈ ℝ × is a link attribute matrix where Ė represents the power flow attribute of link (  ,   ) =   .Δ Ḃ ∈ ℝ ×1 is a node attribute vector where Δ Ḃ represents the rate of exergy change attribute of node   .An attributed node signature matrix may then be defined by equation (4).
Consider an example graph in Fig. 2.   For this graph, the attributed node signature matrix can be represented by equation (5).
For the system diagram in Fig. 1 and the component inclusion criteria provided, the attributed graph can be constructed as shown in Fig. 3.The 20 nodes are numbered 1 to 20, with node 20 indicating an interaction with the environment.From Fig. 3, it is clear that the links are numbered as directional energy transfers from the originating node to the receiving node.
A node signature matrix (  ) can then be composed as given in equation (6).
Nodes that represent components with zero energy exchange are simply equated to zero, as is the environment rate of exergy change Δ Ḃ20 .

FDI techniques
Section 3 explained the EGBV method showing the relation between the graph structure, ℎ   layout, and the energy attributes of ℎ  .These relations were shown to encapsulate causal information about the health of ℎ   [1,27,31,40].
Analysing the attributed graph in a specified way allows for the extraction of information that can be utilised for FDI.The three analysis methods in this paper can be categorised into two groups; the first utilising a cost matrix to quantify the difference between two node signature matrices and the second utilising a residual matrix.These difference matrices are then analysed to enable FDI.
The work of [41] illustrated the usefulness of exergy for this process, however, the paper only focused on steady-state scenarios testing the distance parameter method.The work of [31] expanded the research by comparing the performance of the eigenvalue decomposition method with that of PCA on a two-tank system.This paper aims to contribute to the research by evaluating the FDI performance of the three mentioned EGBV methods.This section will focus on the details of EGBV FDI.

Overview of the FDI methods
For online process monitoring the operational node signature matrix,  , , is calculated every sample  in time.For all three methods in this study, a graph comparison step is first executed requiring the relevant N , and a reference node signature matrix, N  .For the cost matrix methods, these two graphs are compared using a distance metric to yield a square cost matrix.For the residual approach, a residual matrix is obtained by a direct element-wise comparison of the two node signature matrices, resulting in a matrix with the same dimension as the input matrices.In this study, the operational time series data are available in sequences of length 2880 i.e. 1 ≤  ≤ 2880 for each system state (i.e., 2880 samples for normal state data and the same for each fault ID equating to 18 sets of N , ).
For fault detection the reference node signature matrix should represent the normal condition.A match of the operational node signature to this normal reference resulting in zero or a short relative distance indicates a normal operational state and vice versa for larger relative distances.For fault isolation, the reference node signature matrix should represent the specific fault that is being isolated.A match of the operational node signature matrix with the fault reference node signature matrix resulting in a zero or relatively short distance implies possible isolation of that fault given that the resulting distance is not similar to a match with another fault reference node signature matrix.The set of reference node signature matrices is mathematically expressed by equation (7).
In equation ( 7) each matrix represents a reference node signature matrix for a different system state (i.e.normal or faulty) where N   is the normal state reference and N    is the reference node signature matrix for the  ℎ fault ID.  represents the number of fault IDs which in this case is 17.Fault states in N , are isolated by comparing their node signature matrices to each of the reference node signature matrices.In this study, the reference node signature matrices were compiled using the averaged data from 70% (the training set) of the respective operational datasets N , .

Cost matrix composition
In this study, the heterogeneous Euclidean overlap metric (HEOM) is used as a distance metric for composing cost matrices ( , ).The function allows for graph comparison of continuous attributes (e.g.temperature, pressure, etc.) [42], while also normalising the data.A square cost matrix can be compiled using the HEOM metric that links an attributed graph's nodes to the cost matrix columns.The HEOM metric is given by equation (8).
In equation ( 8)  , (, ) refers to the (, ) entry in the operational signature matrix,   (, ) to the (, ) entry in the reference signature matrix, and , the entry in a row's  ℎ column.The  normalises the attribute as given by equation (9).
= () − () (9) In equation ( 9) () and () are the respective maximum and minimum values in the  ℎ column of   .Using equation ( 8), the first two FDI methods can be applied.This is described in Subsections 4.3 and 4.4.

Distance parameter
The idea of using a distance parameter based on the diagonal values of the cost matrix was suggested by [43], which calculates the average size of the diagonal values to indicate the distance between the two compared states as a single value.The distance parameter is defined by equation (10).

𝐷𝐶 =
∑  =1    (10) In equation (10)   denotes the  ℎ diagonal cost matrix entry and  the number of diagonal elements.FDI is then achieved by evaluating the distance parameter values for a specific  , , compared to each   .The reference node signature matrix associated with the smallest distance parameter would reveal the likely current state.This can be visually illustrated with a box-plot of the distance parameters obtained by comparing all node known operational node signature matrices  , which include time series of normal and the known fault types with the set of reference node signature matrices   .Since the states of each operational graph are known in this study, the results can be summarised quantitatively by calculating the percentage of correct detection or isolation of a fault out of the total number of variations in each system state.

Eigenvalue decomposition
The approach, according to De Bruin et al. and Uren et al., uses the eigenvalues from the cost matrices to generate a qualitative fault signature [23,44].The first step of implementation involves a comparison of each of the 18 sets of operational node signature matrices  , , with    .Each of these comparisons results in a set of 2880 cost matrices on which eigenvalue decomposition is done.This results in a set of 2880 eigenvalue vectors on which the qualitative FDI method can be applied.The first step results in a set of eigenvalue vectors as portrayed in equation (11) with each column representing a set of 2880 eigenvalue vectors, a column per  , set.

λ𝑖,𝑗 =
, − λ  (12) In equation ( 12) λ and  are the per eigenvalue average and standard deviation respectively for all 2880 samples.The next steps involve a repetition of the first step for the remaining 17   matrices compared to all  , matrices.Since the number of nodes in the attributed graph (Fig. 3) is 20 and the number of system states is 18, the dimensions of these standardised eigenvalue vectors are 20 × 18 for each sample, resulting in the above qualitative approach being impractically complex to interpret.
Instead, a quantitative method is followed whereby the average of each column of the standardised eigenvalue vectors is calculated.
The smallest average value indicated the best comparison between   and  , , implying fault detection or isolation.

Residual method
The third approach, as proposed in Neser et al., compiles a residual matrix   by direct element-wise comparison of the signature matrices of an operational node signature matrix  , to a reference node signature matrix   [38].Using the same notation as defined for a node signature matrix of a graph in section 4.1,  , is mathematically expressed by equation (13).

𝐍 𝑠𝑟𝑒𝑠,𝑘 (𝑖, 𝑗) =
(, ) −  , (, ) In equation ( 13)  = {1, 2, ..., } and  = {1, 2, ..., ( + 1)} for all the graphs representing an  × ( + 1) matrix.The division with   (, ) normalises the residual matrix with respect to the specific reference.The reference graph for this approach requires only the use of the normal state reference and thus no comparison to any of the fault state reference graphs is necessary.Next, the columns of the residual matrix are normalised with respect to the column maxima as portrayed in equation (14).
Each row of the resulting  , represents the normalised distances between some operational state and a normal state.It is therefore argued that, when analysing a residual graph, specific rows of each residual have a unique signature attributed to the state of the operational graph.By identifying these unique signatures, the faults can be detected and isolated.The method of identification used in this study can be described by the following steps applied to the normalised residual matrices: Step 1: Calculate the average and standard deviation for column 1 (Δ Ḃ and Δ Ḃ ) and the table formed by columns 2 to ( + 1) ( Ė and Ė ).
Step 2: Determine the following: for  ,1 ∈  , (thus for the indices in column 1) count the number of indices outside the range Step 4: The results from Step 2 and Step 3 will form a vector indicating the number of instances in the normalised residual graph that had a change larger than the indicated limits or range.This vector is referred to as a frequency vector (  ) as defined Step 5: Group and count the number of identical frequency vectors into  + 1 groups, referring to the number of states (normal and fault states).The extra group represents the frequency vectors that could not be classified or linked to a specific state.The number of unidentified frequency vectors is directly dependent on the ranges used in Steps 2 and 3. Enlarging the range from 2 to 3 or more standard deviations above and below the average would decrease the number of unidentified frequency vectors but would also increase the number of incorrectly identified states.

Experimental design
The experimental design consists of 3 main steps that result in the composition of time series data sets: obtain physical system data from ℎ   under normal operating conditions, simulate each fault using the Ebsilon Ⓡ model, and combine the physical and simulated data.This results in data sets for each FT representing the faulty conditions, encapsulating typical actual plant noise and variations.A flow diagram illustrating this three-part process and the data variables for this study is shown in Fig. 4.

Step 1: physical data measurement
Due to the sensor placement and layout of the system, it was not possible to measure data at each of the input and outputs of the components shown by the node locations in Fig. 3.These components, referred to as the "missing components" listed in Table 4 as available or not, do not have physical data available.Instead, it was assumed that the variation in the data typically measured at, for example, the MPT and LPT, would be closely related to the components before and after the missing components in the cycle (following a logical path through the system based on the node numbering).It was further assumed that the variation in the data of missing components can be approximated by obtaining the average of the closest components on both sides of the missing components.For instance, the variation in the data of the high-pressure pre-heater (node 19) can be obtained by taking the average variation between the main water pump output data (node 16) and the boiler input data (node 1).Following this approach for all the components, where required, resulted in 19 sets of data -one set for each node.

Step 2: fault simulations
The goal of simulating faults in ℎ   was to obtain a set of data from which the energy attributes could be composed.
Consequently, the model outputs the pressure, temperature, and mass flow rate at specific locations in the system, which were then used to calculate the change in exergy, heat flow rate, and mechanical power.Since the efficiencies of each component are specified, changing these component efficiencies, input parameters or characteristics delivers results of the typical conditions that the system would operate at.The process followed in Ebsilon Ⓡ for each fault is discussed next.
Obtain window size for moving average filter.
Remove noise from the data using the moving average filter.
Expand the noise data to include components where measurements were not possible Simulate each Fault type at each node  1 , ...,   For each node   ∈ { 1 , ...,   } Fig. 4. Flow diagram illustrating the process followed to obtain the time series data.

Simulating solid particle erosion (FT1)
FT1 is SPE and was simulated respectively in both the HPT and the first stage of the MPT, which both have a direct connection to the boiler in Fig. 1.This will typically simulate a reduced efficacy and an increase in suction capacity that increases the mass flow rate.For each simulation run, only one of the two components' (HPT or MPT) parameters are adjusted while the other is kept at its default values.

Simulating leakage of the overflow valve (FT2)
The valve component between the turbine stages was given a general input value that assured a specific mass flow rate leaking out of the valve.Since the result from the model is at steady-state, simply specifying the mass flow rate is adequate for this simulation.

Simulating overall wear and ageing (FT3)
The simulation of overall wear and ageing can be challenging with a steady-state model since this fault type typically occurs over time due to several underlying faults such as SPE.A typical workaround is to simulate wear in multiple components at once by adjusting the efficiency and performance parameters of multiple components in the system, resulting in a change in the system parameters similar to SPE or cavitation.

Simulating pump cavitation (FT4)
Accurate simulation of pump cavitation can be extremely difficult, especially if the dynamic aspects of cavitation are to be simulated as well [45].For the purpose of this paper, however, the steady-state effects of cavitation on the total system can be simulated by adjusting the thermodynamic variable at the pump inlet that would result in cavitation within the pump, while ensuring that its input pressure is within such a range that cavitation would take place.To enable the Ebsilon Ⓡ model to do this, a simple piping component with adjustable physical properties is added before the pump.By adjusting the pressure drop over the piping component to 3% or more, the NPSH of the fluid would result in cavitation within the pump [45].

Step 3: data conditioning
Since practical fault data was not available representative fault data was obtained in the data conditioning phase.This data was composed by calculating the energy attributes for both model and physical data and then constructing the time series of each fault set by combining the variations in the physical data with the model data.The outcome is sets of time series data that approximate the typical variation in the energy variables under fault conditions.An array of physical data and simulation data as a single variable (e.g.temperature) are provided as input.The time series consists of a total of 2880 data points, implying that each system state has at least 2880 variations for each intensity of the various FTs.The physical data is then filtered using a moving average filter, given by equation (16).
In equation (16) x() is the filtered value,  the number of neighbouring data points on each side and 2 + 1 the span or window for which the average is calculated.A 10-point moving average filter was selected for the physical data, based on a sweep analysis comparing the sum of absolute differences between the original data and the filtered data for various window sizes.A clear knee in the sweep plot was seen for all data sets at window sizes between 7 and 10.After applying this approach to each of the temperature, pressure, and mass flow rate variables of each system state, 18 data sets are obtained.A general example of such a data set for a specific operation is given in Table 5.Such data sets are obtained for each of the system states.Using this data as the input to a lookup function that utilise the steam/water property (IAPWS) tables the enthalpy and entropy values required for (2) and (3) are obtained.These equations are then used to calculate the energy and rate of exergy change, thus acquiring all the necessary data to populate the node signature matrix.Note that the sub-heading in Table 5 (i.e.row 2) indicates the node/location as specified in Fig. 3 that is associated with the various columns of data.The energy attributes calculated at each of these locations are therefore directly associated with the various nodes and links in Fig. 3 and therefor also with the node signature matrices.

Application of FDI techniques
The performance metrics used to evaluate the detection and isolation performance are sensitivity and robustness.For the purposes of this study, sensitivity is defined as a measure of the FDI method's true alarm rate (TAR) [46,47], i.e. sensitivity to changes in the size and type of the fault.A high TAR is indicative of a highly sensitive FDI method.This can be quantified using equation (17).Similarly, the robustness of a method is defined as a measure of the FDI method's false alarm rate (FAR) with the goal to minimise the FAR [46,48], and is quantified using equation (18).Robustness is therefore a measure of how disturbances and noise affect the

FDI outcome Description
True negative (TN) A fault-free condition is detected while the true condition was fault-free.

False negative (FN)
A fault-free condition is detected while the true condition was faulty.

False Positive (FP)
A fault condition is detected while the true condition was fault-free.

True Positive (TP)
A fault condition is detected while the true condition was faulty.
FDI methods' performance implying a low FAR as indicative of a robust FDI method.There is thus an expected trade-off between the robustness and sensitivity of an FDI method.To calculate these performance metrics, four outcomes of the FDI technique are considered, as described in Table 6.19) and (20).

Results
To evaluate the aforementioned methods, the set of operational node signature matrices ( , ) is purposefully composed of graphs representing only specific states and then compared to each reference graph (defined in equation ( 7)).The matching results are tabulated in Fig. 5 where the rows indicate the results from a match  , Ψ  .The following approach is used to analyse the results in Fig. 5 (a, b and c): A Fault is isolable if no match with another fault or the normal state is established.In terms of Fig. 5 this implies that all cells in the same row and column of the system state under consideration, except for the cell of the system state itself, are empty.Since a fault that is isolable is also detectable only the results in column 1 of Fig. 5 (a, b and c) are evaluated to determine detectability (isolating the normal state implies detection of all fault states).Thus if the TAR is 100% (dark green dot in Fig. 5), there will also be no other highlighted cells in that same row of Fig. 5 implying that all the faults could be detected.

Distance parameter result
The box-plot results of the distance parameter method in Fig. 6 show the variation in the distance parameters for each state.The box body contains the average (red centre-line), 75 percentile, and 25 percentile with red crosses indicating outliers.The detection of each fault was determined by ( , =  , )Ψ   as depicted by the smallest distance parameter in Fig. 6(a).Since the normal  state can be isolated with no overlap with any other matches, all faults were detectable.Similarly, the faults were then isolated by comparing  , to a fault reference graph.In Fig. 6 (b) the comparison ( , =   2, )Ψ   2 gave the smallest distance parameters implying that FT1B is isolable.However, some overlapping with other faults is seen, especially with FT1C which is a variation in the intensity of FT1B.
The results are summarised in Fig. 5(a), showing the number of smallest distance parameters obtained from each comparison as a percentage of the 2880 variations on each state.To evaluate the robustness and sensitivity of the method, the TAR and FAR calculations were done for the results from Fig. 5(a) and shown in Fig. 7 when distinctions in fault variations are neglected (i.e., for example, FT1B and FT1C is the same fault with different intensities but is evaluated as if it represents different faults), and in Fig. 8 when distinctions in fault variations are accounted for.Note that Figs. 7 and 8 also show the results for the eigenvalue decomposition and residual method that will be discussed next.

Eigenvalue decomposition results
The same comparing sequence was followed here as in section 7.1.The results, obtained by selecting the smallest average value in each column of the standardised eigenvalue vectors from (11), are shown in Fig. 5(b), and are expressed as a percentage of 2880 time-dependent variations in terms of correct detection or isolation rate.For the performance evaluations, the TAR and FAR were calculated using (17) and (18).The isolation TAR results for all FTs are shown in Fig. 7 for the case where fault variations are not accounted for.Fig. 8 shows the results accounting for fault variations.

Residual method results
The results for the residual method are presented in Fig. 5(c), obtained for the ranges defined in Steps 2 and 3 of section 4.5.Two standard deviations were used, based on the statistical significance of such a range, and offered the best accuracy with this range, compared to 6 variations of the standard deviation  ∈ {0.5, 1, 1.5, 2, 3, 3.5}.The calculated TAR results are shown in Fig. 7 for the case where fault variations are not accounted for, and in Fig. 8 where variations in fault size and location are accounted for.

Discussion of results
This section compares the three techniques with respect to their detection and isolation performance, based on the results in Fig. 5, Figs.7 and 8.The detection and isolation performance are characterised in terms of sensitivity, and robustness.

Distance parameter detection performance
Sensitivity: Changing the size or location of any fault did not change the TAR as seen in Fig. 5(a).The method's detection sensitivity is therefore high.

Robustness:
The FAR for detection was 0% regardless of the change in fault size or location as seen in Fig. 5(a).The distance parameter method is therefore also highly robust in terms of detection.

Conclusion
Based on the detection and isolation results, it is concluded that the distance parameter and eigenvalue decomposition methods outperformed the residual method.The residual method also had high inaccuracies in terms of overall isolation when compared to the other two methods, rendering it ineffective compared to the remaining two methods.Based on the low-performance improvement of the eigenvalue decomposition method over the distance parameter method, it is further concluded that the mathematical and implementation complexity of the eigenvalue decomposition method outweighs its effectiveness.However, since the distance parameter method has overlapping values, a set of faults that cause less variation in the energy parameters of a system may well result in lower performance, in which case the eigenvalue decomposition method is more accurate.The residual method produced false isolation results for all fault types rendering the result inconclusive with respect to sensitivity and robustness.
The energy graph-based FDI methods proposed in this article are not regarded as mature and further work is warranted.Firstly generalising the techniques, taking into account different working points and operational modes and determining the effect of sample size and frequency, is seen as paramount.Then the availability of energy data and the consideration of machine learning and data fusion to facilitate the estimation of energy attributes are deemed important.Comparing the energy graph-based methods with standard techniques such as PCA and more pertinently establishing the role of spectral theory in this context, is warranted future work.

7 .
J.H. Smith, G. van Schoor, K.R. Uren et al. Isolation TAR results for the distance parameter, eigenvalue, and residual method while neglecting fault variations.

Fig. 8 .
Fig. 8. Isolation TAR results for the distance parameter, eigenvalue, and residual method while accounting for fault variations.

Table 1
List of the energy-significant components in ℎ   and their corresponding component numbers.

Table 2
Comparison of the Ebsilon Ⓡ model with the actual plant data.

Table 3
List of all considered fault types, with corresponding location and intensity.

Table 4
Availability of component data from ℎ  .

Table 5
Illustration of the data set consisting of temperature, pressure and mass flow rate data of a system state.

Table 6
Description of the four outcomes of the FDI techniques.
=       +    +    +    × 100% As an example, consider hypothetical operational data obtained for fault ID1.Suppose a total of 100 measurements were taken during the considered time interval, resulting in a set of 100 operational node signature matrices i.e.  , ∈ {  1,1 ,   1,2 , ...,   1,100 }.Next, a comparison operation denoted by Ψ, is executed whereby operational node signature matrices are compared with reference node signature matrices.This operation is indicated as  , Ψ , .When   =   1 , 70 out of the 100 matches resulted in correctly identifying FT1A as the fault condition under which the plant is operating.This implies a true positive rate of 70%(   = 70%).When   =   2 , 25 out of the 100 comparisons are incorrectly (   = 25%) identified as FT1B since the fault occurring in this example is FT1A.When   =   13 , 5 out of the 100 comparisons are incorrectly identified as fault FT3C (   = 5%).For all other comparisons of  , no fault is identified and therefore    = 0%.Since FT1B is simply a variation in the location of FT1, two interpretations of the data are possible.If a distinction between the fault intensities or location of a specific fault type is required (i.e.FT1A ≠ FT1B ≠ FT3C), the false positive rate is obtained by adding the 25% and 5% (  = 30%).
The isolation TAR and FAR are then calculated using equations (