Digital twins for design in the presence of uncertainties

Successful application of digital twins in the design process requires a tailored approach to identify high value information from the uncertain data. We propose a non-intrusive sensitivity metric toolbox that integrates black-box digital twins in the design and decision process under uncertainties. The toolbox captures the evolving nature of the key design performance indicators (KPI) and provide both KPI-free and KPI-based metrics. The KPI-free metrics, which are based on entropy and Fisher information but independent of design KPIs, is shown to give good indication of the most influential data for KPI-based metrics. This suggests a consistent identification of high value data throughout the design process.


Introduction
Research relating to Digital Twins (DTs) has mostly focused on manufacturing, operation and maintenance, where the corresponding physical entities exist.However, at the design stage, the virtual part of a DT predates its physical part.As a result, the DT at the design stage is simulation based and mirrors the expected functions and events of the physical entity.
Although computational models are commonly used in the design process to simulate expected design performance, DTs offer an enhanced approach as they are based on a fusion of models and data.Data at the design stage can be historical, being based on existing knowledge and experience, or bespoke, being based on targeted measurements, such as environmental data or material properties.Data can be from multiple sources and can have different level of relevance, and for the successful and efficient application of a DT in design, there is a need to identify data with a high value of information that can be used to support decision-makings at different stage of the design process.
With an emphasis on data integration, an overview of the application of a DT in the design process under uncertainty is given in Fig. 1.At each step of the design process, the quantities of interest (QoIs) are estimated from the DT.Decision makers then extract useful information form QoIs using design metrics and decide either to accept the design, to improve it, or to obtain more data to better inform the decision making process.One of the key steps in this process is to identify high value information, denoted as b * , and that is the focus of this paper.
In this paper, we propose new design sensitivity metrics that identify the most influential parameters from the DT to support decision-makings.The essence of the design metrics, as highlighted in the shaded hexagons in Fig. 1, is to couple the design process (vertical axis) and the decision process (horizontal axis).This has prompted the development of a metric toolbox for DTs in the design process that focuses on data: Toolbox for Engineering Design Sensitivity (TEDS).The unique features of TEDS are (a) it identifies important uncertain data that design should focus on; (b) it is non-intrusive and thus allow the integration of black-box DTs; (c) it is developed for the design process and thus allow more tailored design decisions.The toolbox mainly includes two design metrics, namely probability of acceptance 2 and design entropy, and their sensitivities (more discussion of TEDS are given in Section 2 below).The relationship of the present approach to the previous literature is discussed in what follows.
The acceptability of a design is a widely used metric to evaluate a design against requirements.However, design requirements can evolve during the design process.The implications of changing requirements are mainly twofold.First, designers are uncertain about the specific level of design acceptance.To capture this uncertainty, a probabilistic acceptability model is proposed in [1,2] to provide designers with an intuitive way to evaluate design performance, where a single quantitative output representing the probability that the design's performance will be acceptable is computed.This metric, the probability of acceptance, thus evaluates a design directly against uncertain requirements and is included in TEDS.
Another implication for the evolving requirements is that, the acceptability function changes as the understanding of the problem evolves and this is especially common at the early stages of design.In this case, it becomes more important to measure the internal structure of the design, such as the complexity of the design, without comparing to specific requirements.In [3], design complexity is defined as a function of the design's information content, as captured by intelligent computer aided design systems.The information content can be interpreted as the least number of bits to represent the design.By assuming equal probability for the variables (operators and operands), the information content metric is shown to be equivalent to entropy.In [4], the entropy of the design solutions related to the functional requirements, which are assumed to be continuous random variables, is used to measure the complexity for the overall design solution.In [5], the design complexity is measured using an exponential information entropy.This metric is based on the information content in the estimated quantities of interest and is intended to be used mainly in simulation based design activities.The exponential entropy can be related to the length of the support of the joint probability density function of the random design quantity of interest.Entropy thus measures the design complexity and its information content.This makes entropy most suitable for design under uncertainties and it is included in TEDS as a design metric.Note that different from the applications described above, entropy can also be applied to the probability of acceptance and measure the complexity of the design related to the design requirements [6].
Neither of the design metrics discussed above, probability of acceptance or entropy, can guide design decisions directly.In order to identify high value information or the most influential parameters, it is desirable to conduct a sensitivity analysis where a change of the design metric is quantified subject to a change in the data or the design parameters.The sensitivity of the probability of failure is often a key element in reliability based optimization where the gradients of the objective function are required [7,8].The gradients can be obtained by using a polynomial chaos method where a functional relationship between the uncertain response and the input variables is represented as a polynomial expansion [9].However, the application of the polynomial chaos method can be difficult because the failure region is often unknown.In [10], an indicator function is introduced to transform the failure region to the full random space and a smooth approximation is adopted to compute the gradients.An extended polynomial chaos formalism has been applied for evaluating the sensitivities of output distribution functions to input distribution parameters in [11], where the application to reliability sensitivity requires the characterization of the failure probabilities as random variables themselves.When the Monte Carlo methods are used, the numerical efforts involved in the calculation of the failure probability and its sensitivities can be prohibitive [12].One of the approaches to compute sensitivity using the Monte Carlo method is the Likelihood Ratio (LR) method [13], where sensitivities of the response function are computed with respect to the distribution parameters of the random variables.Although a well-known method in stochastic optimizations [13], there has not been many applications of the LR method in engineering.Some examples of applying the LR method in reliability sensitivity analysis can be found in [14][15][16][17]37].In this paper, we will be using the LR method for non-intrusive computations of the probability gradients, so that the toolbox TEDS can be applied efficiently to black-box digital twins.
The sensitivity of entropy can be evaluated using the Kullback-Leibler (K-L) divergence (aka relative entropy).Drawing ideas from sensitivity analysis based on variance decomposition, an entropy power based decomposition method is proposed in [18] for design sensitivity analysis.The sensitivity index takes account of both design complexity that is quantified by entropy power (proportional to the square of exponential entropy), and non-Gaussianity that is quantified using the K-L divergence.In [19,20], a sensitivity analysis using K-L divergence is proposed, by measuring the divergence between two PDFs (probability density functions) corresponding to before and after uncertainty reduction of the random variable of interest.This utilizes the concept of omission sensitivity, where a random variable is made deterministic to eliminate its uncertainty.Depending on the design objective, the proposed K-L divergence can be applied to either the full range of PDF (integration of K-L entropy over the entire distribution) or partial range of the PDF (partial integration).As explained below, the K-L divergence will be used in a different way in the present work.
In this paper, a design process tailored toolbox TEDS is developed, which includes both the probability of failure and the entropy as design metrics.The K-L divergence is used in TEDS to measure the sensitivity of the entropy to the data.Different from [19,20], however, we propose a new design metric using the Fisher Information.The Fisher information is first introduced by Fisher [21] and is widely used for parameter estimation and statistical inference.Estimated parameter values will show variability and the Fisher Information is a lower bound on the variance of any unbiased estimate (the Cramber-Rao lower bound).The Fisher information is used in a different way in the present work.The distribution parameters b of the design variables x design are assumed to be known from data as shown Fig. 1.The Fisher Information is used to measure the sensitivity of the uncertain quantities of interest, y QoI , with respect to the input distribution parameters b.To consider the interdependency of uncertain parameters, the Fisher Information Matrix (FIM) is formed.The eigenvalues and eigenvectors of the FIM are found to indicate the principal components and the principal directions of the entropic sensitivities.The principal directions thus provide us with the most sensitive parameters b * and correspondingly indicate the data of most importance to the design.
Note that it is assumed in this paper that x design has a probability distribution that can be characterised by the distribution parameters b.This includes deterministic variables which can be approximated using a delta distribution function.A lack of knowledge and information about the design variables is then reflected in the uncertainty in the values of the parameters b.It is thus out of scope of the present sensitivity framework to consider imprecise probabilities, which might be applicable when there is only very scarce data available such that a probability distribution cannot be identified without significant assumptions.Recent sensitivity studies in terms of imprecise probability are described in [22] within the paradigm of probability bounds and in [23] by proposing a probabilistic Sobol' sensitivity indices.
In what follows, TEDS is introduced in Section 2, with a qualitative overview of the relationship between the proposed design metrics and the design process.The mathematical framework of the design metrics and their sensitivities is presented in detail in Section 3. A case study for dynamic design of an offshore structure is conducted in Section 4 to illustrate the application of TEDS, where two design scenarios are considered and compared.In Section 5, the relationship between the two types of metrics and their impact on design decision-makings are discussed.Concluding remarks are given in Section 6. Fig. 2. TEDS (Toolbox for Engineering Design Sensitivity) provides design process tailored metrics.

TEDStoolbox for engineering design sensitivity
In this section, we give a qualitative overview of how the proposed design metrics are tailored to the design process and the corresponding quantitative details are given in Section 3 below.
The design process is commonly divided into multiple phases, ranging from problem definition to design verification.Although the detailed design steps vary from case to case, the essence of the design process is to have a converging solution.Note that the design solution in the presence of uncertainties will have probabilistic design requirements.As the design requirements are evolving during the design process, we have divided the design process into two stages, the early conceptual design stage and the detailed design stage.At detailed stages of the design process, we normally have a specified design target or key performance indicator (KPI).The need is to find the probability that the design will meet the target, P f , and also the sensitivity of this probability to input data so that design decisions can be made accordingly.In this case, the design metrics are denoted as KPI-based and are most suitable for detailed design stages, as indicated in Fig. 2. Note that the requirement (KPI) needs to be specific but can be uncertain.For example, one of the KPI used in the case study in Section 4 is the design life of the structure due to fatigue failure.The design criteria is specific, i.e., a fatigue failure not a failure due to excessive displacement, but the expected life duration can be itself uncertain.On the other hand, at early design stages, a specific design requirement is not normally available.This could be due to either too much uncertainty to fix the requirements, or a preference to waive specific requirements so that a much wider design space can be explored.As there are no specific design KPIs, the design metrics that are most suitable at the conceptual design stage are called KPI-free metrics, as indicated in Fig. 2. The KPI-free metrics are the design entropy H and its sensitivity to perturbations of design parameters, which are independent of specific design targets (KPIs).
One of the key elements needed to implement TEDS is the partial derivative of a probability measure (more details in Section 3.4).In a standard setting, especially for black box models, the function has to be evaluated twice and to compute the derivative using the finite difference method.This is time consuming for complicated simulations using the Monte Carlo method.In TEDS, the Likelihood Ratio (LR) method [13], also known as score function method [14], is applied.In the LR method, the partial derivative of a probability measure with respect to (w.r.t) continuous parameters can be obtained efficiently in a single simulation run.As it is a sampling based method, TEDS is non-intrusive and can be applied to black-box digital twin as indicated in Fig. 2.

Methodology
In this section, the theory underpinning the sensitivity toolbox TEDS is presented.An overview of the mathematical framework is shown in Fig. 3.It is assumed that the input uncertainties can be described by parametric distribution models, where x is a vector of random variables and b is a vector of corresponding distribution parameters (e.g.mean or standard deviation).y is the design quantity of interest output from a numerical model h, where h can be a black box model.
In accordance with Fig. 2, two sensitivity metrics are estimated using TEDS.The first one is KPI-free design entropy, denoted as H in Fig. 3.The sensitivity of design entropy is linked to the eigenvalues of the Fisher Information Matrix (FIM) (denoted as F in Fig. 3), which will be introduced in detail in Section 3.1.The other metric is the KPI-based failure probability P f and the sensitivity is Fig. 3. Overview of the mathematical framework of TEDS, where the impact of Δb on both KPI-free metric H and KPI-based metric P f are evaluated.The eigenvectors of the matrix F form the Eigen basis for projections of sensitivity vectors.quantified using the partial derivative of P f with respect to the distribution parameters b.Details of these two metrics are given in Section 3.1 and 3.2 respectively.In Section 3.3, the relationship between the two metrics is explored via projections of the sensitivity vectors and numerical implementation steps are given in Section 3.4.

Design entropy
Entropy can be used to quantify the information contained in an uncertain variable.In the KPI-free case, the expected information contained in the designed quantity of interest, p(y|b), is measured using entropy: where H is called design entropy in our context.Note that b is not a distribution parameter of p(y), where the dependence on b is propagated through the function y = h(x).To look at the sensitivity, we can then define the perturbation of the design entropy as a relative entropy quantified using K-L divergence: where p(y|b +Δb) is the perturbed PDF.For ease of notation, the p(y|b) will be simplified as p(y) or p wherever the context is clear.

Quadratic form for the perturbation of design entropy
Eq. ( 2) presents a metric to compute the impact of parameter perturbations, but it does not explicitly identify the most influential set of parameters.This is not an easy task because the impact of the parameters on the design entropy are not independent.In this paper, we represent the perturbation of design entropy as a quadratic form using the positive semidefinite symmetric Fisher Information Matrix (FIM).The eigenvectors of the FIM can then be used to identify orthogonal set of parameters for our sensitivity analysis.
First, we can take the Taylor expansion of the following term at the current design point b: where p(b +Δb) is the simplified notation for p(y|b +Δb) and the terms in the curly brackets are the gradient vector and Hessian matrix respectively.The perturbation of the design entropy can then be represented by a quadratic form using the FIM: where Eq. ( 3) is substituted back in Eq. ( 2) and third and higher order terms are ignored.Note it has been assumed that the differential and integral operators are commutative, i.e. the order of the two operations can be exchanged under regularity conditions of continuous and bounded functions [13], and it has been noted that the PDF has unit area (c.f.Eq. ( 6)).

Interpretation of FIM
The matrix F in ( 4) is the Fisher Information Matrix (FIM).The jk th entry of the Fisher Information Matrix (FIM) can be expressed as: where the mean value of the partial derivative term, ∂lnp(y)/∂b j , is zero: Thus, from Eq. ( 5), it can be seen that the jk th entry of the Fisher Information Matrix (FIM) is the covariance between the partial derivative of the log probability distribution, with respect to (w.r.t) the parameters b j and b k respectively: Therefore, the FIM is the covariance matrix.
where for simplicity, β is used to denote the distribution sensitivity vector: Using the quadratic representation in Eq. ( 4) and noting that the FIM is a covariance matrix, it can be seen that the KPI-free sensitivity measures the variance of the perturbation of the log distribution for the quantities of interest: As a covariance matrix, the eigenvalues and eigenvectors of FIM then indicate the principal components and the principal directions of the variances.The principal directions thus give us the KPI-free b * that represents high value data the design should focus on.
Note that the partial derivative vector in Eq. ( 9), where the j th entry is β j = ∂lnp(y)/∂b j , provides the relative effect on the perturbations for an infinitesimal change of b j .However, these raw partial derivatives are not directly comparable when the parameters, b j and b k , are of different units.In addition, the FIM tends to be ill-conditioned without scaling because the magnitude of the parameters could be of many orders difference.Therefore, in the numerical case studies presented in Section 4, the sensitivity vector in Eq. ( 9) is normalised, β j = b j ∂lnp(y)/∂b j , so that it becomes dimensionless.This is equivalent to a normalised perturbation, where Δb j = Δb j /b j in (4).In cases where the Gaussian distribution is used for the design variables, the standard deviation can be used instead as the normalization constant [24] which implies the allowable design range of the mean value is limited to local region and it is quantified by the standard deviation.

Probability of failure
Given a set of specific KPIs or targets, designs can be evaluated against those requirements.When uncertainties are present, a single quantitative metric representing the probability that the design will be acceptable, the probability of acceptance, should be estimated.The probability of non-acceptance is referred to as the probability of failure and denoted as P f in this paper.This notation is compatible with reliability-based design, although 'failure' here indicate failure to satisfy the design requirements.An unconditional probability of failure can be expressed as.
The quantities of interest y is a function of x (y = h(x) as shown in Fig. 3).H(⋅) is the Heaviside step function and z represents the threshold the design is evaluated against.Note that the unconditional failure probability is a function of the parameters b.

Sensitivity of the failure probability
The purpose of the sensitivity analysis is to identify the most important uncertain parameters or factors in the design.The designer can then make informed decisions to prioritize the resources for further analyses or measurements.With this objective, we look at the perturbed effect on the (unconditional) failure probability using the first order Taylor expansion from the nominal values of b.
The partial derivative vector, ∂P f /∂b, then provides the relative effect on the perturbations of the failure probability for an infinitesimal change of b.As mentioned in Section 3.1.3,these raw partial derivatives are not directly comparable when the parameters are of different units.One approach to overcome this issue is to normalise the raw derivative as: where the normalised derivative r is often called proportional sensitivity or elasticity [25] and it measures the percentage change on P f resulting from a fraction change of b.Therefore, r can be used to compare the relative sensitivities of parameters of different physical units because it is dimensionless.In addition, it is straightforward to show that r is invariant under rescaling of the variables in the equation [26].More discussions about sensitivity normalisation and the extension to consider the sensitivity of multiple (correlated) failure modes can be found in [37].

Sensitivity projections
After introducing both the KPI-free and KPI-based design metrics, the relationship between the two metrics is explored in this section via projections of the sensitivity vectors.
As a real symmetric matrix, the eigenvectors of the FIM form an orthogonal basis.The i th eigenvalue and eigenvector of the Fisher Information Matrix (FIM) can be found as.
Multiplying both sides of the above equation by the transpose of the eigenvector then gives: where the eigenvectors are assumed to be normalised q T i q i = 1.From Section 3.1.2it can then be noted that: Therefore, the eigenvalue λ i represents the variance of the projection of the sensitivity vector β (cf Eq. ( 9)) onto the corresponding eigenvector q i .Note that Eq. ( 9) represents variance because the mean value of the projection is zero.Therefore, the eigenvectors with the largest eigenvalues indicate the most sensitive directions and this in turn points out the most important parameters with a high value of information for the design.
By changing of basis, the perturbation vector of Δb can be written as a linear combination of the FIM eigenvectors Q: As a result of this transformation, the perturbation of design entropy is: which re-emphasizes the fact that the FIM eigenvalues represent the sensitivity of the design entropy.Similarly, the perturbation of the failure probability can be re-written as: where the projection of the P f sensitivity vector onto the i th FIM eigenvector is denoted as s i .In comparison with Eq. ( 18), it can be seen that s i plays a similar role to the FIM eigenvalues and can be considered to characterise the spectrum of the P f sensitivity.
As the FIM eigenvectors with largest eigenvalues indicate the most sensitive directions in terms of the underlying distribution p(y|b), it is likely that the projection of the P f sensitivity onto those eigenvectors also dominate the spectrum of ΔP f .

Numerical implementation
To implement TEDS numerically, the main elements required are p(y) and ∂p(y)/∂b j for KPI-free metrics and P f and ∂P f /∂b j for KPIbased metrics.The largest computational effort is the determination of the partial derivatives.In a standard setting, especially for black box models, the function has to be evaluated twice to approximate the derivative using the finite difference method.This is time consuming for complicated simulations using the Monte Carlo method.In this paper, we use the Likelihood Ratio (LR) method to overcome this issue.The steps of the numerical implementation to get the partial derivatives using the LR method is summarised in Algorithm 1 below.The Likelihood Ratio (aka score function) method obtains a gradient estimation of a performance measure w.r.t continuous parameters in a single simulation run [13].The function y = h(x) is a transformation from the random variables x to the random variables y, and the joint probability density function (PDF) at the output can be written as [27,28]: where δ( ⋅ ) is the Dirac delta function and the 2nd row indicates the Monte Carlo (MC) approximation of the integral, where x i is a MC realisation of the random variable x and N MC simulations are considered.It is noted that although the theories introduced in the previous sections apply to dependent inputs, it is assumed for simplicity in the numerical implementation below that the components of x are independent, that is: where p j is the PDF considered for the random variable x j .Applying the LR method to the density function above, the partial derivative w.r.t the distribution parameter b j can be computed as: ∂lnp(x i |b) ∂b j (21) where the technique used in importance sampling has been adopted here by multiply the ratio of the PDF p(x|b) to itself, i.e. ratio of one.The main advantage of expressing the sensitivities as above is that both the density function and its sensitivities can be approximated using Monte Carlo simulation, which means that almost all the computational effort of the calculation is done in obtaining h k (x) and g(y) for each set of samples.Once the results are obtained for the desired number of samples, in this paper, a histogram representation, i.e. the δ functions are replaced by bins of finite width, is constructed to estimate Eq. ( 20) and ( 21) numerically.
For many commonly used distributions, analytical closed-form expressions can be obtained for the additional terms involving the partial derivatives w.r.t a distribution parameter.For example, for Gaussian distribution: and the LR expressions for a list of commonly used distributions can be found in [15].For the KPI-free metric, Eqs ( 20) and ( 21) can be used to estimate the Fisher Information Matrix (FIM) using Eq. ( 5) For KPI-based metric, applying the LR method to the failure probability in Eq. ( 5), the sensitivity can be obtained as: The function g( ⋅ ) is often just an identity function in design practice, where the failure is evaluated against the design quantity of interest y directly.Nevertheless, the function g( ⋅ ) can be any nonlinear function in general.In the case study presented in Section 4, the function g( ⋅ ) computes the fatigue capacity of the structure subject to a random loading in a given time period.store the MC results for N samples into a multi-dimensional array y 15.
divide the range [min(y), max(y) ] into equal intervals and form the grid ŷ 16.
compare y and ŷ and count how many of y fall into each bin of ŷ 17.

end switch
Note that in this study, only the standard Monte Carlo (MC) method is used.This can become prohibitive when a large number of samples are evaluated.Advanced simulation methods, such as importance sampling and subset simulation, are available to mitigate this shortcoming and a relevant survey is available in [29].The application of MC methods in the framework of high performance computing environments is considered in [30].

Case study
To illustrate the potential application of TEDS, the methodology described in Section 3 is applied to the sensitivity analysis of an offshore marine riser, as shown in Fig. 4. The dynamic model of the marine riser used here is based on [31,32] and it is considered as a black-box digital twin on which the toolbox TEDS is applied as indicated in Fig. 2.This illustrative case study has a few representative features.First, the uncertainties related to the marine riser design, as highlighted in Fig. 4, can be divided into reducible and irreducible uncertainties.For example, the random wave heights is stochastic by nature and thus is irreducible, while the uncertain material properties is in general reducible via a better production process.Different type of uncertainties would require different design decisions, as indicated in Fig. 1, when the digital twin is used in the design process.In addition, there are both x design and x environment input variables, as indicated in Fig. 1.The random wave loading spectrum comes from environmental data and it is considered part of the digital twin.Although there can be a large number of uncertain parameters to be designed for marine riser analysis [33], we focus on a limited subset x design in this example and they are listed in Table 1.Last but not least, as an illustrative example it is also of great practical significance.Uncertainty and sensitivity analysis are key elements.In the design code of offshore structures [34].
The distribution parameters of random variables used in this example are listed in Table 1.Without loss of generality, Gaussian distributions are assumed for all the random variables, with the mean taken from their nominal values [31] and the standard deviation based on assumed coefficient of variation (CoV).The rest of the system parameters are assumed deterministic: riser length (500 m), inner diameter (0.374 m), outer diameter (0.406 m).The motion of the floating platform is described using a deterministic transfer function with respect to (w.r.t) wave amplitude and more details are given in [31].A random sea state, based on the wave scatter diagram for North Atlantic [35], was selected with wave height of 5.5 m and zero upcrossing period of 10.5 s for the JONSWAP spectrum.
Two different design scenarios will be considered in this illustrative case study, as listed in Table 2. Scenario A has one design Quantity of Interest (QoI) and one Key Performance Indicator (KPI), while Scenario B considers the same KPI but with three different QoIs.Note that the structural response of the riser is stochastic since a random wave loading is considered.For illustration purposes, we take the root mean square (r.m.s) values of the stochastic response as our random QoI in this case study.The r.m.s response is itself a random variable as a function of input random variables x and it varies spatially along the riser.A frequency domain approach is used in this study [31] and the r.m.s response of the riser can be estimated from the response spectrum S yy :  where the r.m.s value is from each sample of the random variable x listed in Table 1.
As shown in Table 2, there is only one QoI in Scenario A and it is the maximum r.m.s stress response along the structure.The three different QoIs considered in Scenario B are maximum r.m.s stress, maximum r.m.s displacement and maximum r.m.s rotation (slope) along the marine riser.The design KPI is the same for both Scenario A and B, where a design requirement of a fatigue life of at least 80 years is specified.The fatigue analysis assumes a narrow band Gaussian process for the stress response of the riser and the detailed procedure follows guidance on riser fatigue assessment using an S-N curve, as given in [33].
These two simple scenarios represent common design situations.In scenario A, the design objectives are clear and the design QoI and KPI are closely linked and this represents a relatively mature structural design.On the other hand, we have relative large uncertainties about the design requirements in scenario B and as a result, it is often necessary to consider various QoIs at the early design stage to minimize risks.Therefore, in scenario B, three different QoIs are assessed and only one of them turns out to have a direct relation to the KPI.

Scenario A
In scenario A, the QoI (y in Fig. 3) is the maximum r.m.s stress along the structure (r.m.s value given by Eq. ( 24) for a stochastic wave loading).Following the KPI-free route in Fig. 3, the Fisher Information Matrix (FIM) is estimated.The eigenvalues and eigenvectors of the FIM are shown in Fig. 5 and Fig. 6.There is a distinct feature in Fig. 5 that the 1st eigenvalue (the maximum one) dominates.The eigenvectors shown in Fig. 6, which represent KPI-free sensitivities to the mean and standard deviation (Std Dev) of the variables listed in Table 1, are grouped with the corresponding random variables indicated.
The KPI-based probability of failure, where a fatigue life of 80 years is set as the design KPI, and its sensitivity vector are shown in Fig. 7.It can be seen from Fig. 7 that fatigue failure is extremely sensitive to the S-N curve coefficient δ.This is an expected result as the total damage received by the structure depends on a factor raised to the δ power.Factors closely related to bending stress, like Young modulus E and top tension T 0 , are also found to be significant for fatigue failure sensitivities.A validation analysis for the sensitivity of fatigue failure probability presented here can be found in [37].
The projection of the P f sensitivity vector onto the FIM eigenvectors, denoted as s i in Eq. ( 19), is also shown in Fig. 7.It can be seen from Fig. 7 that the projection is dominated by the 1st FIM eigenvector.

Scenario B
The results for Scenario B are presented in the same format as Scenario A, where the FIM eigenvalues and eigenvectors are shown in Fig. 8 and Fig. 9, and Fig. 10 shows the P f sensitivity vector and its projection onto the FIM eigenvectors.For multiple QoIs, the joint PDF is used to estimate the FIM.Although the first few eigenvalues of the FIM (the first six in this case) are still much higher than the  rest, there is no clear dominance like that in Scenario A.
The P f failure sensitivity shown in Fig. 10 is the same as that in Fig. 7 as the same KPI is assumed for both scenarios.However, the projection onto the FIM eigenvectors is not dominated by the 1st eigenvector as in Scenario A. In Scenario B, the projections focused on the first six eigenvectors, with the exception of the 4th one, and projections onto the 1st and 3rd eigenvectors are much higher than the rest.This is evident from Fig. 9 that the 1st and 3rd eigenvectors of the FIM are dominated by E and T 0 respectively and they are  important parameters for the fatigue failure.On the other hand, since the FIM is estimated from the joint PDF of three different QoIs, the 4th eigenvector is dominated by ρ 0 that has negligible influence on fatigue failure.

Relationship between KPI-free and KPI-based metrics
On one hand, the KPI-free and KPI-based metrics are different metrics that are tailored to different stages of the design process.The KPI-free metric measures the average information contained in the joint PDF of the y QoI using the design entropy.As the design requirements are often evolving during the design process, the KPI-free metric is most suitable for the early design stage where specific design targets are not fixed.On the other hand, the KPI-based metric measures the probability that a certain design would be accepted (or rejected).To evaluate this probability of acceptance (or failure), a specific design requirement needs to be decided.Therefore, the KPI-based metric is most suitable for the detailed design stage where the specific design targets are fixed.Note that the requirement  (KPI) needs to be specific but can be uncertain.For example, the KPI used in the case study is design life of the structure due to fatigue failure.The design criteria is specific, i.e., a fatigue failure not a failure due to excessive displacement, but the expected life duration can be itself uncertain.It should be noted in passing that although a fixed threshold is used in the case study, when the design threshold is uncertain but can be specified by a PDF p(z), then it is possible to use the expected failure probability for KPI-based metric, where dz can be used in Eq. ( 11) and ( 13) instead.On the other hand, there is close relationship between the KPI-free and KPI-based metrics.As mentioned in Section 3.4, as the FIM eigenvectors with largest eigenvalues indicate the most sensitive directions in terms of the underlying PDF, it is likely that the projection of the sensitivity onto those eigenvectors is dominated by the largest eigenvalues.This means that the KPI-based sensitivity likely to identify a similar set of influential parameters as the KPI-free metric.This is the case of Scenario A where the y QoI is the stress response and the design KPI of fatigue failure is also a function of the stress.And it was evident from Scenario A that the KPI-based sensitivity vector is pointing in the similar direction as the 1st FIM eigenvector, despite that the S-N parameters are unique to the fatigue KPI and not included in the analysis of the FIM for the KPI-free metric.Similar conclusion can be drawn from Scenario B where two of the three QoIs have no direct relationship with the design KPI.
The implications of the close relationship between the KPI-free and KPI-based metrics to the design strategy are mainly twofold.First, if the KPI-free metric is used in the early design stage to identify the most influential parameters b * , it is mostly likely that the same set of b * are still the parameters the design should focus on at later design stages.Second, design at the early stage could make use of this relationship and intentionally utilise TEDS for design improvement.This would allow the design to explore a wide range of different QoIs and leave the introduction of specific KPIs until later stages of the design process where design uncertainties have been largely reduced.

Design metrics for decision making
Compared to traditional computational simulations, the application of digital twins for design decision making has more emphasis on efficient processing of the uncertain data.Therefore, the design metrics proposed in TEDS are intended to support decision-making to focus on the data components of the digital twin.Depending on the category of the most influential parameters identified from TEDS, either b * reducible or b * irreducible , tailored design decisions can be made.When the uncertainties associated with b * are irreducible, e.g., the random wave loading spectrum, the structure under design has the undesirable characteristics that they are sensitive to the uncertainties that are difficult to reduce.Thus, design efforts should focus on changing the design itself, correspondingly the model, in this situation.If the uncertainties associate with b * are reducible, e.g., reduction of material variability with an improved production line, it might be better to collect more data and re-evaluate the design before further decisions at next design iteration.Note that in this paper no distinction is made between aleatory and epistemic uncertainties.Although it is often used as a synonym for aleatory uncertainty in the literature, irreducible uncertainty in this paper includes both aleatory uncertainty and those that are epistemic but too expensive to reduce.For example, the uncertainties of hydrodynamic coefficients C a and C d used in the case study (cf Table 1) are due to a lack of complete understanding of the underlying physics.These uncertainties are reducible in principle but likely to be considered too expensive because a large array of flow and structure related parameters need to be identified [36].This view links sensitivity analysis closely to the Value of Information (VoI) in decision making.If the decision is to collect more data to reduce uncertainties, the cost of data collection and the VoI need to be estimated.It is assumed implicitly in Fig. 1 that reducible uncertainty has a bigger VoI than the associated cost.With an emphasis on decision support with high value information from DTs, one of the future possibility is to explore utility based design metrics.For example, given a multi-attribute utility function u(y), the objective is maximizing the gain in utility, over the possible set of Δb, subject to the cost of the decisions: where U is the expected utility function U(b) = ∫ u(y(x) )p(x|b)dx and its partial derivative can be calculated efficiently using the Likelihood Ratio method described in Section 3.4.Note that the entropy change in Eq. ( 2) can be seen as a special case of the utility function described here, where the K-L divergence is used to quantify the information gain.

Conclusions
A design framework that incorporates a Digital Twin (DT) in the design process in the presence of uncertainty is proposed, with a focus on the identification of high value information and the most influential parameters to support decision-making.As a design process is often characterised with evolving design requirements (KPIs), TEDS (Toolbox for Engineering Design Sensitivity) has been developed to provide both KPI-free and KPI-based metrics.The unique features of TEDS are (a) it identifies important uncertain data that design should focus on; (b) it is non-intrusive and thus allow the integration of black-box DTs; (c) it is developed for the design process and thus allow more tailored decisions.The advantage of using the KPI-free metrics at the early design stage, which are based on entropy and fisher information but independent of design KPIs, is that a similar set of most influential parameters b * is expected to be identified by the KPI-based metrics at later design stages.This is because the KPI-free sensitivity is quantified in terms of the underlying distribution of the design quantities of interest (QoIs), and the design KPIs are likely to be based on the QoIs or a subset of the QoIs.This close relationship between the KPI-free and KPI-based metrics are demonstrated in the case study of the dynamic design of an offshore structure.Therefore, it is expected that TEDS would provide a consistent identification of high value information throughout the design process and greatly reduce the risks associated with uncertain data when digital twins are integrated in the design process.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 1 .
Fig. 1.Overview of digital twins in the design process to support decision making in the presence of uncertainties.

Algorithm 1 1 . 2 .
Define probability density function (PDF) for the input uncertain variables p(x|b) Derive the partial derivatives expression ∂lnp(x|b)/∂bj for each distribution parameter bj.This is often available in closed-form analytically, e.g.see Eq. (22) for Gaussian distribution 3. for i ← 1 to Ns 4. draw one set of sample xi from p(x|b) and substitute xi into ∂lnp(x|b)/∂bj

Fig. 4 .
Fig.4.Highlight of the uncertainties that need to be taken into account by marine riser designers.A marine riser is a conduit that transfers subsea oil to a surface platform.

Fig. 6 .
Fig. 6.First four eigenvectors of the FIM (the rest twelve eigenvectors have negligible corresponding eigenvalues and therefore not shown) -Scenario A.

Fig. 7 .
Fig. 7. Scenario A: (a) sensitivity vector of the failure probability (top figure); (b) projection of the sensitivity vector onto the eigenvectors of FIM.

Fig. 10 .
Fig. 10.Scenario B: (a) sensitivity vector of the failure probability (top figure); (b) projection of the sensitivity vector onto the eigenvectors of FIM.

Table 1
Mean and standard deviation values for the random variables.

Table 2
Two design scenarios for the case study.