A digital twin framework for civil engineering structures

The digital twin concept represents an appealing opportunity to advance condition-based and predictive maintenance paradigms for civil engineering systems, thus allowing reduced lifecycle costs, increased system safety, and increased system availability. This work proposes a predictive digital twin approach to the health monitoring, maintenance, and management planning of civil engineering structures. The asset-twin coupled dynamical system is encoded employing a probabilistic graphical model, which allows all relevant sources of uncertainty to be taken into account. In particular, the time-repeating observations-to-decisions flow is modeled using a dynamic Bayesian network. Real-time structural health diagnostics are provided by assimilating sensed data with deep learning models. The digital twin state is continually updated in a sequential Bayesian inference fashion. This is then exploited to inform the optimal planning of maintenance and management actions within a dynamic decision-making framework. A preliminary offline phase involves the population of training datasets through a reduced-order numerical model and the computation of a health-dependent control policy. The strategy is assessed on two synthetic case studies, involving a cantilever beam and a railway bridge, demonstrating the dynamic decision-making capabilities of health-aware digital twins.


Introduction
The optimal management of deteriorating structural systems is an important challenge in modern engineering.In particular, the failure or non-optimized maintenance planning of civil structures may entail high safety, economic, and social costs.Within this context, enabling a digital twin (DT) perspective for structural systems that are critical for either safety or operative reasons, is crucial to allow for condition-based or predictive maintenance practices, in place of customarily employed time-based ones.Indeed, having an up-to-date digital replica of the physical asset of interest can yield several benefits spanning its entire lifecycle, including performance and health monitoring, as well as maintenance, inspection, and management planning [1].
The DT concept [2,3,4,5,6] has been recently applied to several fields for operational monitoring, control, and decision support, including structural health monitoring (SHM) and predictive maintenance [7,8], additive manufacturing [9], smart cities [10], urban sustainability [11], and railway systems management [12].It allows for a personalized characterization of a physical asset, in the form of computational models and parameters of interest, that evolves over time and is kept synchronized with its physical counterpart by means of data-collecting devices.Within a civil SHM framework, such a twinning perspective can be enabled by the assimilation of data through data-driven structural health diagnostics (from physical to digital), possibly accommodating the quantification and propagation of relevant uncertainties related to, e.g., measurement noise, modeling assumptions, environmental and operational variabilities [13,14,15,16,17].The resulting updated digital state should then enable prediction of the physical system evolution, as well as inform optimal planning of maintenance and management actions (from digital to physical).
In this work, we propose a DT framework for civil engineering structures.The overall computational strategy is based upon a probabilistic graphical model (PGM) inspired by the foundational model proposed in [18], which provides a general framework to carry out data assimilation, state estimation, prediction, planning, and learning.Formally, such a PGM is a dynamic Bayesian network with the addition of decision nodes, i.e., a dynamic decision network [19,20].This is employed to encode the end-to-end information flow, from physical to digital through assimilation and inference, and back to the physical asset in the form of informed control actions.A graphical abstraction of the proposed DT strategy is depicted in Fig. 1.The figure shows a physical-to-digital information flow and a digital-to-physical information flow.These bi-directional information flows repeat indefinitely over time.In particular, we have: • From physical to digital.Structural response data are gathered from the physical system and assimilated with deep learning (DL) models, see e.g., [21,22], to estimate the current structural health in terms of presence, location, and severity of structural damage.To solve this inverse problem, we refer to vibration-based SHM techniques, see e.g., [23,24,25,26], which exploit the aforementioned collected data, such as displacement or acceleration time histories.This first estimate of the digital state is then employed to estimate an updated digital state, according to control-dependent transition dynamics models describing how the structural health is expected to evolve.
• From digital to physical.The updated digital state is exploited to predict the future evolution of the physical system and the associated uncertainty, thereby enabling predictive decisionmaking about maintenance and management actions feeding back to the physical system.
• Offline learning phase.The DT setup considered in this work takes advantage of a preliminary offline learning phase.This phase involves the training of the DL models underlying the structural health identification, and learning the control policy to be applied at each time step of the online phase.The DL models are trained in a supervised fashion, with labeled data pertaining to specific damage conditions generated by exploiting physics-based numerical models.To efficiently assemble a training dataset representative of potential damage and operational conditions the structure might undergo during its lifetime, we exploit a reducedorder modeling strategy for parametrized systems relying on the reduced basis method [27].
The health-dependent control policy is also computed offline, by maximizing the expected future rewards for the planning problem induced by the PGM.The elements of novelty that characterize this work are the following: (i) the adaptation of the PGM digital twinning framework to the health monitoring, maintenance, and management planning of civil engineering structures; (ii) the assimilation of vibration response data is carried out by exploiting DL models, which allow automated selection and extraction of optimized damage-sensitive features and real-time assessment of the structural state.This work shows how to incorporate in the DT framework high-dimensional multivariate time series describing the sensor measurements, while tracking the associated uncertainties.The proposed computational framework is made available in the public repository digital-twin-SHM [28].The code implements the PGM framework as a dynamic decision network.This enables us to easily specify the graph topology from a few time slices, and then unroll it for any number of time steps in the future.

Structural health identification
The remainder of this paper is organized as follows.In Sec. 2, we describe the proposed DT framework.In Sec. 3, the computational procedure is assessed on two test cases, respectively related to a cantilever beam and a railway bridge.Conclusions and future developments are drawn in Sec. 4.

Predictive digital twins using physics-based models and machine learning
In this section, we describe the methodology characterizing our DT framework in terms of the PGM encoding the asset-twin coupled dynamical system in Sec.2.1; the population of training datasets exploiting physics-based numerical models in Sec.2.2; and the DL models underlying the structural health identification in Sec.2.3.

Probabilistic graphical model for predictive digital twins
The digital twin assimilates vibration recordings shaped as multivariate time series U(µ) = [u 1 (µ), . . ., u Nu (µ)] ∈ R L×Nu , consisting of N u time series made of L sensor measurements equally spaced in time, for instance in terms of accelerations or displacements.The vector µ ∈ R Npar comprises the parameters representing the operational, damage, and (possibly) environmental conditions.Each recording refers to a time interval (0, T ), within which measurements are recorded with a sampling rate f s .For the problem settings we consider, the interval (0, T ) is short enough for the operational, environmental, and damage conditions to be considered time-invariant, yet long enough not to compromise the identification of the structural behavior.
The PGM that defines the elements comprising the asset-twin coupled dynamical system, and mathematically describes the relevant interactions via observed data and control inputs, is the dynamic decision network sketched in Fig. 2. Circle nodes in the graph denote random variables at discrete times, square nodes denote actions, and diamond nodes denote the objective function.Bold outlines denote observed quantities, while thin outlines denote estimated quantities.The directed acyclic structure of the PGM encodes the assumed conditional dependencies.Edges in the graph represent dependencies between random variables.Solid edges represent the variables' dependencies encoded via conditional probability distributions, while dashed edges represent the variables' dependencies encoded via deterministic functions.
We consider a non-dimensional time discretization, and denote discrete time steps by t.The physical time duration between successive time steps may vary depending on the application, and is governed by the update frequency of the DT via data assimilation, so that the DT is updated once per time step.Thanks to the modeled conditional dependencies between random variables, the graph topology is specified from the first two time steps, and can then be unrolled for any number of time steps.The physical state S t ∼ p(s t ), with s t denoting the realization of the random variable S t at time t, encapsulates the variability in the health state of the asset, which is usually only partially observable.The probability distribution encoding the relative likelihood that S t = s t , for any possible s t , is denoted with p(s t ).The digital state D t ∼ p(d t ) is characterized by those parameters employed to capture the variability of the physical asset by means of the computational models comprising the DT.In our framework, the digital state is given as a vector of length two, describing the presence/location and magnitude of damage in the asset.The physical-to-digital information flow is governed by the observed data O t = o t , which are assimilated by the DT to update the digital state.The assimilation is carried out using the DL models described in Sec.2.3, providing a first estimate of the digital state D NN t ∼ p(d NN t ).This estimation is then used in a Bayesian inference formulation, together with the prior belief D t−1 from the previous time step, to estimate an updated digital state D t according to a control-dependent transition dynamics model describing how the digital state is expected to evolve.The updated digital state can thus be exploited to compute quantities of interest Q t ∼ p(q t ), such as modal quantities or other response features, through the computational models comprising the DT.For instance, quantities of interest can be useful to perform posterior predictive checks on the tracking capabilities of the DT to assess how it matches the reality, by comparing sensor measurements with the corresponding posterior estimates.However, we point out that this capability is not exploited in the present work, and that the Q t node is kept in the graph in agreement with the foundational model proposed in [18].Nevertheless, the updated digital state D t is eventually exploited to inform the digital-to-physical information flow, in the form of control inputs; in Fig. 2, U t ∼ p(u t ) and U A t = u A t denote the belief about what action to take and the control input effectively enacted on the asset, respectively.At each time step, U t is estimated according to a health-dependent control policy, that maps the belief over the digital state onto the control actions feeding back to the physical asset.Finally, the reward R t ∼ p(r t ) quantifies the performance of the asset for the time step and can be equivalently perceived as a negative cost to be maximized.
The key assumptions behind our PGM are that the physical state is only observable indirectly via the sensed structural response, and the physical and digital states evolve according to a Markovian process.This implies that the conditional probabilities associated with the random variables at one time step depend only on the random variables at the previous time step, and are independent of all past states.The resulting graph topology encodes a conditional independence structure that allows us to conveniently cast the asset tracking within a sequential Bayesian inference framework.Indeed, by exploiting conditional independence and Bayes rule, the joint distributions over variables can be factorized up to the current time step t c , as follows: p(D NN 0 , . . ., D NN tc , D 0 , . . ., D tc , Q 0 , . . ., Q tc , R 0 , . . ., R tc , U 0 , . . ., U tc |o 0 , . . ., o tc , u A 0 , . . ., u A tc ) with factors: Herein, ϕ data t encodes the assimilation of observed data through the DL models underlying the identification of the structural health.ϕ history t and ϕ NN t factorize the belief about the digital state D t , conditioned on the digital state at the previous time step D t−1 , the last enacted control input , and the data assimilation outcome D NN t .In our PGM, the spaces of the digital states and control inputs are discrete, thus the relevant causal relationships are modeled by means of conditional probability tables (CPTs).In particular, ϕ history t plays the role of a predictor forward in time, parametrized by means of a control-dependent CPT describing how the digital state is expected to evolve.Such a CPT should embody any a priori knowledge that the DT designer has with respect to the asset and the relevant operational conditions.ϕ history t can be estimated offline from historical data, see e.g., [29,30], or learned online from experience.On the other hand, ϕ NN t updates the digital state estimate to account for data assimilation.This is encoded by means of a CPT mapping the estimate D NN t provided by the DL models, onto a belief about D t .Such a CPT is a confusion matrix measuring the offline (expected) performance of the DL models in correctly identifying the digital state among all the possible outcomes of D t .ϕ QoI t and ϕ reward t respectively encapsulate the evaluation of the computational models comprising the DT to estimate quantities of interest, and the computation of the reward function quantifying the performance of the asset.Finally, the control factor ϕ control t is encoded by means of a health-dependent control policy π(D t ) computed as described in the following.Since the spaces of the unobserved variables are discrete, we can propagate and update the relative belief exactly with a single pass of the sum-product message-passing algorithm [19].
The control policy π(D t ) is computed offline under the simplifying assumption of sufficient sensing capability to provide an accurate estimate of the structural health, allowing us to decouple the sensing and control problems.This involves solving the planning problem induced by the expected evolution of the structural health, maximizing the expected reward over the planning horizon.Considering an infinite planning horizon, this can be stated as the optimization problem: where γ ∈ [0, 1] is the discount factor.Here, this is solved using the dynamic-programming value iteration algorithm [31].The reward function to be optimized is chosen as: Herein, R control t (U t ) and R health t (D t ) quantify the rewards relative to control inputs and health state, respectively, and α ∈ R is a weighting factor, useful to tune the trade-off between risk-averse and risk-seeking behavior.After learning π(D t ), U A t is selected as the best point estimate of U t .Starting from the updated digital state D tc at the current time step t c , future prediction is achieved by unrolling until a prediction time t p the portion of PGM relative to D t , Q t , R t , and U t (see Fig. 3).All other nodes are removed from the prediction graph, as neither data assimilation nor actions are performed on the asset while forecasting its evolution.The factorization in Eq. ( 1) can be extended over the prediction horizon as: The algorithmic description of the online phase of the proposed digital twinning framework is reported in Algorithm 1.The operations repeat each time new observational data are provided.Note that the considered PGM digital twinning framework is general, and can easily be adapted to deal with physical assets other than civil engineering structures by reorganizing the topology of the graph, if necessary.
return control input to be enacted u A t , expected evolution of D t and U t .

Numerical models for simulation-based damage identification
As anticipated in the previous section, the assimilation of structural response data to identify the structural state is carried out through DL models.A simulation-based strategy is exploited to train DL models on the basis of vibration responses.The training data are numerically generated by simulating physics-based models so that the effect of damage on the structural response can be systematically reproduced [32].In particular, the structure to be monitored is modeled as a linear-elastic continuum, discretized in space through finite elements.Its dynamic response to the applied loadings, under the assumption of linearized kinematics, is described by the following semi-discretized form of the elasto-dynamic problem: which is referred to as the full-order model (FOM).Here t ∈ (0, T ) denotes time; x(t), ẋ(t), ẍ(t) ∈ R NFE are the vectors of nodal displacements, velocities and accelerations, respectively; N FE is the number of degrees of freedom (dofs); M ∈ R NFE×NFE is the mass matrix; C(µ) ∈ R NFE×NFE is the damping matrix, assembled according to the Rayleigh's model; K(µ) ∈ R NFE×NFE is the stiffness matrix; f (t, µ) ∈ R NFE is the vector of nodal forces induced by the external loadings; and x 0 and ẋ0 are the initial conditions (at t = 0), in terms of nodal displacements and velocities, respectively.The mass matrix M is not a function of µ because the mass properties of the structure are unaffected by the employed damage description or by the operational conditions.The solution of Problem (11) is advanced in time using the Newmark integration scheme (constant average acceleration method) [33], to provide x l , ẋl and ẍl , for l = 1, . . ., L, with x l being the vector of nodal displacements at time l.
With reference to civil structures, we focus on the early detection of damage patterns characterized by a small evolution rate, whose prompt identification can reduce lifecycle costs and increase the safety and availability of the structure.In this context, a localized reduction of the material stiffness stands as the simplest damage mechanism resulting from a time scale separation between damage growth and damage assessment, see e.g., [34,35,36].Here, local stiffness reduction is obtained by parametrizing the stiffness matrix via two variables y ∈ N and δ ∈ R, included in the parameter vector µ, respectively describing the location and magnitude of the applied stiffness reduction, similarly to [37,38,39].In particular, y ∈ {0, . . ., N y } labels the specific damage region, among a set of predefined N y damage locations, where y = 0 identifies the damage-free baseline.The parameter δ ∈ R describes the magnitude of the stiffness reduction taking place within the predesignated region associated with y.
As N FE increases, the computational cost associated with the solution of the FOM for any sampled µ also grows, and the generation of synthetic datasets becomes prohibitive.To address this challenge, a projection-based reduced-order model (ROM) is exploited in place of the FOM to speed up the offline dataset population phase, similarly to [38,39].The ROM is obtained by a proper orthogonal decomposition (POD)-Galerkin reduced basis method [27,40,41,42].This reduced-order modeling strategy is chosen because POD has been investigated and validated in the context of structural dynamics [43,44] and structural analysis [45,46], its appealing offline-online decoupling, and the availability of efficient criteria for the selection of POD basis functions.
The ROM approximation to the solution of Problem ( 11) is obtained by linearly combining N RB ≪ N FE POD basis functions w k ∈ R NFE , k = 1, . . ., N RB , as x(t, µ) ≈ W x(t, µ), where W = [w 1 , . . ., w NRB ] ∈ R NFE×NRB is the basis matrix collecting the POD basis functions and x(t, µ) ∈ R NRB is the vector of unknown POD coefficients.By enforcing the orthogonality between the residual and the subspace spanned by the first N RB POD modes through a Galerkin projection, the following N RB -dimensional semi-discretized form is obtained: The solution of this reduced-order system is advanced in time using the same strategy employed for the FOM model, and then projected onto the original FOM space as x(t, µ) ≈ W x(t, µ).Here, reduced matrices M r , C r , and K r , and the reduced vector f r play the same role as their high-fidelity counterparts, yet with dimension N RB × N RB instead of N FE × N FE , according to the following relationships: The basis matrix W is obtained by POD, exploiting the so-called method of snapshots as follows.First, a snapshot matrix S = [x 1 , . . ., x NS ] ∈ R NFE×NS is assembled from N S solution snapshots, computed by integrating in time the FOM solution for different values of parameters µ.The computation of an optimal reduced basis is then carried out by factorizing S through a singular value decomposition.We use a standard energy-based criterion to set the order N RB of the approximation.For further details see, e.g., [27,42,23,21].
To populate the training dataset D, the parametric space of vector µ is taken as uniformly distributed, and then sampled via the Latin hypercube rule.The number of samples is equal to the number I of instances collected in D as: where the vibration recordings U i associated with the i-th sampling of µ, with i = 1, . . ., I, are labeled by the corresponding values of y i and δ i , and are obtained as follows.With reference to displacement recordings, nodal values in (0, T ) are first collected as (12).The relevant vibration recordings U i are then obtained as: where T ∈ R Nu×NFE is a Boolean matrix whose (n, m)-th entry is equal to 1 only if the nth sensor output coincides with the m-th dof.In order to mimic the measurement noise, each vibration recording in D is corrupted by adding an independent, identically distributed Gaussian noise, whose statistical properties depend on the target accuracy of the sensors.In the following, the index i will be dropped for ease of notation, unless necessary.

Data assimilation via artificial neural networks
The ϕ data t factor in our PGM encodes the assimilation of observed data through the DL models underlying the identification of the structural health.In this section, we describe the adopted DL models, the aspects related to their training, and how they are used to assimilate observational data to detect, locate, and quantify the presence of structural damage.
Every time new observational data U are acquired, they are first processed with a classification model NN CL to address damage detection/localization.Classification involves the prediction of an output class to categorize a given input.Here, the classes are those described through the y parameter.Whenever a damage is identified in the j-th region, j = 1, . . ., N y , the observational data U are further processed with regression models NN j RG , one for each damageable region, to quantify the associated amount of damage δ.
The aforementioned classification and regression tasks are addressed by means of DL models.The use of DL models for SHM purposes has the advantage of automating the feature engineering stage characterizing the pattern recognition paradigm for SHM [35,47].Indeed, a DL model is trained to select and extract optimized damage-sensitive features from raw sensor recordings through an end-to-end learning process.Moreover, since the DL model is learned offline, the structural state can be next assessed in real-time regardless of considering continuous or discrete variables, which would be difficult to achieve with other optimization techniques, such as nonlinear programming, stochastic optimization, and metaheuristic methods.
The model NN CL addresses the multi-class classification task underlying the damage detection/localization problem, namely NN CL : U → b ∈ R Ny+1 .The target label b categorizes one of the N y + 1 predefined damage scenarios described through parameter y.In particular, b is a one-hot encoding b = [b 0 , . . ., b Ny ] ⊤ , with entries b m equal to 1 if the target class y is m and 0 otherwise, with m = 0, . . ., N y .This is needed because DL models cannot operate on nominal data directly.They require all input variables and output variables to be numeric.The one-hot encoding converts the nominal feature described by the y parameter into a multidimensional binary vector.The number of dimensions corresponds to the number of categories, and each category gets its dimension.Each category is encoded by mapping it to a vector in which the entry corresponding to the category's dimension is 1, and the rest are 0. The When NN CL is exploited for prediction, the most likely class is selected as the one that best categorizes the processed measurements U.
The model NN j RG addresses the regression task underlying the damage quantification problem, namely NN j RG : U → δ ∈ R, with j = 1, . . ., N y .The estimated counterpart of δ is obtained as δ = NN j RG (U).Hence, the regression models, one for each damageable region, map the vibration recordings U associated with the j-th damage region, onto the estimated magnitude of the stiffness reduction taking place within the relative damage region.Since all NN j RG models are learned following the same procedure, the index j will be dropped in the following for ease of notation.
Since the space of digital states in the PGM is discrete, the outcomes of NN CL and NN RG are accommodated within the PGM by discretizing the range in which the damage level δ can take values in N δ uniform intervals, thus resulting in N d = 1 + N δ N y possible states.The same reasoning is followed to compute the confusion matrix encoding the ϕ NN t factor.In particular, ϕ NN t measures the offline performance of NN CL and NN RG in assimilating noisy FOM data to classify the digital state, among the N d possible outcomes of D t .
The models NN CL and NN RG are trained separately.The datasets dedicated to the training of NN CL and NN RG are derived from dataset D in Eq. ( 14) as follows.The dataset used to learn NN CL is obtained from Eq. ( 14), as The dataset used to learn NN RG is derived from Eq. ( 14), as where I RG is the number of training instances in D RG , all characterized by a structural damage within the same predefined region.The set of weights and biases parametrizing NN CL is denoted as Θ CL .This is optimized minimizing the probabilistic categorical cross-entropy [48,37] L CL between the predicted and target class labels over D CL : which provides a measure of the distance between the discrete probability distribution describing b, and its estimated counterpart b = NN CL (U).The set of weights and biases Θ RG parametrizing NN RG is learned through the minimization of the following mean squared error loss function: which provides a measure of the distance between the target magnitude of the stiffness reduction δ, and its approximated counterpart δ = NN RG (U).
The algorithmic description of the procedures and computations characterizing the preliminary offline phase of the proposed digital twinning framework is reported in Algorithm 2. The implementation details of the deep learning models are reported in Appendix A.

Algorithm 2 Preliminary offline phase -algorithmic description
Input: parametrization of the operational and damage conditions PGM implementing the prediction graph 1: set up the physics-based numerical model of the structure to be monitored.

Numerical experiments
This section demonstrates the proposed methodology for two test cases: an L-shaped cantilever beam and a railway bridge.
The FOM and ROM in Problem (11) and Problem (12) are implemented in the Matlab environment, using the redbKIT library [49].The PGM framework for predictive digital twins is implemented in Python, using the pgmpy library [50].All computations have been carried out on a PC featuring an AMD Ryzen TM 9 5950X CPU @ 3.4 GHz and 128 GB RAM.The NN architectures are implemented through the Tensorflow-based Keras API [51], and trained on a single Nvidia GeForce RTX TM 3080 GPU card.

L-shaped cantilever beam
The first test case deals with the L-shaped cantilever beam depicted in Fig. 4. The structure is made of two arms, each one having a length of 4 m, a width of 0.3 m, and a height of 0.4 m.The assumed mechanical properties are those of concrete: Young's modulus E = 30 GPa, Poisson's ratio ν = 0.2, density ρ = 2500 kg/m 3 .The structure is excited by a distributed vertical load q(t), acting on an area of (0.3×0.3) m 2 close to its tip.The load varies in time according to q(t) = Q sin (2πf t), with Q ∈ [40, 80] kPa and f ∈ [10, 60] Hz, respectively being the load amplitude and frequency.Following the setup described in Sec. 2, these parameters have a uniform distribution within their respective ranges.

Dataset assembly
Synthetic displacement time histories U are obtained in relation to N u = 8 dofs along the bottom surface of the structure, to mimic the monitoring system depicted in Fig. 4.Each recording is provided for a time interval (0, T = 1 s) with an acquisition frequency f s = 200 Hz.Recordings are corrupted with an additive Gaussian noise yielding a signal-to-noise ratio of 100.
In addition to the damage-free baseline condition, damage is simulated by considering N y = 7 possible damage classes, each referring to a reduction of the material stiffness within a subdomain Ω j , with j = 1, . . ., N y , as depicted in Fig. 4. The stiffness reduction can occur with a magnitude δ ∈ [30%, 80%], and is held constant within the considered time interval.
The FOM is obtained with a finite element discretization using linear tetrahedral elements and resulting in N FE = 4659 dofs.The basis matrix W is obtained from a snapshot matrix S, assembled through 400 evaluations of the FOM, at varying values of the input parameters µ = (Q, f, y, δ) ⊤ sampled via Latin hypercube rule.By prescribing a tolerance ϵ = 10 −3 on the fraction of energy content to be disregarded in the approximation, the order of the ROM approximation turns out to be N RB = 56.
The dataset D is built with I = 10, 000 instances collected using the ROM.This is then employed to train NN CL and NN RG , as described in the previous section.In the absence of experimental data, the testing phase of NN CL and of NN RG is carried out through noise-corrupted FOM solutions.In particular, the asset is monitored by processing batches of N obs = 10 noisy observations, relative to the same damage location y and damage magnitude δ, yet featuring varying operational conditions set by Q and f .As the health of the asset evolves over time, the DT assimilates a batch of noisy observations {U k } N obs k=1 at each time step, to dynamically estimate the variation in the structural health parameters underlying the digital state.

Digital twin framework
The two structural health parameters within the digital state are d = (y, δ) ⊤ .In order to accommodate the outcome of the DL models within the PGM and to compute the CPT encoding the ϕ NN t factor, the range in which the damage level δ can take values is discretized in N δ = 6 intervals {[30%, 35%], [35%, 45%], [45%, 55%], [55%, 65%], [65%, 75%], [75%, 80%]}, thus resulting in N d = 43 possible digital states.The number of δ intervals and the width of each interval are chosen arbitrarily, and there are no restrictions in this respect.The resulting digital states are then sorted to follow the lexicographic order.
The confusion matrix reported in Fig. 5 measures the offline performance of NN CL and NN RG in assimilating noisy FOM data to classify the digital state, among the N d possible outcomes of D t .The (unknown) ground truth digital state is detected by the DL models with an overall classification accuracy of 93.61%.Moreover, it can be argued from the confusion matrix that most of the misclassifications are due to the damage scenarios related to a stiffness reduction within Ω 6 or within Ω 7 .This is a quite expected outcome since measurements closer to the clamped side are only marginally affected by the presence of damage close to the free end of the beam, thus yielding a smaller sensitivity of sensor recordings to damage.This confusion matrix then serves as the CPT encoding the ϕ NN t factor.For the present case, we consider four possible control inputs, each provided with a CPT modeling the transition probability p(D t+1 |D t , U A t = u A t ) from D t to D t+1 after taking the action u A t , and collectively encoding the ϕ history t factor.These internal models of how structural health is expected to evolve do not reflect the prescribed ground truth evolution, which is unknown to the DT.The considered control inputs are the following: • Do nothing (DN) action.There is no maintenance action planned in this case and the physical state will evolve according to a stochastic deterioration process.• Minor imperfect maintenance (MI) action.A maintenance action is performed and the asset may be restored from its current condition to a healthier damage state.This can be traced back to, e.g., patching and sealing cracked surfaces, rectifying and replacing expansion joints, or tightening/replacing loose/missing knot bolts for steel members.
• Major imperfect maintenance (MA) action.A maintenance action is performed and the asset may be restored from its current condition to a healthier damage state, with a higher probability of improvements than in the previous case.This can be traced back to, e.g., repairing heavily damaged slabs, piers, and steel members, and retrofitting compromised structural elements.
• Perfect maintenance (PM) action.A maintenance action is performed and the asset is restored from its current condition to the damage-free state.This can be traced back to the replacement of excessively compromised structural elements.

Results: two available actions
We first illustrate the DT capabilities to assimilate observational data and track the structural health evolution, by restricting the available actions to DN and PM.We prescribe a stochastic degradation process featuring a probability of damage inception (y ̸ = 0) equal to 0.5.Damage may develop in any of the predefined regions with δ = 30%, and then propagate with δ increments sampled from a Gaussian probability density function (pdf) centered at 1.5% and featuring a standard deviation equal to 1% (negative increments are rounded to zero).At each time step during the operation, new observational data are simulated according to the (unknown) ground truth structural health and the most recently enacted control input.The DT assimilates the data and estimates the digital state, eventually suggesting the next control input to enact.Note that the prescribed trajectory of the structural health parameters is arbitrarily chosen to fully display the capabilities of the DT.Nevertheless, the DT would be equally capable of tracking the structural health evolution also considering either more or less aggressive degradation processes.The transition model p(D t+1 |D t , U A t = u A t ) associated with the DN action assumes that damage may start in any subdomain Ω j , with j = 1, . . ., N y , with probability 0.05, and then grow to the next δ interval with the same probability.The transition model assumed for the PM action instead maps the D t belief to a belief D t+1 associated with a damage-free condition, independently of the current condition.
At each time step, the DT selects a control input u A t ∈ {DN,PM} to be enacted on the asset.Taking a DN action yields a positive reward, but also gives the chance of worsening the asset's structural health.On the other hand, the PM action responds to the degrading structural health, yet yields a negative reward.In particular, the two reward functions in Eq. ( 9) are defined as: where R control t targets the cost assigned to each control input and R health t measures the cost associated with the structural health state.These non-dimensional rewards represent indicative values the decision-maker is charged due to the condition of the structure.Although these values are not based on real data, actual values are not usually hard to find.State agencies and companies provide lists with services and costs [52].The three cases in R health t distinguish between the absence of damage, the presence of damage within the harm closed to the clamped side, and the presence of damage far from the clamp, respectively.Note how these penalize the progressive deterioration of the structural health as a function of δ.R health t can resemble a variety of aspects, like reduction in the level of service due to deterioration, working accidents, structural reliability, and structural failure probability [52].

Results: four available actions
We now consider all four possible control inputs.We prescribe a stochastic degradation process with a probability of damage inception (y ̸ = 0) equal to 0.5.Damage may develop in any of the predefined regions with damage level sampled from a uniform distribution δ ∈ [30%, 70%], and then propagate as in the previous case.This more aggressive degradation process is used to spot in a few time steps the effectiveness of the decision-making capabilities of the DT.The effect of a MI action on the asset is simulated with δ decrements sampled from a Gaussian pdf centered at −12.5% and featuring a standard deviation equal to 1%.We model the effect of a MA action with δ decrements sampled from a Gaussian pdf centered at −17.5% and featuring a standard deviation equal to 1%.In both cases, the damage-free condition is assumed to be recovered if the resulting structural state features δ < 30%.The transition model p(D t+1 |D t , U A t = u A t ) associated with the MI action assumes no improvement in the structural health with probability 0.1, improvement of one δ interval with probability 0.75, and improvement of two δ intervals with probability 0.15.Similarly, the MI action assumes no improvement with probability 0.05, improvement of one δ interval with probability 0.3, improvement of two δ intervals with probability 0.4, and improvement of three δ intervals with probability 0.25.
Fig. 8 depicts a simulated online phase of the DT up to t c = 50.The DT accurately tracks the digital state evolution and timely suggests the appropriate control inputs most of the time.In particular, the DT proposes the optimal control input, except for the time steps t = 43 and t = 50 featuring a sub-optimal action.In both cases, a MI action is proposed in place of a DN, because the DT estimates a δ ∈ [35%, 45%] instead of a δ ∈ [30%, 35%] related to a stiffness reduction within Ω 7 .This is in line with what was observed in the confusion matrix of Fig. 5, due to the limited sensitivity of recordings to damage scenarios affecting the terminal region of the beam.This peculiar type of misclassification turns out to be the most pathological in the confusion matrix and is therefore capable of potentially spoiling the assimilation of observational data.Nevertheless, the DT reverts to correctly tracking the structural health of the asset within one time step.
).In the bottom panel it corresponds to p(U t |D t ).
Fig. 9 depicts the predicted evolution of the digital state and control inputs, from t c = 21 and over 20 time steps in the future.The DT prediction correctly suggests taking with high probability a MA action, followed by two MI actions, and accordingly predicts the corresponding evolution of the structural health.Comparing the DT prediction with what is effectively experienced during the online phase (see Fig. 8), note how the DT prediction closely resembles the actual evolution of the digital state and control inputs.This is a remarkable result in terms of DT prediction capabilities, since the DT is not aware of the future values of the structural health parameters, and the relative transition models do not match their real (stochastic) evolution.

Railway bridge
The second case study concerns the railway bridge depicted in Fig.The bridge is subjected to the transit of Gröna Tåget trains type, at a speed υ ∈ [160, 215] km/h.Only trains composed of two wagons are considered, thus characterized by 8 axles, each one carrying a mass ψ ∈ [16,22] ton.The corresponding load model is described in [38], and consists of 25 equivalent distributed forces transmitted by the sleepers to the deck through the ballast layer with a slope 4 : 1, according with Eurocode 1 [55].

Dataset assembly
Synthetic displacement time histories U are obtained from N u = 10 sensors deployed as depicted in Fig. 11.Each recording is provided for a time interval (0, T = 1.5 s) with an acquisition frequency f s = 400 Hz.This setting allows to record train passages at the lowest speed of 160 km/h, and properly catches the structural response at the maximum speed of 215 km/h.Recordings are corrupted with an additive Gaussian noise yielding a signal-to-noise ratio of 120.
In addition to the undamaged condition, the presence of damage in the structure is accounted for using a localized stiffness reduction that can take place within N y = 6 predefined subdomains Ω j , with j = 1, . . ., N y , as depicted in Fig. 11.The stiffness reduction can occur with a magnitude δ ∈ [30%, 80%], and is kept fixed while a train travels across the bridge.The FOM features N FE = 17, 292 dofs, resulting from a finite element discretization with an element size of 0.80 m and a reduced size of 0.15 m for the deck, to enable a smooth propagation of the traveling load.The presence of the ballast layer is accounted for through an increased density for the deck and for the edge beams.The embankments are accounted for through distributed springs, modeled as a Robin mixed boundary condition (with elastic coefficient k robin = 10 8 N/m 3 ) applied on the surfaces facing the ground.The structural dissipation is modeled by means of a Rayleigh's damping matrix, assembled to account for a 5% damping ratio on the first two structural modes.
The ROM is obtained from a snapshot matrix S, assembled through 400 evaluations of the FOM for different values of parameters µ = (υ, ψ, y, δ) ⊤ .By setting the error tolerance to ϵ = 10 −3 , N RB = 133 POD modes are to be considered.
The training dataset D is built with I = 10, 000 instances collected using the ROM.Also in this case, the testing phase of NN CL and of NN RG is carried out considering noisy FOM solutions.The monitoring of the asset is then simulated by assimilating N obs = 1 noisy observations at each time step.As the structural health of the bridge evolves over time, the DT estimates the variation in the structural health parameters every time a train travels across the bridge.

Digital twin framework
As in the previous case, the two structural health parameters within the digital state are d = (y, δ) ⊤ .The range in which the damage level δ can take values is discretized in N δ = 6 intervals.The resulting N d = 37 possible digital states are sorted first for damage location and then for damage level.
The confusion matrix measuring the offline performance of NN CL and of NN RG in correctly categorizing the digital state is reported in Fig. 12.The ground truth digital state is detected with an overall classification accuracy of 91.39%.In this case, the majority of misclassifications are due to confusing adjacent digital states relative to the same damage location, thus yielding a tridiagonal band matrix.For the present case, we consider the following three possible control inputs: • Do nothing (DN) action.There is no maintenance action planned in this case and the physical state will evolve according to a stochastic deterioration process.
• Perfect maintenance (PM) action.A maintenance action is performed and the asset is restored from its current condition to the damage-free state.
• Restrict operational conditions (RE) action.The operational conditions of the bridge are restricted by allowing only lightweight trains, carrying less than 18 ton per axle, to travel across the bridge.Such a restriction results in a lower deterioration rate, but also yields a lower revenue generated by the infrastructure.
We prescribe a stochastic degradation process featuring a probability of damage inception (y ̸ = 0) equal to 0.5.Damage may develop in any of the predefined regions with damage level sampled from a uniform distribution δ ∈ [30%, 35%], and then propagate with δ increments sampled from a Gaussian pdf centered at 1.5% and featuring a standard deviation equal to 1% (negative increments are rounded to zero).When the operations are restricted and only lightweight trains are allowed to travel across the bridge, we instead assume a probability of damage inception equal to 0.25.In this eventuality, damage may develop with damage level sampled from a uniform distribution δ ∈ [30%, 35%], and then propagate with δ increments sampled from a Gaussian pdf centered at 0.95% and featuring a standard deviation equal to 0.5%.The resulting trajectory of the structural health parameters is intended to represent periods of gradual degradation in the structural health, as well as sudden changes due to discrete damage events.
The transition model p(D t+1 |D t , U A t = u A t ) associated with the DN action assumes that damage may start in any subdomain Ω j , with j = 1, . . ., N y , with probability 0.1, and then grow to the next δ interval with the same probability.For the transition model associated with the RE action, this probability is assumed to decrease to 0.03.The transition model assumed for the PM action instead maps the D t belief to a belief D t+1 associated with a damage-free condition, independently of the current condition.
In this case, the two reward functions in Eq. ( 9) are chosen as: where the last contribution in R health t penalizes excessively compromised structural states with a significantly negative reward.

Results
During the offline phase, we solve the planning problem in Eq. ( 8) by assuming a discount factor γ = 0.90, and a weighting factor α = 1.The resulting control policy π(D t ) recommends that the asset operates in ordinary conditions until when δ ∈ [30%, 35%], after which point it should fall back to the more conservative RE regime in order to minimize further degradation.Once reached δ ≥ 65%, the bridge should be finally repaired.
Fig. 13 reports a sample simulation of the DT online phase up to time step t c = 60.The DT correctly tracks the digital state with relatively low uncertainty.Damage initially develops within Ω 5 , and the DT follows its evolution with a limited delay of at most two time steps, with respect to the ground truth, due to the need of updating the relative prior belief from the previous time steps.The RE action is suggested as soon as the DT estimates a δ ∈ [35%, 65%], after which point the DT keeps on tracking the structural health parameters evolving with a lower deterioration rate.A PM action is finally suggested due to an excessively compromised structural state.A similar behavior can be observed for the following damage scenario affecting Ω 6 .Fig. 14 reports the predicted evolution of the digital state and control inputs, from t c = 5 and over 20 time steps in the future.The DT predicts the expected degradation of the structural health according to the transition model associated with the DN action, before predicting to take a RE action with relatively high probability after a few time steps.The DT prediction is close to what is effectively experienced online (see Fig. 13).However, besides having the estimated digital state two time steps behind the ground truth value, the prediction is also too optimistic in terms of deterioration rate, which suggests the use of a more refined transition model.

Conclusions
In this work we have proposed a predictive digital twin approach to the health monitoring, maintenance, and management planning of civil structures, to advance condition-based and predictive maintenance practices.The presented strategy relies upon a probabilistic graphical model inspired by [18].This framework is used to encode the asset-twin coupled dynamical system, the relevant end-to-end information flow via observational data (physical to digital) and control inputs (digital to physical), and its evolution over time, all with quantified uncertainty.The assimilation of observational data is carried out with deep learning models, leveraging the capabilities of convolutional layers to automatically select and extract damage-sensitive features from raw vibration recordings.The structural health parameters comprising the digital state are used to capture the variability of the physical asset.They are continually updated in a sequential Bayesian inference fashion, according to control-dependent transition dynamics models describing how the structural health is expected to evolve.The updated digital state is eventually exploited to predict the future evolution of the physical system and the associated uncertainty.This enables predictive decision-making about maintenance and management actions.The computational procedure takes advantage of a preliminary offline phase which involves: (i) using physics-based numerical models and reduced order modeling, to overcome the lack of experimental data for civil applications under varying damage and operational conditions while populating the datasets for training the deep learning models; (ii) learning the health-dependent control policy to be applied at each time step of the online phase, to map the belief over the digital state onto actions feeding back to the physical asset.
The proposed strategy has been assessed against the simulated monitoring of an L-shaped cantilever beam and a railway bridge.In the absence of experimental data, the tests have been carried out considering high-fidelity simulation data, corrupted with an additive Gaussian noise.The obtained results have proved the digital twin capabilities of accurately tracking the digital state evolution under varying operational conditions, with relatively low uncertainty.The framework is also able to promptly suggest the appropriate control input, within at most two time steps of when the (unknown) ground truth structural health demands it.
Future research lines will investigate the ability of the digital twin to update the transition dynamics models by learning from previous data.As suggested by the railway bridge case study, this will allow for a more accurate prediction of the expected evolution of the digital state, thus enabling predictive decision-making better tailored to the monitored asset.Another aspect of interest concerns solving the planning problem induced by the probabilistic model using reinforcement learning algorithms, capable of taking into account a finite planning horizon representing the design lifetime of the asset.following loss functions, respectively: where λ CL and λ RG denote the L 2 regularization rate over the relative model parameters Θ CL and Θ RG .The loss functions L R CL and L R RG are minimized using the first-order stochastic gradient descent optimizer Adam [58], for a maximum of 250 allowed epochs.The corresponding learning rates η CL and η RG are initially set to {10 −3 , 10 −4 }, and decreased for 4/5 of the allowed training steps using a cosine decay schedule with weight decay equal to 0.05.The optimization is carried out considering an 80:20 splitting ratio of the dataset for training and validation purposes, with 20% of the data randomly taken and set aside to monitor the learning process.We use an early stopping strategy to interrupt learning, whenever the loss function value attained on the validation set does not decrease for a prescribed number of patience epochs in a row.The hyperparameters and training options for NN CL and for NN RG are reported in Tab.1b and in Tab.2b, respectively.

Figure 1 :
Figure 1: Predictive digital twin framework for civil engineering structures: graphical abstraction of the end-to-end information flow enabled by the probabilistic graphical model.

Figure 2 :
Figure 2: Dynamic decision network encoding the asset-twin coupled dynamical system.Circle nodes denote random variables, square nodes denote actions, and diamond nodes denote the objective function.Bold outlines denote observed quantities, while thin outlines denote estimated quantities.Directed solid edges represent the variables' dependencies encoded via conditional probability distributions, while directed dashed edges represent the variables' dependencies encoded via deterministic functions.

Figure 3 :
Figure 3: Dynamic decision network employed to predict the future evolution of the digital state and the associated uncertainty.Circle nodes denote random variables, and diamond nodes denote the objective function.Directed solid edges represent the variables' dependencies encoded via conditional probability distributions, while directed dashed edges represent the dependencies encoded via deterministic functions.
estimated counterpart of b is obtained as b = NN CL (U).By employing a Softmax activation function for the output layer of NN CL , the entries of b = ( b 0 , . . ., b Ny ) ⊤ ∈ R Ny+1 are interpreted as the confidence levels b m by which U is assigned to the m-th damage class, with m = 0, . . ., N y .In particular, the Softmax activation function converts the real-valued vector a = (a 0 , . . ., a Ny ) ⊤ ∈ R Ny+1 , provided by the output layer of NN CL , into a discrete probability distribution as: b = Softmax(a), with b m (a) = exp(a m ) Ny k=0 exp(a k ), m = 0, . . ., N y .

Figure 5 :
Figure 5: L-shaped cantilever beam -Confusion matrix measuring the offline performance of the DL models in correctly categorizing the digital state.Results are reported in terms of classification accuracy, measuring how observational data are classified with respect to the ground truth digital state.Digital states are ordered first for damage location and then for damage level.

Fig. 6
depicts a simulated online phase of the DT up to time step t c = 50.Results are reported in terms of the (unknown) ground truth digital state, and the corresponding DT estimate after assimilating the observational data.The graphs report the evolution of the digital state only for the damaged regions, nevertheless, damage can potentially affect all Ω 1 , . . ., Ω 7 predefined damaged regions.The DT proves capable of accurately tracking the digital state evolution with relatively low uncertainty.The corresponding estimation of the control inputs is reported in the bottom part of the figure, demonstrating that the DT is able to promptly suggest the PM action within one time step of when the (unknown) ground truth structural health demands it.

Figure 6 :
Figure 6: L-shaped cantilever beam -Online phase of the digital twin framework with two possible actions: DN (do nothing), and PM (perfect maintenance).Probabilistic and best point estimates of: (top) digital state evolution against the ground truth digital state; (bottom) control inputs informed by the digital twin, against the optimal control input under ground truth.In the top panels the background color corresponds to p(D t |D t−1 , D NN t , U A t−1 = u A t−1 ).In the bottom panel it corresponds to p(U t |D t ).

Fig. 7 Figure 7 :
Fig.7depicts the predicted evolution of the digital state and of the corresponding informed control inputs, starting from t c = 50.The prediction horizon is extended over 20 time steps in the future so that t p = t c + 20.The DT prediction engine informs about the expected future degradation of the structural health, allowing to plan future interventions.

Figure 8 :
Figure 8: L-shaped cantilever beam -Online phase of the digital twin framework with four possible actions: DN (do nothing), PM (perfect maintenance), MI (minor imperfect maintenance), and MA (major imperfect maintenance).Probabilistic and best point estimates of: (top) digital state evolution against the ground truth digital state; (bottom) control inputs informed by the digital twin, against the optimal control input under ground truth.In the top panels the background color corresponds to p(D t |D t−1 , D NN t , U A t−1 = u A t−1 ).In the bottom panel it corresponds to p(U t |D t ).

Figure 9 :
Figure 9: L-shaped cantilever beam -Digital twin future predictions with four possible actions: DN (do nothing), PM (perfect maintenance), MI (minor imperfect maintenance), and MA (major imperfect maintenance).The starting time is t c = 21.In the top panel the probability p(D t |D t−1 , U t−1 ) relates to the amount of damage in Ω 5 .In the bottom panel it corresponds to p(U t |D t ).

10 .
It is an integral concrete portal frame bridge located along the Bothnia line in the Swedish suburbs of Hörnefors.It features a span of 15.7 m, a free height of 4.7 m, and a width of 5.9 m (edge beams excluded).The thickness of the structural elements is 0.5 m for the deck, 0.7 m for the frame walls, and 0.8 m for the wing walls.The bridge is founded on two plates connected by stay beams and supported by pile groups.The concrete is of class C35/45, whose mechanical properties are: E = 34 GPa, ν = 0.2, ρ = 2500 kg/m 3 .The superstructure consists of a single track with sleepers spaced 0.65 m apart, resting on a ballast layer 0.6 m deep, 4.3 m wide and featuring a density ρ B = 1800 kg/m 3 .The geometrical and mechanical modeling data have been adapted from former research activities on the relevant soil-structure interaction, see [53, 54].

Figure 12 :
Figure 12: Railway bridge -Confusion matrix measuring the offline performance of the DL models in correctly categorizing the digital state.Results are reported in terms of classification accuracy, measuring how observational data are classified with respect to the ground truth digital state.Digital states are ordered first for damage location and then for damage level.

Figure 13 :
Figure 13: Railway bridge -Online phase of the digital twin framework with three possible actions: DN (do nothing), PM (perfect maintenance), and RE (restrict operational conditions).Probabilistic and best point estimates of: (top) digital state evolution against the ground truth digital state; (bottom) control inputs informed by the digital twin, against the optimal control input under ground truth.In the top panels the background color corresponds to p(D t |D t−1 , D NN t , U A t−1 = u A t−1 ).In the bottom panel it corresponds to p(U t |D t ).

Figure 14 :
Figure 14: Railway bridge -Digital twin future predictions with three possible actions: DN (do nothing), PM (perfect maintenance), and RE (restrict operational conditions).The starting time is t c = 5.In the top panel the probability p(D t |D t−1 , U t−1 ) relates to the amount of damage in Ω 5 .In the bottom panel it corresponds to p(U t |D t ).
Algorithm 1 Online phase -algorithmic description Input: observational data O t = o t 1: assimilate o t with the DL models to provide D NN NN t ) 2: infer D t and U t by updating d t−1 , given u A t−1 , d NN t , and the CPTs encoding ϕ history : infer the future evolution of D t and U t , given the updated d t , and the CPTs encoding ϕ history t = d NN t .▷ (O t ) → (D t .▷ (D t−1 , D NN t , U A t−1 , ) → (D t , U t ) 3t and ϕ control t 2: assemble the snapshot matrix of the structural response via FOM analyses.3: compute the POD basis functions via singular value decomposition of the snapshots matrix.4: use the ROM to populate the training dataset D with vibration recordings at sensor location.5: use the recordings and labels in D to derive D CL and D RG .6: train the classification model NN CL on D CL and the regression models NN RG on D RG .7: test the generalization capabilities of NN CL and NN RG on noisy FOM data.: compute the control policy π(D t ) by solving the planning problem induced by the PGM.10: return trained DL models, ϕ NN 8: compute the confusion matrix encoding the ϕ NN t factor.9t factor, control policy π(D t ).