A frequency domain approach for local module identification in dynamic networks

In classical approaches of dynamic network identification, in order to identify a system (module) embedded in a dynamic network, one has to formulate a Multi-input-Single-output (MISO) identification problem that requires identification of a parametric model for all the modules constituting the MISO setup including (possibly) the noise model, and determine their model order. This requirement leads to model order selection steps for modules that are of no interest to the experimenter which increases the computational complexity for large-sized networks. Also, identification using a parametric noise model (like BJ method) can suffer from local minima, however neglecting the noise model has its impact on the variance of the estimates. In this paper, we provide a two-step identification approach to avoid these problems. The first step involves performing a non-parametric indirect approach for a MISO identification problem to get the non-parametric frequency response function (FRF) estimates and its variance as a function of frequency. In the second step, the estimated non-parametric FRF of the target module is smoothed using a parametric frequency domain estimator with the estimated variance from the previous step as the non-parametric noise model. The developed approach is practical with weak assumptions on noise, uses the available toolbox, requires a parametric model only for the target module of interest, and uses a non-parametric noise model to reduce the variance of the estimates. Numerical simulations illustrate the potentials of the introduced method in comparison with the classical identification methods.


Introduction
Real-life systems are becoming increasingly complex and largely interconnected, leading to high interest in the field of identification of large-scale interconnected systems known as dynamic networks.These networks can be considered as a set of measurable signals (the node signals) interconnected through linear dynamic systems and can be possibly driven by external excitation signals and/or process noise.The node signals can possibly be measured with sensor/measurement noise.Data-driven modeling in dynamic networks can be broadly classified into full network identification and local module identification.In the former, we deal with identification of the full network dynamics [17,34,38,40,42], including the aspects of identifiability [1,4,15,18,37,39], while the latter deals with identifying a particular module (system) in a dynamic network assuming that the topology of the network is given [5, 11-14, 20, 23, 28-31, 35, 36, 41].
In this paper, we deal with local module identification.In [12,35], the classical direct method [21] for closed loop identification has been generalized to the framework of dynamic network.A local direct method has been introduced in [30] for identification of dynamic networks with correlated noise.While in direct methods node signals are selected as predictor inputs, in indirect methods external excitation signals are the predictor inputs.Indirect identification methods for dynamic networks have been presented in [12,14].An errors-in-variables (EIV) identification framework for dynamic networks has been introduced in [11].The direct method [35] and the simultaneous minimization of prediction error approach [16] have been extended to a Bayesian setting in [28] and [13] respectively, where a regularized kernel-based method has been used to minimize the mean-squared error of the estimated target module.A generalized method that relaxes the sensing and actuation schemes in a dynamic network by combining the framework of direct and indirect method has been introduced in [31].All the aforementioned approaches are time-domain methods.Frequency domain methods for identification in dynamic networks are scarcely explored.A fully non-parametric frequency domain approach has been presented in [10].
In this paper, we aim at identifying a single target module in a dynamic network with reduced complexity, whose model order is assumed to be known.In direct approaches [12,30,35], we typically need to perform a MISO or MIMO identification that requires a parameterized model for all modules in the MISO or MIMO structure.This embedding of the estimation problem in a larger MISO/MIMO setting is necessary for dealing with the network environment and for handling confounding variables1 .This requires the modelling of additional modules that are not of prime interest to the experimenter, so called nuisance modules, including their model order selection using complexity criteria like -as elaborated in [21] -AIC (Akaike information criterion), BIC (Bayesian information criterion), CV (cross-validation).For a large-sized network, this task can be computationally demanding (see [28] for an example).Moreover, direct approaches lead to biased estimates in an EIV setting (i.e.dynamic networks with node signals measured with sensor noise).
Indirect approaches like the two-stage method [35] and the Instrumental Variable (IV) method [11] do not require handling of confounding variables and can be used in an EIV setting, but they also require the embedding of the single module in a larger MISO structure.In comparison with the direct method it is possible to slightly reduce the complexity by refraining from estimating a noise model, however it comes with the cost of increasing the variance of the estimates.This issue has been dealt with for general identification of SISO systems in frequency domain by a semi-parametric approach with a parametric model for the plant and a non-parametric noise model [32,33], and has been extended to MIMO systems in [26,27].This latter philosophy we are going to follow for addressing the above mentioned problems in the identification of a single module in a dynamic network.
In this paper, we develop a semi-parametric frequency domain approach that requires a parametric model only for the target module of interest and incorporates noise modeling through a non-parametric noise model.We follow a two-step approach, where the first step involves a MISO identification problem to get the non-parametric frequency response function (FRF) estimate of the target module and also its variance, thereby avoiding parametric models in the MISO setup.In the second step, the FRF estimate of the target module is smoothed using a parametric frequency domain estimator with the estimated variance from the previous step as the noise model, thereby reducing the variance of the estimate.The particular advantages of the approach are that we do not require any model order selection procedure, we do not need to parametrically model nuisance modules, and we obtain reduced variance results due to effective noise modeling.Consequently, the complexity of the estimation problem is not affected by the complexity of the network, but only determined by the local environment of the target module.The approach is practical, implemented using the already available MATLAB toolbox and can be implemented for a network with correlated noise and an EIV setting.In terms of achieved properties, our method comes close to the recently introduced regularized kernel-based method for dynamic networks [29], in which nuisance models are modeled as Gaussian processes.However this approach requires tuning of hyperparameters.After the presentation of the new method, numerical simulations are performed for a dynamic network example to show the effectiveness of the developed method compared with both the direct method and the regularized kernel-based method.

Problem statement
Following the setup of [35], we consider a dynamic network that is built up of L scalar measurable internal variables or nodes w j ptq, j = 1, . . ., L. Each internal variable is described as: where, ‚ q ´1 is the shift (delay) operator i.e. q ´1w j ptq " w j pt1 q; ‚ N j is the set of node indices k such that G 0 jl ı 0 ; ‚ G 0 jl pqq are proper rational transfer function for j " 1, . . ., L and k " 1, . . ., L; ‚ There are no self-loops in the network, i.e. nodes are not directly connected to themselves G jj " 0; ‚ v j ptq is the process noise, where the vector process v " rv 1 ¨¨¨v L s T is modelled as a stationary stochastic process with rational spectral density Φ v pωq, such that there exists a white noise process e :" re 1 ¨¨¨e L s T , with covariance matrix Λ ą 0 such that vptq " Hpqqeptq, where Hpqq is monic, square, stable and minimum-phase.The situation of correlated noise refers to the situation that Φ v pωq and H are non-diagonal; ‚ r j ptq is a measured external excitation signal entering node w j ptq.In some nodes, it may be absent.
On combining the equation for all nodes in the network, we obtain the network equation (time and frequency dependence is omitted below), The representation in ( 2) is an extension of the Dynamic Structure Function (DSF) representation [15].The dynamic network is assumed to be stable, i.e. pI ´G0 pqqq
Every node signal is measured using a sensor, due to which there can be measurement errors in the recorded measured node signal of w k .Therefore the measurement of w k accounting for the sensor noise is given by wk " w k `sk , where s k is the sensor noise and is modelled as a stochastic process with rational power spectral density.This will be the EIV setting in dynamic networks.The identification problem that we consider in this paper is to identify the target module of interest G 0 ji from measurements of w and possibly r.To this end, we choose a parameterization of G 0 ji pqq, denoted as G ji pq, θq, that describes the dynamics of the module of interest for a certain parameter vector θ " θ 0 P R n θ .
3 The direct approach

The standard direct method
The equation (1) represents a MISO structure and is the starting point of the direct method in [35].Let D j denote the set of indices of the internal variables that are chosen as predictor inputs and let R denote the nonzero external signals in r.Let w Dj denote the vector rw k1 . . .w kn s J , where tk 1 , . . ., k n u " D j and let r R denote the vector rr k1 . . .r kn s J , where tk 1 , . . ., k n u " R. In the standard direct method for dynamic networks [35], identification of a module G ji can be done by selecting a set of predictor inputs D j such that i P D j , and subsequently estimating a multiple-input single output model for the transfer functions in G jDj .This can be done by considering the one-step-ahead predictor 2 ŵj pt|t ´1q :" Ētw j ptq | w t´1 j , w t Dj u, and the resulting prediction error ( [21]) ε j pt, θq " w j ptq ´ŵ j pt|t ´1, θq, given by where arguments q and t have been dropped for notational clarity.The parameterized transfer functions G jk pq, θq, k P D j and H j pθq are estimated by minimizing the sum of squared (prediction) errors: V j pθq " where N is the length of the data set.Let θN denote the minimizing argument of V j pθq.We note that in the above formulation, the prediction error depends also on the additional parameters entering the modules that are not of interest, and on the noise model, which needs to be identified to guarantee consistent estimates of θ.Therefore, all these modules require a model order selection step that could have a detrimental effect in large-scale networks due to computational complexity.

Predictor input selection
Predictor input selection (i.e.selecting D j ) plays an important role in the direct method, for guaranteeing that the identified transfer function is an estimate of the target module G 0 ji and not a biased estimate of the target module.We call this as module invariance in networks.In [35], the set D j is chosen to be N j (i.e.all in-neighbors of the output of the target module).However, it is possible to choose a subset of N j in D j as predictor inputs provided certain conditions are satisfied [12].The main condition is given in Property 1.
Property 1 To identify a target module G ji , consider a set of internal variables w k , k P D j .Let D j satisfy the following properties: (1) i P D j and j R D j ; (2) Every path from w i to w j , excluding the path through G ji , pass through a node w k , k P D j ( parallel path condition); (3) Every loop through w j pass through a node w k , k P D j ( loop condition).
When the above property is satisfied, the direct method provides a consistent estimate of the target module, if data informativity conditions are satisfied, and in addition there are no confounding variables for the estimation problem w Dj Ñ w j .We use the identification setup 2 Ē refers to limNÑ8 1 N ř N t"1 E, and w ℓ j and w ℓ D j refer to signal samples wjpτ q and w k pτ q, k P Dj, respectively, for all τ ď ℓ.
of the direct method as the basis of the approach developed in this paper.4 The frequency domain approach

Introduction
A special semi-parametric two-step indirect approach can also be used for the identification of a single local module.In the first step, a nonparametric indirect method is used to get FRF and variance estimates of the local module that is involved in the problem.In a second step, the estimated nonparametric FRF of the local module of interest is smoothed using a parametric frequency domain estimator using the estimated variance as a nonparametric noise model.
The semi-parametric approach offers two major advantages.First, the method complexity is independent of the complexity of the network.The complexity of the first non-parametric step is set by the number of local modules that need to be estimated in order to isolate the local module of interest.Second, in the parametric step, a parametric model is only estimated for the local module of interest, in other words, the other modules are not entering into the problem.Moreover, only the plant model needs to be identified because a nonparametric noise model that is obtained in the first step is used as a frequency weighting.

Nonparametric FRF estimation
In this section we will use a special implementation of the Local Polynomial Method (LPM) [26] for estimating the FRF of the target module.The basic idea of LPM is to cope with leakage and transient issues and it brings two advantages: ‚ The approach is not restricted to periodic excitations, it works well with noise excitation as well.Of course, the advantages of using a periodic excitation still hold true and we still advocate to use periodic excitations whenever this is possible.‚ The LPM estimate provides an FRF estimate for the modules together with their co-variance matrices (as a function of the frequency).In this step there is almost no user interaction needed, and the complexity of the identification problem is (almost) independent of the size of the network.
We consider a straightforward, iteration free, direct implementation of LPM calculated per excited frequency lines as discussed in [8] and implemented in the SAMI toolbox (Simplified Analysis for Multiple Input [7] [9]).This toolbox offers a user friendly environment (GUI or Command line driven) to make a full nonparametric analysis of MIMO (non)linear systems.
Next, the LPM method is briefly elaborated.The LPM relies on the following frequency domain nonparametric baseline model at the excited frequency lines, for a generic input-output relationship: where, ‚ k represents an excited frequency line (bin) ‚ Y is the output signal in the frequency domain with n y outputs (dimension of Y is n y ˆN , dimension of Y pkq is n y ˆ1); ‚ U is the excitation signal in the frequency domain with Because the FRF and the transient are smooth [6], it is possible to use polynomial approximations such as the LPM.In the process, a (narrow) sliding processing window is used with a polynomial degree of d.First, an excited frequency line is selected (denoted by frequency index f ), this is called the central frequency.This is the middle point of the processing window.Around this frequency line, in a ˘d{2 radius, a narrow band is selected such that the r (radius) is given by In this band (f `r) all the (excited and non-excited) frequency lines are used to estimate the transfer functions in polynomial form.The measured output is at the excited frequency index f in a radius of r as follows: Y pf `rq " G p pf `rqU pf `rq `Tp pf `rq `Epf `rq where G p and T p are polynomials of order d.The polynomials are given by: The LPM parameters at the center frequency f can be estimated by the following LS cost function: where ηpf q represents the estimates of the LPM polynomials (i.e. the terms in (7,8)) at frequency index f .In (7) the estimated value of g 0 corresponds to G, i.e. the quantity of interest.
The LS problem stated in ( 9) is recommended to be solved output channel per output channel.In this case, for each frequency line there are pd `1qpn i `1q unknown parameters to be estimated; (pd `1qn i in G p , and d `1 in T p ).This means, for the algorithm to work, the bandwidth r should be at least as wide as the number of unknown parameters.In this work we consider second degree polynomials (d " 2).The restriction on the minimum bandwidth also implies that in the FRF measurements, in the (-3 dB) bandwidth of the lowest damped mode (the sharpest peak in the FRFs) should be at least pd `1qpn i `1q points measured.By further increasing the bandwidth of LPM, an increased smoothing of the nonparametric estimate over the frequency is obtained.However, a too large bandwidth will create a bias.Since the FRFs will be further smoothed in the second parametric step, the perfectly chosen parameters of the smoothing processing is less crucial.Note that ( 9) must be solved with the help of a numerically stable inversion method.

The nonparametric indirect method
The previous subsection contains the formal definition of the LPM method.In the dynamic network setup, an indirect method is used.
First, the transfer is estimated between the non-zero external excitation signals r k , k P R and the nodes w k , k P D j .This is reflected in the frequency domain through the following MIMO problem: W Dj pf `rq " S DjR pf `rqR R pf `rq `T pf `rq (10) where, This problem is solved with the help the proposed LPM method such that the estimated FRM ŜDjR is obtained between the reference signals r R and the nodes w Dj .Note, that for a unique solution, it is required that R R is full rank.
In the next step we use the estimated FRM to construct a disturbance-free version of the node signals that are going to be used for estimating the target module, ŴDj pf `rq " ŜDjR pf `rqR R pf `rq.(11) This is done by estimating the transfer between ŴDj and the output of the target module using the following relationship: W j pf `rq " G jDj pf `rq ŴDj pf `rq `T pf `rq (12) where ‚ W j is the measured node value of the output of the target module in the frequency domain; ‚ G jDj is the FRM between the target module and the nodes in w Dj .
This problem should be solved again with the help of the proposed LPM method such that the estimated FRM ĜjDj is obtained.We can now isolate the local module Ĝji in the FRM estimate ĜjDj .

Parametric processing
One must post-process the non-parametric local module Ĝji in order to obtain the parametric rational transfer function form Ĝji pθq.In this work, a weighted least squares estimation is used for the identification: where ‚ θ represents the parameters in the parametric local module estimate; ‚ F is the number of frequency components; ‚ σ2 Ĝji is the non-parametric co-variance estimate of the the local module estimate obtained from the LPM method [8,26].
In order to minimize the cost function given in ( 13) the FDIDENT toolbox is used [19].

Reflection
The presented method combines a two-stage indirect method for estimating a local module, and combines this with a non-parametric noise model to reduce the variance of the final parametric module estimate.The use of the non-parametric noise model is particularly enabled by the fact that the problem is addressed in the frequency domain.Additionally there is no need for parametrically modeling nuisance modules, typically present in MISO network estimation problems, and therefore extensive model order selection procedures are avoided.
In the next section, we compare the performance of the developed method in this paper with the nonparametric regularized kernel-based method developed in [28].In this method, keeping a parametric model for the module of interest, we model the impulse response of the remaining modules in the MISO structure as zero mean Gaussian processes with prior covariance matrix (kernel) given by the first-order stable spline kernel [3,25].This kernel encodes the properties like stability and smoothness of an impulse response.The kernels are dependent on hyperparameters.By tuning the hyperparameters, it is possible to reduce the mean squared error (MSE) of the estimates through bias-variance trade off [2].The hyperparameters need to be tuned such that the MSE is reduced.Using an Empirical Bayes (EB) approach [22], the hyperparameters are tuned and the target-module parameters are estimated by maximizing the marginal likelihood of the data.This method, similar to the method developed in this paper, circumvents the problem of model order selection for the nuisance modules in the identification setup.Even though the compared regularized kernel-based method circumvents the problem of model order selection, it involves the complexity of tuning the hyperparameters.However, this is avoided in the developed method.

Numerical Simulations
Numerical simulations are performed with a dynamic network example as given in Figure 1, to evaluate the performance of the developed method discussed in Section 4, which we abbreviate as "iLPM + ELiS" (indirect Local Polynomial Method combined with ELiS cost function).We compare the method with the standard direct method "DM+TO" [35] and the Empirical Bayes Direct Method ("EBDM") [28].The goal is to identify G 0 31 .The network example is the same network as in [29], with the only difference in the presence of r 1 signal.The network modules in Figure 1 are given by, We run 100 independent Monte Carlo experiments where the data is generated using known reference signals r 1 ptq, r 2 ptq and r 4 ptq that are realizations of independent white noise with variance of 0.1.The number of data samples is N = 500.The noise sources e 1 ptq, e 2 ptq, e 3 ptq and e 4 ptq have variance 0.05, 0.08, 1, 0.1, respectively.We assume that we know the model order of G 0 31 pqq.In the case of the direct method, we solve a 3-input/1-output MISO identification problem with w 1 ptq, w 2 ptq and w 4 ptq as inputs.For the direct method, we consider that the model orders of all modules in the MISO setup are known.The EBDM [28] considers the target module G 0 31 in the above MISO identification problem as parameterized and the rest of the modules (nuisance modules) in the MISO setup are modeled as zero mean independent Gaussian processes, whose covariance matrices are described by a stable spline kernel which encodes stability and smoothness of its impulse response.Each nuisance module is described only by two hyperparameters of the kernel, thereby reducing the number of parameters.The model order of the target module is considered to be known.The length of the impulse response of each module in the EBDM is considered to be ℓ " 50.The indirect LPM (iLPM) method uses a second order polynomial.For this configuration the minimal required bandwidth should be at least 12 (i.e.pd `1qpn i `1q " 3 ˆ4).Due to the excessive noise on the signals, the bandwidth should be higher.Using a simple Least Squares fitting method, the bandwidth has been set to 24, which is sufficient to reduce the effect of the noise and still give a small bias error.
To evaluate the performance of the methods, we use the standard goodness-of-fit metric, where g 0 ji is the true value of the impulse response of G 0 ji , ĝji is the impulse response of the estimated target module and ḡji is the vector with the sample mean of values in g 0 ji as its elements.The box plots of the fits of the estimated impulse response of G 31 pqq are shown in Figure 2, where we have compared the Direct method with true model orders ('DM+TO'), the Empirical Bayes Direct Method ('EBDM'), and the developed method ('iLPM+ELiS').A Box plot shows the following distribution properties: minimum/maximum values (buttom/top lines), 1st-3rd quartiles (rectangular area), median (line in the middle), and (dotted) outliers.It can be noted that the median of the fit of the developed method is higher that the other considered methods.Figure 3 shows the mean and standard deviation of the parameter estimates of G 31 .It is evident that the developed method ('iLPM+ELiS') gives a smaller bias and a reduced variance (in particular when considering parameter b 2 ) compared to the other considered identification methods.This can also be confirmed from the box plot of the parameters of Ĝ31 in figure 4. The reduction in variance is attributed to the implementation of a nonparametric noise model and it can be witnessed that it is on par with the EBDM which is expected to provide reduced variance due to incorporation of regularization in its approach.The mean and standard deviation on the mean of the FRF estimated from 100 MC simulations is shown in figure 5.It can be observed frequency response plot that the error between the true value and the mean value is far below the standard deviation, indicating that the developed method is unbiased.
Considering a relatively small sized network with 3 mod- G31-mean G EBDM Fig. 5. FRF plot showing 1) the error between the true value and the mean value (dotted lines) and 2) the standard deviation on the mean of the FRF (dashed lines), for the developed method (red) and EBDM (blue) calculated from 100 realizations.
ules in the MISO structure, the developed method proves effective.When the size of the network grows, the results of the direct method may deteriorate further due to increase in variance; furthermore, it is expected that in large networks the model order selection step contributes to inaccurate results.The developed method requires a parametric model only for the target module and circumvents the problem of model order selection.Also, the incorporation of a non-parametric noise model substantially reduces the variance in the target module estimate and also avoids the problem of local minima that can occur in a parametric noise model as in the direct method.Thus the developed method is expected to serve as an effective local module identification method also in large dynamic networks.
Remark 1 Being an indirect approach, the developed method also provides consistent estimates in an EIV setting [35], thereby allowing our measurements to be distorted by sensor noise.

Conclusions
A two-step indirect local module identification method has been developed in this paper.The developed method ('iLPM+ELiS') avoids the problem of model order selection for all the modules that are of not interest to the experimenter, but still needs to be estimated in order to obtain consistent estimate of the target module.Due to the incorporation of a non-parameteric noise model, we achieve a clear benefit in the mean square error of the estimated target module and also the benefit of avoiding possible local minima when a parameteric noise model in used.Therefore, the developed method is less complex and scalable to large sized networks.This is the most important advantage: the complexity of the problem is independent of the complexity of the network.It only depends on the complexity of the block to be identified.Furthermore, the method is built on using the already available toolbox SAMI and FDIDENT, making it practical and user-oriented.Numerical simulations performed with a dynamic network example illustrate the potentials of the developed method on comparison with already available methods.The developed method provide target module estimates with greatly reduced bias and variance in comparison with the other compared methods for local module identification.

W
Dj is the node value in the frequency domain for the node signals in w Dj ; ‚ R R represents the reference signals r R in the frequency domain; ‚ S DjR is the Frequency Response Matrix (FRM) i.e. collection of FRFs, between the reference signals in r R and node signals w Dj ; ‚ T represent the (autonomous) transient term.

Fig. 2 . 2 Fig. 3 .
Fig. 2. Box plot of the fit of the impulse response of Ĝ31 obtained by the developed method, Direct method and EBDM.Number of data samples used for estimation is N = 500.

2 Fig. 4 .
Fig. 4. Box plot of the parameters of Ĝ31 obtained by the developed method, Direct method and EBDM.Number of data samples used for estimation is N = 500.