Optimal Design for Parameter Estimation in Eeg Problems in a 3d Multilayered Domain

The fundamental problem of collecting data in the " best way " in order to assure statistically efficient estimation of parameters is known as Optimal Experimental Design. Many inverse problems consist in selecting best parameter values of a given mathematical model based on fits to measured data. These are usually formulated as optimization problems and the accuracy of their solutions depends not only on the chosen optimization scheme but also on the given data. We consider an electromagnetic interrogation problem, specifically one arising in an electroencephalography (EEG) problem, of finding optimal number and locations for sensors for source identification in a 3D unit sphere from data on its boundary. In this effort we compare the use of the classical D-optimal criterion for observation points as opposed to that for a uniform observation mesh. We consider location and best number of sensors and report results based on statistical uncertainty analysis of the resulting estimated parameters.


Introduction
In a series of recent works [6,7,8,11,12] several authors have developed a design framework based on the Fisher Information Matrix (FIM) for a system of differential equations to determine when and where an experimenter should take samples and what variables to measure in collecting information on a physical or biological process that is modeled by a vector dynamical system.This framework has also been proposed for use in inverse problem methodologies in the context of dynamical system or mathematical model parameter estimation when one investigates the sufficiency of the number of observations of one or more states (variables).Experimental design using the Fisher Information Matrix (FIM), which is based on sensitivity functions (traditional and generalized), is described in [7] for the case of scalar data.In [8], the authors develop an experimental design theory using the FIM to identify optimal sampling times for experiments on physical processes (modeled by an ODE system) in which scalar or vector data will be taken.The methodology can be readily applied to problems involving ordinary, partial and delay differential equations dynamics but requires both a mathematical model and a statistical model.Early efforts published in this area were concerned with parameter estimation for one dimensional dynamic systems in one space or time dimension.More recently, these ideas were successfully applied in [11] for an experimentally validated six-compartment HIV model and a thirty-eight dimensional enzyme kinetics model of the Calvin Cycle in spinach.
In [13] and [14] proof-of-concept numerical results for a distributed parameter system in a 3D one layer spherical domain are presented for several different design criteria (D-optimal, SE-optimal, IGSF-optimal).In this present effort, a more general case in 3D is studied.Motivated again by classical problems in EEG analysis, we consider a stationary process modeled by a Poisson type equation with multiple interfaces given by ∇ • (κ(x)∇u(x, θ)) = g(x, θ) x ∈ Ω. ( Here θ ∈ R p is the parameter to be estimated, Ω is the multilayered sphere in R 3 , κ(x) is a piecewise positive constant conductivity function, u(x, θ) ∈ R, x ∈ ∂Ω is the output and g : R 3+p → R is the source.For this investigation we suppose that there exists a real value θ 0 such that the equation (1) accurately describes the process.
For the estimation process, we formulate a statistical model [35] of the form (absolute error) where θ 0 is the vector of true values of the unknown parameters and E is a vector random process that represents observation error for the measured variables.Realizations of the statistical model (2) can be written as In almost every real problem only a discrete set of output data is available.We denote {y 1 , ..., y n } the measurements at Λ = {x 1 , . . ., x n } ⊂ ∂Ω.In this context, the parameter value θ 0 may be estimated by an Ordinary Least Square (OLS) procedure yielding an estimate θ.That is, θ = arg min θ∈A J Λ (θ) where A is the set of admisible parameter values and J Λ (θ) denotes the square errors between the measured and simulated outputs at the observation points, i.e., Different choices of the points x 1 , . . ., x n ∈ Λ may lead to different estimates, and thus it is important to look for the optimal set of observation points that will lead to an accurate parameter estimation.This is the purpose of the so-called Optimal Design methods, like D-Optimal, SE-Optimal, E-optimal design methods (see [6,7]).For the sake of clarity, in the remainder of this article we omit the subindexes Λ on J.
The results presented here will clearly illustrate the advantages of performing some type of design such as D-Optimal design in connection to the estimation of the parameters in the problem described by (1).

The EEG Inverse Problem
The electrical activity of the brain consists of currents generated by biochemical sources at the cellular level.The electric and magnetic fields that they produce can be estimated by means of Maxwell equations (see [27,33]).Based on the properties of the tissues involved, the velocity of propagation of the electromagnetic waves caused by potential changes within the brain is such that the effect may be detected simultaneously at any point in the brain or in the surrounding tissues.This fact justifies the use of a static approximation of Maxwell equations.This approximation uncouples the equations for the magnetic and electric fields leading to a 3D Poisson-type equation with interfaces that relates the electric potential u in the head with the impressed current C.
The equation that relates the electric potential u and the impressed current C reads: where the volume Ω represents the head, ν is the external normal vector and κ is the conductivity function.
The impressed current is often represented by an electric dipole, C(x) = q δ(x − r q ), where δ is the Dirac distribution, r q is a fixed point in the brain which represents the dipole location, and q is the dipole moment [27].Observe that, since a static approximation is considered, the values of u(x) do not depend on time; instead they correspond to a precise instant of the underlaying process.
A simplified geometric model is usually considered where the head is represented by a 3D three-layered sphere.The nested subsets Ω 3 , Ω 2 \Ω 3 and Ω 1 \Ω 2 correspond to the brain, skull and scalp, respectively.The location of the dipole r q ∈ Ω 3 (see Fig. 1) is in the brain.The function κ(x) represents the conductivity of Assuming that the potential and its normal derivative multiplied by the conductivity function are continuous across the transition surfaces, the following interface conditions are imposed where [•] Si denotes the difference between the values of the functions inside the brackets through the interface surfaces S i = ∂Ω i , i = 1, .., 3 that surround the different subsets.
In this framework, the forward problem of EEG consists in finding the electric potential u in the head for a given current source C. Conversely the inverse problem of EEG consists in finding the location r q of the electric dipole and its moment q, for a given scalp potential u.The remaining parameters of the model κ 1 , κ 2 , κ 3 , as well as the radii of the spheres, are supposed to be known.We recall that the values of u(x), x ∈ ∂Ω, do not depend on time, since they are collected at a precise instant.
Figure 2: Forward and inverse problems of EEG Electroencephalographic source localization by noninvasive techniques is an area of interest in clinic epileptology, in particular concerning models of dipolar and distributed sources for the investigation of focal epilepsy.In this context the scalp data corresponding to an instant of a seizure should be considered.An accurate solution to the inverse problem could provide useful information to determine the location of epileptogenic zones within the brain from EEG recordings.In the last decade several authors have been working in this area.Source models were analyzed in [20,34,38] while forward and inverse problem solutions were studied in [28,31,32], among others.In [16], the authors developed a method for EEG source localization based on rational approximation techniques in the complex plane and they presented results using simulated data.We refer to the references therein for a review of the principal results concerning the inverse problem in EEG.
Since in practice the scalp potential u is measured only at a finite set of points x 1 , • • • , x n , on the scalp where the electrodes are placed, the inverse problem of EEG consists in recovering r q and q from u(x 1 ), Challenged by this problem, we analyze the corresponding mathematical model considering that only a finite set of potential values are available.Parameter estimation techniques and optimal design schemes are performed to solve the inverse problem accurately.

Mathematical model
Motivated by the above formulation we consider the following second order elliptic partial differential equation (PDE) where Ω is the unit ball of R 3 , and κ is a positive piecewise constant function defined by where Ω 1 = Ω and Ω 2 , Ω 3 are balls centered at the origin with radii r 2 , r 3 such that 0 < r 3 < r 2 < 1.The mathematical formulation is completely defined with the following interface and boundary conditions: where [•] Si denotes the difference between the values of the functions inside the brackets through the surface S i = ∂Ω i and η is the external normal vector.This type of system appears in a number of applied problems such electrostatic interrogation systems, medical imaging, geophysical exploration and nondestructive testing, etc, [2,3,4,5,10,15,17,18,24,25,26,29,30].
Once again based on the formulations discussed above, we consider a function F of the form F (x, θ) = qδ(x − r q ), where δ denotes the dirac distribution.Then the source ∇ • F (x, θ) could represent for example an electric dipole with moment q = (q 1 , q 2 , q 3 ) ∈ R 3 and location r q = (r q1 , r q2 , r q3 ) ∈ Ω 3 .The parameter of the model is then θ = (r q , q) ∈ R 6 .Existence and uniqueness of a solution u(x, θ) to this case have been studied in [19] and its dependence with respect to Ω and κ appears in [36,37], respectively.
In [19] the author gives an explicit formula in terms of a series, for u and its derivatives with respect to the parameters for spherical domains.This formula allows one to compute u at any point of Ω given a dipole that can be located anywhere in Ω.Since we are only interested in computing u(x) for x in the boundary ∂Ω of the domain and with a dipole located in the innermost layer, i.e., r q ∈ Ω 3 , we will only recall the expression of u in this situation.The formula for u given in [19] is where cos ω = rq rq x x is the cosine of the angle formed by the observation point x, the dipole location r q and the origin, and S 0 , S 1 are given by Here r 0 = r q , r = x , P n is the n-th Legendre polynomial and P n its derivative which can be computed from the recursive formula Finally, r q ∈ Ω 3 implies r 0 < r 3 , hence the definitions of R n and R n given in [19] reduce to where A = {U 1 (r 1 , r 2 )U 2 (r 2 , r 3 )U 3 (r 3 , 0)} 22 , and the matrix U j (r a , r b ) is defined as (13) We can thus rewrite R n and R n as so that The formula (9) with S 0 and S 1 given by the previous formula ( 15) allow us to compute easily the solution to (5) when r q ∈ Ω 3 and x ∈ ∂Ω.

Inverse problem formulation
We consider the inverse problem or parameter estimation for the mathematical model described by (5).It consists in estimating the unknown parameter θ = (r q , q) ∈ R 6 from the data y 1 , .., y n , corresponding to n observation points at x 1 , .., x n on the boundary ∂Ω.
We choose the Ordinary Least Square (OLS) method to calculate estimates θ of θ 0 ; that is, we minimize where A = {(r, q)|r ∈ Ω 3 }.We compute the solution u of our model by means of ( 9).The OLS-estimator θ is then θ = arg min θ∈A J(θ).
As already noted, a statistical model is necessary to study and implement inverse problem techniques properly.Here we take the absolute error model (2) with realizations (3).In this context the inverse problem consists of the estimation of the unknown parameters θ 0 from the data where we assume that the additive noises 1 , .., n are independent realizations of a centered normal random variable with variance σ 2 .
Recall that θ is a realization of a random variable Θ.It can be proved that under suitable hypothesis (see [9,35]), Θ has asymptotically normal distribution where F (x 1 , .., x n , θ) ∈ R 6×6 is the usual Fisher Information Matrix defined by The partial derivatives ∂u ∂θj (x k , θ) are the traditional sensitivity functions that, assuming smoothness on u, quantify the variations in u with respect to changes in the j th component of the parameter θ.A precise discussion of the hypothesis and the approximations involved in the above theory is given in [9].
For our model representation, the sensitivities of u with respect to θ can be easily computed by directly differentiating (9).The computation of the derivative ∇ q u of u with respect to the moment q of the dipole is immediate from (9).For the derivative ∇ rq u the author in [19] found that where S 0 and S 1 are given in (15), and S 3 , S 4 are defined as Here P n is the 2nd derivative of P n that can be calculated from (11), and one can use the expressions ( 14) of R n and R n to make the simplifications, and obtain Then using we find

Optimal Design
It is often useful to have some criteria to determine where samples should be taken for any type of interrogation problem, especially those that can be expensive and/or invasive.This is the goal of the optimal design techniques: to search for the optimal set of observation points in order to carry on the estimation procedures.Different criteria generally give rise to different selections of observation points.In most cases optimal design methods for parameter estimation problems choose the sampling distribution by minimizing a specific cost function related to the error or to the accuracy in parameter estimates.Data collected in this optimal way will lead to parameter estimates with increased accuracy.
In view of the asymptotic distribution ( 17)- (18) it is natural to choose the points x i that minimize F (x 1 , .., x n , θ 0 ) in some sense (see [6,7]).A well-known and widely used optimal design method is the D-optimal criteria that consists in minimizing det F (x 1 , ...x n , θ) −1 .Geometrically, this corresponds to minimizing the volume of the confidence ellipsoid for the covariance matrix Cov = σ 2 F −1 .
With that purpose we choose the set Λ D = {x 1,D . . ., x n,D } ⊂ ∂Ω of n observation points where the measurements {y 1 , ..., y n } are to be obtained by minimizing the function starting with some initial set of points and considering θ 0 as an initial guess value for the parameter.After having selected the observation points, we perform OLS with the optimal set Λ D and the initial guess θ 0 .In the next section we present a detailed description of the way our numerical calculations were performed for the problems of interest to us in this presentation.

Numerical Preliminaries
Since we consider simulated data, we are able to study the actual errors and analyze the performance of Optimal Design for an optimal selection of data.
The values used in the numerical simulation are given in the following two tables.Two different electric dipoles were chosen to illustrate a general behaviour.The first potential dipole (Example 1) is defined by F (x, y, z) = (3, 4, 0) δ((x, y, z) − (0.3, 0.4, 0))).It is a parallel dipole since its moment q = (3, 4, 0) is parallel to the vector position r q = (0.3, 0.4, 0).The second electric dipole (Example 2) is given by F (x, y, z) = (2, −1, 1) δ((x, y, z) − (0.3, 0.4, 0))).Note that we kept the dipole position while its moment is significantly different.These parameter values and their respective initial relative errors are shown in Table 2 The observation data is simulated as where the values u i = u(x i , θ 0 ), i = 1, • • • , n are computed as a sum of 20 terms of the form given in (9), for θ 0 = (0.3, 0.4, 0, 3, 4, 0) in the first example and θ 0 = (0.3, 0.4, 0, 2, −1, 1) in the second.The observation noise i at each observation point x i is calculated by the Matlab function randn with standard deviations σ = 0, 0.05, 0.1, 0.15 and 0.2 to produce noise free data as well as data sets with nontrivial increasing noise levels.
Note that the observation data is taken for the outer surface of the domain which is considered as a unit sphere (r 3 = 1).Hence, the observation points can be written as (x, y, z) = (sin(α) cos(φ), sin(α) sin(φ), cos(α)), with (α, φ) ∈ [0, π]×[0, 2π].We define a uniform grid on spherical coordinates with constant step ∆α = π/30 for α and ∆φ = 2 * π/30 for φ.This gives us 31 × 31 grid points in the rectangle [0, π] × [0, 2π] of the form (α i , φ j ) where α i = (i − 1) * ∆α, i = 1, ..., 31, φ j = (j − 1) * ∆φ, j = 1, ..., 31.The gridpoints are numbered on the rectangular grid from left to right and from bottom to top as shown in Fig. 3 (left).The spherical gridpoints are shown in Fig. 3, on the right.Note that the bottom line of gridpoints correspond to α = 0 and the top line to α = φ.Each of these lines correspond to only one point on the unit sphere, they are (0, 0, 1) and (0, 0, −1) in cartesian coordinates, which are the poles.On the other hand, the extreme points of the same line are the same, due to the periodicity of trigonometrical functions.One needs to take these facts into account when considering observation points.
As mentioned before, the estimated value for the parameter is based on the initial set Λ of observation points.
We ran numerical experiments considering different numbers of measurements from n = 7 to n = 20.For each case, the points x i , i = 1, .., n in Λ are chosen from the set of observable gridpoints as follows.First, we fix the first and the last points of Λ at the gridpoints numbered as 100 and 650, respectively; that is x 1 = xg 100 , x n = xg 650 .In this way we are sure that they fall far from the poles.The rest of the points are uniformly distributed on the numbered grid, which means

Parameter estimation
We obtain two estimates θΛ and θD of θ 0 performing OLS with the initial guess θ g and two different sets of observations points x 1 , . . ., x n : • the estimate θΛ is obtained performing OLS with the initial guess θ g and the set Λ = {x 1 , . . ., xn } of observation points given by ( 21).
• we obtain a second estimate θD of θ 0 applying OLS using the observation points arising from the D-optimal design criteria as follows.We first look for a set Λ D = {x D 1 , . . ., x D n } of n observation points minimizing the function det F (x 1 , ...x n , θ g ) −1 starting with Λ as initial observation points.We then perform OLS with θ g as initial guess for θ and the observation points in Λ D .
The OLS-estimate is computed by the Matlab function lsqnonlin with the same options as fmincon except that in this case we set TolFun=1e-10.
and {u i,Λ } 1≤i≤n defined in (20) using the set of observation points Λ D and Λ respectively.We then compute the estimated variances σ2 Λ and σ2 D defined as The standard errors SE( θΛ ), SE( θD ) ∈ R 6 are then defined as where k = 1, . . ., 6, and F is the Fisher matrix defined in (18).The approximate CI at the (1 − α)% level for the k-th component θ 0,k of θ 0 corresponding to θΛ and θD are then given respectively by [ θD,k The results for r q (1) (the first component of r q ) and q(1) (the first component of q) are shown in Figures ( 6)-( 9). Figure 6 depicts the results when σ = 0.05.From top to bottom: The mean relative errors, length of the confidence intervals (built using formula (24) (in blue) and ( 25)) and the confidence intervals.On the left are the results for r q (1) and on the right are the results for q(1).Similar results for σ = 0.1, 0.15, 0.2 are depicted in Figures 7 -9. ) and ēD (q) (using D-Optimal design) as functions of n and σ. Figure 6: Results for noisy realizations with σ = 0.05 using D-optimal design (red) and no optimal design (blue).From top to bottom: Mean relative errors ēD (r q ) and ēΛ (r q ) (right) and ēD (q) and ēΛ (q) (left); length of confidence intervals for r q (1) (right) and q(1) (left); confidence intervals at 90 % confidence level for r q (1) (right) and q(1) (left).Figure 7: Results for noisy realizations with σ = 0.1 using D-optimal design (red) and no optimal design (blue).From top to bottom: Mean relative errors ēD (r q ) and ēΛ (r q ) (right) and ēD (q) and ēΛ (q) (left); length of confidence intervals for r q (1) (right) and q(1) (left); confidence intervals at 90 % confidence level for r q (1) (right) and q(1) (left).Figure 8: Results for noisy realizations with σ = 0.15 using D-optimal design (red) and no optimal design (blue).From top to bottom: Mean relative errors ēD (r q ) and ēΛ (r q ) (right) and ēD (q) and ēΛ (q) (left); length of confidence intervals for r q (1) (right) and q(1) (left); confidence intervals at 90 % confidence level for r q (1) (right) and q(1) (left).Figure 9: Results for noisy realizations with σ = 0.2 using D-optimal design (red) and no optimal design (blue).From top to bottom: Mean relative errors ēD (r q ) and ēΛ (r q ) (right) and ēD (q) and ēΛ (q) (left); length of confidence intervals for r q (1) (right) and q(1) (left); confidence intervals at 90 % confidence level for r q (1) (right) and q(1) (left).

Conclusions
In this paper we have investigated a typical interrogation problem such as those used in EEG.We have shown the value of using some type of optimal design criterion (such as those studied in [6,7,8,11,12]) in determining how to best collect data.From the numerical results, we would conclude that: • D-optimal design techniques provide a set observations points leading to a more accurate estimate of the parameters of interest.The results of this paper emphatically demonstrate the benefits of using some type of optimal design criterion (D-optimal in this case) in deciding how data should be collected in a specific application.
• For the specific EEG problem investigated, the length of the confidence intervals as well as the mean relative error do not decrease significantly for more than 10 or 11 observation points.Hence an optimal array of sensors of this size is sufficient in practice.Moreover, there are dramatic differences between estimation accuracies with a smaller number of sensors ( ≈ 7 or 8) and the optimal values of 10 or 11 sensors.
points Length of the confidence intervals for the 1st component of r q -σ = 0.05 Using D-optimal design No optimal points Length of the confidence intervals for the 1st component of q -σ = 0.05 Using D-optimal design No optimal Confidence Intervals for q(1), σ = 0.05 Example 1 Real parameter value Initial guess value Confidence intervals (No optimal design) Confidence intervals (D-optimal design)

1 Using 1 Using
points Length of the confidence intervals for the 1st component of r q -σ = 0.points Length of the confidence intervals for the 1st component of q -σ = 0.Confidence Intervals for r q (1), σ = 0.1 Example 1 Real parameter value Initial guess value Confidence intervals (No optimal design) Confidence intervals (D-optimal design) Confidence Intervals for q(1), σ = 0.1 Example 1 Real parameter value Initial guess value Confidence intervals (No optimal design) Confidence intervals (D-optimal design)

15 Using 15 Using
pointsLength of the confidence intervals for the 1st component of r q -σ = 0.points Length of the confidence intervals for the 1st component of q -σ = 0.Confidence Intervals for q(1), σ = 0.15 Example 1 Real parameter value Initial guess value Confidence intervals (No optimal design) Confidence intervals (D-optimal design)

2 Using 2 Using
pointsLength of the confidence intervals for the 1st component of r q -σ = 0.points Length of the confidence intervals for the 1st component of q -σ = 0.Confidence Intervals for q(1), σ = 0.2 Example 1 Real parameter value Initial guess value Confidence intervals (No optimal design) Confidence intervals (D-optimal design)
points Length of the confidence intervals for the 1st component of r q -σ = 0.05 Using D-optimal design No optimal points Length of the confidence intervals for the 1st component of q -σ = 0.05 Using D-optimal design No optimal Confidence Intervals for r q (1), σ = 0.05 Example 2 Real parameter value Initial guess value Confidence intervals (No optimal design) Confidence intervals (D-optimal design) Confidence Intervals for q(1), σ = 0.05 Example 2 Real parameter value Initial guess value Confidence intervals (No optimal design) Confidence intervals (D-optimal design)

1 Using 1 Using
points Length of the confidence intervals for the 1st component of r q -σ = 0.points Length of the confidence intervals for the 1st component of q -σ = 0.Confidence Intervals for r q (1), σ = 0.1 Example 2 Real parameter value Initial guess value Confidence intervals (No optimal design) Confidence intervals (D-optimal design) Confidence Intervals for q(1), σ = 0.1 Example 2 Real parameter value Initial guess value Confidence intervals (No optimal design) Confidence intervals (D-optimal design)

15 Using 15 Using
points Length of the confidence intervals for the 1st component of r q -σ = 0.points Length of the confidence intervals for the 1st component of q -σ = 0.Confidence Intervals for r q (1), σ = 0.15 Example 2 Real parameter value Initial guess value Confidence intervals (No optimal design) Confidence intervals (D-optimal design) Confidence Intervals for q(1), σ = 0.15 Example Real parameter value Initial guess value Confidence intervals (No optimal design) Confidence intervals (D-optimal design)

2 Using 2 Using
pointsLength of the confidence intervals for the 1st component of r q -σ = 0.points Length of the confidence intervals for the 1st component of q -σ = 0.Confidence Intervals for r q (1), σ = 0.2.Example 2 Real parameter value Initial guess value Confidence intervals (No optimal design) Confidence intervals (D-optimal design) Confidence Intervals for q(1), σ = 0.2 Example 2 Real parameter value Initial guess value Confidence intervals (No optimal design) Confidence intervals (D-optimal design)

Table 1 :
Table 1 contains the values for the mathematical model that are used in each numerical experiments.They are fixed values of the model and they are assumed to be known.Known model values (used for all simulations).

Table 2 :
. Parameter values for inverse problem experiments