IDENTIFICATION OF MULTIPLE UNKNOWN POINT SOURCES OCCURRING IN THE 2D TRANSPORT EQUATION: APPLICATION TO GROUNDWATER POLLUTION SOURCE IDENTIFICATION

Accessing quality water is of crucial importance to both society and the environment. Deterioration in water quality through groundwater pollution presents a substantial risk to human health, plant and animal life, and detrimental effects on the local economy. To ensure groundwater quality, there is need to identify locations of unknown groundwater pollution sources. In this paper, the locations of groundwater contaminant sources have been identified using inverse problem technique. The work in this paper concerns the inverse source problem in the Advection Dispersion Reaction Equation (ADRE) with an emphasis on groundwater pollution source identification. Mathematically, inverse source problem involves the reconstruction of the source function in the ADRE from the boundary and interior measurements. An inverse source problem technique for identifying the unknown groundwater pollution source utilizing only the boundary and interior measurements is developed. The finite volume discretization ∗Corresponding author E-mail address: alpha@aims.ac.tz Received January 31, 2020


INTRODUCTION
The increasing rate of water resource usage results in their contamination by wastewater of domestic, industrial and agricultural sectors. Due to the increased need for freshwater, humans are opting for the groundwater reservoirs [14]. Thus there is a need to ensure quality groundwater for the sustenance of human health. To manage and supervise efficiently the groundwater quality in aquifers, accurate determination of the location and magnitude of pollution sources is necessary. The identification of pollution is done by monitoring the Biochemical Oxygen Demand (BOD) concentration which measures the amount of dissolved oxygen consumed by the microorganisms during the oxidation process [2]. BOD transport through groundwater is governed by the ADRE [4]. Identifying the sources of contamination from the spatial and temporal measurements of the BOD concentrations in the aquifer is an inverse problem which requires solving the ADRE backwards in time. Several approaches have been proposed in the last three decades to solve the ADRE backward in time to identify the contaminant sources in the groundwater. Mazaheri et al. (2015) [28], provided a comprehensive review of the available methodologies and classified them into three major groups; Simulation-optimization approach, Probabilistic approach and Mathematical approach.
The simulation-optimization approach has been employed by Mahar and Datta (2000) [26], who investigated pollution source locations in groundwater. Sequential unconstrained minimization technique was employed to solve the optimization problem. Recently, Huang et al. (2018) [20] used the simulation optimization approach by integrating numerical simulators MODFLOW and MT3DMS into the grids traversal and shuffled complex evolution optimization algorithms in determining unknown pollution source from observed groundwater data. Despite the simplicity of formulation in the simulation-optimization approach, the computational cost is relatively high and the global optimum solution is not guaranteed and dealing with ill-posed conditions encountered in the approach is much more complicated.
In the probabilistic approach, statistical distributions are employed in model development. The most important feature of this approach is that the pollution source parameters are usually treated as random variables and probability distribution functions are used to predict them. Zeng et al. (2018) [23], used the backward probability method for characterizing the instantaneous pollutant source location and releasing time within a ventilation system. The physical problem was formulated in terms of a one-dimensional advection-diffusion model with branches and it was verified by the results of a purposely designed experiment. The backward method requires schemes which can stabilize the solution process and needs prior source information such as location. The backward method also has the limitation as it requires a limited number of pollution sources which has known locations. The information being required is usually difficult to obtain in physical applications.
In the mathematical approach, initially, the governing mass transfer equation is solved for each unit intensity pollution source independently. The final problem leads to a linear, ill-posed algebraic system of equations in which the intensities and locations of the pollution sources are unknown. The ill-posed condition is solved using regularization methods. Recently, Adel (2018) [1] used the mathematical approach in establishing a non-iterative method for identifying multiple unknown time-dependent sources compactly supported occurring in a 2D parabolic equation. It is noteworthy that the mathematics and formulation used in this method are relatively more complex with respect to the other methods but it is much less time-consuming.
The originality of the present study consists in addressing the linear inverse source problem of identifying multiple unknown time-dependent point sources occurring in a 2D ADRE with the solitary velocity profiles. Although the 2D mathematical model is subject to a lot of interest, the identification of multiple unknown time-dependent point sources in 2D evolution transport equations remains an open problem. The paper is organized as follows: Section 2 is devoted to stating the problem, assumptions and proving some technical results for later use.
In Section 3, we derive the adjoint differential equation which have been used in formulating the inverse problem. In Section 4, we establish an inverse problem identification method that uses the concentration records to compute the unknown source position and intensity function.
Some numerical experiments done on a variant of groundwater pollution BOD are presented in Section 5.

GOVERNING EQUATION AND PROBLEM FORMULATION
The model concerns the measurement of pollutant concentration by the use of BOD 5 values denoted herein by C. Let Ω ∈ R 2 be any connected and bounded open set with boundary ∂ Ω.
The boundary ∂ Ω is assumed to be of the form Equation (1).
where Γ 1 represent the inflow boundary, Γ 4 represent the outflow boundary, Γ 2 and Γ 3 represents the lateral boundaries. Let Γ L regroup the two lateral lower and upper boundaries as Equation (2):  Thus the model is defined in the space Ω × (0, T ). The evolution of the BOD 5 concentration in a porous medium is governed by the Equation (3) [31]: where C is the pollutant concentration measured in mg/L, V is the flow velocity measured in m/s, R is the reaction coefficient measured in s −1 , F represents the set of all occurring pollution sources measured in mgL −1 s −1 and D denotes the hydrodynamic dispersion tensor measured in m 2 /s. Since Equation (3) is the equation for groundwater, it can be further simplified by applying the following assumptions; firstly, the groundwater is incompressible that is its density does not change when the pressure changes. Mathematically, this means that the density, ρ, of the groundwater is constant and therefore the continuity equation implies the Equation (4): secondly, V satisfies the no-slip boundary condition which means that along the lateral boundaries, the groundwater has zero velocity relative to the lateral boundary Finally, the V satisfies the solitary vibrations of the Korteweg-de Vries equation (KdV) available on the interaction between groundwater and surface water. Thus the velocity V = (ν 1 , ν 2 ) T is a spatial-temporal varying field that satisfies the Equation (6): in Ω The Equation (6) is to model groundwater flow velocity which is independent of the pollution concentrations. For detailed derivations for the velocity profile see [21,30]. Simplifying Equation (3) using the Equation (4) gives the Equation (7); Hydrodynamic dispersion refers to the stretching of a solute band in the flow direction during its transport by an advecting fluid [32]. It occurs as a consequence of two processes; molecular diffusion which results from the random molecular motion and mechanical dispersion which is caused by non-uniform velocities. With these two processes the hydrodynamic dispersion is given by the Equation (8): where the spatially varying entries D xx , D yy , D xy , D yx satisfy the Equations (9) to (11) [4,27]: Where a T and a L are the transverse and longitudinal dispersion coefficients respectively and || · || is the usual Euclidean norm. For the reaction coefficient, R, we use the linear model due to Dobbins (1964) [10], that combines groundwater de-oxygenation rate and groundwater sedimentation rate as in the Equation (12); where K d , K s are the de-oxygenation and sedimentation rate respectively.
The major challenge for an inverse problem regarding the identification of a function source F in the Equation (7) is the fact that F cannot be uniquely determined in its general form. For instance, let f ∈ D(Ω) be an infinity differentiable function with compact support in Ω and g = −∇ 2 f . Then ν(x, y,t) = t f (x, y) satisfies the Equations (13) to (15): Without having necessarily f + tg null [11]. Therefore, the same boundary records may lead to the identification of different sources. This means that for a well-posed inverse source problem a priori information on the sources should be available. Hamdi (2017) noted that the information takes the form of some conditions on the admissible sources [18]. Yamamoto (1993) considered sources of the form F(x, t) = α(t) f (x) where f ∈ L 2 and the time-dependent function α ∈ C 1 ([0, T ]) was assumed to be known and satisfying the condition α(0) = 0 [35].
In this paper, point sources of the Equation (16): are considered. In Equation (16), w i is the ith source intensity function, δ is the Dirac delta function, S i is the ith source location.
For the initial condition, without loss of generality, the assumption that no pollution occurs at the initial monitoring time is reasonable since at this time the pollutant has yet to mix with the water. Thus a null initial BOD concentration. Since the pollutants are introduced on the boundaries and as the convective transport generally dominates the diffusion process,an assumption that no pollution concentration on the inflow boundary is reasonable. Finally, a null gradient concentration at the downstream boundary which represent no variation in BOD concentration. Therefore, given the Equation (7), the BOD 5 concentration C satisfies the Equations (17) to (19): The problem in the Equations (17) to (19) admits a unique solution C that belongs to the functional space in the Equation (21) [1,25]: As the source position S i is assumed to be in the interior of the domain Ω, the state C is smooth on the boundary ∂ Ω, which allows defining the boundary observation operator in the Equation (22): where a and b are observation points satisfying 0 < a < b < l. This is called a direct problem. The inverse problem the paper is dealing with is; assuming available the records {d a (t), d b (t) for 0 < t < T } of the concentration C at the two observation points a and b, find the source F satisfying the observation in the Equation (23): Before proceeding, we need to show that if two measured concentrations of BOD coincide on Γ 4 , then they are generated by the same source. This allows us to know whether the inverse problem Proof. We recall the unique continuation theorem of Mizohata [34]: If Ω is a connected open set in R 2 , ω an open subset of Ω and if, C ∈ L 2 0, T ; H 2 loc (Ω) , verifies the Equations (27) and (28):  (17) to (19): is uniquely determined by the observation Equation (23).
Proof. Let C j for j = 1, 2 be the solutions to the Equations (17) to (19) with the time dependent point source in the Equation (29): and the same observation Equation (23). Assume that M(F 1 ) = M(F 2 ), we have to prove that Clearly ϕ satisfies the system of Equations (30) to (33): Since the points S i for i = 1, 2, . . . Ns are in the interior of Ω, we choose B such that and S j i / ∈ B for j = 1, 2. Therefore the open , is a solution of the Equations (34) and (35): Thus according to Lemma 2 (36):

GREENS FUNCTION METHOD TO SOLVE DIRECT PROBLEM
Letn be an outward normal of unit length to the space Ω. The non-homogeneous Equation (17) is equivalent to the Equation (37): where x is a vector in more than one dimension, L is a linear partial differential operator in more than one independent variables given by the Equation (38) f (x) is the right-hand side of Equation (17). The Green's function for Equation (37) will be obtained from using the Greens identity which gives the relationship between the operator, L , and its Adjoint Operator L * . For an arbitrary test function, g(x) and an independent number of the variables N, the fundamental relation which defines the adjoint operator L * is given by the Equation (39) [22]; where function M(C, g) is a boundary term which involves the values of C(x), g(x), and some of their partial derivatives on the boundary. Multiply the Equation (39) by the volume element dσ on both sides then integrate over Ω and on the right side apply, the Gaussian integral transformation in the Equation (40): wheren is the outside normal of the boundary surface ∂ Ω. After integration, the Equation (39) gives the Equation (41): In the search for the Green's function, the starting point in the solution of Equation (37) by the method of Green's function, is the Equation (41) with g substituted by G, which is the Green's function. Substituting the integration variables x by the dummy variables ξ , gives the Equation (42): Where dσ ξ is the new volume element resulting from the variable change. The boundary terms in Equation (42) and L * ξ are found by carrying out the integration by parts of the left-hand side with the given operator L ξ . At the same time, L ξ C = f (ξ ), this gives the Equation (43); The boundary terms are given by the Equation (44): Depending on the choice of G, Equation (42) provides the solution to the original problem.
Specifically, if, G, satisfies the Equation (45): Then the last term in Equation (42) Since ∇C is non-zero on the boundary Γ 1 , it suffices that G = 0 on Γ 1 . Similarly, as C is non zero on the boundary Γ L , then ∇G = 0 on Γ L . Finally, as C is non-zero on the boundary Γ 4 then G = 0 on Γ 4 . We summarize this in the Definition 3.1: The whole point of this method is that the Boundary Value Problem (BVP) governing G is generally simpler than the original governing C.

METHOD
Simplifying Equation (52) by taking the summation out of the integral sign and applying the shifting property of the delta function gives the Equation (53): The greens function in Equation (53) must be computed for each point source on the various sources Ns using the FVM method. Using Equation (53), the concentration of the ith source point located at x s , is given by the Equation (54): where w(τ) is the intensity function of the source. Equation (54) (17) to (19) and is calculated at the finite number of points x 1 , x 2 , . . . , x m . The Greens function is calculated from Equation (47) on the points is the only unknown that should be calculated to identify the pollution source. In-order to discretize Equation (54), the rectangular method is employed [24,3]. On this method, an assumption that the averages of the intensities of the source for the points x s = x 1 , x 2 , . . . , x m are given by w 1 , w 2 , . . . , w m respectively, changes Equation (54) into the Equation (55): The integral in the Equation (55) is then approximated by Equation (56): where: The system in Equation (57) is extended to Ns sources occurring in multiple locations, N p concentration-time measurements in the groundwater to produce the Equation (59): The System in the Equation (59) is a block system in which every element of the matrix is a matrix and every element of the vectors is a vector, that is, Q i j for 1 ≤ i ≤ N s and 1 ≤ j ≤ N p , is the coefficients matrix associated with the jth measurement point and the ith source, R i for 1 ≤ i ≤ N s , is the vector of unknown average intensities of the ith source and U j for 1 ≤ j ≤ N p is the vector that represents concentration versus time at the jth measurement point.
The number of equations in Equation (59) should be equal to or greater than the number of unknowns to be able to find a solution. In the latter case, the system of equations is overdetermined and can be solved using the linear least-squares method; however, the problem of source identification, encountered in this problem is ill-posed as it does not satisfy the wellposedness conditions of Hadamard [17]. According to [17] an ill-posed problem is the one that at least has one of these conditions: (i) nonexistence of answer, (ii) non-uniqueness of answer and (iii) no continuous relationship between input and output. In this case conditions (i) and (ii) are satisfied but not condition (iii) because of lack of stability of the system. To solve the ill-posed system, Tikhonov Regularization Method (TRM) is used to convert the ill-posed system to well-posed system. This is described in Section 4.2.

Tikhonov Regularization of the Inverse Problem.
This section examines the solution of the linear system of equations from Equation (57) using the TRM. For convenience, Equation (57) is rewritten into Equation (60); Since the entries of C are obtained through observation, they typically are contaminated by measurement errors and discretization errors. Let e ∈ R m be the sum of all the errors. Decompose the BOD concentration data, C, into the sum of error-free data, c 0 ∈ R m , and the errors due to measurements and discretization in the Equation (61): Let, w 0 , be the solution to the error-free linear system in the Equation (62). Since the vector c 0 contains no perturbations and the matrix G is not ill-conditioned, it follows that the linear system Equation (62); which is meaningless as it is dominated by the propagated error. Tikhonov regularization seeks to determine an approximation of w 0 by minimizing the quadratic functional in Equation (64) [5,6,7,13]: where λ is a constant chosen to control the size of the solution vector, and Γ ∈ R k×n , k ≤ n, is the regularization matrix. The operator || · || 2 represents the Euclidean norm. Since the desired solution, w 0 , has known properties, then according to [5], Γ has to be chosen as a scaled finite difference approximation of order equal to the order of the equation [29,33]. Thus the regularization matrix is the tri-diagonal second-order finite difference approximation in the It is well known that the solution w λ to the Equation (64) solves the Equation (66) [16]: From Equation (66), note that if λ is close to zero, then due to the ill-conditioning, w is badly computed. On the other hand, if λ is far away from zero, w is well computed but the error w − w 0 is quite large. Thus, the choice of a good value for λ is a difficult problem. Several methods have been proposed to obtain an effective choice of λ . For instance, if the norm of the error e on C is known, numerical results suggest that the optimal value of λ is given in Equation (67): || Γw || 2 (67) Equation (67) expresses the fact that the error on C must be close to the error introduced by the regularizing term [6]. If the norm of the error is not known, the most well-known methods to approximate λ are the L − curve method [19,6] and the Generalized Cross-Validation (GCV) [9,15] As the matrix D is invertible, setting ψ(0, 0) = ψ ⊥ (0, 0) = 0 it follows that the two dispersion functions are given by the Equation (70): where the constants α and β are given by the Equation (71): The following two functions in the Equation (72): solves the adjoint Equations (73) to (76): Multiply the Equation (17) by z(x, y,t) and integrate by parts over Ω × (0, T ) with the boundary conditions Equations (18) to (20) gives the Equation (77): where on the lateral boundary, the no-slip condition, V = 0, have been used. By substituting z(x, y,t) = e Rt u(x, y) in Equation (77) The time-dependent intensity function w n (t) is known from the previous section and to locate the source position, an assumption that the integral in Equation (79) is non-zero was made.
where the coefficients, P e ψ , P 0 and P ψ ⊥ in the Equation (80) Since D is invertible, the system of Equation (80) has a solution given by the Equation (84): In the following sections we implement the source identification procedure outlined so far.

NUMERICAL RESULTS AND DISCUSSION
In this section, we carry out numerical experiments in the case of a rectangular domain defined by the Equation (85): Substituting the variables in Equation (86) in the Equations (17) to (19) gives the Equations (87), (89) and (90).
Multiply the Equation (87) by t c /C c gives the Equation (91): The choice of the coefficient t c is based on the convective transport, that is, t c = L/V , which is the characteristic time it takes to transport a signal by convection through the domain. Also as the convection term dominates over the diffusion term. Meanwhile, C c is carefully chosen to make the coefficient in the source term unity, that is, C c = t c f c = L f c /V . This gives the Equation (92); where Pe is the Peclet number. The Peclet number measures the ratio of the convection and the diffusion terms, that is, the Equation (93). To identify the elements S and w(t) defining the source occurring in the controlled portion of a an aquifer represented by the segment (0, L). We assume controlling this portion of a aquifer for T = 4h. To generate the records C(a, t) and C(b, t) at the two observation points a and b, we solve the problem Equation (92) using the finite volume method with the source located at (900m, 10m) loading the following time-dependent intensity function in the Equation (95): where b 1 = 1.2, b 2 = 0.4, b 3 = 0.6, u 1 = 10 −6 , u 2 = 5 × 10 −5 , u 3 = 10 −6 , q 1 = 4.5 × 10 3 , q 2 = 6.5 × 10 3 , q 3 = 9 × 10 3 .
Over the whole control time T , we employ 100 measures of C at each of the two observation points a and b. Those measures have been taken at the regularly distributed discrete times with equal time steps. As far as the source position S = (Sx, Sy) is concerned, we employ the approximation of the Dirac mass in the Equation (96) [12]; 1 + cos(π(y − Sy)) 2ε (96) We set the parameter, ε = 10 −5 in Equation (96). Then, to apply the identification method established in the previous section, we start by computing the intensity function as outlined in Section 4.1 after which we apply the Tikhonov Regularization method as outlined in Section 4.2.
For the L-curve method, we use λ = {e −4 , e −5 , e −3 }. After the identification of the intensity, we compute the three integrals in Equations (81) to (83) involving the unknown data C(x, y, T ).
We end by using the computed integrals to solve the system of Equation    Table 1 gives the percentage error obtained on obtaining the intensity function with the regularization parameter of λ = e −4 .   Table 2 gives the percentage error obtained on obtaining the intensity function with the regularization parameter of λ = e −5 . As from Table 2, the percentage error is very small but the recovered intensity function is not very close to the initial intensity this is due to stability of the method. For practical purposes, the value of the regularization parameter to use would be λ = e −4 . we test λ = e −3 on Figure 6. The Table 2 gives the percentage error obtained on obtaining the intensity function with the regularization parameter of λ = e −3 .  Tables 1 to 3 clearly, the optimal value of the regularization parameter we to use in determining source position is λ = e −4 since the other values bring instabilities. To locate the source position, we use the values of C to compute the integrals in Equations (81) to (83) and use the result to solve the system Equation (80). Table 4, presents some numerical results obtained from the localization of a source  The numerical results presented in Table 4 show that the source localization procedure established enables us to identify the source position with relatively good accuracy. The observed error on the localized source position is due to the boundary records are not generated by a point source that is the Dirac mass but rather by its approximation given in Equation (96) and an approximation of the integrals Equations (81) to (83) introduces errors.

CONCLUSION
In this study, a solution for an inverse source problem of identifying multiple unknown timedependent point sources occurring in the two-dimensional ADRE was developed. Validation of the model was demonstrated by using hypothetical examples as there was no groundwater data in the literature. To generate the hypothetical data FVM was used to solve the forward problem and approximation made at discreet points. Numerical experiments on the BOD data were carried out. The obtained numerical results show that the developed identification method is accurate.
In this paper, we only considered the transport of the pollutant in the groundwater without considering groundwater flow. An outlook for the results established in the present study is their extension towards at least the following directions: Firstly, incorporating groundwater flow equations for the transport of groundwater, ADRE for the transport of the pollutants and solitary vibrations available on the interaction between groundwater and surface water. Secondly, apply the developed identification method using other reference geometries and real-life measurements taken on a flow crossing a monitored domain of arbitrary geometric shape. Thirdly, treat the three-dimensional case for the ADRE. Finally, apply the developed methodology into other applications.

ACKNOWLEDGMENT
This work was supported by Pan African University Institute for Basic Sciences Technology and Innovation (PAUSTI) for which the authors express their sincere gratitude. We also thank Dr.
Mark Kimathi for his valuable comments and help.