Full Reciprocity-Gap Waveform Inversion in the frequency domain, enabling sparse-source acquisition

We perform quantitative sub-surface Earth-imaging by setting up the Full Reciprocity-gap Waveform Inversion (FRgWI ) method. The reconstruction relies on iterative minimization of a misfit functional which is specifically designed for data obtained from dual-sensors devices. In addition to the pressure field, the dual-sensors devices are able to measure the normal velocity of the waves and have been deployed in exploration geophysics. The use of reciprocity-based misfit functional provides additional features compared to the more traditional least-squares approaches with, in particular, that the observational and computational acquisitions can be different. Therefore, the source positions and wavelets that generate the measurements are not needed for the reconstruction procedure and, in fact, the numerical acquisition (for the simulations) can be arbitrarily chosen. We illustrate our method with three-dimensional experiments, where we first show that the reciprocity-gap inversion behaves better than the Full Waveform Inversion (FWI) in the same context. Next, we investigate arbitrary numerical acquisitions in two ways: firstly, when few measurements are given, we use a dense numerical acquisition (compared to the observational one). On the other hand, with a dense observational acquisition, we employ a sparse computational acquisition, with multiple-point sources, to reduce the numerical cost. We highlight that the reciprocity-gap functional is efficient in both situations and is more robust with respect to cross-talk than shot-stacking.


Introduction
The Full Waveform Inversion (FWI) method has been extensively developed in the last decades for quantitative recovery of sub-surface Earth media in seismic exploration. The concept of FWI is to find models from which simulations of wave propagation fit the measured seismograms (i.e. the 'full waveform') and it relies on an iterative minimization problem. The method was originally introduced in the work of [5,6] for one-dimensional wave equation, and was extended in the work of [33] and [50,52]. The method was first used with time-domain wave propagation, and the frequency-domain formulation of FWI, which requires a Fourier transform of the original time-dependent seismic traces, was established by [43,42].
In marine seismic, the data usually consist in measurements of the pressure field from hydrophones, but new devices, the dual-sensors, have been recently deployed and also give access to the vertical velocity, see [16,53]. This additional information has shown advantages to reduce noise for image processing, cf. [57,45] and has motivated new analysis, [2,1], and numerical methodology in [25,61] for the seismic inverse problem. In our work, we implement a misfit functional dedicated to the dual-sensors data and demonstrate its efficiency for the recovery of sub-surface physical parameters.
The FWI relies on an iterative minimization of the misfit functional which, in the traditional least-squares approach, is simply the L 2 difference between the observations and the simulations. In addition to the numerical challenges induced by the large scale domain, one main difficulty of FWI (which is a nonlinear and ill-posed inverse problem) is that the misfit functional suffers from local minima, in particular when the low frequency data are missing and/or in absence of a priori information, as detailed by [8]. In order to mitigate the 'cycle-skipping' effect, selection of increasing frequency content in the data is commonly employed, cf. [15,49]. In our work, we follow the frequency domain formulation, where this approach is natural, and further employ increasing sequential frequency, as advocated in [7].
Several alternatives to the least-squares have been studied, to enhance the convexity of the misfit functional, such as logarithmic function, mentioned in [51], that is particularly appropriate in complex-frequency FWI, see [47,46,25]. Comparisons of misfit using the phase and amplitude of signals are carried out in [48,10,44], while the signal envelop is advocated in [13], and is shown to be efficient for the reconstruction of attenuation properties in global Earth seismology by [28]. The use of L 1 criterion is studied in [14]. The optimal transport distance is considered in the work of [36,58] and is shown to improve the misfit functional convexity. In order to avoid the local minima, one can also rely on specific model parametrization such as the smooth background/data space reflectivity decomposition of the MBTT (or DSR) method, [19], which is shown to increase the attraction basin of the least-squares problem, cf. [8].
In the case where Cauchy data are available on the boundary, e.g. in inverse scattering problems, the reconstruction can use a combination of Dirichlet and Neumann traces, see [31,21,2]. This approach is labeled as "reciprocity-gap" in the literature, see, e.g., [23,21]. Lipschitzstability can be obtained for the partial Cauchy data inverse problem, in particular in the seismic context with surface measurements, as we proved in [2,1]. This result further holds for piecewise-linear parameters, employed for the numerical experiments of [1,25]. In the context of acoustic wave governed by the Euler's equations, we shall see that Cauchy data are in fact equivalent to the pressure field and the normal velocity, i.e., dual-sensors data. We further connect the boundary measurements to the volume properties of the media, see Appendix A. In our work, we employ the reciprocity-gap formulation and further combine all sources, defining the Full Reciprocity-gap Waveform Inversion (FRgWI) framework for seismic imaging using dual-sensors data.
In the time-harmonic formulation, the reciprocity-gap results in a misfit functional where observations and simulations are multiplied, and it translates in correlation of fields in the time-domain. Correlation-based misfit functional has been studied in seismic tomography, for instance with [54] and in the work of [18,60], however using a single type of measurements (i.e. the acoustic pressure fields). The use of the velocity fields is studied in the time-domain by [62], assuming that all directional velocities are available and convolving the same fields. A fundamental difference with these earlier studies, and in particular with [62], is that in our work, we correlate different fields (i.e. pressure with velocity). This is the essence of reciprocitygap and it is crucial in order to relate surface measurements to global model reconstruction using Green's identity, see Appendix A. In addition, our framework is naturally designed for the normal velocity, in accordance with the dual-sensors devices. More generally, approaches based upon a specific filtering of the data, such as [56,26], also rely on a minimization where the fields are not directly the quantity to consider.
The main feature of FRgWI is to allow different observational and numerical acquisitions, cf. [25,1,62]. Minimal information regarding the observational sources is required: the original wavelet and source positions are not needed to conduct the reconstruction. This flexibility in the choice of numerical probing sources opens up many perspectives and, in particular, to use denser or sparser computational acquisitions than the given observational one. The use of sparse acquisition relates to the shot-stacking approach for data decimation, which sums several single point-source data using the linearity of the wave equation. It has early been employed in seismic to reduce the computational cost, e.g., [37]. It is based upon the redundancy of information in the data, but has a major drawback as it is difficult to avoid the cross-talk between the encoded sources, cf. [32,59]. It has motivated several works to efficiently assemble the encoded sources in seismic (i.e. source blending, [11]), we mention, for instance, the random combination of sources changing with iteration in [32]; approach based upon compress sensing by [34]; wavelet encoding is used in [59]; see also the references therein. In our applications, we will see that FRgWI, by using arbitrary numerical sources, can employ multiple-point sources (for computational or observational acquisition) while being naturally robust to the cross-talk. In fact, it is not really cross-talk in this context because the measurements are not modified, i.e., they are still tested independently with respect to the numerical simulations.
In this paper, we study the seismic inverse problem associated with time-harmonic waves, using dual-sensors data. We first state the mathematical problem in Section 2, and define the two misfit functionals that we analyze for the iterative reconstruction: the traditional leastsquares difference and the reciprocity-based functional. Next, we detail the features provided by the reciprocity-gap formulation in Section 3. We carry out three-dimensional reconstruction experiments in Section 4 with a layered medium, where we compare the performance of reciprocity-gap and full waveform inversion. We further demonstrate the efficiency of FRgWI with simultaneous point-sources to reduce the computational cost, and its robustness compared to traditional shot-stacking. Finally, we carry out a large-scale experiment including salt domes using the SEAM benchmark in Section 5. In particular, we highlight that FRgWI performs equally well in the case of acquisitions that are sparse for observations/dense for simulations and when acquisitions are dense for observations/sparse for simulations.

Time-harmonic seismic inverse problem with dual-sensors data
We work with time-harmonic wave propagation for the identification of the physical parameters in a seismic context. The quantitative reconstruction is conducted using an iterative minimization of a misfit functional. We first give the misfit as the traditional L 2 difference and further design the reciprocity-gap version of the functional, which combines pressure and normal velocity data.

Acoustic wave equation, forward problem from dual-sensors
We consider a three-dimensional domain Ω ⊂ R 3 with boundary Γ. The propagation of waves in acoustic media is represented with the scalar pressure and vectorial velocity fields, that satisfy the Euler's equation (e.g., [22,30,23]): in Ω, in Ω, The frequency is denoted by ω, f is the (scalar) interior (harmonic) source term and ∂ ν denotes the normal derivative. The propagation is characterized by the two physical parameters of the medium: the density ρ and the bulk modulus κ. In addition, the wave speed c is defined such that For boundary conditions, we follow a geophysical context where the boundary is separated into two: Γ = Γ 1 ∪ Γ 2 . We denote by Γ 1 the interface between the air and the acoustic medium, where a free surface boundary condition holds (1c). On the other part of the boundary, Γ 2 , we implement an Absorbing Boundary Conditions (ABC, [24]), (1d), to ensure that waves reaching the boundary are not reflected back into the domain. It corresponds to the fact that the domain Ω is a numerical restriction of the global Earth.
The dual-sensor devices have been recently introduced in marine seismic exploration, and allow the recording of both the pressure field and vertical velocity, see [16,53]. Consequently we define the forward problem (which maps the parameters to the data) F (f ) ω at frequency ω for a source f such that The model parameters are referred to by m = (κ, ρ), the normal velocity by v ν and Σ corresponds with the (discrete) set of receivers location: where x i is the position of the i th receiver for a total of n rcv receivers. For notation, we introduce the restriction operator R, which reduces the fields to the set Σ, such that

Misfit functionals
The inverse problem aims the quantitative reconstruction of the subsurface medium parameters (κ and ρ) from data measured at the receivers location. Using pressure and vertical velocity measurements, we denote the data at frequency ω for a source f by where d (f ) ω,p and d (f ) ω,v are vectors of C nrcv and respectively refer to the pressure and normal velocity records, following (3). We further denote by d (f ) ω (x i ) the data recorded for the source f at the i th receiver.
In the following, we omit the frequency index and space dependency for the sake of clarity (in the experiments, we use increasing sequential frequency, as suggested in [7]). We introduce two misfit functionals which evaluate the difference between the observations and simulations.
Firstly, the functional J L2 , which follows the traditional least-squares, is where η is a scaling factor to adjust between the amplitudes of the pressure and velocity. In our applications, it is taken such that d p = η d v . Secondly, we define an alternative misfit functional based upon the reciprocity-gap, motivated by Green's identity: where T denotes the transpose. The misfit functional is motivated by the Green's identity and has been introduced in the context of inverse scattering, from Cauchy data, as mentioned in the introduction, cf. [31,21], and used with partial seismic data in [1]. In Appendix A, we justify the formulation of the misfit functional J r using variational formulation of Problem (1). Let us insist that the mathematical foundation of the reciprocity-gap formulation relies naturally on the normal velocity (because of the trace appearing in the integration by part, see Appendix A). Therefore, it perfectly matches the dual-sensors data, and does not necessitate specific directional components (i.e. v x , v y or v z ).

Iterative reconstruction procedure
The reconstruction procedure follows an iterative minimization of the selected misfit functional. For least-squares formulation such as (7), it is traditionally referred to as the Full Waveform Inversion (FWI) method in seismic, as one makes use of the full data seismograms, see the review of [55] and the references therein. Consequently, we shall refer to the minimization of (8) as Full Reciprocity-gap Waveform Inversion: FRgWI. In any case, the iterative minimization follows successive updates of the physical models, such that, at iteration l, the new model is given by m l+1 = m l − α l s l , where α is the scalar step size typically computed via a line search method, see [39], and s is the search direction. The search direction depends on the gradient of the cost function, which is computed using the adjoint state method. The method has its foundation in the body of work of [35], and is reviewed for seismic application in [40]. Application with complex fields is further described in [9,8]. The adjoint-state method for reciprocity-gap functional is briefly reviewed in Appendix B, see also [1]. In our implementation, the search direction only depends on the gradient of the misfit functional, and we employ the Limited-BFGS algorithm, see [38,39]. We review the steps for the iterative minimization in Algorithm 1.
Algorithm 1: Iterative reconstruction procedure: for the minimization of (7), the computational acquisition (source positions and wavelet) must follow the one employed to generate the measurements. For the minimization of (8), the user can prescribe computational acquisition, i.e., the positions and wavelet of the probing sources g j in (8).
for ω i = ω 1 , . . . , ω Nω do for j = 1, . . . , n iter do set l := (i − 1)n iter + j ; solve Problem (1) using current models m k and frequency ω i ; compute the misfit functional, (7) or (8); compute the gradient of the misfit functional using the adjoint-state method; compute the search direction, s l , using Limited-BFGS; compute the step length, α l , using line search method; update the model: m l+1 = m l − α l s l . end end

Features of FRgWI and data acquisition
The fundamental feature of the reciprocity-gap misfit functional (8) compared to (7) is that the set of computational sources is separated from the observational ones (respectively g j and f i in (8)), [25,1]. It implies that 1. we do not have to know the observational source positions (the location of the f i in (8)) for the reconstruction algorithm; 2. we do not have to know the observational source signatures (wavelet); Consequently, and contrary to least-squares misfit such as (7), minimal information is required regarding the observational acquisition (only the positions of the receivers). The recovery of the source wavelet in parallel to the iterative minimization procedure usually performs well in seismic exploration, using formula prescribed in [41], see Remark 2. However, incorrect knowledge of the position of the sources is a strong difficulty which can lead to the failure of the reconstruction procedure. Namely, FRgWI increases robustness by being free of such considerations. In addition, 3. the set of computational sources can be arbitrarily taken, hence can differ from the observational one (respectively n obs src and n sim src in (8)).
Therefore, it provides high flexibility regarding the choice of computational sources. It is important to notice that whatever computational sources are selected, the observed data are still independently tested one by one against the simulations, i.e., the measurements are not modified in any way.

Data acquisition
In our experiments, we consider the source term for the wave propagation (f or g) to be a Ricker function in time and a delta-Dirac function in space. In the frequency domain, it translates to a delta-Dirac function δ (with value the discrete Fourier transform of the time-domain signal). We refer to a point-source when the source is localized at the position x k : We further refer to multiple-point source when it is composed of n pt point-sources, such that The observational setup uses a fixed lattice of receivers and sources located slightly above, as illustrated in Figure 1(a). Furthermore, the position of the receivers remains the same for all sources. This configuration is consistent with the TopSeis acquisition system for marine seismic developed by CGG (Compagnie Générale de Géophysique) and Lundin Norwary AS. This principle has been recently deployed on field, showing benefits for near-offset coverage, and it has also won the 'Exploration Innovation Prize 2019' 1 . Namely, it consists in having two boats: one that moves and carries the sources, and one that carries the receivers and remains fixed.
For the reconstruction using FRgWI, we can employ arbitrary numerical sources, which do not have to correspond with the observational acquisition. Here, we investigate the two following configurations.

Dense point-source computational acquisition for sparse multiple-point measurements
In the case where the observational acquisition is composed of few sources (e.g., to reduce the cost of field experiments), FRgWI can use a dense coverage of computational point-sources to enhance the sensitivity. The sources for the measurements can be multiple-points, e.g., if several air-guns are excited at the same time. In our experiments, the choice of positions for the multiple-points follows a structured decomposition, cf. Figure 1(b). It appears more natural to employ such structured decision, e.g. the boat will carry the air-gun sources all together. In addition, note that we have observed in our experiments that this structured partition gives a better performance than using a random combination of sources.

Sparse multiple-point computational acquisition for dense point-source measurements
In the case where the observational acquisition is composed of many (point) sources (as it can happen in exploration seismic), FRgWI can instead use multiple-point sources in the computational acquisition to reduce the computational cost. It consists in a sparsification of the observational acquisition.
In order to compare with the traditional FWI approach, one can use the linearity of the wave equation and the multiple-point source can be assimilated to the well-known shot-stacking approach, which rewrites the misfit functional (7) such that 1 We refer to https://www.cgg.com/en/What-We-Do/Offshore/Products-and-Solutions/TopSeis and https://expronews.com/2019/05/23/topseis-a-worthy-winner/.   where we have a total of n stack multiple-point sources, each composed of n pt points.
Therefore, we will investigate in the following numerical sections different situations where one of the acquisition (observational or computational) is composed of point-sources while the other is made of multiple-point sources. The main advantage of FRgWI is that it does not incur any modification of the data, which are tested one by one against the computational acquisition in (8). However, with the shot-stacking version of FWI, it requires the summation of data in (11), which results in the possible loss of information, i.e. cross-talk (e.g., [59]).

Numerical experiment 1: layered medium
In this section and the next, we carry out three-dimensional experiments of geophysical reconstruction, to study the performance of FRgWI. In this first test, we compare with the traditional least-squares functional, and we employ multiple-point sources to probe the robustness of arbitrary source positions; we also test the combination of sparse and dense acquisitions. All experiments have been performed on the cluster PlaFRIM 2 .
Remark 1 (Discretization of the forward problem). We use the Hybridizable Discontinuous Galerkin (HDG) method for the discretization of Problem (1), and follow the implementation described in [12], where we first mesh the domain using simplex (i.e. tetrahedra). The method relies on a global matrix which only consists in the degrees of freedom located on the faces of discretization mesh (i.e., the degrees of freedom inside the cells are not taken for the global matrix), see [20]. Consequently, it has been shown to be more efficient (i.e., less memory consumption for the matrix factorization due to a smaller global matrix) compared to Continuous Galerkin (CG) approaches, for high order discretization, cf. [29,12].
In addition, the main motivation of the HDG method is that it gives the solutions to the first order Problem (1), i.e., the pressure and velocity, but at the same level of accuracy while using only one variable for the global matrix (i.e. it does not increase the size of the discretized problem). On the contrary, in CG or Finite Differences approaches where, if one wants both fields with same accuracy, it requires to solve a global matrix with the four fields (velocity and pressure) as unknowns. As an alternative, one can solve only the pressure to retrieve the velocity in postprocessing, but due to the derivative in (1a), it means that the velocity would be represented with one order of accuracy less than the pressure. Therefore, the HDG method is favorable method for this problem.
Then, For the resolution of the subsequent linear system, we use the direct solver Mumps, designed for sparse matrix, see [3,4], with the multiple right-hand sides feature.

Wave speed model
We consider a three-dimensional acoustic medium (provided by Statoil) of size 2.55 × 1.45 × 1.22 km, in the x, y, and z axes respectively (i.e., z is the medium depth). We assume a constant density ρ = 1000 kg m −3 , and the subsurface wave speed is pictured in Figure 2. It consists in different geophysical layers, with non-monotone variations, from 1500 to 5200 m s −1 . The first 500 m are mostly constant with a wave speed of about 1600 m s −1 . For the iterative reconstruction, we start with a one-dimensional wave speed profile that is pictured in Figure 3. This initial guess does not encode any a priori information, and only has increasing wave speed with the depth. Moreover, the range of values is lower compared to the true medium of Figure 2

Time-domain data with noise
We work with time-harmonic wave propagation but follow a seismic context where measurements are obtained in the time-domain. Furthermore, we incorporate noise in the synthetic seismograms to have a more realistic setup. The observational acquisition consists in 160 pointsources, excited one by one. They are positioned at the depth z = 10 m, and forms regular lattice such that sources are every 160 m along the x axis, and every 150 m along the y axis. There is a total of 1376 receivers, which are located at a fixed depth of 100 m, every 60 m along the x axis, and every 50 m along the y axis. Note that the receivers remain at the same position for all sources, cf. Figure 1. We generate the time-domain seismograms 3 and incorporate noise in the resulting traces, with a signal-to-noise ratio of 10 dB. Then, we proceed with the discrete Fourier transform to feed Algorithm 1. In Figure 4, we picture the noisy time-domain trace for a single-point source, and the corresponding Fourier-transform that we employ for the reconstruction. We respect the seismic constraint that the low-frequencies are not available from the time-domain data (because of noise and listening time) and we only work with data from 5 to 15 Hz frequency. In Figure 5, we picture two-dimensional time-space sections of the traces for the pressure and normal velocity, and illustrate the effect of the added noise (10 dB signal-to-noise ratio in our experiments).

Performance comparison of the misfit functionals
We first compare the performance of both misfit functionals, J L2 and J r , in the same context: the numerical acquisition is taken to be the same as the observational one (which is anyway mandatory for J L2 ). Therefore, we take n sim src = n obs src and the same set for f and g in (8). We use Algorithm 1, using sequential frequency progression from 5 to 15 Hz, more precisely, we use {5, 6, 7, 8, 9, 10, 12, 15} Hz data, and 30 minimization iterations per frequency. In Figures 6  and 7, we show the final reconstruction, after 15 Hz iterations when minimizing J L2 and J r respectively. For the discretization, we employ a mesh of about 75 thousands tetrahedra, and polynomial order three or five (depending on the frequency) for accuracy (note the mesh differs from the one employed to generate the time-domain data).

Remark 2.
For the minimization of J L2 (7), one has to use the same source wavelet for the observations and simulations. Because the observational source wavelet is not precisely known, we need to reconstruct the source during the iterative process as well. We employ the update formula given in [41] for the iterative point-source reconstruction. This can however induce additional difficulty in the case where the source characteristic is not well recovered or, more important, when the source positions are not precisely known.
In this experiment, we use the same acquisition context for the two choices of misfit functional, to observe intrinsic differences between the two methods. We observe the following:  -The full reciprocity-gap waveform inversion provides slightly better results, in particular for the parts that are near the boundaries (see the vertical and horizontal sections of Figures 6 and 7). This can be explained as FRgWI formulation (8) tests every simulation source with each observational one (the two sums in (8)) and it somehow compensates for the limited illumination of the boundary zones.
-Regarding the computational time, both approaches are similar (the only difference is the two sums in the misfit functional (8), which is a computationally cheap operation compared to the matrix factorization and linear system resolution).
Therefore, in this first test-case, we have demonstrated the efficiency of FRgWI, which produces better reconstruction compared to the traditional approach, without incurring any increase of computational time. But more important, the reciprocity-gap offers flexibility for the choice of computational acquisition, which we now study.

Comparison of multiple-point sources FRgWI with shot-stacking
The main feature of FRgWI is to enable the use of arbitrary probing sources for the numerical simulations (g in (8)). Here we investigate the use of multiple-point sources in order to reduce    i , according to (9). For the computational acquisition (g in (8)), we now consider n stack multiple-point sources, each composed of n pt points, cf. (10).
We assume that the multiple-point sources all have the same number of points. The positions of the multiple-points are taken to coincide with the observational acquisition, and we consider a group of sources in a structured partition of the original configuration, as illustrated in Figure 1. We have where x j corresponds with the multiple-point source, composed of the point-sources located in x i . The FWI counterpart is obtained using shot-stacking, following (11), i.e., it requires the summation of the observed data. We perform the iterative reconstruction using n stack = 5 multiple-point sources, each of them composed of n pt = 32 points. The reconstruction using the shot-stacking version of FWI (11) is shown Figure 8 and the result using FRgWI (i.e., where only the computational acquisition is changed) is shown Figure 9.
We observe that FRgWI performs much better than the shot-stacking in FWI: -the vertical and horizontal sections shown in Figure 9 are actually very close from the reconstruction obtained using the same computational and observational acquisition, see Figure 7. The pattern of increasing and decreasing velocity is well captured, with the appropriate values.
-On the other hand, FWI with shot-stacking looses accuracy, in particular the vertical section in Figure 8 where the non-monotone variation is not even recovered.
FRgWI, by testing each of the observational sources independently with the arbitrary computational ones, is robust with multiple-point sources, and allows a sharp decrease in the number of numerical sources for the simulations.
Remark 3. The results using the shot-stacking with FWI could be improved by using more advanced strategy of shot blending as mentioned in the introduction (e.g., [32,34]). We illustrate here that even a naive approach (i.e. simple to implement numerically) of shot-stacking is sufficient for the full reciprocity-gap waveform inversion to produce satisfactory results. Also, note that we have tried to use a random selection of positions for the multiple-point but it was not performing as well as the structured criterion. This would however require more testing to confirm.
We can further reduce the number of computational sources, the reconstruction using FRgWI with n stack = 2, n pt = 80 is shown in Figure 10. In Figure 11, we show the reconstruction using only one source: n stack = 1, n pt = 160. We see that FRgWI still behaves quite well, in particular for two sources the reconstruction carries similar accuracy than before: the vertical structures are accurately discovered, with pertinent variations of wave speed. For the reconstruction using only one source, Figure 11, we see some deterioration in the recovered layers but it remains more precise than the shot-stacking FWI using 5 sources (Figure 8). The essence of FRgWI is that it does not modify the observational acquisition and instead it probes every measured seismogram independently. It increases the robustness of the reconstruction procedure, and allows for the design of arbitrary computational acquisitions to reduce the computational cost. In this experiment, we have used a naive approach of shot summation, which requires no effort for its implementation, and it already shows accurate reconstructions.

Sparse observational acquisition
We now investigate a different situation: when the measurements are obtained from only a few set of multiple-point sources. This happens, for example, in marine seismic when several air guns are excited at the same time, i.e. the source boat carries an array of air guns. It consequently reduces the amount of data obtained, and the cost of the field acquisition.
Hence, we now consider that the measurements are obtained from 5 multiple-point sources. we follow the configuration of the previous subsection but inverting the computational and observational acquisitions: it is now f in (8) that is represented with a multiple-point sources (12) while the computational sources g consist of 160 single point sources. It means that the observed measurements are acquired from the physical phenomenon corresponding with multiple-point sources. Note that from this type of measurements, the FWI reconstruction coincides with the shot-stacking result of Figure 8. In Figure 12, we show the reconstruction where the observations are obtained from n stack = 5 multiple-point sources and the computational acquisition consists of 160 single-point sources. In Figure 13, we show the reconstruction where the measured data result from one multiple-point source, composed of 160 points (i.e., all sources are excited at the same time) while the numerical acquisition remains with 160 point-sources. We observe that the reconstructions obtained from a drastically reduced set of measurements, resulting from multiple-point sources, retrieve the appropriate wave speed variation. It displays similar accuracy compared to the use of single-point source measurements with multiple-point source simulations (respectively Figures 9 and 11 for 5 and 1 computational sources). Namely, it seems that FRgWI is insensitive to which acquisition is made sparse (i.e., with multiple-point sources).

Numerical experiment 2: salt-body SEAM model
In this experiment, we consider a three-dimensional wave speed model encompassing salt-domes. The velocity is extracted from the seismic SEAM 4 benchmark, and is of size 7×6.5×2.1 km. The velocity model is depicted in Figure 14, and varies from 1500 to 4800 m s −1 . In this experiment, the density is heterogeneous, and pictured in Figure 15. Compared to previous test-case, it is of different nature (salt-domes), it is larger and it has a heterogeneous density. Therefore, we expect this numerical experiment to be more challenging.

Time-domain data
Similarly to the previous experiment, we work with noisy time-domain data and compute their Fourier transform to generate the harmonic data used in the iterative reconstruction. There is a total of 272 point-sources in the observational acquisition, which are located at 10 m depth. The sources are placed on a two-dimensional plane with 400 m between each source along the x and y-axes. For multiple-point acquisition, we follow the structured combination illustrated in Figure 1(b). In this experiment, we generate the time-domain data and we use 20 dB signalto-noise ratio. The resulting seismograms are illustrated in Figure 16 for a single source, where we show the pressure and vertical velocity measurements. The data are acquired by a total of 2805 receivers. For the reconstruction, we perform a Fourier transform of the time-domain noisy data, and use frequency content from 2 to 7 Hz.

Reconstruction using FRgWI
For the reconstruction, we start with initial guesses that correspond with smooth version of the true models, they are shown in Figures 17 and 18 for the starting wave speed and density respectively. We actually only focuses on the reconstruction of the wave speed model, and keep the density as its initial representation of Figure 18. The density is known to be more complicated to recover than the wave speed because of lack of sensitivity in the data, [27], but it should not prevent us from recovering the wave speed, see, e.g., [25]. We use the FRgWI algorithm with the minimization of (8) using sequential frequency content from 2 to 7 Hz. We perform 30 iterations per frequency which leads to a total of 180 iterations. We analyze the two following configurations.
-We first use the dense observational acquisition composed of 272 point-sources, and multiplepoint sources for the computational acquisition (g in (8)). The multiple-point sources use n pt = 8 points, resulting in n stack = 34 sources. The reconstructed wave speed after the iterations at 7 Hz is shown in Figure 19.   -Next, we consider the opposite situation: we assume a sparse observational acquisition of n stack = 34 sources each composed of n pt = 8 points. Here, the computational acquisition is made dense, with 272 point-sources. The recovered wave speed is shown in Figure 20.
The minimization performs well in both situations: the salt dome is appearing with appropriate values (about 4500 m s −1 ). The upper boundary of the salt is accurately recovered (cf. the right of Figures 19 and 20) while the deepest parts are more difficult to obtain. It is possible that the high velocity contrast prevents us from imaging below the salt. The horizontal section (bottom of Figures 19 and 20) also shows the appropriate pattern, in particular with the decrease of wave speed in the middle, and only the parts near the boundary are slightly less accurate. It confirms the results of our first experiment that FRgWI is relatively insensitive to whether the observational or computational acquisitions is taken sparse or dense: the reconstructions display similar resolution for the two cases. In terms of numerical cost, sparse computational acquisition reduces the computational time but, in the frequency domain with the use of direct solver, this gain remains marginal. In order to further improve the reconstruction, we shall include a regularization parameter in the minimization, e.g. with Total Variation approach, which can indeed be supported by our misfit functional.

Conclusion
In this paper, we have performed waveform inversion using a reciprocity-gap misfit functional. The functional is adapted to the dual-sensors devices which capture both the pressure field and normal velocity of the waves. In this case, the measurements can be seen as Cauchy data, making our choice of misfit natural. The main feature of the Full Reciprocity-gap Waveform Inversion (FRgWI) is that it does not require any information regarding the observational acquisition (i.e., nor the source positions nor the wavelet). This flexibility further extends to the choice of arbitrary computational probing sources.
We have implemented the method in the frequency domain, with the HDG method for the first-order problem discretization. Our preliminary experiments of reconstruction have shown that FRgWI performs better than FWI in the same configuration, i.e. when we only rely on the observational acquisition. More important, the use of multiple-point sources appears very effective, even in a simple implementation which, contrary to the traditional shot-stacking in FWI, is robust to cross-talk. In fact, FRgWI does not exactly suffer from cross-talk as the observational data remain untouched and tested one-by-one against the simulations, one should better speak about some leakage. In addition, FRgWI allows for no information on the observational sources (positions and characteristics). We have carried out three-dimensional experiments of different nature, one with a layered wave speed, and another with a salt dome reconstruction, extracted from the SEAM benchmark. We have demonstrated that FRgWI can use sparse acquisition either for the observational acquisition (i.e. to reduce the cost of the field acquisition) or for the computational one (i.e. to reduce the numerical cost of the reconstruction), while not altering the resolution of the reconstruction. We plan to pursue in this direction by implementing more advanced criterion for the design of computational acquisition, and to further study the choice of the source positions (e.g., not only close to the near surface area). Note also that the concept of full reciprocity-gap waveform inversion extends to elasticity.

A Reciprocity-gap formula with Euler's equations in seismic
In this appendix, we motivate the reciprocity-gap misfit, constructed from the variational formulation of Problem (1). In the context of seismic, we assume that the data (i.e. pressure and normal velocity) are acquired on a line Σ which is slightly underneath the surface Γ 1 . The domain is decomposed into above and below area from the line of receivers, such that Ω = Ω + ∪Ω − , we illustrate in Figure 21.
x Figure 21: Illustration of the domain of interest for the Euler's equation (14), the line of measurements cuts the domain into Ω = Ω + ∪ Ω − which are colored in red and blue respectively. We also consider the sources (x f and x g ) to be located in Ω + .
Let us first rewrite Problem 1 introducing continuity conditions at Σ, using exponent + and − for the fields that are in Ω + and Ω − respectively, such that and similarly for the velocity. Problem 1 is equivalent to, We consider two couples (p 1 , v 1 ) and (p 2 , v 2 ) solutions to Problem 14. They are respectively associated to two sources, f and g, and two different sets of physical parameters: (κ 1 , ρ 1 ) and (κ 2 , ρ 2 ) respectively. We take p k ∈ H 1 (Ω) ad v k ∈ (H 1 (Ω)) 3 , for k = 1, 2, where H refers to the Hilbert space. For the derivation of the reciprocity-gap formula, we write the variational formulation on Ω − only. The variational formulation of (14) for (p 1 , v 1 ), using for the test functions (p 2 , v 2 ) gives where we omit the − exponent in the fields p and v for the sake of clarity. Furthermore we assume that the source f is supported outside Ω − , e.g., by using delta-Dirac functions in some position x f ∈ Ω + , according to Figure 21. We use an integration by part for both equations, and subtract the first one from the second one to get In the volume integral, we replace (∇ · v 2 ) and (∇p 2 ) using the fact that (p 2 , v 2 ) solves (14): Next, the integral on the boundary in (16) is decomposed between Γ − 2 = Γ 2 ∩ ∂Ω − and Σ such that On Γ − 2 , we first use (1a) to replace (v 1 · ν) and (v 2 · ν) and then the absorbing boundary condition (1d): Eventually, injecting the new formulas for the volume and integral equations in (16), we obtain The right-hand side coincides with our choice misfit functional for full reciprocity-gap waveform inversion cf. (8) (where we take the squared of the expression to have a positive functional), and it equates to zero if and only if ρ 1 = ρ 2 and κ 1 = κ 2 (meaning that c 1 = c 2 ) over the whole domain Ω − . We further refer to [2,1] for stability results.

B Gradient formulation using adjoint-state method
The gradient of the misfit functional is computed using the adjoint-state method, which comes from the work of Lions in optimal control, cf. [35], with early applications in [17]. The method is popular in seismic as it avoids the computation of the Jacobian matrix, thus requiring limited numerical effort; it is reviewed in [40]. For generality, we consider the misfit functional J • , which is either J L2 of J r , and define the constrained minimization problem, where A is the linear wave operator corresponding with the Euler's equations (1). We denote by f i the sources and use the notation z , p (f i ) } for the solution associated with the volume source F i = {0, 0, 0, f i }, in accordance with (1). For the sake of notation, we first consider a single source and drop the index f i , the formulation with Lagrangian gives L(m,Ũ ,γ) = J • + AŨ − F ,γ Ω .
Here, < . , . > Ω denotes the complex inner product in L 2 (Ω) such that < a, b > Ω = Ω a b dΩ, with the complex conjugate. The first step of the adjoint-state method is to takeŨ = U such that ∇ m L(m,Ũ = U,γ) = ∇ m J • .
Then, the adjoint-state γ is selected such that the derivative of the Lagrangian with respect to U equates zero, i.e., γ is solution to where * denotes the adjoint (transposed of the complex conjugate). With this choice of adjointstate, the gradient of the misfit functional (which coincides with the one of the Lagrangian from (23)) is We further refer to [8,Appendix A], and [25,9] for more details on the complex-variable adjointstate.
When we incorporate back the different sources, the adjoint problem (24) for the misfit functional J L2 is, for each source f i in the acquisition, Here we use ν to indicate the normal direction (in the case of a flat surface, i.e. x − y plane, ν x = ν y = 0 and ν z = 1). The gradient using all sources is For the misfit functional J r of (8), the adjoint-state solves, for each computational source g j , with the scalar ξ i,j given by The gradient is ∇ m J r = Re n sim src j=1 ∂ m A U (g j ) , γ (g j ) Ω .
We refer to [1] for more details on the adjoint-state method using Cauchy data. Therefore, we see that the adjoint-state for a single source in the computational acquisition for J r encodes information from all measurements because of the reciprocity-gap (the sum over i in (28)) while it only takes the current source with J L2 , see (26).