Least-squares diffraction imaging using shaping regularization by anisotropic smoothing

We have used least-squares migration to emphasize edge diffractions. The inverted forward-modeling operator is the chain of three operators: Kirchhoff modeling, azimuthal plane-wave destruction, and the path-summation integral filter. Azimuthal plane-wave destruction removes reflected energy without damaging edge-diffraction signatures. The path-summation integral guides the inversion toward probable diffraction locations. We combine sparsity constraints and anisotropic smoothing in the form of shaping regularization to highlight edge diffractions. Anisotropic smoothing enforces continuity along edges. Sparsity constraints emphasize diffractions perpendicular to edges and have a denoising effect. Synthetic and field data examples illustrate the effectiveness of the proposed approach in denoising and highlighting edge diffractions, such as channel edges and faults.

Consideration of diffraction phenomena in 3D (Keller, 1962;Klem-Musatov et al., 2008;Hoeber et al., 2010) requires taking into account edge diffractions. Due to lateral symmetry, they kinematically behave as reflections when observed along the edge and as diffractions when observed perpendicular to the edge (Moser, Manuscript received by the Editor 11 November 2019; revised manuscript received 3 April 2020; published ahead of production 21 June 2020; published online 11 September 2020. 1 2011). Thus, edge-diffraction signature is neither a "pure" reflection nor a "pure" diffraction but rather a combination of both; therefore, it requires a special processing procedure to be emphasized. Serfaty et al. (2017) separate reflections, tip and edge diffractions, and noise using principal component analysis and deep learning. Klokov et al. (2011) and Bona and Pevzner (2015) investigate 3D signatures of different types of diffractors. Alonaizi et al. (2013) and Merzlikin et al. (2017a) propose workflows to properly process energy diffracted on the edge. Keydar and Landa (2019) propose a method for edge-diffraction imaging based on the time-reversal principle and the stacking operator directly targeting edge diffractions. Dell et al. (2019a) extract edge diffraction responses from full-wavefield data by analyzing amplitude distribution along different azimuths on a 3D prestack time Kirchhoff migration stacking surface. Znak et al. (2019) develop a common-reflection-surface-based framework for distinguishing between point and edge diffractions and separating them from reflections.
Separation of reflections and diffractions can be done as a part of least-squares migration (Nemeth et al., 1999;Ronen and Liner, 2000). Harlan et al. (1984) pioneer in separating diffractions from noise by comparing the observed data and data modeled from a migration image in a least-squares sense.  perform least-squares migration chained with plane-wave destruction (PWD) and path-summation integral filtering and enforce sparsity in a diffraction model. Merzlikin et al. (2019) extend the approach and simultaneously decompose the input wavefield into reflections, diffractions, and noise. Decker et al. (2017) denoise diffractions by applying semblance-based weights estimated in the dip-angle gather domain. Yu et al. (2016) use common-offset Kirchhoff least-squares migration with a sparse model regularization to emphasize diffractions. Yu et al. (2017a) extract diffractions based on PWD and dictionary learning for sparse representation. Yu et al. (2017b) use two separate modeling operators for diffractions and reflections and impose a sparsity constraint on diffractions. Sparse inversion is an efficient tool to perform the extraction and denoising of diffractions because scatterers have spiky and intermittent distribution. However, a simple sparsity constraint does not account for the signature of the energy scattered on the edge, which is kinematically similar to a reflection when observed along the edge, and thus can distort it.
We combine sparsity constraints and structure-oriented smoothing in the form of shaping regularization (Fomel, 2007) to highlight edge diffractions and account for their signature. Structure-oriented smoothing performs smoothing along the edges emphasizing their continuity (Hale, 2009). Sparsity constraints imposed by thresholding in the model space force the model to describe the data with the fewest parameters and therefore denoise and emphasize edge-diffraction signatures observed perpendicular to the edge. Thus, we properly account for edge-diffraction kinematic behavior for the parallel and perpendicular to the edge directions.
For forward modeling, we use a chain of operators introduced by . We extend this workflow to 3D and modify the reflection destruction operator to account for an edge-diffraction signature by suppressing the reflected energy perpendicular to the edges. Edge orientations are determined through a PWD-based structure tensor (Merzlikin et al., 2017a). We start with a method introduction and then validate its performance on a synthetic, on a noisy marine field data set, and on a land field data set, by separating edge diffractions from reflections and noise.

Objective function
To solve for a seismic diffraction image m d , we extend the approach developed by where Jðm d Þ is the objective function, d PI ¼ PDd and d is "observed" data. Here, forward modeling corresponds to the chain of operators: 3D path-summation integral filter P Fomel, 2015, 2017), azimuthal plane-wave destruction (AzPWD) filter D (Merzlikin et al., , 2017b, and 3D Kirchhoff modeling L. The path-summation integral filter P can be treated as the probability of a diffraction at a certain location. AzPWD filter D removes reflected energy perpendicular to the edges. Therefore, AzPWD emphasizes the edge-diffraction signature, which when measured in the direction perpendicular to the edge exhibits a hyperbolic moveout and kinematically behaves as a reflection when observed along the edge. AzPWD application is the key distinction from the 2D version of the workflow , in which the PWD filter (Fomel, 2002) is applied along the timedistance plane Merzlikin et al., 2018). After weighting the data d PI ¼ PDd by path-summation integral P and AzPWD filter D, model fitting is constrained to the most probable diffraction locations.

Determination of edge-diffraction orientation
For the AzPWD workflow, Merzlikin et al. ( , 2017b show that a volume with the PWD filter applied in arbitrary direction x 0 corresponding to the azimuth θ can be generated as a linear combination of PWDs applied in the inline (D x ) and crossline (D y ) directions: D x 0 ¼ D x cos θ þ D y sin θ. Due to migration procedure linearity, the same relationship holds for the images of the corresponding PWD volumes. Azimuth θ should be perpendicular to the edge at each location to remove reflections but preserve edge-diffraction signatures, which are kinematically similar to reflections when observed along the edge.
This azimuth can be determined from the structure tensor (Van Vliet and Verbeek, 1995;Weickert, 1997;Fehmers and Höcker, 2003;Hale, 2009;Wu, 2017;Wu and Janson, 2017), which is defined as an outer product of migrated PWD filter volumes in the inline and crossline directions (Merzlikin et al., , 2017b: where hi denotes smoothing of structure-tensor components, which is done in the edge-preserving fashion (Liu et al., 2010). Smoothing stabilizes structure-tensor orientation determination in the presence of noise (Weickert, 1997;Fehmers and Höcker, 2003), whereas edge-preservation keeps information related to geologic discontinuities, which otherwise would be lost due to smearing. Here, p x and p y are the samples of inline-and cross-migrated PWD volumes (P x and P y ) at each location. The PWD filter can be treated as a derivative along the dominant local slope (Fomel, 2002;Fomel et al., 2007). Thus, a 2D PWD-based structure tensor (equation 2) effectively represents 3D structures without the need for the third dimension because the orientations are determined along the horizons.

S314
The edge orientation can be determined by an eigendecomposition of a structure tensor (Fehmers and Höcker, 2003;Hale, 2009) If a linear feature (edge) is encountered, eigenvector u corresponding to a larger eigenvalue λ u points in the direction perpendicular to the edge. Eigenvector v of a smaller eigenvalue λ v points along the edge. Thus, azimuth θ of a direction x 0 perpendicular to the edge can be computed from either u or v. If no linear features are observed, there is no preferred PWD direction.
The PWD-based tensor (equation 2) describes 3D structures. Its components (p x and p y ) are computed along the "structural frame" defined by the reflecting horizons. Thus, vectors u and v "span" the surfaces, which at each point are determined by the dominant local slopes. Eigenvectors of a PWD-based structure tensor (equation 2) are parallel to a reflection surface at each point.

Regularization
We use shaping regularization to constrain the model (Fomel, 2007). We penalize edge diffractions by two shaping operators: thresholding and smoothing by anisotropic diffusion. Iterative application of the first operator forces the model to describe the data with the fewest parameters possible (Daubechies et al., 2004) and therefore denoises and emphasizes edge diffraction signatures in the direction perpendicular to the edge. The second operator emphasizes the signatures of edge diffractions along the edge by enforcing their continuity.
Smoothing based on anisotropic diffusion enhances flow-like structures and completes interrupted lines (Weickert, 1998). The input is an unfiltered image, the smoothness of which is increasing while the diffusion is proceeding in time (Fehmers and Höcker, 2003). In a discrete form, anisotropic diffusion can be formulated as follows: where U k is an image at a diffusion time step k with Δt sampling interval, D im is a matrix operator combining PWDs in the inline and crossline directions in the image domain, and V is a matrix of eigenvectors v, which are parallel to the linear features' orientations along the dominant local slopes. Classic anisotropic diffusion partial differential equation (Weickert, 1998) requires 3D structure tensor and gradient. In equation 3, we use the PWD-based structure tensor instead of its 3D counterpart based on derivatives along the coordinate axes and a combination of PWDs in the inline and crossline directions in the image domain instead of a 3D gradient. After rearrangement of terms, equation 3 takes the form of a linear least-squares solution with the forward modeling operator corresponding to the identity matrix: where ϵ corresponds to the time step Δt. At diffusion time step k, we update the previous image U k−1 with N conjugate gradients iterations until the desired smoothness is achieved in the output image U k . The total number of diffusion time steps K, the number of conjugate gradients iterations N, and parameter ε control the smoothness of the final result. Figure 1 shows the anisotropic smoothing operator action on a zigzag pattern contaminated with Gaussian noise. The isotropic smoothing operator (for details, see, e.g., Weickert, 1998) action is equal for all of the directions and thus results in sharpness loss incurred by smoothing across the edges ( Figure 1c). Figure 1d shows that the anisotropic smoothing operator (equation 4) preserves edges by smoothing along them and thus results in S/N enhancement and sharpness of the image. Flow-like coherent noise patterns in Figure 1d are induced by using the "true" azimuths of edges across the whole image including the signal and noise regions. The result of combining both regularization operatorsthresholding and anisotropic smoothing -is shown in Figure 1f: Noise and flow-like artifacts are suppressed in regions with no signal. Thus, the combination of thresholding and anisotropic smoothing is required for proper edge diffraction regularization.
In the following examples, edge orientation estimation is done for each location individually based on a structure tensor.

Optimization
For inversion, we adopt a conjugate-gradient scheme : where H ϵ;N;K and T λ are the anisotropic-smoothing and thresholding operators, −∇Jðm j d Þ is the gradient at iteration j, s j is a conjugate direction, α j is an update step length, and β j is designed to guarantee that s j and s j−1 are conjugate. After several internal iterations j of the conjugate gradient algorithm, we generate m j d , to which we apply thresholding to drop samples corresponding to noise with values lower than the threshold λ, and which we then smooth along the edges by applying anisotropic smoothing operator H ϵ;N;K . The outer model shaping iterations are denoted by i.
The inversion results also depend on the numbers of inner and outer iterations: Their trade-off determines how often shaping regularization is applied and therefore controls its strength. Regularization by early stopping also can be conducted. The optimization strategy with H ϵ;N;K removed corresponds to the iterative thresholding approach (Daubechies et al., 2004).

Workflow
The workflow takes stacked data as the input. To generate all of the inputs necessary for the inversion, we propose the following sequence of procedures: 1) Estimate the inline and crossline dips describing dominant local slopes associated with reflections in the stack. 2) Perform PWD filtering on the stack in the inline and crossline directions. 3) Migrate the corresponding volumes. 4) Combine the migrated volumes in a structure tensor (equation 2). 5) Smooth structure-tensor components along structures with edge preservation. 6) Perform eigendecomposition of a structure tensor and determine the orientations of the edges. 7) Apply AzPWD and the path-summation integral to the stacked data. 8) Apply conventional full-wavefield migration to the data set stack (the same as the input to step 1); estimate dips and generate PWD volumes in the inline and crossline directions in the image domain (D im in equation 3) for anisotropic smoothing regularization.
The sequence of procedures with their corresponding inputs and outputs is shown in Figure 2. In the first step, dips are estimated using PWD  and two volumes one for the inline dips and another for the crossline dips, are produced and further used for reflection removal in step 2. In the second step, based on the two input dip volumes, reflections are predicted and suppressed: Figure 2. Workflow chart illustrating the sequence of procedures and the relation of their corresponding inputs and outputs. Here, D x and D y correspond to inline and crossline PWD volumes of the input stack, P x and P y correspond to inline and crossline PWD volumes after migration, ⊙ corresponds to the Hadamard (elementwise) matrix product, hi corresponds to the edge-preserved smoothing, Θ corresponds to the volume of edge diffraction azimuths, PDL corresponds to the forward modeling operator corresponding to the chain of the path-summation integral, AzPWD, and Kirchhoff modeling operators respectively; d PI ¼ PDd, where d corresponds to the input stack and H ϵ and T λ are the anisotropic-smoothing and thresholding operators. The term m d is the edge diffractivity that we invert for: m j d describes the model updates from internal iterations minimizing the misfit, and m i d is the result of regularization applied during external iterations. Notice that in step 8, the inline and crossline PWDs are computed in the image domain (D im in equation 3) and then are used instead of the Cartesian derivatives in the anisotropic smoothing operator H ϵ (equation 3). The edge diffraction azimuths Θ are used in the AzPWD operator D (step 6) and in the anisotropic smoothing operator H ϵ to prevent smearing across the edges.

S316
Two outputs are generated, which correspond to the PWD filter application in the inline (D x ) and in crossline (D y ) directions using the corresponding dip distributions. These two volumes with reflections removed then are migrated (step 3) using conventional full-wavefield migration, for example, 3D poststack Kirchhoff migration. For step 4, instead of explicitly computing the structure tensor for each data sample according to equation 2, volumes for each of its components can be precomputed. The term p x p x structure-tensor component volume can be generated by the Hadamard product between the migrated inline PWD volume (P x ) and itself, the p y p y -component volumeby the Hadamard product between the migrated crossline PWD volume (P y ) and itself, and the p x p y -component volumeby the Hadamard product between migrated inline PWD volume (P x ) and the migrated crossline PWD volume (P y ). Then, in step 5, the p x p x , p y p y , and p x p y structure-tensor component volumes are input to edge-preserving smoothing, which outputs the hp x p x i, hp y p y i, and hp y p y i volumes. In step 6, structure-tensor (equation 2) eigendecomposition is performed "on the fly" by combining structure-tensor component values from the hp x p x i, hp y p y i, and hp y p y i volumes for each data sample. The result is the smaller eigenvalue eigenvector volume, which then is converted to edge-diffraction orientations Θ.
Step 7 gives the data to be fit by the inversion (equation 1). Orientations of structures for AzPWD and for anisotropic smoothing regularization are estimated in step 6. In anisotropic diffusion instead of derivatives in Cartesian coordinates, we use PWDs in the inline and crossline directions in the image domain (D im in equation 3)the dip estimation for which should be performed on a "conventional" image of a full-wavefield stack (step 8). Then, we invert the data for edge diffractions.

SYNTHETIC DATA EXAMPLE
We test the approach on a synthetic data example with a reflectivity model shown in Figure 3. The synthetic seismic data are generated by the Kirchhoff-modeling method with a 15 Hz peakfrequency Ricker wavelet, the reflectivity distribution shown in Figure 3, and a 2.0 km/s constant velocity model, which is further used in the migration and the inversion. Zero-offset geometry with 1 m sampling in the inline and crossline directions is used. The synthetic data are shown in Figure 4a. Random noise with a maximum amplitude of 30% of that of the signal, filtered to the signal frequency band, is added to the synthetic. After 3D Kirchhoff migration, diffractions become focused and a channel-like structure with a zigzag pattern becomes prominent (Figure 4b). However, channel edges appear to be somewhat blurred and masked by the reflections. We follow the workflow described above. The data to be fit by the inversion are shown in Figure 5a. We use the following parameters for the inversion: λ ¼ 100, ϵ ¼ 10, N ¼ 5, and K ¼ 1. In total, we use 20 iterations: five inner by four outer. The inversion result is shown in Figure 5b. The edges are highlighted and denoised.

FIELD DATA EXAMPLE 1
We test the capabilities of the proposed approach on the field data set acquired with a high-resolution 3D (HR3D) marine seismic acquisition system (P-cable) in the Gulf of Mexico to characterize the structure and stratigraphy of the shallow subsurface (Meckel and Mulcahy, 2016;Klokov et al., 2017b;Merzlikin et al., 2017b;Greer and Fomel, 2018). The acquisition geometry is defined by a "cross cable" of a catenary shape with paravans (diverters) at the ends oriented perpendicular to the inline (transit) direction that tows twelve 25 m long streamers with separation of 10-15 m. Source-receiver offsets are on the order of 100-150 m, whereas the source sampling is 6.25 m. The target interval of interest in this paper is associated with the 0.222 s time slice, which approximately corresponds to 160 m in depth. Diffraction imaging is applied to the data set stack preprocessed with the following procedures: 40 Hz low-pass filter, wavelet deconvolution, surface-consistent amplitude corrections, predictive deconvolution, missing trace interpolation, and acquisition footprint elimination using f-k filtering in the image domain. More detailed description of the acquisition geometry, geology, and interpretation of diffraction images can be found in Klokov et al. (2017b) and Merzlikin et al. (2017b). Here, we focus on the interval with a channel of high-wavelength sinuosity (Merzlikin et al., 2017b). For simplicity, a constant velocity model of 1.5 km/s is used for the migration and the inversion. Higher accuracy diffractionbased velocity estimation for the same data set is described in Merzlikin et al. (2017b).
The stacked volume is shown in Figure 6. The poststack 3D Kirchhoff time migration image of the target slice is shown in Figure 7a. Channel delineation is hindered by stronger reflections and acquisition footprinthorizontal lines. We eliminate the footprint in the inline and crossline PWD diffraction images using f-k filtering before combining them in a structure tensor (equation 2) and forming the AzPWD linear combination (Merzlikin et al., 2017b). The result of AzPWD volume migration is shown in Figure 7b. The channel and other small-scale features (e.g., the fault at inlines 2.5-3.0 km and crosslines 6.0-6.4 km) are highlighted, but the image still is noisy.
We follow the proposed workflow and first generate the data to be fit by the inversion (Figure 8a) by applying AzPWD and the pathsummation integral migration to the stack ( Figure 6). We run five outer and two inner iterations and use λ ¼ 0.008, ϵ ¼ 20, N ¼ 30, and K ¼ 1. Due to the low S/N of the data set, a small number of inner iterations is used to prevent leakage of noise to the diffraction image domain. The result of the proposed approach is shown in Figure 8b. The channel appears to be highlighted and denoised. Most of the low-amplitude events also are preserved and highlighted including the fault feature (inlines 2.5-3.0 km and crosslines 6.0-6.4 km). The discontinuities of edges intersecting each other at inlines 1.0-1.5 km and crosslines 4.0−5.0 km are caused by their rapidly varying orientations prohibiting smoothing from emphasizing different strikes simultaneously.  preconditioned by AzPWD and path-summation integral migration and (b) inversion result (strong smoothing is used to highlight the continuity of the edge diffractions). A significant improvement in the S/N of the edge diffractions has been achieved (compare to Figure 7b). Figure 9a and 9b shows the diffractivity model after thresholding and after thresholding followed by anisotropic smoothing applied correspondingly. The figures illustrate that anisotropic smoothing merges neighbor samples and thus is necessary for edge-diffraction regularization. The result is consistent with the experiments shown in Figure 1. Figure 10 shows the stack after reflection elimination, modeled diffractions from denoised diffractivity shown in Figure 8b, and their difference. Denoised diffractions in Figure 10b show clear hyperbolic signatures. Reflection energy remainders after the AzPWD application prominent in Figure 10a (e.g., 0.222 s two-way traveltime (TWT), inline 1.625 km and crosslines 5.5-6.25 km and 0.222 s TWT, inlines 2.4-2.7 km and crosslines 6.25-6.5 km) also are removed. The difference in Figure 10c is predominated by noise, reflection remainders (e.g., regions mentioned above), and the acquisition footprint and proves the effectiveness of the approach.
To account for amplitude differences and for the higher continuity of denoised diffractions (Figure 10b) in comparison to the stack with AzPWD (Figure 10a), signal and noise orthogonolization (Chen and Fomel, 2015) has been applied. The similarity between the restored signal ( Figure 10b) and the restored noise (Figure 10c) is measured. Then, the signal energy, which leaked to the difference volume ( Figure 10c) and which has high similarity with the signal events actually predicted (Figure 10b), is withdrawn from the noise volume and added to the signal estimate.

FIELD DATA EXAMPLE 2
The second field data example comes from the Cooper Basin onshore Western Australia. The data set corresponds to stacked (with 25 × 25 m bin size) preprocessed data acquired as a 3D land seismic survey with fully azimuthal distribution of offsets and a far offset of 4000 m. The preprocessing sequence includes noise attenuation, near-surface static corrections, despike, surface-consistent deconvolution, Q-compensation whitening, and time-variant filter applied after stacking. The target horizon slice picked by the interpreter and used for the performance evaluation of the method corresponds to the interface between tight-gas sand and coals (Tyiasning et al., 2016). The depth of the interface is approximately 2500 m. The structural complexity of the overburden can be characterized as high. A detailed geologic description of the area as well as a comparison of diffraction imaging results to discontinuity-type attributes is given by Tyiasning et al. (2016). Here, we apply the developed approach to the data set. The prestack time migration velocity is used for full-wavefield migration and the proposed inversion approach for edge-diffraction imaging. Figure 11 shows a stacked volume of the data set. We focus on the window around the target horizon, which on average corresponds to 1.72 s TWT. Figure 12a shows a conventional image of the target horizon slice generated with 3D poststack time Kirchhoff migration. We then apply the reflection removal procedure to the stack (Figure 11), migrate it, and generate the image shown in Figure 12b. Smaller scale features become highlighted, for example, faults between 8-10 km inlines and 0-4 km crosslines. The acquisition footprint is noticeable on both of the images ( Figure 12) and corresponds to the low-amplitude events aligned in a grid-like fashion (easy to notice between inlines 0-2 km and crosslines 6-8 km). We apply the developed workflow to further highlight and denoise the diffractions.
First, we generate the data to be fit by the inversionthe stack weighted by a path-summation integral after reflection elimination (Figure 13a). We run 20 outer and 2 inner iterations and use λ ¼ 4, ϵ ¼ 10, N ¼ 20, and K ¼ 1. As in the previous data example, we use a low number of internal iterations to avoid noise fitting in general and footprint in particular. The inversion result along the target horizon is shown in Figure 13b. The 3D cubes of the migrated stack, the migrated stack after reflection elimination, and the inversion result are shown in Figures 14, 15, and 16. Edges masked by the specular energy on the full-wavefield image ( Figure 14) are highlighted on the image after reflection elimination (Figure 15). At the same time, Figure 9. Last iteration diffractivity model: (a) after thresholding and (b) after thresholding followed by anisotropic smoothing. Although thresholding allows for denoising, anisotropic smoothing emphasizes edge diffraction continuity and accounts for its kinematic behavior when observed along the edge. Figure 10. (a) Stack with the reflections removed, (b) Kirchhoff modeling of diffractivity from Figure 9b, and (c) the difference between (a and b) is predominated by noise and, in particular, by reflection remainders (notice the coherent events with slowly varying amplitudes, e.g., 0.222 s TWT, inline 1.625 km and crosslines 5.5-6.25 km and 0.222 s TWT, inlines 2.4-2.7 km and crosslines 6.25-6.5 km) and the acquisition footprint. edge diffractions produced by the inversion (Figure 16) appear to be clearer and have a higher S/N. Further, we apply Kirchhoff modeling to the inversion result (Figures 13b and 16), restore diffractions (Figure 17), and subtract the result from the stack after reflection elimination (Figure 18). The result of the subtraction shown in Figure 19 can be treated as noise eliminated during the inversion. Signal and noise orthogonalization (Chen and Fomel, 2015) is applied to account for aperture difference between the observed and restored diffractions (for instance, often only a single diffraction flank can be observed in the data leading to spuriously high difference or the "noise estimate") and to bring back some of the energy accidentally leaked to the noise domain.

a) b) c)
The acquisition footprint evident in Figure 19 suggests the high performance of the method. Improvement can be noticed between the 6-10 km inlines and 2-4 km crosslines associated  Figure 11 and (b) 3D poststack Kirchhoff migration of the stack after AzPWD. Subsurface discontinuities appear to be highlighted in (b) in comparison to (a), in which they are masked by specular energy. At the same time, the diffraction image in (b) still has some noise present. with the extraction of small-scale discontinuities. Compare the inversion result in Figure 13b and the result of migration shown in Figure 12b. The same subtle events are located at crossline 3.25 km between inlines 7-9 km (Figures 17  and 18), and they have a clear hyperbolic shape (Figure 17), which appears when edge diffractions are observed perpendicular to edges. The good restoration quality also can be inferred from the "circular" structure between the 6-8 km inlines and 0-2 km crosslines. Both of these areas have a high similarity between the initial diffraction stack ( Figure 18) and the stack with diffractions "restored" and denoised by inversion (Figure 17) leading to low amplitudes in the difference section ( Figure 19) primarily associated with noise. These features also follow "frowning"-focusing-"smiling" behavior under migration velocity perturbation supporting their diffraction nature in the plane perpendicular to the edge.
Some features are lost in the area between the 10-11 km inlines and 6-8 km crosslines. The event between inlines 12-14 km and crosslines 6-8 km is not predicted (Figure 19). High-magnitude events observed between inlines 1-3 km at crossline 4-5 km and between crosslines 4-5 km at inline 5.35 km (Figures 16 and 17) can be associated with reflections leaked to the diffraction image domain or actually can be edge diffractions observed in the plane not perpendicular to the edge and thus having an elongated signature. The event located at crossline 3.25 km between inlines 5-7 km inversion is interpreted as a reflection and is removed from the diffraction imaging result (Figures 16 and 17). High-difference values following the channel in the difference cube (Figure 19) can also be associated with the reflection remainders removed from the diffraction image domain.
It should be mentioned that the target horizon has a highly oscillating pattern including a highmagnitude drop in the left part of the cube (inlines 1-3 km) making the dip estimation for "ideal" reflection-diffraction separation simultaneously for the whole area challenging. More careful dip estimation possibly with different parameters in different regions of the data set should further improve the result. Inversion parameters definitely can be tweaked to improve the results, but even with these trial values most of the edges including some subtle features (e.g., events at crossline 3.25 km between inlines 7-9 km; Figure 16) hardly are noticeable on the migrated stack after reflection elimination (Figure 15) has been extracted and highlighted.

DISCUSSION
The dependency of the workflow's ability to produce sharp images of edge diffractions upon Figure 16. Inversion result. Edge diffractions are highlighted and denoised (compare to Figure 15).  the migration velocity accuracy requires further investigation. On one hand, the approach incorporates the least-squares migration framework known to be quite sensitive toward the velocity model (Nemeth et al., 1999). On the other hand, the path-summation integral provides a velocitymodel-independent weighting of the misfit, which is expected to increase the method's tolerance to velocity model errors.
The second field example illustrates the efficiency of the proposed approach in a complex geologic environment. Most edge diffractions, and especially those associated with major discontinuities, are extracted and denoised. Some reflection energy remainders are present but are limited to locations characterized by a peculiar reflection pattern, which was not picked up by the dip estimation tuned to perform reflectiondiffraction separation over the whole area. This is the dip estimation problem, the results of which can be improved, for instance, by subdividing the area into smaller fragments, and, thus, it does not question the validity of the approach presented. The latter statement also is supported by the high performance of the developed approach when applied to the first field data example. The first field data example is characterized by a low S/N and is highly contaminated by the acquisition footprint. The approach allows successfully untangling edge diffractions from reflections and noise and provides high-resolution images of subsurface discontinuities. We expect that in a production-like environment, geologic knowledge can be used to further adjust the parameter values. For instance, in the first field data example, the expected channel sinuosity could guide the smoothing strength for the edges. In this paper, we focus on highlighting the advantages of the method rather than on delivering the final results for drilling decisions.
The natural extension of the approach is to include reflection modeling into the inversion. As demonstrated by Merzlikin et al. (2019), the reflections and diffractions can be inverted for by the same forward modeling operator whereas the separation into the components can be done on the regularization level. Regularization of diffractions can stay the same, whereas reflections, for instance, can be penalized by a strong isotropic smoothing operator along the dominant local slopes: Specular events locally do not exhibit lateral symmetry as opposed to edges. Extension of the model space to include reflections can help to eliminate the reflection remainders in the diffraction image domain.
The high complexity of the overburden often leads to interference between seismic events, which results in the presence of multiple dominant local slopes at a single data sample. Although, in   Figures 11 and 17). Reflection remainders and noise including the acquisition footprint are apparent (compare to Figure 17). Figure 19. The difference between Figures 17 and 18 is predominated by noise and thus supports the validity of the denoised edge diffractions (Figure 17). this case, the effectiveness of the proposed inversion scheme in general and of AzPWD in particular will be degraded, the performance could be improved by preapplying migration with an approximate velocity model to untangle interfering events, running AzPWD, and then going back to the original data domain.
Anisotropic smoothing is capable of emphasizing one edge direction at once. Edges with conflicting orientations can be a challenge. The "brute force" way to tackle the challenge can be scanning for various edge-diffraction orientations and picking the desired ones . Then, inversion results with alternative orientations can be compared. At the same time, if coherent noise with a consistent spatial orientation is present in the data, it can be emphasized by anisotropic smoothing. Structure-tensor orientation determination will treat this noise as signal. Poor illumination and velocity model errors also can reduce the accuracy of structure-tensor-based edge-diffraction orientation determination. The latter will degrade the performance of the anisotropic-smoothing regularization operator. We expect that the problem can be alleviated by using a priori information about geologic discontinuities' orientation, for example, by using the predominant azimuths of the faults in the region extracted from a geomechanical model.
The workflow described in this paper is a 3D extension of the method proposed by . In 2D, one cannot discriminate between point and edge diffractions. The distribution of scatterers is spiky and intermittent, which leads to a natural choice of sparsity constraints on the diffractivity model. The new inversion scheme is based on two regularization operators: thresholding and anisotropic smoothing. Although the thresholding operator imposing sparsity constraints applicable to point and edge diffractions remains the same as in the 2D counterpart of the workflow, anisotropic smoothing enforces continuity along the directions picked up by the structure tensor and thus is applicable only to edge diffractions. Point diffractions, which are not elongated in space, will be smeared under anisotropic-smoothing operator action. Currently, our method is biased toward edge diffractions.
Anisotropic smoothing can have spatially variable diffusion coefficient defining its strength (Weickert, 1998;Hale, 2009). For instance, the coefficient and the direction of smoothing can depend on the linearity, which can be computed as a ratio of eigenvalues of the PWD-based structure tensor and which can distinguish between the edges and regions of continuous amplitude variation (Hale, 2009;Wu, 2017). The spatially variable diffusion coefficient could help alleviate smearing of point diffractions. Point diffractions in the diffractivity model will exhibit low linearity. For these samples, the smoothing power can be reduced and thresholding will be a predominant regularization operator.
The proposed approach is equivalent to total variation (TV) regularization (Strong and Chan, 2003), in which minimizing the l 1 norm of a second derivative penalizes the model. The Hessian of TV is guided by a structure tensor, which forces model smoothing to be applied along the edges with no smearing across them. Thus, TV regularization is similar to the one proposed in this paper and also can be used to penalize edge diffractions. TV implementation challenges are associated with regularization term differentiation during optimization. Although this obstacle can be accommodated by using sophisticated optimization methods and representing l 1 norm as a square root with a damper (Chan et al., 1999;Burstedde and Ghattas, 2009;Anagaw and Sacchi, 2012), shaping regulariza-tion by anisotropic diffusion appears to be a viable, simple-to-implement, and fast-to-converge alternative with no approximations required. Anisotropic smoothing can be used to regularize fullwavefield images in "conventional" least-squares migration and even in iterative velocity-model building methods.
The workflow can be used to extract and denoise diffractions for their subsequent depth imaging. Alternatively, a depth imaging operator can replace Kirchhoff time migration in forward modeling to allow for depth-domain model conditioning, whereas misfit weighting by the path-summation integral still is performed in the timemigration domain.
The inversion can be extended to the prestack domain. In this case, prestack counterparts of the chained forward modeling operators should be used: the prestack path-summation integral , prestack migration engine, and prestack AzPWD. Although expressions for the former two exist, AzPWD has not been applied in the prestack domain. The corresponding method can be derived based on the approach developed by Taner et al. (2006).

CONCLUSION
We have developed an efficient approach to highlight and denoise edge diffractions based on least-squares migration. The inverted operator corresponds to the chain of path-summation integral filter, AzPWD, and Kirchhoff modeling operators. Although the combination of the path-summation integral filter and AzPWD emphasizes edge diffraction signatures in the data domain, thresholding and anisotropic smoothing precondition them in the model domain by denoising and enhancing their continuity. Forward modeling and shaping regularization operators guide the inversion toward restoration of edge diffractions. Synthetic and field data examples show the high fidelity of the approach.
The efficiency of the proposed inversion scheme comes from the workflow application in the time poststack domain and the shaping regularization framework leading to fast convergence. The inversion scheme that we propose can be thought of as an effective operator directly tailoring edge diffractions and extracting them from the full wavefield.

ACKNOWLEDGMENTS
We are grateful to the TCCS sponsors and in particular to Equinor (formerly Statoil) for financial support of this research. We thank E. Landa, A. Bona, T. Meckel, O. Ghattas, and L. Decker for the inspiring discussions. We thank the developers of and contributors to the Madagascar open-source software project. We thank the four anonymous reviewers, the associate editors, and J. Shragge for their valuable comments and help in improving the manuscript. The prestack time migration velocity and preprocessed stack are provided by GeoFrac research consortium at the Australian School of Petroleum at the University of Adelaide.

DATA AND MATERIALS AVAILABILITY
Data associated with this research are available. Results can be reproduced using Madagascar seismic processing software.