A Probabilistic Tremor Location Method

A method to locate tremor with an unclear onset is introduced. The method maps envelopes of cross‐correlated records from pairs of seismographs to signal likelihoods using Bayes' theorem. The likelihood time series are then back projected to obtain likelihoods of source location in space. Assuming that information derived from different station pairs is independent, the joint likelihood for all station pairs is the product of all individual station pair likelihoods. Its peak and spread give the most probable source location and its uncertainty. Robustness of the method has been evaluated with synthetic tests. The method recovers true location within 0.5 km with realistic noise characteristics in synthetic data. Uncertainty estimates are consistent with location deviations for sources inside the seismic network. When applied to real data from Katla and Eyjafjallajökull volcanoes in southern Iceland, their likely tremor source is also recovered within 1 km.


Introduction
Source location is one of the main and usually the first task when analyzing seismic waves from a source. Many sources, such as tectonic earthquakes, produce impulsive phases with clear onset times in seismograms and can, therefore, be located using standard earthquake location methods (Lomax et al., 2009). However, some earth processes, especially in volcanic areas, produce complex signals without a clear onset. We cannot pick the arrival times of waves from such sources to locate them. Instead, alternative methods, such as amplitude decay and array methods, have been used. The amplitude decay method (Battaglia & Aki, 2003) involves simplistic assumptions about geometrical spreading and attenuation of seismic waves, which may not be satisfied in highly heterogeneous volcanic regions . Array methods, for example, beam forming (Rost & Thomas, 2002) and frequency wave number analysis (Capon, 1973), require dense sampling of the wave field in order to obtain reliable measurements of slowness and direction of the incoming seismic waves. Combining observations from several arrays may pinpoint a location of the source. However, such dense deployments are not always possible and practical.
The back projection of interstation cross correlations has recently been proposed for locating sources without clear onsets (e.g., Haney, 2010Haney, , 2014Zhang et al., 2010). Numerous authors have applied methods based on this concept in various geological settings, such as at volcanoes (Ballmer et al., 2013;Droznin et al., 2015;Haney, 2010Haney, , 2014Soubestre et al., 2019) and in geothermal areas (Gudmundsson & Brandsdóttir, 2010). The cross correlation between two stations 10.1029/2019GL085538 is back projected from time to space assuming a velocity model. To stabilize the location estimate, some integral attribute of the correlation is often used instead of the correlation itself because it contains rapid oscillations, for example, the correlation envelope. If the velocity model is homogeneous, a peak in the correlation (or its envelope), which corresponds to the differential travel time between the stations, will distribute along a hyperbola (in two dimensions) or a hyperboloid (in three dimensions). For this reason, a single correlation does not suffice to locate a source. At least two (for two dimensions) or three (for three dimensions) correlations are required for a unique source location. The straightforward way to combine the correlation attributes is to back project and sum them (Haney, 2010). , however, proposed a double-correlation method in which correlation of two correlations from two station pairs is back projected for source location.  argued that correlated noise, present in the correlations as individual peaks, is better suppressed with double correlation than single correlation. Since noise is better suppressed in the double, compared to single correlations,  further proposed to multiply the correlation envelopes, following a reference station scheme. This way of combining information suppresses correlated noise in the correlations significantly better and therefore yields a more focused location estimate, even in volcanic environments with strong scattering .
The success of these methods lies in the fact that multiplication better suppresses noise that is inconsistent with a single source location compared to summation. Following this line of argument, one may suggest to use the product of all available correlation envelopes for source location. This approach can be justified by considering that the envelope of a cross-correlation function is related to a probability density function of a source location, since wherever the amplitude of the back-projected correlation envelope is high, a source location is likely and vice versa.
In this paper, we propose a novel method which transforms the envelope (or other attribute) of a cross -correlation function to a probability density function of time shift using a mapping derived from Bayes' theorem. The mapped time series of probability density is then back projected to obtain a map of probability density of source location in space. This map describes the likelihood of source location given the data in the correlation of a single station pair. Assuming that correlations are independently derived from multiple seismograms, the back projected maps are independent of each other. The joint probability density of all likelihood functions is then the product of the individual maps. Therefore, the maps are multiplied to obtain the joint probability density of the source location given data from all correlations. The back projection of the time series of probability density is, however, not homogeneous everywhere in space. Each time delay maps to a hyperbola and the density of the mapping varies along the hyperbola. This will be accounted for by applying a scaling factor to the likelihood function.
Since the final product is a joint probability density of the source location, the location uncertainty can be estimated directly from the spread of its distribution. Note that the location uncertainty can generally not be directly estimated in other methods using stacks of single correlations. This estimate represents an ideal situation where we have full knowledge of the velocity model of the study area. This is, however, not the case in reality. To account for the uncertainty in the velocity model, we define a stochastic process describing the uncertainty and variation of the velocity and compute the variation of predicted time for all combinations of differential paths. A Gaussian distribution of spatially varying width corresponding to that variation is then convolved with the individual back-projected likelihood function.

Method
Our tremor location method consists of four main steps: (1) mapping an envelope of a cross-correlation function to a probability density as a function of time shift; (2) back projecting the time series of temporal probability density into space using a predefined velocity model; (3) scaling the back-projected map of probability density with a factor accounting for the heterogeneity of the back-projection density and convolving with a two-dimensional Gaussian probability density function which accounts for the uncertainty of the back projection due to an inaccurate velocity model; and (4) multiplying all individual likelihood distributions to construct the joint likelihood distribution. The following subsections describe each of these steps.

Mapping Cross Correlations to Probability Densities
To map the envelope of a cross-correlation function to a probability density as a function of time shift, we need a mapping, which can be derived using Bayes' theorem. We assume that our time series of a correlation envelope (t) consists of the sum of signal s(t) and noise n(t), that is, (1) We further assume that the noise process is continuous, characterized by a probability density P n (n) and that the signal is described by two random processes, one describing its distribution P s (s), the other a binary process S describing its occurrence (S = 1 or S = 0), with a probability P S (S = 1) = 1 − P S (S = 0). S = 1 when a signal is present and S = 0 when no signal is present. Then, according to Bayes' theorem, where the notation |x denotes the probability being conditioned by x. P S (S = 1| ) is the probability that a signal is present for any given value of . This is what we want to describe. Now, assume that we have no prior knowledge about when a signal might occur. Then, it is equally likely at all times, that is, P S (S = 1) = constant, and Also, assuming that P S (S = 1) is small, P ( ) is approximately equal to P n ( ) and when a signal is present, = s + n. Therefore, at those points in time when S = 1, where * stands for convolution. Therefore, If we know nothing about what values the signal may take, but assume that they are one sided (positive) it is natural to choose P s to be a Heaviside function H(s), where Then, Thus, we can transform the signal (t) into a scaled probability density at any given point in time if we know P n (n). We can estimate P n (n) from a histogram of the values of the time series. This requires parametric simplification. We have estimated P n (n) by fitting a function of the form to the histogram. This empirical parameterization works well for our correlation envelope data and synthetic examples. The parameter a defines the width of the distribution, and the parameter k defines the rate of decay of its tail.

Back Projection
The back projection of the time series of probability density is done in the following way. First, we construct a geographical grid for our study area. For each grid point, we calculate the differential travel time of a station pair assuming a velocity model. The value that corresponds to the differential travel time is then fetched from the time series of probability density for that station pair. The procedure is repeated for all grid points. This yields a back-projected map of probability density (or likelihood) of source location due to a single correlation from a station pair. Assuming that the correlograms are independent renditions of differential time from the source(s), a joint likelihood based on all correlation data can be constructed by the product of all the individual likelihood functions. This is done after the correction factors discussed in sections 2.3 and 2.4 are accounted for in each individual likelihood function.

Accounting for Back-Projection Heterogeneity
The back projection described in the previous section is neither homogeneous nor unique. Each time delay maps to a hyperbolic curve passing through the source in two-dimensional space and the density of that mapping varies along the hyperbola. The further away from the station pair, the lower the density is. In order to interpret the back-projected, temporal probability density as a spatial probability density, one must account for the heterogeneity of this mapping. This can be done by multiplying the back-projected probability density by a scaling factor such that the probability density integrated over an infinitesimal time lag range before back projection is equal to the probability density integrated over the spatial range corresponding to that time-lag range after back projection. This scaling factor is similar to the normal moveout (NMO) stretch in reflection-seismic processing (Yilmaz, 2001), but manifests itself as spatial stretching and scaling in the back projection as opposed to temporal stretching in the vertical-incidence record section. The scaling factor is given by where Δx is the station separation, S is the distance between the source and the midpoint of the station pair, and D is the differential distance between the two stations. A detailed derivation of this scaling factor can be found in the supporting information. This scaling factor is applied at each grid point. Note, that this expression assumes homogeneous velocity. If velocity heterogeneity is present, that will distort the hyperbola and this mapping density.

Accounting for Back-Projection Uncertainty
Apart from its heterogeneity and non-uniqueness, the back projection itself is subject to uncertainty, that is, the uncertainty of the relationship between time and space, or velocity. This can be characterized by defining a stochastic process describing the uncertainty and variation of the velocity, computing the variation of predicted time for a given set of differential paths and convolving the likelihood function of source location with a distribution of width corresponding to that variation.
Assume that the medium slowness can be described by where u ≪ u 0 . u 0 is the mean slowness and u is a slowness perturbation as a function of spatial location x. If the perturbation is described as a stationary random process and the expectation of u is zero, the covariance Γ u is only a function of the distance between two points in space, not the locations themselves, that is, where ||x −x || is the length of x −x and 2 u is the variance of the slowness perturbations. The differential travel time for a station pair is given by where t 1 and t 2 are the travel times along the corresponding paths l 1 and l 2 , respectively. = Δx∕(2S) is a dimensionless factor, which only depends on the station separation Δx and the distance between the source and the midpoint of the station pair S. Our objective is to find an expression for the expectation and the  (8) to this empirical distribution. (c) Time series of temporal probability density after the translation using equation (7).
variance of t. Since the expectation of u is zero, it follows that the expectation of t is also zero. The variance of t is defined as since E{ t} = 0, where E{ t} denotes the expectation of t. Substituting equation (12) into equation (13), we have 2 Assuming that (u) = exp[−u 2 ∕(2a 2 )], that is, the random medium is described by a Gaussian autocorrelation function, equation (14) becomes Equation (15) gives the variance of differential travel time given a specific random medium and station geometry. Specifically, velocity perturbations are assumed to be small, their distribution is assumed to be stationary, Gaussian, and randomly sampled. This estimate of variance in time is then translated into an estimate of location uncertainty using the predefined velocity. To account for this uncertainty, for each grid point, the back-projected map is convolved with a Gaussian distribution having a width corresponding to that given by equation (15).

An Example
We demonstrate the steps of the method with an example in Figure 1. It shows an example correlation envelope as a function of time lag (Figure 1a) and the empirical distribution of its values (blue solid curve in Figure 1b). This distribution is fitted with a parametric function as in equation (8) (red dashed curve in Figure 1b). This allows us to translate the time series into the temporal probability density (Figure 1c) according to equation (7). Finally, the temporal probability density is back projected to two-dimensional space with the corrections described in equations (9) and (15).

Postlude
A major advantage of the proposed method is that each back-projected map represents the likelihood function of source location due to a single correlation. The product of all back-projected maps represents the joint-likelihood function due to all correlations. In other words, the peaks in the joint, back-projected map

10.1029/2019GL085538
give the most likely locations of sources given all data and the widths of the peaks provide information about the location uncertainty. Therefore, this method can be regarded as an inference method, as opposed to existing back-projection methods in the literature that are based on stacked back projections and therefore more akin to imaging methods. However, the mapping of an attribute of correlation, in this case its envelope, into likelihood is subject to simplifying assumptions and a simplifying parameterization. Furthermore, the method emphasizes the primary or strongest source.

Synthetic Tests
To quantify the robustness of the location and uncertainty estimates in the joint-likelihood function, we perform synthetic tests by locating synthetic sources at different locations by a network of seismic stations. For a given true source location, synthetic correlograms are generated. These include both random and correlated noise due to scattering and unwanted wave types. The method is then applied to the synthetic data to estimate a location and its uncertainty. These estimates are finally compared to the true source location. By repeating synthetic tests at different source locations, maps of location accuracy and uncertainty estimates are obtained, which can be used to quantify the robustness of the method.
The synthetic data are based on a two-dimensional heterogeneous velocity model, which is generated by adding a random medium to a uniform velocity of 1.2 km/s (a suitable group velocity of 0.5-2 s Rayleigh waves at some volcanoes in Iceland (Benediktsdóttir et al., 2017;Jeddi et al., 2017)). The random medium is described by a Gaussian autocorrelation function with a correlation length of 4 km and a standard deviation of 0.34 km/s based on tomographic results of Jeddi et al. (2017) and Benediktsdóttir et al. (2017). The same station geometry as that of a monitoring network at Katla volcano, southern Iceland is used. The reason for using this geometry is that the synthetic correlation envelopes will be compared to those obtained from real data gathered during an unrest period of the volcano in 2011 . Furthermore, we have chosen structural parameters appropriate for that setting.
For each test, a single source is placed in the study area. A synthetic seismogram is generated by shifting a time series of Gaussian white noise according to the predicted travel time calculated using the specified velocity model. The amplitude of the signal decays with distance from the source as r 0.5 , where r is the distance from the source, to reflect the geometrical spreading of surface waves. To simulate the existence of other wave types, such as body waves, in the signal, we delay the time series of Gaussian white noise according to a uniform body wave velocity of 2.7 km/s. These body waves can be viewed as correlated noise since they are coherent among different stations and will, therefore, produce a peak in the correlation envelopes. Their location distributions in individual back-projected likelihood functions will, however, be inconsistent when the back projection is done with a velocity that differs from theirs. We also add incoherent Gaussian random noise to the signal.
To simulate scattering effects, we randomly distribute 50 scatterers inside the station network. These scatterers have a random strength between 0 and 1 and a random scattering angle with a Gaussian width of ±40 • (1 standard deviation). The scatterers act as secondary sources producing signals that are similar to direct arrivals except that they are only seen by part of the station network. Therefore, they constitute partially correlated signal-generated noise, which is the dominant noise type in our correlation data as time integration in the correlation calculation has effectively suppressed random noise.
The synthetic correlation envelopes are generated by first filtering the synthetic seismograms between 0.8 and 1.5 Hz and then calculating the envelopes of cross correlations of the filtered seismograms for all station pairs. Figure 2 shows a comparison of synthetic correlation envelopes to those from real data obtained during an unrest period of Katla volcano in 2011, using a network with the same station geometry. The correlations are characterized by a broad distribution of energy and are similar for both synthetic and real data. The timing of the highest peak corresponds approximately to the location of the primary source in synthetics. Side lobes generally decay away from the maximum and correspond to scattered energy and body waves.
Correlations from real data are, in general, smoother and have a slower decay to large time shifts. This may be due to multiple scattering in a strongly scattering medium, which is not included in the simulation. This may also be caused by a more spatially confined distribution of scatterers in the synthetics than in the real earth. Note, that the scattering process generates noise that is not stationary in time. It therefore becomes ambiguous over what time lag range to estimate the noise distribution. We have used ±30 s.  . Each correlation envelope is normalized by its standard deviation. See Figure 3a for the station names. Figures 3a and 3b show the joint likelihood of source location for a test with the true location inside the network at (-1, -8). The width of the distribution is about 0.4 km, and the deviation from true location is around 0.3 km, i.e., similar to the width of the distribution. Figure 3c shows a map of location uncertainties estimated from individual joint-likelihood functions. Each uncertainty is estimated by first fitting a two-dimensional Gaussian distribution around the peak of the joint-likelihood function, and then averaging the standard deviations along the two principal directions of the Gaussian distribution. The uncertainty estimate is, in general, less than 1 km, except at locations outside the network, where the estimate is 1-2 km. Figure 3d shows a map of deviations of location estimates from true source locations in our synthetic test. This map is obtained from one realization of synthetic data. We have tested the ergodicity of these random deviations by generating 210 realizations of data at a few source locations inside the seismic network. The ensemble statistics of these realizations are very similar to the spatial statistics of the single realization used to produce Figure 3d. Therefore, we argue that the random process generating the synthetic results is approximately ergodic and the spatial variation in Figure 3d reflects variation in different realizations. In most cases, the deviation of the location estimate is less than 1 km, except at and beyond the periphery of the network, where the deviation is 1.5-3 km.
The location uncertainty is comparable to the location bias, except for locations outside the network, where the location uncertainty is underestimated by about a factor of 2 (and more at distance). To test the robustness of uncertainty estimates, we compare the distribution of location deviations inside the network with the distribution obtained from a theoretical 2 distribution, assuming normally distributed errors with a standard deviation equal to the mean of the uncertainty estimates (Figure 3e). The fit is impressive (Figure 3f), confirming the robustness of the uncertainty estimate.

A Real-Data Example From Katla Volcano
To further demonstrate the method, we apply it to real tremor at Katla volcano, southern Iceland, during unrest in 2011 . To compare the results with synthetics, we apply the same processing to the real data and use the same parameters for the location. We extract 1 hr of seismograms during a period when tremor correlograms and tremor amplitude ratios at different stations were stable. Local events have been removed from the data. We then filter the seismograms at 0.8-1.5 Hz and calculate envelopes of the cross correlations for every station pair. Figures 3g and 3h show the joint-likelihood function of source location from all station pairs. The estimated source location is very close to two out of three cauldrons that subsided during the tremor period, in particular, it is less than a kilometer from that cauldron which collapsed and may be volcanic or caused by pressure release boiling in an underlying geothermal system after emptying of the cauldron reservoir (see Sgattoni et al., 2017 for a more detailed discussion). In addition, the method has been applied to tremor associated with the Eyjafjallajökull eruption in 2010 (Benediktsdóttir, 2019). Again, the tremor source was located within 1 km from the eruption vent at similar frequencies. These examples demonstrate that when applied to real data from two volcanoes in southern Iceland the method locates the source with similar accuracy to the synthetic tests away from the likely source of the tremor.

Discussion and Conclusions
We have introduced a new method for locating tremor sources with unclear onsets. The method maps a correlogram envelope from a pair of recording stations to a probability density as a function of time shift. This probability density is back projected into two-dimensional space to construct a likelihood function of source location. The likelihood function is multiplied by a scaling factor, which accounts for the heterogeneity of the back projection. The scaled likelihood function is then convolved with a Gaussian distribution of width corresponding to the uncertainty of the back projection due to an inaccurate velocity model. Finally, the individual likelihood functions from individual single correlations are multiplied to obtain a joint-likelihood function of the source location. The location estimate is given by the peak of the distribution and the uncertainty estimated by the width of the peak.
The method is similar to existing methods that apply back projection and stacking to image the source location (Haney, 2010(Haney, , 2014. The difference lies in applying multiplication instead of a sum (stack). This is justified in terms of joint location likelihood, given data contained in all correlograms. The method produces both a location estimate and its uncertainty and can thus be seen as an inference method as opposed to an imaging method.
The method can in principle be applied to any attribute of the correlation of tremor. It is not restricted to tremor signals and could be tried on infrasound (Ripepe et al., 2018) as well as other seismic signals. It could be applied to event detection/location algorithms that are based on back projection of a signal attribute, such as the CMM algorithm of Drew et al. (2013) or the detection algorithm of Wagner et al. (2017). It can also be used with three-dimensional back projection using a typical body wave velocity or travel time table if the tremor contains sufficient body wave power. This would in general require a denser network to resolve the added dimension (depth), but would of course be very interesting as depth is an important parameter to understand the tremor mechanism.
Choosing a typical surface-wave group velocity at the relevant period and back projecting to a map in two dimensions assumes surface waves in the tremor. They need not be dominant. Other wave components will not be back projected consistently at the chosen velocity ) and suppressed by the multiplication of the joint likelihood. We have used a homogeneous velocity. Where a heterogeneous velocity model is available, this can be applied in the back projection in order to reduce its uncertainty.
The location bias increases substantially when sources are at and beyond the periphery of the network. This may be due to misidentification of false peaks from scatterers (noise) as the primary peak from source (signal). For sources inside the network, the lengths of the primary paths tend to be significantly shorter than the scattered paths. Therefore, the primary arrivals are usually the biggest amplitude arrivals. However, for sources outside the network, differences between lengths of the primary and the scattered paths decrease and, therefore, the primary arrival may not have the biggest amplitude. Since in the synthetic test the scattering strength is allowed to approach unity, the probability that a scattered arrival, or a number of interfering scattering arrivals, exceeds the primary arrival in amplitude becomes significant. Thus, the probability of misidentification of noise as signal becomes significant. The further away from the network the primary source is, the more likely such misidentification becomes for individual station pairs.