High‐Frequency Ground Motions of Earthquakes Correlate With Fault Network Complexity

Understanding the generation of damaging, high‐frequency ground motions during earthquakes is essential both for fundamental science and for effective hazard preparation. Various theories exist regarding the origin of high‐frequency ground motions, including the standard paradigm linked to slip heterogeneity on the rupture plane, and alternative perspectives associated with fault complexity. To assess these competing hypotheses, we measure ground motion amplitudes in different frequency bands for 3 ≤ M ≤ 5.8 earthquakes in Southern California and compare them to empirical ground motion models. We utilize a Bayesian inference technique called the Integrated Nested Laplace Approximation (INLA) to identify earthquake source regions that produce higher or lower ground motions than expected. Our analysis reveals a strong correlation between fault complexity measurements and the high‐frequency ground motion event terms identified by INLA. These findings suggest that earthquakes on complex faults (or fault networks) lead to stronger‐than‐expected ground motions at high frequencies.


Introduction
It is widely accepted that an earthquake source parameter called stress drop, Δσ, is a critical parameter that determines the level of high-frequency ground motion during an earthquake.As formulated by the conventional Brune-Haskell model (Brune, 1970;Haskell, 1964), stress drop relates average slip on a fault to a characteristic rupture dimension, often the radius r for a circular crack.The earthquake source spectrum is influenced by the spatiotemporal slip history (Kanamori & Anderson, 1975) and exhibits an f 2 falloff at high frequencies above a corner frequency f c ∼ 1/T (Haskell, 1964), where T is the total duration of the rupture process.In this model, the observed corner frequency depends on both the stress drop and seismic moment of the earthquake (Allmann & Shearer, 2009;Kanamori & Anderson, 1975).In particular, the stress drop controls the transition between the low-frequency spectrum, with an amplitude proportional to moment, and the high-frequency portion beyond the corner f c , and thus may play an important role in high-frequency ground motions (Baltay et al., 2019).
However, stress drop estimates are known to be highly uncertain (Abercrombie, 2013(Abercrombie, , 2021;;Cotton et al., 2013), perhaps due to the oversimplification of the Brune-Haskell model in capturing the nuances of the rupture process.Efforts have been made to introduce additional complexity into this model, for example, by considering smallerscale slip heterogeneity, whether stochastic (e.g., Aki, 1967;Andrews, 1980;Frankel, 1991;Haskell, 1964) or deterministic (e.g., Dunham et al., 2011).Despite the advancements, these models still face challenges in explaining the broad spectrum of observed high-frequency ground motions.Tsai and Hirth (2020) proposed an alternative paradigm called the "Impact Model" which posits that high-frequency ground motions may also arise from the radiation generated during elastic loading and unloading of fault zone structures during fault slip in the presence of geometrical incompatibilities.The impact model is distinct from rough fault frictional models in that it predicts larger P/S radiated energy, more isotropic high-frequency radiation patterns, and corner frequencies and deformation timeframes that are dictated by elasticity (Tsai et al., 2021).
The purpose of this study is to test whether damaging high frequency ground motions can be generated by the presence of fault zone complexities.Chu et al. (2021) introduced two novel metrics for defining fault complexity based on mapped fault traces, aimed at characterizing fault networks by assessing their Fault Density Ratio (R D ) and degree of alignment, referred to as Fault Misalignment Ratio (R M ).Additionally, Chu et al. (2021) demonstrated that these fault complexity metrics are correlated with spatial variations in Brune stress drop estimated from corner frequencies.Encouraged by these findings, in this study we directly investigate the influence of fault-zone geometry in the generation of high-frequency ground motion of earthquakes.Our work aims to determine the degree to which geometric complexities of fault-zone structures and frictional heterogeneity are necessary for understanding the physical processes that generate high-frequency ground motions during earthquakes.
Ground motion prediction is a fundamental task in engineering seismology and is crucial to seismic hazard assessment.The community has dedicated tremendous effort to the development of empirical ground motion models (GMMs), which relate properties of the source, path, and site to the distribution of expected ground motions for a given earthquake scenario (e.g., Douglas & Edwards, 2016).GMMs are typically developed using the ergodic assumption, where ground motion records collected at a global scale are assumed to be representative of earthquake scenarios defined at a more local scale (Anderson & Brune, 1999).However, application of such ergodic models when applied to a regional or local scale, can lead to large uncertainties, as some of the variability in ground motion observed at a global scale may not be relevant at any given location.Nonergodic GMMs were subsequently developed to help reduce these uncertainties (Atik et al., 2010).
In this study, we employ the Integrated Nested Laplace Approximation (INLA) method (Rue et al., 2009), a Bayesian inference approximation technique, to incorporate a non-ergodic component into the GMM (Lavrentiadis et al., 2022), and in particular to isolate an event term that quantifies how different the observed ground motions from earthquakes in a source region are from the model predictions.Once the event terms have been estimated, we seek to investigate the influence of fault complexity measurements on ground motion variations.Through correlation analysis, we demonstrate that fault complexity measurements are strongly associated with high-frequency ground motions.To our knowledge, this study is the first attempt to quantify the impact of geometric complexity of fault networks directly on the level of high-frequency ground motions generated by earthquakes.

Data
We compile our event catalog from earthquakes in Southern California sourced from two prior studies: Shearer et al. (2022) and Trugman (2022).This set of earthquakes occurred in between 1996 and 2020.We only include events in the moment magnitudes ranging between 3 ≤ M ≤ 5.8 (Figure 1a) to ensure that the GMMs applied later in our study are appropriate for our data set.Note that data sets do not overlap; the Shearer et al. (2022) data set includes only relatively small events (M < 5), while the Trugman (2022) data set includes a selection of 14 M5 to M5.8 earthquakes.These two data sets were specifically selected because they present a set of well-recorded earthquakes that also include Brune-type stress drop estimates, which provide useful additional context for our study of high-frequency ground motions.Although the two studies use different techniques and data sets to measure stress drop, they show similar regional patterns in stress drop variability, and we assume they are mutually consistent to reasonable approximation (Figure S3 in Supporting Information S1).Following the cautionary approach recommended by Shearer et al. (2022), we exclude events where the stress drop estimate is above 100 MPa, which signifies unreliable estimates from earthquakes with poorly resolved corner frequencies.
To measure ground motion amplitudes at different periods, we obtain seismic waveforms within a 2°distance radius of each earthquake epicenter.The data is retrieved from the Incorporated Research Institutions for Seismology Data Management Center and subsequently processed using the USGS automated ground-motion processing software gmprocess (Hearne et al., 2019).The gmprocess software is utilized to apply standard timeseries processing procedures (removal of instrument response, baseline offset corrections, etc.) and exclude records that did not meet quality control standards (signal-to-noise ratio, maximum amplitude, etc.) (Ancheta et al., 2014).To guarantee a minimum signal-to-noise ratio of 3.0 in the passband, the records are subjected to band-pass filtering using automatically selected corner frequencies.After the quality-control step we retrieve 44,362 waveforms from 351 unique events (Figure 1b).Subsequently, gmprocess is used to compute the observed 5% damped pseudo-spectral acceleration (PSA) values at oscillator periods of 0.066, 0.075, 0.1, 0.11, 0.12, 0.14, 0.16, 0.2, 0.25, 0.33, 0.46, 1.0, and 1.5s for each recording.For the same periods, using the OpenQuake Engine framework (Pagani et al., 2014) based on the Active Crustal model, we calculate the predicted PSA values.The 5% damped PSA intensity metric is regularly utilized by the seismic hazard community and provides a quantifiable index for hazard calculations relevant to structures and buildings.To calculate our fault complexity metrics, R M and R D , we used the high-resolution fault map information from the USGS Quaternary Fault Database.

Methods
We use the four "Active Crustal" GMMs from the Next Generation Attenuation West (NGA-West2) suite (Abrahamson et al., 2014;Boore et al., 2014;Campbell & Bozorgnia, 2014;Chiou & Youngs, 2014) and weight them equally to measure predicted Pseudo-Spectral Acceleration (PSA) ground motion values for a range of frequencies.This composite active crustal GMM has been shown to be well-suited for the region of Southern California (Chatterjee et al., 2022), where this study is focused.
The growing availability of ground motion data has revealed systematic differences related to site and source location.Non-ergodic models explicitly consider such location and site-specific effects to reduce uncertainty in the mean predicted ground motions.We take this into account and apply a non-ergodic adjustment to capture the systematic effects related to the source and site with an ergodic "backbone" GMM, with our "backbone" GMM being the Active Crustal GMM.The details of this non-ergodic formulation is described in the supplemental material.The median prediction of the backbone ergodic model is a function of the earthquake magnitude M, the closest distance to rupture surface R Rup , and site amplification information represented by V S30 .Non-ergodic adjustments, exemplified in models by Kuehn et al. (2019) and Lavrentiadis et al. (2022), are determined through the estimation of total ergodic residuals.The total residuals correspond to the difference between the log of the ground motion intensity measure and the median ground motion of the ergodic backbone model in log space as, where y is the observed ground motion and f erg is the predicted intensity using a GMM.Additional complexities in ground motion recordings can be expressed as the superposition of an earthquake source (or event) term, a site (or station) term, and a path effect associated with the raypath between the source and station, along with some uncertainties.Following the convention introduced by Atik et al. ( 2010), and using our formulation of the non-ergodic GMM, we solve for the total non-ergodic effects related to the source (δL2L) and site (δS2S).We do not account for repeatable path effects (δP2P), because raypath coverage in our relatively small data set (several hundred earthquakes) is not sufficient to reliably constrain the path terms.We then estimate the non-ergodic model parameters via Bayesian inference, using the INLA method (Rue et al., 2009).INLA, in conjunction with the stochastic partial differential equation (SPDE) approach (Lindgren & Rue, 2015;Lindgren et al., 2011), enables rapid and efficient inference in spatial models.In the SPDE approach for modeling spatial data, the spatial Gaussian field is approximated using basis functions evaluated on a triangular mesh.This approach implies that the spatially varying coefficients are represented at the mesh nodes and subsequently projected onto the observation locations (Figure S1 in Supporting Information S1).The implementation of INLA models utilizes the R-INLA package (Grigorios Lavrentiadis, 2022;Lindgren & Rue, 2015;Rue et al., 2017).
With the GMM event terms estimated (Figure 1b), we next focus on understanding their correlation to stress drop and fault complexity measurements.We first cluster our earthquake locations using the DBSCAN algorithm in Python (Ester et al., 1996;Schubert et al., 2017) into 18 clusters (Figure S2a in Supporting Information S1).
Clusters were created by imposing a minimum sample value of 3 (the smallest number of events required to form a cluster), with a maximum distance of 8 km between any two events for them to be deemed neighbors.Since the event terms estimated by INLA are derived from a spatial random field, which is spatially smooth, events terms located in close proximity are by design strongly correlated and do not represent independent measurements.Clustering helps us mitigate this issue by distinguishing geographic groups of events, and hence we cluster our data set, calculate the median cluster coordinates, median stress drop estimate and median event term for all the 18 clusters individually.We provide additional examples in the supplementary material with alternative clustering parameters (Figure S2b in Supporting Information S1) that demonstrate the robustness of our main findings.
Next, we calculate the fault complexities, R M and R D , from the quaternary fault database.We estimate these metrics for the entire state of California by taking coordinates on a spatial grid with 0.1°spacing and calculating an average value based on search radii ranging from 5 to 30 km.This gives us a spatial map of R M and R D values for all of California (Figures 2a and 2b).We calculate the ratio of the minimum to the maximum length of the summed fault trace projections for each possible rotation angle to estimate fault misalignments (R M ), in accordance with Chu et al.  fault density (R D ), defined as the sum of fault lengths within a region divided by the length of the perimeter.The R M and R D values at each cluster coordinate are extrapolated from this spatial map via the Nearest Neighbor algorithm using the K-dimensional (KD) tree approach (Bentley, 1975).
We next correlate the spatially varying event terms defined at 13 different periods with the fault-complexity measurements, R M and R D at the corresponding locations.A best fitting line to the data was determined using robust linear regression (Siegel, 1982;Stein & Werman, 1992).Pearson correlation coefficients are subsequently computed between the measured event terms and fault complexity metrics.The associated uncertainties in the correlation coefficients are computed by bootstrapping the data set 1,000 times (Efron, 1979).While not the main focus of our study, previous work has focused in detail on the connection between stress drop and fault complexities (Chu et al., 2021), and also between stress drop and high frequency ground motions (Baltay et al., 2013;Lior & Ziv, 2020;Oth et al., Trugman & Shearer, 2018), so for comparison we also repeat the correlation exercise between the event terms and stress drop estimates, and present the results from this analysis in the supplementary material.

Results
We use more than 40,000 quality-controlled recordings from 358 events to estimate the PSA values for 13 different periods, and separately assemble the fault complexity metrics.The non-ergodic event terms computed from INLA show notable scatter but with discernible spatial variations.At frequencies 2 Hz and greater, the total non-ergodic event terms exhibit negative values around the Salton Trough region, positive values around the San Andreas Big Bend region, and moderate values around the Ridgecrest region (Figure 1b).A similar spatial trend is also observed with the stress drop values in this region (Figure 1a and Figure S3 in Supporting Information S1).
Regarding the fault complexity metrics, R M values range from 0 to 0.8, where values closer to 1 represent a high degree of misaligned faults in the region.R D values range from 0 to 4, with higher values indicating a high fault density ratio.For both the metrics, we see significant spatial variability.There are areas where there are no mapped surface fault traces, and hence fault complexities cannot be computed.
In the next section, we describe the statistical correlations we find between our calculated event terms at different frequencies for both the fault complexity metrics, R M and R D .Figures 3a and 3b are shown as examples to illustrate the correlation of these parameters with the event term at 5 Hz.Individual correlation plots of fault complexities (Figures S6 and S7 in Supporting Information S1) and stress drop estimates (Figure S4 in Supporting Information S1) with event terms for all other frequencies are presented in the supplemental section of this paper.
We plot the calculated fault complexity for each cluster against the median event term of that cluster.We also compute the correlation coefficient between the two parameters at the 13 different frequencies used in this study.
We find correlation values ranging from 0.4 to 0.75 between fault misalignment R M and the event terms at frequencies between 2 and 15 Hz, with lower, slightly negative correlations ranging from 0.1 to 0.3 at low frequencies below 1 Hz (Figure 3d).We observe similar, albeit weaker, trends for fault density R D : correlation coefficients span a range of 0.3-0.4 for event terms between 5 and 15 Hz, range between 0.1 and 0.2 for event terms between 2 and 4 Hz and have negative correlations with event terms below 1 Hz (∼ 0.3) (Figure 3c).In summary, we find that the fault complexity metrics are well correlated with the event terms at frequencies 2 Hz with diminishing correlations below 2 Hz, an observation we discuss in some detail in the following section.We also find that the overall correlation with high-frequency ground motion is stronger for fault misalignment than it is for fault density.

Discussion and Conclusions
Several previous studies have explored the relation between stress drop and ground motion (Baltay et al., 2013;Lior & Ziv, 2020;Oth et al., 2017;Trugman & Shearer, 2018).While not the central focus of our study, we show (Figure S5 in Supporting Information S1) that for the moderate-sized events in our study, stress drop is more strongly correlated with event terms at higher frequencies (between 2 and 15 Hz).These frequency ranges are comparable to or above the corner frequencies of most earthquakes in our database, which is intuitive based on the definition of the corner frequency as a transition point in the seismic spectrum.However, when interpreting these results, it is important to recognize that the stress drop estimates are calculated from measured corner frequencies under the assumption of source spectral models that may not accurately reflect the true earthquake rupture process.Thus, it is possible that other physical processes unrelated to the variability in static stress drop may contribute to the observed correlation between stress drop and ground motion.
Indeed, in this study we demonstrate a significant correlation between fault complexity metrics and earthquake ground motion amplitudes at high frequencies.While the correlation between stress drop and ground motion noted above is now well-established, and recent work has examined the correlation between stress drop and fault complexities (Chu et al., 2021), we newly illuminate the connection between fault complexity and ground motion event terms without assuming a particular spectral model.This relationship has not, to our knowledge, been previously explored in quantitative detail and suggests the existence of an underappreciated factor controlling how ground motions from earthquakes are generated at these high frequencies.
These findings extend a growing body of evidence indicating that earthquake source properties are influenced by extrinsic characteristics of the fault system on which the event occurs.The plate tectonic context (intraplate vs. interplate faults; e.g., Scholz et al., 1986), the long-term slip rate (e.g., Anderson et al., 1996), the geometry (e.g., Stirling et al., 1996), the structural maturity of the long-term faults (Manighetti et al., 2007;Radiguet et al., 2009) and fault geometry (Chu et al., 2021;Trugman et al., 2021;Tsai et al., 2021) have all been recognized as fault properties that impart a significant effect on earthquake variability (i.e., variability in stress drop, slip amplitude, rupture length, and magnitude).Our analysis shows significant correlations between fault misalignment R M and ground motion event terms at frequencies above 2 Hz and diminishing correlations below 1 Hz.This frequencydependent correlation with ground motion is similar, albeit less intense, for R D .
Taken together, our observations suggest that regions with more complex fault systems exhibit intensified highfrequency ground motions.The particularly strong correlation of fault misalignment with high-frequency ground motions suggests that discrete fault structures themselves could be a factor.Interactions between distinct fault structures, which at short timescales can transfer momentum to each other nearly elastically, can produce radiation at a frequency characteristic of the timescale of impact and hence the structure size (Tsai & Hirth, 2020;Tsai et al., 2021).Similarly, geometric complexity in fault systems may also cause accelerations and decelerations in the rupture processes that influence the characteristics of earthquake spectra.
We conclude the manuscript by identifying some limitations of this analysis and outline some ways to improve them in the future.First, we note that due to limitations in data density, our decomposition of event terms from the ground motion residual distribution does not consider non-ergodic or regional variations in path effects that may also influence ground motions.We are limited in this work to earthquakes of sufficient size (M3 and greater) to align with the constraints of NGA-West2 GMMs, but future work could consider smaller earthquakes as a means to estimate path terms alongside non-ergodic source and site terms.While we neglect non-ergodic path terms in this work, we note that it would be difficult to explain the observed correlation between high-frequency ground motion and fault complexity as a path effect rather than a source effect.In particular, the presence of a complex fault network would suggest damage to the host rock in the near-source region, which would tend to attenuate high-frequency ground motion rather than enhance it.Thus, if path effects were the primary driving factor, we would expect negative correlations with event terms at high frequencies rather than positive ones.Any path effects associated with fault complexity that are present may have in fact suppressed the ground motion correlation values such that, if corrected for, we may anticipate higher correlations than we currently observe.
We also note that our analysis is by design geared toward examining the variability in ground motion amplitudes at frequencies equal to or surpassing the corner frequency of the earthquakes in our data set.At lower frequencies, ground motion levels are predominantly influenced by earthquake magnitude, which is a direct input to the backbone ergodic GMM and thus does not influence event terms derived from residual ground motions.While fault complexity may have some influence on earthquake size, this effect cannot be ascertained through ground motion residual analysis.However, at higher frequencies, corresponding to timescales shorter than the rupture dimension, the complexity of the rupture process can affect ground motion amplitudes in a measurable way.
Given that the earthquakes in our data set are generally small (mostly M3 and M4), with corner frequencies at 2 Hz or higher, it is this range of frequencies where we would anticipate the strongest effects to be observed.
Another limitation is the use of mapped surface faults from the USGS Quaternary Faults and Folds Database to quantify fault complexity metrics.While this database is remarkably detailed in southern California, it only captures surface rupture traces at a limited resolution.The earthquakes in our data set are small compared to surface-rupturing events and confined to seismogenic depths, and thus the surficial fault measurements may not always reflect the relevant fault complexity at hypocentral depths where ruptures occur.However, the correlations we observe persist whether or not we limit the depth extent of our earthquake catalog, suggesting that fault complexity as measured at the surface is a sufficient proxy for the relevant processes at seismogenic depth.
Despite these limitations, our analysis newly reveals strong but frequency-dependent correlations between ground motion intensity metrics and fault complexity as defined by surface traces.These novel results may aid in determining the extent to which measurements of fault zone characteristics are required for producing precise ground motion estimations.This study provides a first step toward doing so, illuminating a new avenue to explore while developing next-generation GMMs that fully capture the local and region-scale processes relevant for hazard.

Figure 1 .
Figure 1.(a) Map view of earthquakes used in our study, color-coded by stress drop estimates.The size of the circles scale with earthquake magnitude.Stations from where waveform recordings were downloaded and processed by gmprocess are plotted in blue.(b) Non-ergodic spatially varying event terms resolved from pseudospectral acceleration at 5 Hz.
(2021)'s methodology.A value of zero indicates completely parallel fault strands, and a value of one indicates completely random fault orientations.Additionally, we analyze potential relationships with

Figure 2 .
Figure 2. (a) Spatial Map of fault misalignment, R M , values across Southern California.A value of zero indicates completely parallel fault strands, and a value of one indicates random fault orientations.(b) Spatial map of fault density, R D , values across California and Nevada, with larger values corresponding to more dense fault systems.The gray areas in this plot indicate regions with no mapped surface fault traces and hence fault complexities could not be computed in these areas.Note that the events analyzed in our data set do not fall in these gray areas.

Figure 3 .
Figure 3. Ground motion event terms measured at 5 Hz plotted against (a) fault density ratio, R D (b) fault misalignment ratio, R M .The error bars indicate the standard deviation within each cluster.The black dotted line is the robust regression fit for the data.Each subplot has a title that states the correlation coefficient with uncertainties.The uncertainties in the correlation coefficient are estimated by bootstrapping the data 1,000 times.Pearson correlation coefficient at different frequencies, for (c) fault density against spatially varying event terms, and (d) fault misalignment against spatially varying event terms.The arrows from panel (a) to (c) and (b) to (d), marked by a star, shows the corresponding correlation coefficient value for the respective scatter plot for pseudo-spectral acceleration at 5 Hz.The highest correlations can be found at the higher frequencies, with correlations diminishing at the lowest frequencies.