Imaging microearthquake rupture processes using a dense array in Oklahoma

Both large and small earthquakes rupture in complex ways. However, microearthquakes are often simplified as point sources and their rupture properties are challenging to resolve. We leverage seismic wavefieldsrecordedbyadensearrayinOklahomatoimagemicroearthquakeruptureprocesses. Weconstruct machine-learning enabled catalogs and identify four spatially disconnected seismic clusters. These clusters likely delineate near-vertical strike-slip faults. We develop a new approach to use the maximum absolute SH-wave amplitude distributions (S-wave wavefields) to compare microearthquake rupture processes. We focus on one cluster with earthquakes that are located beneath the dense array and have a local magnitude range of -1.3 to 2.3. The S-wave wavefields of single earthquakes are generally coherent but differ slightly between thelow-frequency( ≤ 12Hz)andhigh-frequency( ≥ 12Hz)bands. TheS-wavewavefieldsarecoherentbetween differentearthquakesatlowfrequencieswithaveragecorrelationcoefficientsgreaterthan0.95. However, the wavefieldcoherencedecreaseswithincreasingfrequencyfordifferentearthquakes. Thisreducedcoherenceis likely due to the rupture differences among individual earthquakes. Our results suggest that earthquake slip of the microearthquakes dominates the radiated S-wave wavefields at both low and high frequencies. Our method suggests a new direction in resolving small earthquake source attributes using dense seismic arrays without assuming a rupture model. Non-technical summary The earthquake rupture process and source parameters exhibit key controls on the distribution of ground shaking. Both large and small earthquakes can rupture in complex ways. While large earthquakes occur infrequently, studying abundant small earthquakes may provide a unique opportunity to investigate earthquake rupture physics and identify similarities and differences between large and small earthquakes. Specifically, resolving earthquake rupture processes, and their complexity, is vital for hazardassessmentsandriskmanagement. Here,wedevelopanewapproachtousegroundmotionsrecorded athundredsofstationsfromsmallearthquakestoresolvetheirruptureprocesses. Wefindthatgroundmotion distributionsarecontrolledbytheearthquakeslipatbothlow-andhigh-frequencybands. Thehigh-frequency groundmotionappearstovarybetweendifferentearthquakes, whichmayhavebeencausedbydifferentrup-tureprocessesoftheseearthquakes. Ourresultsshowthatusingdenseseismicarrayobservationscanresolve small earthquake rupture processes, suggesting future research opportunities.


Introduction
Understanding rupture processes of small magnitude earthquakes can help inform earthquake physics, fault zone conditions, and fault stress states because of their frequent occurrence and their relatively-uniform spatial distributions (e.g. Thatcher and Hanks, 1973;Kanamori and Rivera, 2004;Allmann and Shearer, 2007).Small earthquakes are often evaluated as simple sources because of observational limits and modeling challenges (Abercrombie, 2021).However, when examined in detail, small to moderate earthquakes often show surprising rupture complexities (McGuire, 2004;Dreger et al., 2007;Abercrombie et al., 2017;Wu et al., 2019;Meng and Fan, 2021;Pennington et al., 2023).Their key rupture parameters are primarily esti-mated using seismic observations (Abercrombie, 2021).For example, spectral fitting methods measure the corner frequency of the source spectrum in the frequency domain and use pre-assumed rupture models to infer the earthquake rupture extent (e.g., Abercrombie, 1995;Prieto et al., 2004;Trugman and Shearer, 2017b;Shearer et al., 2022).In the time domain, the second-degree seismic moments can be used to image moderate to small earthquake rupture processes (McGuire, 2004;McGuire and Kaneko, 2018;Fan and McGuire, 2018).These simple source models approximate the true rupture processes of events, and the model uncertainty and resolution are often challenging to quantify (Abercrombie, 2021;Pennington et al., 2023).
Conventional approaches rely on aggregating measurements from individual seismic records, and the spatial patterns of the measurements are rarely exam-Figure 1 Seismicity in the study region.Squares show the nodal stations arranged in three cross-wise lines, orange triangles show the broadband stations, and blue inverted triangles show fluid injection sites.Grey circles depict existing regional earthquake catalog with marker size scaled by magnitude (Park et al., 2022).Focal mechanism solution is for the M 2.3 earthquake analyzed in Fan and McGuire (2018).Black line shows orientation of maximum principal stress measured from drillinginduced tensile fractures (Alt and Zoback, 2017;Heidbach et al., 2010).Right inset shows the 1D velocity model used in the location procedure.Left inset shows the location of the study region in Oklahoma.
ined to infer the earthquake ruptures, often limited by instrumental coverage.Large-N nodal arrays provide an opportunity to image earthquakes and fault zone structures at a high spatial resolution (e.g., Ben-Zion et al., 2015;Meng and Ben-Zion, 2018;Sweet et al., 2018;Dougherty et al., 2019;Yang et al., 2021).For example, such arrays have been used to illuminate complex networks of basement faults near wastewater injection sites (e.g., Wang et al., 2020), investigate the physical cause of near-source high-frequency ground motions (e.g., Trugman et al., 2021), and study local site effects and the shallow velocity structures (Johnson et al., 2020;Chang et al., 2023).
The 2016 Incorporated Research Institutions for Seismology (IRIS) Community Wavefield Demonstration Experiment in Oklahoma installed 381 seismographs in a geologically uniform area (Sweet et al., 2018).Here, we use the experiment data to develop a new data-driven method to observe microearthquake sources that requires few assumptions.We construct new catalogs for the study site to investigate the faulting complexities.We measure the distribution of maximum SH-wave amplitudes (S-wave wavefields) at varying frequencies.We identify four earthquake clusters and find that microearthquake wavefields can be used to distinguish small earthquake rupture complexities.

Data
The IRIS Community Wavefield Demonstration Experiment deployed a dense array of all-in-one nodal, broadband, and infrasound stations in northern Oklahoma in 2016 to evaluate the scientific potential of nodal arrays (Sweet et al., 2018).There are 227 nodal stations, deployed in three crosswise lines with one trending eastwest (E-W) and two trending north-south (N-S).The N-W trending lines are 4.8 km in length and the E-W trending line is 12.9 km in extent (Fig. 1).There are 18 broadband stations surrounding the three lines which are deployed in a Golay array configuration.We analyze four months of the continuous broadband records from July to October and one-month continuous nodal records in June/July, overlapping the start of the broadband deployment.The nodal stations have an inter-station spacing of 100 m along the three lines with densification at intersection points.

Earthquake Catalog
We apply a machine-learning-enabled workflow to detect, associate, locate, and re-locate earthquakes (Gong et al., 2023).The broadband and nodal arrays have different deployment periods, and we generate two catalogs separately.The 18 broadband stations result in a  (Alt and Zoback, 2017;Heidbach et al., 2010).Top-left insets show a zoom-in view of seismicity at the ok1 location for the catalogs.
four-month catalog between July 1st to October 28th, 2016, named as Catalog-18.We combine the 18 broadband stations with 39 nodal stations to generate a second, one-month catalog (June 21st to July 18th), named as Catalog-57.The nodal stations are selected to provide an even coverage of the region, with an average interstation spacing of 564 m (Fig. 2).
We first resample the records to 100 Hz, detrend the data, and fill data gaps with zeros.We then apply PhaseNet (Zhu and Beroza, 2018), a machine-learning phase picker, to detect P-and S-wave arrivals at each station independently.We use a probability threshold of 0.3 to select successful phase picks, leading to 252,127 and 254,913 P and S picks,respectively,275 and 481,602 for Catalog-57.The phase picks are associated using GaMMA (Zhu et al., 2022), leading to 10,522 and 14,798 candidate earthquakes for Catalog-18 and Catalog-57, respectively.The GaMMA algorithm assumes a uniform half-space for the association exercise, and we set a P wave velocity as 6 km/s and a V P /V S ratio of 1.75 for the procedure.
With the associated candidates, we locate earthquakes using COMPLOC (Lin and Shearer, 2006) and the same 1D velocity model used in Fan and McGuire (2018) (Fig. 1).The 1D velocity model is constructed based on the velocity model used in Schoenball and Ellsworth (2017b) and by the Oklahoma Geological Survey (OGS; Darold et al., 2015) with the top of basement set as 2270 m.The COMPLOC method uses the Source Specific Station Term (SSST) method to reduce the structureinduced location error.We iterate the location procedure ten times, and remove events within depths less than 0.1 km (Gong et al., 2023).We locate the earthquakes in Catalog-18 and Catalog-57 simultaneously to assure the SSST terms are consistent for the 18 broadband stations.The resulting catalogs agree well, leading to 8,031 and 7,325 earthquakes for Catalog-18 and Catalog-57 respectively.
We further refine the earthquake locations using differential arrival times between earthquake pairs and the GrowClust method (Trugman and Shearer, 2017a).We cross-correlate both P-and S-waves to obtain differential travel times for the nearest 100 earthquakes, and we remove measurements with cross-correlation values less than 0.6 from further analysis (Gong et al., 2022).Given the array configuration, we discard earthquakes located beyond 9.4 km from the center of the array (36.617 • /-97.671• in latitude and longitude).The final relocated catalogs have 5,996 and 5,885 earthquakes for Catalog-18 and Catalog-57, respectively.
Following Gong et al. (2022), we calculate local magnitudes (M L ) for both relocated catalogs using threecomponent displacement waveforms.We remove the instrument response and bandpass filter the records be- tween 4 to 12 Hz.We compute a signal to noise ratio (SNR) for each station as the maximum-amplitude ratio between the signal and noise windows.The noise window is defined as 5 s to 2 s preceding the P wave arrival, and the signal window is 1 s before to 5 s after the S wave arrival.Waveforms with SNR greater than 5 are used to compute local magnitudes, and the final M L assigned to an earthquake is the median value if there are more than three magnitude measurements at different stations.In the OGS catalog, one of the largest earthquakes in our catalog is reported with a magnitude of 2.3 (Fan and McGuire, 2018;Walter et al., 2019).We compute the magnitude difference, and use that to empirically correct all our magnitude values.
Because of our strict quality control procedure, there are approximately 60% and 40% of events with local magnitudes in Catalog-18 and Catalog-57, respectively.Following Goebel et al. (2017b), we calculate the magnitude of completeness (M c ) and Gutenberg-Richter parameters (Aki, 1965;Goebel et al., 2017b;Clauset et al., 2009).We obtain M c values of 0.3 and 0.6 for Catalog-18 and Catalog-57, respectively.Their associated Gutenberg-Richter b values are 0.92 and 0.96 for Catalog-18 and Catalog-57, respectively.The two catalogs have similar b values, while Catalog-57 has a greater M c .This M c difference results from our strict quality criteria, and the majority of the events in Catalog-57 do not have magnitude assignments.The b-value esti-mates of Oklahoma earthquakes span a wide range of values from 0.6 to 1.4 (e.g., Langenbruch and Zoback, 2016;Gable and Huang, 2024).These estimates may depend on the magnitude of completeness, dynamic range of the instruments, and the calculation method (e.g., van der Elst, 2021;Geffers et al., 2022).Our estimates are within the range of previously reported values, and are similar to those of tectonic earthquakes (Hardebeck, 2013).

Spatiotemporal Behaviors of the Seismicity
Both Catalog-18 and Catalog-57 have approximately 6,000 relocated earthquakes (Fig. 2).These earthquakes cluster in space, and we apply a density-based clustering algorithm to group these events with respect to latitude and longitude, requiring a minimum cluster size of 100 earthquakes and a neighbourhood of 0.003 degrees.In Catalog-18, the four largest clusters within the array footprint are named ok1, ok2, ok3, and ok4 with 227, 1176, 407, and 230 earthquakes, respectively.Some earthquakes in ok1, ok3 and ok4 are also observed in Catalog-57, while ok2 is absent in Catalog-57.The locations of these clusters align well for both catalogs.These four clusters are spatially disconnected and form linear features that are parallel to the NEE direction (Fig. 2a).Further, it appears all four clusters were located in Park et al. (2022), however, the locations differ slightly which may be due to the different station configuration and location procedure.
To measure the strike of the seismicity clusters, we solve a linear regression to obtain the best fitting line through the horizontal location of each cluster.The four clusters share a similar strike direction.Given the regional earthquakes have left-lateral strike-slip focal mechanisms (Fan and McGuire, 2018;Schoenball and Ellsworth, 2017b;Herrmann et al., 2011;McNamara et al., 2015), the strikes of these clusters are likely around 254 • .These strikes agree well with fault features determined in other studies (Schoenball and Ellsworth, 2017a;Park et al., 2022), orienting approximately clockwise 15 • from the maximal principal stress direction of the region (Alt and Zoback, 2017;Heidbach et al., 2010).These angles are smaller than the optimal angle of 25-30 • predicted from Byerlee's law for a friction coefficient of 0.6-0.8(Fialko and Jin, 2021), suggesting that these faults are not optimally orientated or have been rotated.Other regional earthquakes also occur on similarly orientated faults throughout Oklahoma (Schoenball and Ellsworth, 2017a;Park et al., 2022), which could be due to high pore pressure reactivation of such faults from wastewater injection (Qin et al., 2019).Pore pressures are likely elevated in our study region due to the close proximity between the faults and injec-tion wells.Moreover, Goebel et al. (2017a) finds that the 2016 M w 5.1 Prague earthquake strikes at approximately 15-20 • to where the coulomb stress is maximized and report large differences in the strike of its nearby faults, suggesting that a locally heterogeneous stress field controls the fault network.Lastly, pairs of conjugate strikeslip faults with a dihedral angle of 50 • are well documented in Oklahoma (e.g., Schoenball and Ellsworth, 2017a) and the smaller angles we observe may be compensated by larger dihedral angles on the conjugate fault.
We focus on Catalog-18 to analyze the spatiotemporal behaviors of the earthquakes because of its longer duration.Earthquakes in ok1, ok3, and ok4 spread near vertically.These features likely represent a strike-slip faulting style (Figs. 1 and 3), as being observed in the regional seismicity (Schoenball and Ellsworth, 2017b;Park et al., 2022).These three faults extend approximately 1 km along strike and less than 1 km along dip.The ok1 cluster has an average depth of 3.60 km, right beneath the top of the basement at 2.27 km (Fan and McGuire, 2018).The ok3 and ok4 clusters are away from the stations, and their earthquake depths are less well constrained.The ok2 cluster appears to comprise a shallower 73 • dipping fault plane (visual interpretation in Fig. 3a) and a deeper, vertically-dipping plane.The shallower fault plane may permit dip slip motion (Fig. 3a,c), and the high dipping angle suggests that it is likely a nor- mal fault.Earthquakes in the four clusters appear shallower in depth than the regional earthquakes (Schoenball and Ellsworth, 2017b), which most commonly occur around 4.5 km beneath the top of the basement.There are numerous wastewater injection sites in the study region (Fig. 1), and the four clusters are all within four kilometers of these sites.The close spatial proximity indicates that the seismicity and their locations may correlate with the wastewater injection activities (e.g., Ellsworth, 2013;Keranen et al., 2013Keranen et al., , 2014;;Weingarten et al., 2015;Walsh and Zoback, 2015).
The four clusters all include episodic bursts of seismicity but with varying temporal behaviours (Fig. 4).There are one, four, and three episodes in ok1, ok3 and ok4, respectively, which are short-lived and have few earthquakes between the episodes.These episodes occur within a 1 km stretch of the fault zone.Seismicity in these three clusters appears to be confined within a small footprint without clear migratory behaviors.In contrast, ok2 appears to experience continuous seismicity, spanning the observational period, and there are two intense episodes of seismicity where earthquakes occur at the shallower dipping fault and the deeper vertical fault, respectively.Following Vidale and Shearer (2006), we calculate the average duration of each sequence as the mean of the time lags of the events after the burst initiation, and normalize the occurrence time of the largest event within each burst with the average duration.If this normalized time is less than 0.15, indicating that the largest magnitude event is close to the start of the sequence, we consider the burst as a mainshock-aftershock sequence.Otherwise, we categorize the bursts as swarms because the largest event occurs later in the sequence.In total, four mainshock-aftershock sequences and six swarms are identified in the four clusters.The shallower and deeper bursts of ok2 display different behaviour with the largest events having normalized times of 2.4 and 0.13, respectively, suggesting the shallow burst was a swarm while the deeper burst was a mainshock-aftershock sequence.Additionally, there are four and one swarms in ok3 and ok4, respectively.These swarm-like sequences might represent fault-fluid interaction due to the nearby injection wells (Vidale and Shearer, 2006;Chen and Shearer, 2011;Horton, 2012;Goebel et al., 2016), and ok3 is located in close proximity to two injection wells with all its bursts as swarms.We also use the coefficient of variation of interevent time (C v ) to evaluate the temporal clustering behaviour of each of the ten bursts (Cochran et al., 2018;Kagan and Jackson, 1991).The C v is the ratio of the standard deviation of the interevent time to the average interevent time of the bursts (Kagan and Jackson, 1991).A C v value of 1 represents a random Poissonian earthquake sequence, a value less than 1 suggests a quasi-periodic occurrence, and a value greater than 1 indicates temporal clustering of seismicity.The C v values of the ten bursts are between 1.26 and 3.16, which are similar to those observed in Cochran et al. (2018), suggesting all sequences display temporal clustering behaviors, for example, in both swarms and main-shock aftershock sequences.

SH Wavefield
We use a cluster of 204 earthquakes in Catalog-57 (Fig. 2b) to explore the spatial variation in the SH velocity wavefield using the 245 nodal and broadband stations deployed between June and July 2016.P-waves in the region have complex signals with long-lasting coda waves that are likely reverberations due to the sediment, and are omitted from this study (Fan and McGuire, 2018).We focus on the SH waves of the Catalog-57 earthquakes that occur below the array.These events are the best resolved and have a good azimuthal coverage.To select the candidate earthquakes, we apply the same density-based clustering algorithm that is used to group earthquakes in Catalog-18 and identify events located within the ok1 cluster footprint.The selected 204 earthquakes have a local magnitude range of -1.3 to 2.3.
For each earthquake, we measure the peak ground ve-locity (PGV) of SH waves at different frequency bands from the transverse component records.We rotate the two horizontal components to obtain the radial and transverse components for each earthquake-station pair, and both the broadband and nodal stations are used in this analysis.To ensure measurement consistency, we remove the instrument response from the broadband traces and then convolve them with the nodal station instrument response.Further, we decimate the nodal data to 100 Hz as the sampling frequency of the broadband data.We apply an 8 th -order Butterworth bandpass filter to the data and use seven oneoctave filters with filter bands from 3-6 Hz to 24-48 Hz (e.g., Fig. 5).We focus on discussing results from four non-overlapping frequency bands, 3-6 Hz, 6-12 Hz, 12-24 Hz, and 24-48 Hz.
For each record in each frequency band, we define a signal window as 0.5 s before to 0.5 s after the predicted S wave arrival and a noise window as 3 s to 2 s before the predicted P-wave arrival using the same 1D velocity model from the location procedure.We measure the largest absolute S-wave amplitude in this signal window as the SH PGV.We compute a SNR as the ratio of the root-mean-square value of the signal win- dow to that of the noise window.If the SNR is less than four, the PGV measurement is discarded from further analysis.We further limit our comparisons between wavefields by enforcing a minimum of 100 common stations between them.Out of the 204 earthquake cluster, 59 events are qualified for further analyses all of which have assigned magnitudes.Before correlating the wavefields, we apply a spatial moving average with a radius of 0.0027 • (a total of ∼ 5 nodal stations on average).The nodal stations are deployed at the surface with varying degrees of coupling to the ground (Sweet et al., 2018).As this experiment was one of the first few to explore nodal array campaign experiments, a few deployment strategies were used, which can impact the observed wavefield amplitudes in this study (Sweet et al., 2018).As the station spacing is around 100 m on average, spatial coherence between adjacent stations is expected due to the simple geological structure of the region.We have tested a variety of radii, ranging from 0.00135 to 0.0054 • , and the wavefield results are about the same (Supplemental Fig. S1).We opt with a radius threshold of 0.0027 to balance smaller scale features and the wavefield coherence between adjacent stations.With the qualified measurements, we obtain a collection of SH wavefield functions for each earthquake in seven frequency bands, which describe the frequency-dependent, spatial distribution of the SH-wave peak ground velocity.We repeat this process for all frequency bands and all earthquakes in the cluster.

Wavefield Comparisons
Figure 6 shows the wavefields of an M 2.3 event at the four example frequency bands.This M 2.3 earthquake is the largest event in the target cluster.At low frequencies (e.g., 3-6 Hz), the associated wavefields show structured spatial patterns that vary azimuthally with respect to the earthquake epicenter.As expected, the measured PGV amplitudes decay with increasing distance from the epicenter.These structured patterns are highly similar to the SH wave radiation pattern for the M 2.3 (Kwiatek and Ben-Zion, 2013) that is predicted using a left-lateral strike-slip focal mechanism solution (Fig. 6a; Fan and McGuire, 2018).At higher frequencies (e.g., above 12 Hz), small-scale heterogeneities are present in the wavefields, which become more pronounced with increasing frequency.Visually, we observe some variations between the low and high frequency wavefields but as we discuss quantitatively later, the correlation values are high.
To quantitatively explore the pattern variations as a function of frequency, we compute the correlation values of the wavefields at different frequencies for the same earthquake (Fig. 7) as the normalized dot product between two wavefields.We use qualified wavefields from 17 earthquakes.In Figure 7, we include results from all seven frequency bands in this analysis.For a wavefield at a given frequency band, we compute the correlation values with wavefields from the other frequency bands for the same event (e.g., Fig. 7a).We further compute the geometric mean of the correlation values from these events to obtain an average estimate of the pattern variations across different frequencies.The 3-6 Hz wavefields of the earthquakes are strongly correlated with the 4-8 Hz bands with an average correlation coefficients of 0.986 (Fig. 7a).The correlation coefficients decrease gradually with increasing frequency, and the average correlation coefficient is 0.9 between the wavefields at the 3-6 Hz and 24-48 Hz bands.We observe a similar behaviour when using wavefield functions in other frequency bands as references, and the correlation coefficients decrease with increasing frequency separations.We find that despite the frequency dependence, the intra-event correlation coefficients remain above 0.839 for all frequency band pairs indicating a general similarity between the wavefields at all frequencies for the same earthquakes.
We further compare the SH wavefields of different earthquakes in the same frequency bands (Fig. 8).Visually, the wavefields are highly coherent, showing little variability at low frequencies (e.g., 3-6 Hz).We correlate the wavefields of 59 different earthquakes for four frequency bands (Fig. 9), following the same method as applied to obtain the intra-event correlation values.We perform 1505 inter-event correlations using qualified wavefield functions for each frequency band, and we only correlate wavefield pairs which are qualified at all four frequency bands.The inter-event correlation values have a mean of 0.965 at the 3-6 Hz band.At higher frequencies, the wavefield functions show more variability (Fig. 9).For the 24-48 Hz band, the mean correlation value is 0.918, and the standard deviation of the correlation values double that of the 3-6 Hz band (Fig. 9).The inter-event distance also appears to inversely scale with the wavefield correlation value (Figs.9e-h).However, we note that the wavefield spatial features remain broadly consistent between different earthquakes as shown in Figure 8. Similar to the intra-event comparisons, the mean correlation values of the inter-event pairs are above 0.918 for all four frequencies.These observations show that wavefields of different earthquakes share similar structured patterns and they likely have the same focal mechanisms.In conjunction with the similarity between the predicted SH wave radiation pattern and the observed SH wavefield functions, the overall high intra-event and inter-event correlation values suggest that the observed SH wavefields at multiple frequency bands are primarily caused by the fault slip instead of fault zone elastic collisions (Tsai and Hirth, 2020;Trugman et al., 2021).

Wavefield Ratios
The empirical Green's functions (eGf) method seeks to use records of small earthquakes as proxies of the Green's function to isolate the source time functions or source spectra of a larger target event (e.g.Mori and Frankel, 1990;Abercrombie, 2021).Here, we assimilate this approach in the time domain by computing the wavefield ratios between earthquake pairs (Fig. 10).This approach is similar to the method for obtaining source spectral estimate used in Al-Ismail et al. (2023).For a target event, we select appropriate eGf events in the same cluster with magnitudes between 0.7 and 1.5 units smaller than the target event.We require the eGf events to have more than 100 common and qualified stations with the target earthquake at all four frequency bands.We then point-wisely divide the target earthquake wavefield by that of an eGf event at every frequency band.We further compute a geometric mean to stack the wavefield ratios observed at each station for a given target earthquakes, and stations with less than five ratio measurements are removed from the final stack results.The ok1 cluster contains five qualified target earthquakes with magnitude greater than one and their wavefield ratios with the associated eGfs are analyzed in this study (Figs. 10 and S3).

Discussion
The agreement between the 3-6 Hz SH wavefield and radiation pattern of the M 2.3 earthquake in Figure 6 suggests that the low-frequency wavefield is modulated by its focal mechanism.Similar results have been observed in other wavefield experiments (Trugman et al., 2021), albeit using the P-wave wavefields.The general similarity between the low-and high-frequency wavefields for the same earthquakes in this study suggests that their high-frequency ground motions remain being controlled by the earthquake focal mechanism, therefore, fault slip.These observations differ from the results showing that the fault slip (double-couple components of the moment tensor) can only be observed in the SH wavefield at low frequencies (e.g., Takenaka et al., 2003;Satoh, 2002;Takemura et al., 2009;Castro, 2006).
The low frequency limit could be related to mixing of the SH and SV waves due to subsurface structural heterogeneities (e.g., Kennett, 1986;Takenaka et al., 2003) or the earthquake source (Tsai and Hirth, 2020).Further, the frequency limits may be related to hypocentral distance.For example, using peak SH velocity measurements for events with hypocentral distances between 4.9-20 km, Takenaka et al. (2003) observe a limit of 2 Hz, while Vidale (1989) study the peak acceleration data of two earthquakes, both with depths greater than 10 km, and find the fault slip is most observable in the 3-6 Hz limit.In contrast, Trugman et al. (2021) use P wave wavefields of the LASSO array in Oklahoma (Dougherty et al., 2019), with hypocentral distances between ∼3-35 km, and find that high frequency wavefields become more isotropic with a higher transition frequency limit of 15 Hz and the median double couple component is approximately 70 % at the 25-35 Hz band.Given the similar hypocentral distances, the higher frequency limit in Trugman et al. ( 2021) might be due to that P waves have more high frequency signals than SH waves.However, we observe SH high-frequency wavefields retain a structure pattern similar to the SH radiation pattern up to 24-48 Hz for events in ok1 that have a median hypocentral distance of 3.95 km and a maximum of 9.43 km.
As we focus on examining SH wavefields of co-located earthquakes within 1 km beneath the nodal array, their path and site effects are likely the same, indicating that the variations in the wavefield functions are solely related to variation in the earthquake sources.The agreement between low frequency wavefields (Fig. 8) suggests that these earthquakes likely share a similar focal mechanism.The wavefields become increasingly variable at higher frequencies, and this variation may relate to the faulting and rupture complexities of individual earthquakes (Fig. 9).For example, the stacked wavefield ratios of the M 2.3 event (Figs.10d-f) aligns well with the up-dip rupture model from Fan and McGuire (2018) that higher corner frequencies and short apparent-sourcetime-function durations are observed at stations where we observe large wavefield ratios.This agreement indicates that our approach can also delineate rupture directivity effects for small earthquakes.Moreover, wavefield ratios of the other target events (Fig. 10 and S3) appear to have different spatial patterns.These distinct wavefield ratio patterns may relate to different rupture processes between the earthquakes.
Our reported wavefield patterns could be from site effects that are not linked to source effects (e.g., Chang et al., 2023).To quantify the effects of this potential bias, we follow Ibs-von Seht and Wohlenberg (1999) to com-pute the horizontal to vertical (H/V) amplitude spectral ratio of 80 minute-long noise time series.The noise time series are randomly selected as 1-min long segments during a 10 day period from 1st to 10th of July.These H/V spectral ratios are used to infer spatial variations in site amplification for the nodal stations.Figure 11 shows the spatial distribution of the maximum spectral ratio in the 2-20 Hz range and the corresponding peak frequency.We observe that the spectral-ratio estimates are relatively uniform across the array, and the peak frequencies are higher towards the west.The uniform spectral-ratio distribution suggests that site effects do not bias our wavefield and wavefield-ratio results much, and the peak frequency variation may relate to the local shallow structure complexities (e.g., Zhong and Zhan, 2020).We experiment with removing sites with spectral ratio amplitudes deviating two standard deviations from the mean for the intra-event, interevent, and wavefield ratio analyses, and find the results do not change very much.

Conclusions
We develop two earthquake catalogs using the 2016 IRIS Wavefield Experiment data and apply a new data-driven method to evaluate microearthquake source properties using their SH wavefield functions.We find four earthquake clusters that have episodic seismicity bursts, including one cluster beneath the nodal array.The SH wavefields of this cluster are modulated by the earthquake focal mechanism and their wavefields at high frequencies are also impacted by the earthquake rupture complexities.These complexities may have caused reduced coherence in wavefields at higher frequencies for the group of earthquakes.Our results suggest that wavefield observations can be used to probe microearthquake source properties.Given its simplicity, this wavefield approach can be applied to new observations, such as distributed acoustic sensing, to infer microearthquake rupture processes.

Figure 2
Figure 2 Map view of (a) Catalog-18 over 120 days and (b) Catalog-57 over 27 days.Legends are similar to those in Fig. 1 with open and solid squares and triangles as seismic stations.Stations shown as solid black symbols are used to generate the catalogs.Four seismicity clusters, ok1 to ok4, are highlighted in different colors and unclustered seismicity is in grey.Circle size denotes magnitude and events with no magnitude are plotted as M -1.5 for visual purposes.Colored lines show the strike estimates of the corresponding clusters and the black line shows the maximum principal stress of the region(Alt and Zoback, 2017;Heidbach et al., 2010).Top-left insets show a zoom-in view of seismicity at the ok1 location for the catalogs.

Figure 3
Figure 3 Vertical cross section of seismicity for Catalog-18.Circular markers show earthquakes and their sizes are scaled by magnitude and events with no magnitude are plotted as M -1.5 for visual purposes.(a) strike-orthogonal (74 • azimuth as the positive direction) projection of the seismicity.Black dashed line shows the inferred dip of the shallow cluster in ok2.Seismicity is colored according to the cluster.(b) strike-parallel projection of the seismicity.(c,d) seismicity color-coded with their occurrence time as days in 2016.

Figure 4
Figure 4 Temporal evolution of seismicity in (a) ok1, (b) ok2, (c) ok3 and (d) ok4.Grey histogram shows daily seismicity rate.Circular markers show earthquakes and their sizes are scaled by magnitude and events with no magnitude are plotted as M -1.5 for visual purposes.Horizontal and vertical axes show the occurrence time and the along-strike distance to the cluster centroid location.Horizontal bars denote the span of bursts in each cluster and the number indicates the normalized time of the largest event in the burst.

Figure 5
Figure 5 SH wave record sections for the M 2.3 event along the E-W trending line of the nodal stations.Records are filtered in the (a) 3-6 Hz, (b) 6-12 Hz, (c) 12-24 Hz, and (d) 24-48 Hz bands.Records are normalized by the maximum amplitude of the respective frequency band.Red box shows the position of the maximum amplitude for each frequency band, which shifts with varying frequency bands.

Figure 7
Figure7Intra-event correlations for 17 earthquakes at the ok1 location.For each event, we take the wavefield recorded in the (a) 3-6 Hz (b) 6-12 Hz (c) 12-24 Hz and (d) 24-48 Hz band and correlate it with wavefields, for the same event, at all seven frequency bands.Grey lines show correlation values, for each event, as a function of frequency (we plot the midpoint frequency of each band).Black line shows the geometric mean of the grey lines.Note that the correlation coefficients are unity where the band is correlated with itself.

Figure 8
Figure 8 Wavefield functions for 100 example events recorded along the E-W trending line at (a) 3-6 Hz , (b) 8-16 Hz, (c) 16-32 Hz and (d) 24-48 Hz.Red line shows the wavefield function of the M 2.3 earthquake.For visual purpose, we plot all qualified data points including events with less than 100 stations.

Figure 9
Figure 9 Inter-event SH wavefield correlations for 59 earthquakes at the ok1 location.(a-d) Grey histogram show the pairwise SH wavefield correlations for the qualified wavefield functions.(e-h) Contour plot of correlation values as a function of inter-event distance.Colors correspond to the measurement density with a minimum value of 10 data points.

Figure 10
Figure 10 SH wavefield ratios.(a-c) wavefield ratio of the M 2.3 wavefield to that of one eGf event at (a) 3-6 Hz, (b) 6-12 Hz and (c) 12-24 Hz.Beach ball shows the M 2.3 event focal mechanisms.(d-f) geometric stack of wavefield ratios of the M 2.3 earthquake.(g-i) geometric stack of wavefield ratios of an example M 1.5 target earthquake.Grey circles show eGf events.

Figure 11
Figure 11 Quantifying site effects in the study region.(a) geometric mean of peak frequencies for the H/V ratios.Inset shows the peak frequency histogram with a mean of 7.484 Hz and a standard deviation of 2.364 Hz.(b) geometric mean of H/V ratio that corresponds to the peak frequency.Inset shows the H/V ratio histogram with a mean of 7.094 and a standard deviation of 3.187.No moving average is applied.