Evaluating and correcting short-term clock drift in data from temporary seismic deployments

Temporary seismic network deployments often suffer from incorrect timing records and thus pose a challenge to fully utilize the valuable data. To inspect and ﬁ x such time problems, the ambient noise cross-correlation function (NCCF) has been widely adopted by using daily waveforms. However, it is still challenging to detect the short-term clock drift and overcome the in ﬂ uence of local noise on NCCF. To address these challenges, we conduct a study on two temporary datasets, including an ocean-bottom-seismometer (OBS) dataset from the southern Mariana subduction zone and a dataset from a temporary dense network from the Weiyuan shale gas ﬁ eld, Sichuan, China. We ﬁ rst inspect the teleseismic and local event waveforms to evaluate the overall clock drift and data quality for both datasets. For the OBS dataset, NCCF using different time segments (3, 6, and 12-h) beside daily waveforms data is computed to select the data length with optimal detection capability. Eventually, the 6-h segment is the preferred choice with high detection ef ﬁ ciency and low noise level. For the land dataset, higher drift detection is achieved by NCCF using the daily long waveforms. Meanwhile, we ﬁ nd that NCCF symmetry on the dense array is highly in ﬂ uenced by localized intense noise for large interstation distances ( > 1 km) but is well preserved for short interstation distances. The results have shown that the use of different segments of daily waveform data in the OBS dataset, and the careful selection of interstation distances in the land dataset substantially improved the NCCF results. All the clock drifts in both datasets are successfully corrected and veri ﬁ ed with waveforms and NCCF. The newly developed strategies using short-segment NCCF help to overcome the existing issues to correct the clock drift of seismic data.


Introduction
Investigation of earthquakes and seismic imaging of the Earth's interior has been significantly improved with the development of modern instruments, expanding station coverage, and advanced processing techniques.In recent years, temporary seismic networks have rapidly grown in a variety of settings including both marine and land environments, which had greatly improved our ability to understand the Earth's internal structure and underlying processes with high resolution.For instance, in the land environment, large-N arrays (a large number of stations with interstation distance of less than a few hundred of meters) have been widely adopted by the scientific community which are specifically designed for shallow structure imaging including fault damage zones and velocity changes along the faults (Jiang et al., 2021;Shao et al., 2022;She et al., 2022;Yang et al., 2014), high-resolution velocity structures of the upper crust (He et al., 2018;Jiang et al., 2020;Lin et al., 2013), monitoring the temporal variations of the subsurface structures (Liu et al., 2021;Luan et al., 2022;Yang et al., 2021), and seismic site amplification (Qiu et al., 2021;Song and Yang, 2022).In addition, these networks help to characterize the source properties of small-magnitude earthquakes such as source origin and stress drop (Kemna et al., 2020;Li et al., 2018;Sheng et al., 2020), and monitor the spatiotemporal evolution of induced seismicity (Cochran et al., 2020;Dougherty et al., 2019;Wong et al., 2021;Zhou et al., 2021).Recently, Li et al. (2021) developed an integrated system based on artificial intelligence and 4G data transmission for monitoring the real-time evolution of aftershock sequences.On the other hand, Ocean Bottom Seismometers (OBSs) have been widely used in numerous studies to understand the seismicity and slab geometry in various subduction zones e.g., Central/Southern Mariana (Cai et al., 2018;Chen et al., 2022;Zhu et al., 2019Zhu et al., , 2021)), Manila Trench (Zhu et al., 2022), Hikurangi (Mochizuki et al., 2021;Yarce et al., 2019), Cascadia (Alongi et al., 2021;Morton et al., 2018), Alaska (B ecel et al., 2017;Wei et al., 2021) and many others offshore structures.
With the well-developed instruments and extended seismic networks, the first important step is to conduct quality control of the data, particularly for accuracy in timing.Indeed, long-term stations without realtime regulation on land and especially in the ocean are often affected by clock errors, which are crucial for multiple geophysical applications such as earthquake relocation and seismic tomography (Hable et al., 2018), and deriving the subsurface structural changes (Liu et al., 2021;Luan et al., 2022).Continuous synchronization of the instrumental clock with the Global Navigation Satellite System (GNSS) is a standard way to assure time accuracy (Ringler et al., 2021).Such synchronization, however, fails in OBS deployments due to no GNSS connection (Hable et al., 2018;Xie et al., 2018;Zhu et al., 2020).As a result, these deployments often experienced clock drifts in multiple stations.For example, Gardner and Collins (2012) reported an average clock drift of around AE2 s/year for 700 OBS deployments which may exceed tens of seconds.Usually, clock drift refers to some linear and continuous changes which are easily estimated and corrected by synchronization with GPS before and after the deployment on board.However, substantial non-linear drifts still exist in the data which can last for the entire deployment period (up to one year) or only for a short time (a few hours to a few days).For instance, out of 6 OBS stations deployed around Loki's Castle hydrothermal field, two stations experienced a linear clock drift of up to 4 s with an additional apparent jump of 0.03 s in the mid of acquisition (Loviknes et al., 2020).Le et al. (2018) also noticed a constant clock drift of 192 s, which appeared after 5 months and remained constant in the data throughout the period.Besides linear and constant drifts, Xie et al. (2018) observed an irregular and fluctuated clock drift up to 30 s during one month from a year-long dataset.But in some cases, such drifts occurred for the entire period (Zhu et al., 2020) with oscillations between 0 and 90 s (Fig. S1).
In comparison, the land stations are considered to have more reliable timing as they have well built-in GPS antenna on the ground.However, some temporary instruments are directly buried underground in the land without a GPS antenna (Farrell et al., 2018;Rodríguez-Pradilla and Eaton, 2020;Wang et al., 2018).Similar to OBS, these instruments did not have real-time GPS signals for clock synchronization which may lead to temporary/permanent clock drift.In addition, GPS connection may also fail due to instrumental error or complex conditions, e.g., high altitude, mountainous areas, or unfavorable metrological conditions (Jiang et al., 2019;Sens-Sch€ onfelder, 2008).Many stations during these dense array deployments have suffered from clock drifts (Lee et al., 2017;Liu et al., 2021;Luan et al., 2022;Zhang et al., 2022) which are normally discarded if few numbers of stations are affected.Discarding the data is not always the foremost choice.If multiple stations exhibit the clock drift, it might significantly affect the intended results.For instance, if the primary objective is to image the precise shallow fault geometry and the micro-seismicity along a fault, such problematic stations eventually lead to a data gap and cause a bias in results or influence the required resolution.
Over the years, enormous studies have been carried out to inspect and correct the clock drift in OBS data (Le et al., 2018;Liu et al., 2018;Loviknes et al., 2020;Zhu et al., 2020) and land stations (Hable et al., 2018;Sens-Sch€ onfelder, 2008;Shapiro and Campillo, 2004;Stehly et al., 2007) by using waveform inspection and time symmetry analysis using ambient noise cross-correlation functions (NCCF).The most direct and robust method for the clock inspection throughout the data can be performed via differential travel time between the predicted and observed P waves of teleseismic and local earthquakes (Loviknes et al., 2020;Zhu et al., 2019).Despite a prompt overview, the discrete measurements based on seismic signals cannot represent timing errors over the entire deployment period due to the limited number of earthquakes.To overcome such uncertainties, the time symmetry analysis of NCCF is commonly adopted.It is based on the principle that NCCF between two stations consists of a causal and an acausal component, both of which should theoretically be in perfect symmetry throughout time (Bensen et al., 2007;Gouedard et al., 2014).Otherwise, clock drift or abrupt changes are indicated by the broken symmetry of the NCCFs.It is important to consider that any clock error in stations will disrupt the symmetry of NCCFs (Gouedard et al., 2014;Liu et al., 2018;Stehly et al., 2007).
In the above studies, NCCFs are derived from daily waveform data with hourly segments, subsequently, which are stacked for several days to improve the signal-to-noise ratio (SNR).Although it has been proven to be a reliable method to inspect the instrument clock for each day, it fails to address some key problems such as resolving the sub-daily time drift, or a clock drift arising within the selected stacking window.For instance, Zhu et al. (2020) reported irregular clock drift in one OBS station using teleseismic and local events waveforms which fluctuated up to 70-90 s in different time periods.These short-period clock drifts could not be detected with daily data using NCCF after stacking and ultimately led to the discarding of a part of the data.This emphasizes the possibility of short-term clock drift, which may be difficult to identify once the NCCF result has been stacked in a longer time window.
In addition, NCCF processing has been dominantly influenced by the distribution and strength of noise source and interstation distances in both OBS and land datasets (Hable et al., 2018;Hannemann et al., 2014).A previous study along the volcanic hotspot of La Reunion by Hable et al. (2018) has successfully done the NCCF processing with a maximum interstation distance of 20 km and >300 km for land and OBS stations, respectively.In contrast to OBS stations, land-based seismic stations are substantially affected by non-uniform noise distribution with strong, local noise sources.Furthermore, in case of station deployment within a densely populated area, these factors might have a strong influence on temporal resolution and ultimately obscure the precision of clock drift.
In this study, we examine the OBS data from the Mariana subduction zone and land data from a dense array deployed in the Weiyuan shale gas field, Sichuan, China.Initially, we inspect the teleseismic and local event waveforms to evaluate the overall clock drift and data quality in both datasets.Based on the clock drift pattern, we carefully evaluate the NCCF using multiple time windows, besides daily waveform data to get optimal data length with higher drift detection which must be consistent with waveform observations.Later, to overcome the impact of nonhomogenous noise sources on NCCF, we select 9 reference stations with different interstation distances from the land data.We select a maximum interstation window with stable NCCF and divide the dense array into subgroups, accordingly.Finally, a precise clock drift for all OBS stations is calculated using NCCF with 6-h of data length.For the land dataset, clock drift is calculated for each group and later, the intra-group NCCF is performed using one overlapping station to verify the uncertainties in clock drifts.After the correction, we repeat the NCCF with the corrected dataset and the same parameters to verify our results.The newly developed strategies using short-segment NCCF help to overcome the existing issues to correct the clock drift of seismic data including short-period clock drift in OBS data and the influence of local intense noise on NCCF in the land dataset.

Data description
The dehydration process at the subduction zone stimulates the magmatic arc formations and inter-slab seismicity at intermediate depths (Contreras-Reyes et al., 2021).Mariana subduction zone is anticipated to be a water-rich system based on the evidence from active serpentine mud volcanos (Fryer, 2012) and relatively low upper-mantle velocities (Cai et al., 2018;Zhu et al., 2021).To investigate these mechanisms, near-field OBS observations have been conducted and analyzed using high-resolution seismic tomographic images (K.Y. Wan et al., 2019;Zhu et al., 2021).In such deep-sea deployment, each station is valuable and important as such expeditions require huge efforts.For example, Zhu et al. (2021) and Chen et al. (2022) analyzed the data from 12 OBS stations (red triangles: Fig. 1a) which were collected for the first time during two different experiments between December 2016 and June 2017 near the "Challenger Deep" (Fig. 1a).However, clock drifts were detected and successfully corrected using event waveforms inspection and NCCF analysis (Zhu et al., 2019(Zhu et al., , 2020) ) except for one station which experienced irregular clock drift (Zhu et al., 2020).Later, five stations were deployed for the period of one year (Oct 2018-Sep 2019) with an average interstation distance of ~30-100 km, primarily covering the Southern West Mariana Ridge (SWMR) (yellow triangles; Fig. 1a).The OBS instruments are the same as the previous experiments reported in (Zhu et al., 2019(Zhu et al., , 2020) ) and (Chen et al., 2022).Each station is equipped with three-component seismic sensors and a hydrophone.The sampling rate of a digital recorder is 100 Hz.In general, the OBS data are of good quality except for station H33 where low SNR has been observed (Fig. 4).
On the other hand, the Weiyuan area in the Sichuan Basin, China is a tectonically inactive region with low historical seismicity (Lei et al., 2020).However, the basin has experienced moderate-size earthquakes frequently over the past few years and most of them are attributed to hydraulic fracking (HF) activities.Among several shale gas blocks; the Weiyuan shale gas field has seen a significant rise in seismicity around HF sites in recent years including the two largest events with M L > 5 in 2019 which led to substantial causalities and extensive damage to nearby structures (Lei et al., 2019).In addition, numerous events with M > 4 triggered along the Molin fault at shallow depths (Fig. 1b) which is one of the major reasons to be more destructive (Sheng et al., 2020;M. Wang et al., 2020;Yang et al., 2020;Yi et al., 2020).With escalated concerns over the seismic hazard to nearby cities, it is crucial to investigate the fault geometry and associated seismicity using a dense network across the Molin fault, a region with intense seismic risk, for better hazard assessment and mitigation.For this purpose, a dense array network was deployed from April to June 2020, which consists of 76 seismic stations with a total length of ~6 km across the Molin Fault with ~50-100 m interstation distance (Fig. 1b).The sampling rate of each seismometer is 100 samples per second.Although most instruments operated regularly, 6 stations experienced recording problems such as data gaps or stopping functioning during the acquisition.

Clock inspection through teleseismic and local events
We first inspect the teleseismic and local event waveforms of OBS data to check the data quality, clock status, and possible clock drift.In this step, the differential time between the observed (T obsr ) and predicted (T pred ) P phase arrival time using the 1D IASP91 velocity model (Kennett and Engdahl, 1991) is calculated on all stations for each event.It is assumed that the differential time on each station should be consistent with others if the clock is operating normally, as the source location and travel path are identical as illustrated in Fig. 2b.Contrary, abnormal differential time at any station will refer to clock drift.To determine the possible clock drift in OBSs, 18 teleseismic events are selected with a good SNR (Fig. 2a).The differential times between theoretical and observed arrival times illuminate the clear signature of clock drift (around 10-15 s) at station H36 for two events while other stations remain stable (Fig. 3a), which is further verified with waveforms examination (Fig. 4a and b).However, the estimations derived from these events only depict the time error at certain time points, but we are unable to trace the variational clock drifting due to the large occurrence gap between events.
To tackle the preceding problem, we inspect the 50 local and regional events with higher SNR within a range of ~1000 km to inspect the differential arrival times (Fig. 2c).Due to high heterogeneity in the crustal structure and thus higher differential time at stations in subduction zones, a maximum threshold of differential time >5 s is considered as time error.This threshold is set to avoid any minor travel time difference caused by structural heterogeneity or wrong phase picks due to poor SNR of local events.In total, nine events overpass the threshold at the H36 station during the inspection (Fig. 3b) and are also verified with event waveforms (Fig. 4c-d).Meanwhile, other stations operate with a stable clock.Although the occurring interval between inspected events restricts the accurate drift estimation and its further correction, the features of an obvious clock drift are remarkable at station H36.The K20 is then used as a reference station for NCCF analysis because of its greater SNR and stable clock condition.
Similarly, for the dense array in the WSGF, we select 36 teleseismic events with magnitude >5.5, recorded during the deployment period (Fig. 5).In a dense seismic network, it is an arduous task to pick the precise travel time on each station for multiple events which might lead to uncertainties in differential time (Fig. 6a).It is anticipated that waveforms on each station are incredibly similar to each other due to the near proximity of the stations and homogenesis medium velocity (Fig. 6b).Any change in waveform arrival time refers to clock drift on that specific station (Fig. 6c).To overcome this issue and detect very minor clock drifts, we adopt the event-based waveform cross-correlation method.Basic data processing has been done before cross-correlation including detrending, remove mean, and bandpass filtering.In this process, the waveform recorded at earliest station which serves as a reference station is cross-correlated with all other stations in the dense array for each event (Fig. 6a).If the clocks are working in stable conditions, a  Some problematic stations exist with inconsistent clock drift throughout the acquisition period, but do not show any sharp drift, unlike OBS H36.For example, the observed drift pattern is identical in different stations e.g., station ML75 suffers from a constant clock drift from the first event and ML67 appears with clock drift in the last week of acquisition.Meanwhile, we notice that the number of stations with clock drifts increases with acquisition time e.g., only 3 stations have been detected with clock drift during the event that occurred on May 23, 2019, but later, the number of stations with clock drifts increased, gradually (Fig. S2).This emphasizes that a comprehensive motioning of these drifts is essential as different stations may have experienced the clock drifts in different time periods.We do not use the local events for this dataset as the teleseismic events are sufficient with a short occurrence interval during the whole deployment period.

Clock drift correction
Although we inspect and find obvious clock drift in both OBS and land datasets, these approaches are unable to correct them, precisely.To trace  the continuous potential clock drifting, we adopt the ambient noise crosscorrelation technique to obtain the empirical green functions (Yao et al., 2011;Zhu et al., 2019).Basic data processing has been performed before obtaining the NCCF for both datasets as discussed by Bensen et al. (2007) including: (1) downsampling the original 100 Hz daily waveforms data to 20 Hz to save the processing time; (2) de-mean and de-trend the waveform; (3) apply a suitable period band; and (4) spectral whitening and temporal normalization are applied to remove the earthquake signals.
We preferentially use the hydrophone recordings for cross-correlation of the OBS dataset, whereas the vertical component is utilized for the land dataset, respectively.
To obtain the optimal period band window, we test the NCCFs with different narrow bands for OBS (2-5 s; 5-10 s; 2-10 s and land dataset (2-5 s; 1-2 s; 0.5-1s) as shown in Fig. 8.The results show that OBS data has a higher SNR for a period band of 2-5 s with preferable phase symmetry as compared to others (Fig. 8a-c).Similarly, for the land data, the same period band (2-5 s provides acceptable symmetry with good SNR (Fig. 8f), but the narrow band (0.5-1 s or 1-2 s anticipates a high resolution in NCCF results and is able to detect very small changes with narrow amplitude symmetry (Fig. 8d and e).Based on this observation, period bands 2-5 s and 1-2 s are used in NCCF for OBS and land-based data, respectively.After preliminary data processing, we evaluate and rectify the clock drift in each dataset separately in the next section.

OBS data correction
It is important to justify the effects of calculating and stacking parameters on the estimation of clock drift based on NCCF.In numerous studies, the OBS clock drift is inspected using daily waveform data through NCCF and stacked for a few days (Hable et al., 2018;Le et al., 2018;Zhu et al., 2019) to several weeks (Hannemann et al., 2014;Stehly et al., 2007) to improve the SNR.However, sometimes stacking obscures the clock drift that could be present for a short period (Zhu et al., 2020).Alike, we notice a clock drift of about 20 s at station H36 on November 11, 2018 (Fig. 9a), which disappears after ~2.5 h, and the clock starts to work in a stable condition (Fig. 9b).This emphasizes that a drift behaviour is possible to fluctuate within a single day.Furthermore, there is an immense probability that such clock drifts will remain in the data if daily waveforms data is used with an extensive stacking window.Therefore, it is very crucial to carefully select the data length and stacking window, to avoid such obscureness.To overcome such antecedent problems, we compute the NCCF in sub-daily time windows including 24, 12, 6, and 3-h and further stacked for 5, 5, 7, and 17 segments to enhance the SNR, respectively (Fig. 10).These stacking Fig. 8. Impact of different period bands on (a-c) OBS stations, NCCF symmetry preserved during 2-5 s and 2-10 s but distorted for 5-10 s, and (d-f) land stations, NCCF symmetry preserved for all period bands from 0.5-5s but provide high resolution at 0.5-1 s and 1-2 s. windows are chosen carefully after testing multiple windows to achieve a robust and adequate resolution with a higher rate of drift detection.After calculating the clock drifts for each segment, we further validate it with the results from the waveform inspection.
We presume that all the clock drift, identified through waveform inspections must be present in the NCCF results and it will be a basic criterion to choose a specific data segment for further clock inspection.For instance, NCCF with a 24-h long segment is not able to detect all clock drifts present during waveform inspection (Fig. 10a-c).For example, a clock drift up to 18 s recorded in the first week during the waveform inspection is unable to be detected by NCCF.Likewise, the 16 s clock drift recorded on March 10, 2019, does not appear in NCCF results.There is a possibility that such drifts may occur within the stacking window or appear in the data for a short period.Subsequently, the NCCF result with 12-h long segment substantially improves the detection capabilities which is comparatively more consistent with waveforms analysis e.g., it perceives the clock drift present in the first week of deployment and early March 2019 (Fig. 10d, f) which failed using NCCF with the daily dataset.To further reduce the possibility of very short-period drifts, we obtain the NCCF using 6 and 3-h segments (Fig. 10g-l).The NCCF (6-h segments) is able to detect more short-period clock drifts (Fig. 10g-i) with distinct similarity to drift detection using NCCF (12-h segments).Meanwhile, NCCF using 3-h segments apparently detects more clock drift but compromises the SNR (Fig. 10j-l).
There is a trade-off between detection efficiency and noise level for segment length, e.g., an increase in segment length (24-h) resulted in a decrease in drift detection (Fig. 10a, b) and a decrease in segment length (3-h) amplifying the noise levels (Fig. 10j and k).Eventually, the 6-h segment is the optimal choice with high detection efficiency and low noise level which is also confirmed by the waveform inspection (Fig. 10h).Meanwhile, we find that linear drift remains consistent in all segments except the 24-h segment where it slightly decreases which may relate to comparatively poor SNR.
Thereafter, the NCCF is computed for each station using a 6-h segment with respect to the reference station and further stacked with the 5 segments (2 segments before and after the original segment) to improve SNR.After analysis of all NCCF pairs, we observe that 3 out of 5 stations are affected by clock drifts including both linear and irregular clock drifts (Fig. 11).The stations H33 and K06 have linear clock errors of 0.25 s and 0.56 s, respectively (Fig. 11c, i).In contrast, H36 experiences a considerable linear clock drift of 2 s along with a short-period irregular clock drift with varying amplitude (4-33 s) throughout the acquisition period (Fig. 11d-f), which is consistent with inspection results of seismic waveforms (Fig. 3).For example, clock drift up to 18 s during October 29-30, 2018, is confirmed by both analysis (Figs. 3 and 11f).Overall, the maximum duration of clock drift is up to three days with a maximum drift of 33 s on June 19, 2019, and the minimum detected drift is up to 6-h (equal to the selected time window).After correcting the clock drift for all problematic stations based on the estimated values, NCCF is recalculated using the amended dataset (Fig. 11b, e, h), which reveals the consistently symmetric signals and verifies the data correction step.In addition, we detect nearly all the clock drifts that appear during the differential time analysis.

Land data
The symmetric pattern and SNR in NCCFs can be influenced by nonuniform noise distribution and intense, localized noise sources as mentioned earlier.Seismometers on the ocean floor are unable to receive GPS satellite signals, but function in a more stable environment than landbased stations (Hable et al., 2018).It is usual to have non-uniform noise sources and their direction in the land data where seismic stations are deployed along the roads, residential areas, and different topography sites.To analyze the impact of interstation distance and noise conditions on NCCF, 9 land stations with a stable clock are selected from the dense array with different distance intervals (0-3) and test the NCCF results for the fixed period band (1-2 s) using daily waveform data (Fig. 12).We evaluate the NCCF result using daily waveform data for a range of distances (0-3 km) and observe the signal distortion.Phase symmetry preserves for a short inter-station distance (<1 km) in NCCF and gradually distorts by increasing the inter-station gap (Fig. 12).An obvious lag shift has been observed at 1.4 km (Fig. 12h) and SNR becomes worst after 2.2 km (Fig. 12j).This suggests that the ambient noise field is only homogenous within 1 km inter-station distance in this region.
Based on this observation, the dense array is divided into 16 groups (5 stations in each) with a maximum inter-station distance (<400 m) to reduce the impact of localized inhomogeneous noise.The pictorial representation of the intra-group and inter-group station selection for NCCF analysis is illustrated in Fig. 13.NCCF with period band 0.5-1 s has been adopted for daily waveforms data to examine the clock status without stacking as SNR is much better.For each intra-group station, the NCCF is performed with a reference station that remained stable throughout the acquisition period, to inspect and correct the clock drift in other stations in the group.Later, the inter-group NCCF is performed using one overlapping station from back-and-forth groups to verify the uncertainties in clock drifts due to inhomogeneous noise, if exist (Fig. 13).
Meanwhile, we detect an unexpected change in NCCF's symmetry for a single day on a few station pairs (e.g., ML03-05 and ML32-33) that could not be categorized as clock error due to an inconsistent symmetry pattern (Fig. 14a and b).To verify such abnormal behavior in NCCF, we inspected the waveforms on these specific days, before considering it a clock error.For pair ML03-05, we observed different data lengths on that  particular day, whereas station ML33 could not record the data for the whole day (Fig. 14c and d) so these are considered as an apparent drift in the data.
Out of 79 stations in a dense array, 13 stations experience clock problems during the acquisition period which is 16% of the total dense array.In addition to linear clock drift, several patterns of clock drift are  identified in each problematic station e.g., the clock starts to drift towards positive lag time and fluctuated back to negative lag time after a week with a maximum drift of 2.3 s (Fig. 15a).Some stations have been working perfectly well in the first half of acquisition but later, start to drift in a linear way (Fig. 15d) or a constant shift with an abrupt jump (Fig. 15g).All clock drifts are successfully corrected for all the stations.For further verification, we repeat the NCCF (Fig. 15b, e, h) and compare it with teleseismic events.

Differential time analysis
Several methods including delay time analysis using frequent earthquakes and repeating active sources, have been proposed and successfully implemented to overcome short-period clock drift for land and OBS datasets.During the delay time analysis, short-period clock drifts are usually detected through teleseismic events with distinct arrival times.For example, Anchieta et al. (2011) and Zhu et al. (2019) compared the teleseismic P waves from collocated teleseismic events and observed the clock drifts of OBSs.This approach requires moderate to large earthquakes from nearly same source locations for precise monitoring of clock drifts.However, events from different sources or regions might cause uncertainties due to the Earth's heterogeneity.Meanwhile, it is uncertain to record frequent repeating/neighbouring earthquakes, which will eventually affect the temporal resolution of inspecting clock drifts.
Since natural earthquakes are spatially confined, man-made sources are frequently used to image the deep structures (Liu. et al., 2021;Wan et al., 2019) that can also assist to overcome the instrumental clock problems in short-term marine experiments.However, these sources are not suitable for long-term deployment (a few months to a year) in terms of logistical challenges and significant costs.The only possible way is to use the active source (e.g., airgun shots) during the recovery stage in such short-term deployment (Lontsi et al., 2022).Since OBS are deployed as a "free-fall" installation on the seafloor, their location uncertainties will still limit the target accuracy of clock drift.However, this follow-up approach is restricted by available facilities of active sources on the vessel.
On the other hand, these sources are frequently adopted in the land environment to image fault zone structures and monitor their temporal variations and proved as an effective way to monitor the clock status (Chen et al., 2017;Jiang et al., 2019;Luan et al., 2022;Wang et al., 2020;Yang et al., 2018).However, in such active source experiments, seasonal variations, and discontinuous measurement obstruct the continuous and accurate assessment of clock drifts.Meanwhile, real-time earthquake monitoring via 4G data transfer and artificial intelligence is also an effective development for synchronizing clocks and minimizing drift time (Li et al., 2021).On the contrary, it is not feasible for distant rural areas with no internet access, especially for OBS deployments and yearlong deployment period.Among all the established methods, only arrival times from natural earthquakes are the most viable and convenient way to inspect the short-period clock drifts in any environment with a limitation of sampling gap.
Differential time analysis significantly increases with the number of available teleseismic and local events and reduces the sampling gap.Our results show that short-period clock drifts (up to 20 s) appear during 11 events in different time periods without any consistent pattern (Fig. 3).A comprehensive assessment of the data quality, including any potential minimal drift (Fig. 9) is well evaluated using differential time analysis.Although this analysis is insufficient to offer a higher temporal resolution, it can be used as a verification standard to select the segment length and stacking window for reliable drift detection using NCCF in the next step.

NCCF and clock drift detection performance
In general, NCCF is an effective method with continuous monitoring to detect precise clock drift, compared to the above-mentioned methods.
The detection capability of short-period clock drifts depends on NCCF parameters e.g., the applied filter, selection of data length, and stacking window.The choice of segment length of daily waveform data is the most critical parameter which depends on the expected clock drift in the data, its duration, and the resultant SNR of NCCF's results.For this purpose, drift detections from differential time analysis serve as a reference to verify resultant clock drifts from NCCF.Our results indicate that NCCF with daily waveform data is unable to effectively detect all the reference clock drifts (Fig. 10a-c), while computing NCCF with a short data segment improves the detection capability with an acceptable SNR (Fig. 10g-i) and perceiving a good agreement with differential time analysis.In addition, method accuracy is accessed by repeating the NCCF with the corrected dataset (Fig. 11).However, it is difficult to comprehend the potential source of these clock drift patterns.This proposed NCCF method using short segments successfully inspects and corrects the short-period clock drift in the data which will significantly help to overcome the existing problem of short-period clock drifts in seismic data.
On the other hand, homogeneous noise condition is one of the most critical assumptions for clock inspection using NCCF.Any changes in ambient noise contribute to the asymmetry of NCCF which will certainly limit the accuracy of clock drift correction.In the case of the land dataset, NCCF symmetry is highly influenced by localized intense noise (Fig. 12).Our findings show that the influence of localized intense noise on NCCF is substantially reduced by careful selection of interstation distance, which will eventually lead to precise clock correction.For instance, NCCF symmetry is only well preserved for stations within a range of 1 km inland dataset (Fig. 12).Meanwhile, noise condition varies in different environments, so it will be paramount important to consider the local noise sources and choose a preferred interstation distance for reliable NCCF.

Apparent or actual clock drift?
(a) Missing Data Samples In addition to linear and abrupt clock drifts, we detect the apparent clock drifts on land stations with distinct symmetry patterns in NCCF (Fig. 14a and b).These observed features do not anticipate the NCCF symmetry before and after these jumps as observed in normal clock drifts (Fig. 11d).Instead, we infer that missing data samples are responsible for the apparent clock jumps.Our hypothesis is further verified with waveform inspection on that particular day (Fig. 14c and d).Therefore, it is paramount important to differentiate the sharp changes in the NCCF results, whether clock errors or apparent drift cause these.
(b) Delayed Arrival Time and Low-Velocity Zone (LVZ) When inspecting the delay times across a dense temporary array, one caution needs to be taken for the low-velocity zones (LVZs) associated with fault zones.The LVZs are usually hundreds of meters to several kilometers in width which can significantly reduce the seismic velocities relative to the host rock, and eventually delay the phase arrival time (Ben-Zion et al., 2003;Li et al., 2007;Share et al., 2017;Yang et al., 2011Yang et al., , 2014)).These features of delay times in LVZ frequently appear for all events outside the fault zone, which looks similar to the clock drift pattern (Yang et al., 2020).Therefore, it is critical to decide whether the delay times are due to clock error or LVZ.For instance, if multiple collocated stations are continuously experiencing consistent delay time, it is highly suggested to inspect the local heterogeneity before considering it as a clock drift.In our dense array, we suspect the possible zones of local heterogeneity near the Molin fault and station's locations, suffering from clock drifts.Our results show that clock drift does not appear on a specific section of the dense array near the fault trace which rules out the influence of any possible LVZ (Fig. 6, Fig. S2).

Observed clock drift pattern
Several patterns of clock drifts appear in both temporary networks in addition to linear and short-period drift.We illustrate all the potential timing problems that were detected in the previous seismic data (Fig. 16).Based on the different behaviour of clock drift, we divide into several types in both OBS and Land Data.
Linear drift: Clock drift is present in the data with a consistent increasing pattern e.g., K20-H33 (Fig. 11g).Linear clock drift with sudden jumps: linear clock drift with sudden irregular jumps in the dataset e.g., K20H36 (Fig. 11d) Fluctuated drift: Clock drift throughout the acquisition period without any consistent trend i.e., ML32-31 (Fig. 15a) Apparent Clock drift due to Missing Data: Apparent clock drift is not consistent with phase symmetry.Several reasons were identified such as the start/end time of daily waveforms being different during crosscorrelation or missing data samples for the specific day as the instrument was not working (Fig. 14).Short-period clock drift: Such drift appears in data for a short period in a constant or linear pattern (Fig. 15d).A GPS connection with the internal clock was lost at this specific station, for a short time.Constant time error: One station experienced a constant time shift of up to 2 s from the first week and remained throughout the deployment period.

Limitations
Although our NCCF results using 6-h segments provide a convincing detection result and improve the existing method, it may not be able to detect all the clock drift which exists in the data for a shorter period than 6-h.There is a strong possibility to have some clock drift in data that occur for a noticeably short period e.g., a few minutes to hours.For instance, a 16 s clock drift is recorded on OBS stations during the waveform's inspection on July 26, 2019.But we cannot notice any drift at this time from NCCF.It emphasizes verifying the accuracy and performance of NCCF detection results with additional methods such as local and teleseismic P wave arrival time.

Conclusion
In this study, we analyse the differential time analysis using waveform inspection which provides key information about data quality and clock status.Later, NCCF has been computed to monitor the continuous clock stability throughout the deployment time.Meanwhile, the importance of waveform inspection is not negligible as it acts as a verification standard of NCCF.The present study summarizes the impact of several factors that can be potentially useful for the identification of short-period clock drifts encountered in the waveform data via the NCCF technique.We show that the NCCF using small segments of daily waveform data significantly improves the detection capability.In general, our approach efficiently detects the clock drifts up to 6-h, however, short-term fluctuation might have still prevailed in the data.On the other hand, careful selection of interstation distances substantially improved the NCCF results and reduce the intense, localized noise.Inclusively, the preliminary analysis of influencing factors is very imperative prior to NCCF drift correction.However, additional methods are required (e.g., teleseismic and local P wave arrival times) to confirm the accuracy and efficacy of NCCF detection results for a reliable assessment of clock error.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 1 .
Fig. 1.(a) The locations of deployed OBS stations along the Southern Mariana Subduction Zone.Red triangles denote the OBS stations deployed between 2016-2017, while yellow triangles denote the recent deployment (2018-2019) used in this work.WMR, West Mariana Ridge; SWMR, Southwest Mariana Rift; MGR, Malaguena-Gadao Ridges; DSZ, diffuse spreading zone.Inset: shows the broader view of the study area (red frame).(b) the location of a dense array across the Molin Fault in the Sichuan Basin (yellow triangles).ML01, ML40, and ML80 denote the first, middle, and last stations, respectively.The black square represents the Rongxian city, and the purple pentagons represent hydraulic fracking platforms.Inset: Location of the study area (red frame).

Fig. 2 .
Fig. 2. OBS dataset: bathymetry map of the OBS stations and selected events.The regular triangle represents the OBS stations (yellow), and the inverted black triangle represents the nearest GUMO inland station, (a) the location of teleseismic events (M W >6.5) are marked by red stars, (b) illustration of the source location and travel path towards the OBS stations, (c) a zoom-in view of OBS locations and local events (M W >4.5) that are analyzed in this study (red stars).

Fig. 3 .
Fig. 3. OBS dataset: Differential time between observed and predicted arrival times on OBS stations for both teleseismic and local events.(a) Differential time for Teleseismic events.Sharp time drift with more than 15 seconds is observed on station H36 for two events (b) Differential time for local events.Sharp time drift is observed on station H36 in nine different time periods.Circles in different colors represent the different stations, and the shaded area highlights the earthquake events with sharp drift.

Fig. 4 .
Fig. 4. OBS dataset: Waveform profiles of all OBS stations for the selected events.The red and blue bar indicates the observed and theoretical travel times.(a-b) predicted and observed arrival times for teleseismic events, and (c-d) local events.Among all stations, H36 is influenced by sharp time drift.

Fig. 5 .
Fig. 5. Land dataset: location of the study area and selected teleseismic events for event-based waveform cross-correlation.The black square and the regular triangle represent the location of the Sichuan basin and dense array, respectively.The location of teleseismic events (M W >5.5) is marked by red stars.

Fig. 6 .
Fig. 6.Land dataset: (a) waveform profile of one event recorded on the dense array.(b) a zoom-in view of waveforms recorded on stations working with the normal condition and (c) with clock errors.The red bar indicates the observed P-wave arrival times.

Fig. 7 .
Fig. 7. Land dataset: lag time (clock drift) based on event-based waveform cross-correlation using teleseismic events.Red and black circles represent the stations with and without clock drift, respectively.Gray and green shaded area highlights the linear drift and irregular clock drift.

Fig. 9 .
Fig. 9. OBS dataset: Waveform profiles of two local events, recorded on November 14, 2018 (a) Station H36 shows the clear time drift at 15:53 for an event with magnitude 4.5 in South Mariana; Meanwhile (b) This station works normally at 18:26 for the event, with M5.2 in W. Caroline Islands.

Fig. 10 .
Fig. 10.OBS dataset: Comparison of noise cross-correlation function (NCCF) with different data lengths.The first column indicates both causal and acausal parts, the second column shows the zoom-in view of the first column, and the third one represents the time drift for each segment (blue dots represent the clock drift on each segment).Following by left to right, (a-c) NCCF with 24-h segment (daily NCCF); (d-f) 12-h segment; (g-i) 6-h segment; (j-l) 3-h segment.

Fig. 11 .
Fig. 11.OBS dataset: NCCF results and time drift for different pairs of OBS stations; (a) original NCCF results using a 6-hour segment of daily waveform data.The vertical black dotted line represents the reference line to observe lag time: (b) NCCF after clock drift correction; and (c) simplified illustration of time drifts for each station pair.Blue and green dots represent the clock drift before and after the correction, respectively.

Fig. 12 .
Fig. 12. Land dataset: Comparison of NCCFs of station pairs with different inter-station distances.(a-f) NCCF symmetry remained consistent within 1 km inter-station distance, and (g-l) gradually distortion of NCCF symmetry with increasing inter-station distance.

Fig. 13 .
Fig. 13.Land dataset: Schematic strategy to overcome the local noise effect on NCCF for stations in the dense array.Different colors of the triangle represent the affiliation of land stations to a specific group.Inter-group station selection for NCCF is depicted by arrows.Arrow colors refer to the previous group color.

Fig. 14 .
Fig. 14.Land Dataset: Impact of missing data on NCCF: (a) sharp change in NCCF results of ML03-05 on the last day; while (b) ML32-33 observe in the middle of data acquisition; (c) waveforms represent that ML03 has shorter data length than ML05; and (d) no data was recorded at ML33.

Fig. 15 .
Fig. 15.Land dataset: NCCF results and time drift for different pairs of land stations; (a) original NCCF results using daily waveform data.The vertical black dotted line represents the reference line to observe lag time: (b) NCCF after clock drift correction; and (c) simplified illustration of time drifts for each station pair.Blue and green dots represent the clock drift before and after the correction, respectively.

Fig.
Fig. The potential timing problems detected in OBS and land-based seismic data.The red line represents the ideal stable clock, whereas the blue dots represent the different clock drifts from the stable clock for each day.The proposed names for different clock drifts are marked in the figure.