Assessing the potential to use repeated ambient noise seismic tomography to detect CO 2 leaks : Application to the Aquistore storage site

The Aquistore project in Saskatchewan, Canada provides carbon dioxide (CO2) storage for the world's first combined commercial power plant and carbon capture and storage (CCS) project. CO2 has been injected at a depth of 3.2 km since April 2015 and a permanent near surface geophone array provides passive seismic monitoring. The ability to identify any containment breach is a vital part of risk management and reduction for CO2 storage sites. We therefore investigate the potential to monitor seismic velocity changes following a hypothetical leak of CO2 from the reservoir using passive monitoring methods. We estimate the expected shearwave velocity change with CO2 saturation, and using data from the geophone array we investigate whether ambient noise interferometry (ANI) and a tomographic inversion for Rayleigh wave group-velocity maps could provide a suitable CO2 leakage detection tool. To assess the repeatability of the method, we conduct, for the first time, a time-lapse ambient noise tomography survey of a CO2 storage site to cover time periods preceding and following injection start-up. Sensitivity analysis results indicate that usable surface wave data derived from the current array configuration are sensitive to depths of ∼400m and shallower. We do not expect to observe any changes due to CO2 migration at such shallow depths and the estimated seismic velocities preand post-injection agree to within 60m s, which is on the order of double the predicted velocity change with CO2 saturation. Therefore, due to uncertainties in travel-time picks (5–15%) and variations in the obtained velocity structure between consecutive days (up to 20%), we would be unable to resolve the expected seismic velocity change with an influx of CO2 at 400m (∼3–4%). Additionally, the noise source variability does not allow stable velocity estimates to be made in the time-frame of currently-available data. Consequently, in the event of a CO2 leak at the Aquistore site, using the standard ambient noise analysis methods applied herein, Rayleigh wave tomography could be deployed to detect velocity changes due to CO2 saturation only if (a) a wider aperture surface array was in place to allow longer period surface waves to be used, providing sensitivity at greater depths, (b) arrival times of interferometrically-synthesised surface waves could be picked with increased accuracy, and (c) there is stability of the noise source distribution between repeated surveys. However, a map of three-dimensional near surface velocities, as obtained in this study, could nevertheless be useful for near surface static corrections when using active-source seismic reflection surveys to image and monitor the reservoir. More generally, further similar studies are required to assess the applicability of ANI for leak detection at other CO2 storage sites.


A B S T R A C T
The Aquistore project in Saskatchewan, Canada provides carbon dioxide (CO 2 ) storage for the world's first combined commercial power plant and carbon capture and storage (CCS) project. CO 2 has been injected at a depth of 3.2 km since April 2015 and a permanent near surface geophone array provides passive seismic monitoring. The ability to identify any containment breach is a vital part of risk management and reduction for CO 2 storage sites. We therefore investigate the potential to monitor seismic velocity changes following a hypothetical leak of CO 2 from the reservoir using passive monitoring methods. We estimate the expected shearwave velocity change with CO 2 saturation, and using data from the geophone array we investigate whether ambient noise interferometry (ANI) and a tomographic inversion for Rayleigh wave group-velocity maps could provide a suitable CO 2 leakage detection tool. To assess the repeatability of the method, we conduct, for the first time, a time-lapse ambient noise tomography survey of a CO 2 storage site to cover time periods preceding and following injection start-up. Sensitivity analysis results indicate that usable surface wave data derived from the current array configuration are sensitive to depths of ∼400 m and shallower. We do not expect to observe any changes due to CO 2 migration at such shallow depths and the estimated seismic velocities pre-and post-injection agree to within 60 m s −1 , which is on the order of double the predicted velocity change with CO 2 saturation. Therefore, due to uncertainties in travel-time picks (5-15%) and variations in the obtained velocity structure between consecutive days (up to 20%), we would be unable to resolve the expected seismic velocity change with an influx of CO 2 at 400 m (∼3-4%). Additionally, the noise source variability does not allow stable velocity estimates to be made in the time-frame of currently-available data. Consequently, in the event of a CO 2 leak at the Aquistore site, using the standard ambient noise analysis methods applied herein, Rayleigh wave tomography could be deployed to detect velocity changes due to CO 2 saturation only if (a) a wider aperture surface array was in place to allow longer period surface waves to be used, providing sensitivity at greater depths, (b) arrival times of interferometrically-synthesised surface waves could be picked with increased accuracy, and (c) there is stability of the noise source distribution between repeated surveys. However, a map of three-dimensional near surface velocities, as obtained in this study, could nevertheless be useful for near surface static corrections when using active-source seismic reflection surveys to image and monitor the reservoir. More generally, further similar studies are required to assess the applicability of ANI for leak detection at other CO 2 storage sites.

Introduction
Geological storage of carbon dioxide (CO 2 ) is one method proposed to reduce anthropogenic emission and release of greenhouse gases to the atmosphere to mitigate against climate change. Several successful commercial scale CO 2 sequestration projects have been conducted, for example, at the Sleipner (Verdon et al., 2013), Weyburn (Verdon et al., 2013), In Salah (Verdon et al., 2013) and Decatur (Couëslan et al., 2014) sites. Despite the success of these projects, safety concerns surrounding the technology persist, particularly relating to induced seismicity and the potential for CO 2 leakage.
The main aim of CO 2 storage projects is to prevent the emission of CO 2 into the atmosphere. A leak from the storage reservoir would negate the climate change mitigation effects, and could lead to groundwater acidification or result in an asphyxiation hazard in low-lying areas (Worth et al., 2014). In this study we investigate the potential for using a near surface deployment of geophones to verify lack of leakage of CO 2 from a storage site using data recorded at the Aquistore CO 2 storage site, Saskatchewan, Canada (Worth et al., 2014).
Passive seismic monitoring (PSM) is an attractive technology for monitoring CO 2 storage sites because it can provide near-real-time continuous data. Seismic monitoring for micro-earthquakes, termed microseismicity, has proven to be a useful tool in the study of the geomechanical response of sites to CO 2 injection (e.g., Verdon et al., 2011Verdon et al., , 2015Goertz-Allmann et al., 2014). The data can enhance the understanding of CO 2 migration patterns (Goertz-Allmann et al., 2014;Stork et al., 2015;Bauer et al., 2016), help calibrate geomechanical models (Shapiro et al., 2011;Verdon et al., 2015) and aid hazard assessment (White and Foxall, 2014;Kaven et al., 2015). However, a leak may also occur aseismically and, in this case, changes in seismic velocities determined using recordings of ambient noise could highlight the presence of CO 2 at shallow depths without the need for repeat activesource seismic surveys. Such changes could potentially be monitored using ambient noise seismic tomography, thus providing a relatively cheap and continuous check on storage integrity.
In this study, we first provide an overview of the Aquistore site and local geology. Subsequently we investigate whether ambient noise surface-wave tomography can provide sufficient resolution and accuracy for CO 2 leak detection. We conclude that this is not possible using the standard methods applied and datasets available in this study. We discuss the future research that is necessary to make this method applicable for leak detection at this and other CO 2 storage sites.

The Aquistore site
Aquistore is a CO 2 storage research project located in Estevan, Saskatchewan, Canada, and is managed by the Petroleum Technology Research Centre (PTRC) (Worth et al., 2014). CO 2 is captured at SaskPower's Boundary Dam coal-fired power plant, the world's first combined commercial power plant and CCS project (IEAGHG, 2015). Following capture, a portion of the CO 2 is sold for enhanced-oil-recovery (EOR) operations and the remainder is transported by pipeline and sequestered at the Aquistore site via the injection well ∼5 km to the west of the power station. The CO 2 is injected into a saline aquifer at a depth of 3150-3350 m and between April 2015 and June 2017 more than 105 kt of CO 2 was injected. The storage formations are the Black Island Member of the Winnipeg Formation and the Deadwood Formation.
The Aquistore site has one injection well and one observation well and is equipped with a wide range of monitoring technologies, including groundwater monitoring, downhole pressure and temperature measurements, GPS sites and interferometric synthetic aperture radar (InSAR) monuments. Time-lapse 3D seismic surveys have been conducted White et al., 2016) and the PSM technologies include a continuously recording near surface array of up to 64 geophones, three broadband seismometers, an intermittently recording array of geophones deployed for the 3D seismic surveys (a 2.5 km × 2.5 km array), a fibre-optic distributed acoustic system and a downhole geophone array. Here, we use data from continuously and intermittently recording geophones, a total of up to 304 instruments (Fig. 1). These are vertical-component 10 Hz geophones buried 20 m deep except for 25 geophones which are 3-component 10 Hz instruments buried 6 m below the surface. The area has previously been used for coal mining and, to enhance data quality, the instruments are buried so they are near the base of or below the surface layer which has been restored after mining. Burial of the receivers also enhances repeatability of studies as it reduces the effects of the seasonal changes in the near surface layer .

Overview of geology
The Aquistore project is located in the Williston Basin, an elliptically shaped sedimentary basin in central North America (Worth et al., 2014). At more than 3 km depth, the Winnipeg and Deadwood CO 2 storage formations are the deepest sedimentary units of the Williston Basin  and are brine-filled clastic strata below all oil producing and potash bearing formations (White et al., , 2016Samsonov et al., 2015). They are large volume, porous and permeable rocks, making them ideal storage formations for CO 2 . The overlying strata include the reservoir shale caprock (Icebox Member of the Winnipeg Formation) and evaporites of the secondary regional seal (Prarie Formation). These are effective barriers to CO 2 migration, providing good seal formations for the storage reservoir (White et al., 2016).
The shallowest bedrock unit at the Aquistore site is the Tertiary Ravenscrag Formation, consisting of interbedded sand, silt, clay and lignite and is 170-180 m thick (Klappstein and Rostron, 2014). This unit is overlain by thin Quaternary glacial tills and clays up to the surface, with thickness up to about 10 m. Below the Ravenscrag are the Late Cretaceous Bearpaw and Belly River formations, to depths of ∼250 m and ∼900 m, respectively .

Passive seismic techniques to detect CO 2 leakage
Injected CO 2 is more buoyant than the brine in the interstitial surroundings and so CO 2 could leak as a result of (a) vertical migration as a free phase through faults, fractures or permeable zones in the caprock; (b) diffusion as a dissolved phase through the caprock; (c) well integrity failure. If CO 2 migration is aseismic, one possibility is to monitor seismic velocity changes using active sources (e.g., Chadwick et al., 2010;Ikeda et al., 2016) or ambient seismic noise recordings (e.g., Boullenger et al., 2015;Cheraghi et al., 2017). Ambient noise is the diffuse wavefield of random background vibrations of the Earth due to atmospheric, oceanic, rock fracturing and anthropogenic activity (e.g., Snieder and Wapenaar, 2010). Ambient noise may also be correlated with a persistent, stationary noise source from a particular direction. The combination of the random and correlated signals is the measured seismic signal when no source is actively triggered. Ambient seismic noise travels through the same subsurface as wave energy from active sources and so should be similarly affected by earth structure. It should therefore be possible to extract some of the same information from ambient noise as from active surveys (Curtis et al., 2006). Ambient noise interferometry (ANI) is a method to estimate the Green's function between two seismic receivers without the need for an earthquake or active seismic source by using recordings of ambient noise energy at the two receivers. By measuring the wavefield at two points in space (at two receivers), cross-correlating the recordings and stacking over time, the response that would have been measured at the second receiver if the first was an impulsive source is obtained (e.g., Wapenaar et al., 2010). This response is termed a virtual seismogram. Effectively, this operation finds the travel-time difference of the waves which are recorded at the two receivers (Curtis et al., 2006). As ambient noise sources are mostly concentrated on or near the Earth's surface it is the surface wave part of the signal which is most easily extracted (Shapiro et al., 2005;Draganov et al., 2009), and it is this part of the signal on which this study concentrates. Surface waves are dispersive (meaning different periods travel at different speeds) because they are sensitive to seismic velocities in different subsurface depth ranges. Therefore, the Greens functions obtained from the cross-correlation of ambient noise can be used to obtain inter-receiver path-averaged surface wave dispersion (period versus speed) curves.
Surface wave travel-times between the two receivers can be estimated from the virtual seismograms at each period to enable a tomographic inversion of this data, producing an estimate of surface wave group velocity for the area spanned by the receiver array. This technique has been successfully and widely used to study the shallow earth structure on regional and continental scales (e.g., Shapiro et al., 2005;Lin et al., 2007;Galetti et al., 2017). The use of ambient noise crosscorrelations for the monitoring of storage sites ensures a high degree of repeatability when using a permanent array of receivers as the source-receiver geometry is unchanged between surveys. However, the repeatability also depends on the consistency of the noise source characteristics, as a change in the ambient noise field may result in a change in shape of the cross-correlated waveforms (Boullenger et al., 2015).
ANI has been applied at an exploration scale to obtain body-wave reflection images at Aquistore and other sites (e.g., Cheraghi et al., 2017;Draganov et al., 2009). It has also been used to produce surfacewave velocity models (e.g., Lin et al., 2008) and investigate seismic velocity changes due to industrial activities and seasonal effects. The 3D velocity models of the shallow sub-surface obtained from such tomographic inversions could be used, for example, to apply improved, timedependent static corrections in 3D reflection surveys. The feasibility of using ANI to detect velocity changes at CO 2 storage sites was investigated by Boullenger et al. (2015), with application to the Ketzin site in Germany. These authors produced zero-offset traces which closely matched those obtained from the active seismic survey. However, they were unable to compare pre-and post-CO 2 injection because the field-data recording began after CO 2 was observed in the storage formation directly beneath the permanent seismic array.
The Aquistore CO 2 storage project provides the ideal dataset for time-lapse monitoring of an industrial site because a permanent geophone array was installed at the site in 2012 and CO 2 injection began in April 2015. A subset of the geophone array has operated almost continuously since 2012 for seismic monitoring purposes . We assess the feasibility of using ambient noise to detect CO 2 leakage if it were to occur at shallow depths at Aquistore by estimating expected seismic velocity changes with an influx of CO 2 using Biot-Gassmann fluid substitution (Biot, 1941;Gassmann, 1951). We apply Gassmann's equation following the method outlined in Smith et al. (2003) to calculate the effect of replacing pore fluids with CO 2 . By applying the Voigt-Reuss-Hill formula we find the average bulk modulus for formations at different depths and use the in situ seismic velocities, densities and porosities derived from logging data taken in the Aquistore injection well. To calculate changes in S-wave velocities, dV s , we assume the pore fluid is brine before the influx of CO 2 . Fig. 2 shows that dV s at depths ≤1000 m would be > 1.5% for CO 2 saturations > 40%, which may be detectable with tomographic methods. At these shallow depths the CO 2 will be in the liquid/gas phase. On the other hand, at depths close to the 3.2 km deep injection point where the CO 2 is in the supercritical phase, the predicted variation in seismic velocities with the influx of CO 2 is small, < 0.5% (Fig. 2), as found by Roach et al. (2015). It is unlikely that such subtle changes are detectable by seismic tomographic methods.

Feasibility of using ANI as a CO 2 detection tool
A variety of ambient noise processing sequences have been applied by different authors in the past. Usually these are justified by arguments of improved stability or reliability of results in the presence of variable noise fields. We therefore first examine the Aquistore noise field itself and its stability over time, then a subset of the most commonly applied processing methods. We assess the feasibility of using surface-wave ANI and tomography to detect a CO 2 leak in the shallow subsurface at the Aquistore site. This is in contrast to the study by Cheraghi et al. (2017) who assess the ability of ambient noise reflection imaging to detect changes at the reservoir level.

Noise characteristics
Theoretically, ANI requires a diffuse and isotropic ambient noise field to correctly reproduce the impulse response at a receiver. For a time-lapse survey the effect of changes in the average ambient wavefield should also be small. The Aquistore site is surrounded by anthropogenic noise sources, including roads within 2 km of the northern, western and southern edges of the array, as well as the Boundary Dam power station and mining activities ∼2 km and 5-10 km, respectively, from the eastern extent of the array. However, Birnie et al. (2016) conducted an extensive study of the noise characteristics at Aquistore and conclude that the noise is not white, stationary, nor Gaussian. Due to the industrial activity in the area, there is a non-isotropic noise  (Smith et al., 2003) to formations at different depths.
A.L. Stork et al. International Journal of Greenhouse Gas Control 71 (2018) 20-35 source distribution with the most powerful sources of noise being to the east of the array. Fig. 3 shows the relative noise power histogrammed in bins of backazimuth and slowness; the more dominant the noise source, the larger the relative power shown and clearly none of these time intervals experienced isotropically distributed noise. Noise originating from the power station is particularly dominant during night time hours (between approximately 21:00 and 05:00 local time or between 03:00 and 11:00 UTC), as shown by the dark colours in Fig. 3(a) and (c). During the day (between approximately 09:00 and 17:00 local time) there is a more even distribution of noise sources, indicated by the smaller relative power values and a broader distribution of colour in Fig. 3(b) than 3(a) and (c). We therefore test if, and over what period, cross-correlations can be considered stationary.

Cross-correlations
To carry out a tomographic study using ambient noise data, noise recordings must be cross-correlated in receiver pairs to produce virtual seismograms. To calculate virtual seismograms for the Aquistore array of geophones, we limit any bias created by an uneven receiver, and hence path, distribution by restricting our analysis to geophones within a circle of 1.25 km radius of the injection well. By taking a circular, rather than rectangular, array of receivers we remove the possibility of errors caused by a higher density of paths on the diagonals. We also use only geophones that are operational during both time periods for which data was available for this study (13-18 April 2015 and 7-13 June 2015), giving a total of 208 geophones and 43,056 receiver pairs. CO 2 injection began at the site on 16 April 2015, and so the data allows us to determine if any detectable seismic velocity changes occurred at the site following the start of injection.
To prepare the data for cross-correlation, we follow the method of Bensen et al. (2007) which was tested and used by Nicolson et al. (2012Nicolson et al. ( , 2014 and Entwistle et al. (2015), amongst others. We use 60 s sections of data with one-bit time domain normalisation and spectral normalisation applied. The waveforms produced from each section were stacked over each time period to give a final average cross-correlation for each receiver pair. Some authors prefer other forms of normalisation (e.g., running-absolute-mean normalisation, Bensen et al., 2007) but several studies find that the one-bit normalisation is just as accurate and often more stable for different intervals of noise recordings or different receiver pairs (e.g., Nicolson, 2011;Galetti, 2015). We therefore use one-bit normalisation to maximise robustness and repeatability. We also test phase-weighted stacking (Schimmel et al., 2011) to ensure that this decision does not affect our conclusions. Fig. 4 shows an example cross-correlation using increasing stack length. After 24 h the main arrivals are stable, however the coda is still quite variable. The cross-correlation has greatest amplitude in the causal part of the signal, reflecting the non-isotropic noise source distribution seen in Fig. 3. It was found that although the main arrivals of the cross-correlations were stable when using 24 h of data in the stack, the travel time pick stability, and hence the estimated group velocity stability, did not necessarily improve with increasing data volume.

Volume of data for cross-correlation stacks
Using the 13 receiver pairs indicated in Fig. 5 we investigate the variability in estimated group travel-times with volume of stacked data. Up to 144 h of data are available from April and June 2015 and the straight ray group velocity estimated for a cross-correlation at a given period with a given number of hours of stacked data is compared to the estimated velocity if all available data is used. Figs. 6 and 7 show velocity difference for four periods (0.7 s, 0.9 s, 1.1 s and 1.3 s) for the 13 receiver pairs analysed. If the dispersion curves stabilise with increasing data volume, then the differences would decrease as the number of stacked hours increased.
The results show significant variability both between paths and over time, but the April and June results are very similar in terms of stability. Paths which have an east-west orientation have good stability over any time frame for all periods, but paths in other directions exhibit little stability. The maximum noise intensity (shown in Fig. 3) is in the east-west direction in both April and June, and so we see that the crosscorrelations are more stable when the path directions align with the noise intensity. For paths in directions other than east-west there does not appear to be a pattern of decreasing difference with increasing number of stacked hours, thus suggesting that even as we increase the   5. Subset of paths (blue lines) used to analyse the length of time required to obtain stable cross-correlations, the effect of using one-bit normalisation compared to phase weight stacking for the cross-correlations, and for three different group travel time picking methods. The receiver array used in the full study is shown by all the triangles, the injection well is indicated by the red star. The stations are numbered as indicated. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of the article.) A.L. Stork et al. International Journal of Greenhouse Gas Control 71 (2018) 20-35 number of hours to 144, our travel time picks do not become more reliable. The variability in estimated velocities may be due to the varying noise field on an approximately 12 h cycle, as highlighted in Fig. 3. We therefore use 24 h of data to estimate travel-times since the waveforms of the main arrivals are stable with this volume of data. If significantly greater data volume were available then it is possible the dispersion curves would stabilise, but many velocity estimates remain unstable when using all data available for this study.

Cross-correlation method: one-bit normalisation or phase-weight stacking
Single station processing of ambient noise includes temporal normalisation, in this study we chose to use one-bit normalisation in the cross-correlation method, a standard technique which has been used in many other studies (e.g., Campillo and Paul, 2003;Shapiro et al., 2005;Yao and Van Der Hilst, 2009;Nicolson et al., 2012Nicolson et al., , 2014Entwistle et al., 2015). However, the normalisation method may cause biases in the travel time picking so we test the effect of this choice on our travel time picks. For this test we used the subset of 13 paths shown in Fig. 5, 6 days of data for the cross-correlations in April and 7 days of data for the cross-correlations in June. We compare travel-times estimated using one-bit normalisation to those estimated when using phase weight stacking (Schimmel et al., 2011). Fig. 8 shows the comparisons for April and June of the travel time picks at all periods between 0.5 and 1.4 s (shown by the different colours) estimated from cross-correlations calculated using one bit normalisation and using phase weight stacking. The black lines indicate 10% bounds on the difference between estimates. The majority of points lie within the 10% bounds in April (82%) and June (97%). In April the outlying points result from the lowest periods (< 0.8 s, Fig. 8), these are waves that are sensitive to seismic structure close to the surface (depths < 200 m, see Fig. 9). It may be that at these short periods one of the two methods exhibits some instability. These results show that for June there is little difference between the two normalisation methods in terms of the picked travel times. As there is no indication of which is the better method to use, we chose to use one-bit normalisation, and to take into account the poor correlation between the results of the two methods at low periods when considering the reliability of the tomography results. Fig. 6. The difference in the (straight ray) group velocity estimated at periods of 0.7 s, 0.9 s, 1.1 s and 1.3 s in the April dataset when using different data volumes compared to the value found when using the maximum possible data volume for the 13 different paths. Paths are named by receiver number pair as shown in Fig. 5. The approximate orientation of each path is indicated on each plot. Fig. 7. The difference in the (straight ray) group velocity estimated at periods of 0.7 s, 0.9 s, 1.1 s and 1.3 s in the June dataset when using different data volumes compared to the value found when using the maximum possible data volume for the 13 different paths. Paths are named by receiver number pair as shown in Fig. 5. The approximate orientation of each path is indicated on each plot.

Group travel-time determination
The Multiple Filter Technique (MFT), as described by Dziewonski et al. (1969), was used to estimate group velocity dispersion curves for each virtual seismogram calculated above. In MFT the data are Fourier transformed to the frequency domain, bandpass-filtered using a Gaussian filter centred at the current period of analysis and then inverse Fourier transformed to return to the time domain. The time of maximum amplitude on the resulting trace is an estimate of the group travel time. However, due to the finite length of the signal, MFT incurs a systematic error and we apply the method described by Bhattacharya (1983) to correct this error.
Group travel-time picking is fully automated. Quality control criteria are applied, such as ensuring continuity of the dispersion curve over the period range considered, and removing unreasonably small or large velocities (where velocities are, at this stage in processing, based on straight ray paths between receiver pairs). An estimate of uncertainty in the travel time is produced based on the width of the peak of the filtered seismogram. Fig. 10 shows an example spectrogram from which a dispersion curve is picked by selecting the time of peak normalised amplitude at each period (each column of the spectrogram is the envelope of a filtered seismogram). The uncertainty is chosen based on the width of this peak at the normalised amplitude of 0.9. In this example, the area above this amplitude is narrow at low periods and there is an increase in the uncertainty of the travel time picks with increasing period. The dispersion curve which would be chosen from the example spectrogram is highlighted in white. These uncertainty estimates are input to the tomographic inversion, along with the estimated group travel times. Only travel-times for sources and receivers separated by more than three wavelengths are included to satisfy the far-field approximation implicit in ambient noise interferometry and to prevent interference of signals at positive and negative lags (Lin et al., 2008;Bensen et al., 2008).
To ensure good raypath coverage for the tomographic inversion, only periods with > 5000 accepted travel time picks (> 10% of the total number of cross-correlations) were used. Fig. 11 shows the number of accepted travel times for data on 14 April and 8 June 2015. There is a peak in the number of accepted picks between 0.6 s and 0.8 s period. The number of accepted picks for the 8 June dataset is greater; this is due to there being more available cross-correlations in June as many cross-correlations in the April dataset were rejected based on the quality criteria. The number of picks from each dataset restricts the accepted period range to 0.5-1.3 s and cross-correlations show clear Rayleigh waves in this range (e.g., Fig. 4). 5.6. Travel time picking method: stacking causal and acausal amplitudes versus using the side with the maximum amplitude Causal (t > 0) and acausal (t < 0) sides of the cross-correlation correspond to virtual seismograms for the case where one or other of each pair of receivers is converted to a virtual source and is recorded by the other receiver. Due to source-receiver reciprocity, in principle these two virtual seismograms should be identical (Wapenaar and Fokkema, 2006). However, in cases where noise distributions are non-isotropic, differences between causal and acausal virtual seismograms are observed.
The noise is highly directional at the Aquistore site (Fig. 3) and so most of the cross-correlations are asymmetrical (Fig. 4). Tests were carried out to determine whether the causal and acausal virtual seismograms should be stacked before travel time picking, or whether the side with the highest amplitude should be used. Fig. 12 shows the comparison of the travel time picks in these two situations, again at all periods (indicated by different colours) and showing the 10% bounds. The majority of the points are within the 10% bounds (97% in April and 98% in June), showing there to be little difference between the results when applying these two methods. Similarly to the results comparing cross-correlation methods, it is the lower periods which have the greater differences between the two methods. We conducted a   tomographic inversion using both methods and did not find any systematic difference between the two. However, it was found that there was better repeatability in the tomography results when stacking the causal and acausal sides of the cross-correlations (see Supplementary material for the tomography results using maximum amplitude picks). We therefore chose to add the acausal and causal sides of the crosscorrelations for the main study below.

Tomographic inversion
The group travel times found for a range of periods were then inverted to estimate tomographic maps of group velocity for the circular area with radius 1.25 km around the injection well. The inversion was carried out using the Fast Marching Surface Tomography code (FMST) (Rawlinson, 2005;Rawlinson and Sambridge, 2003), an iterative linearised method which uses the Fast Marching Method (FMM) for the forward prediction step and a subspace inversion scheme for the inversion step. We estimated appropriate ranges for smoothing and damping parameters, aiming to minimise the trade-off between model roughness and variance. The values chosen are shown in Fig. 13.
The initial model is a homogeneous model at 0.35 km s −1 , approximately the average velocity of all inter-receiver paths assuming straight rays. The velocity model is parameterised using a regular grid of cells. The tomography problem is linearised by estimating and fixing inter-receiver ray paths using FMM (which also estimates the traveltimes), then a subspace inversion scheme performs each inversion step; the rays are then re-traced through the new model found, followed by another inversion. Five iterations of ray path modelling and subspace inversion were performed to allow convergence of the solution to the results presented below.

Pre-injection Rayleigh wave group velocity model
Rayleigh wave group velocity models for periods between 0.50 s and 1.30 s were estimated for the Aquistore site and results are shown for 0.7 s, 0.9 s, 1.1 s and 1.3 s periods for 13 April data in Fig. 14(a), (d), (g) and (j), and for 14 April data in Fig. 15(a), (c), (e) and (g). For all periods studied the velocities lie between 0.25 and 0.45 km s −1 . These velocities are comparable to surface wave velocities found for similar periods in other studies (e.g., Lin et al., 2013). At periods < 1.0 s there is a northeast-southwest trend in the higher and lower velocity anomalies and there are higher velocities in the west. A northwest-southeast trend is visible at 1.1 s and 1.3 s with low velocities in the south and northeast. At 0.7 s and 0.9 s there is a low velocity zone in the east of the array which is continuous with period. The results at 1.1 s and 1.3 s are very similar in terms of structure, but the amplitudes of the anomalies are lower at the longer period.
The results from the two separate days in April have similar features, showing some stability in the tomography results over these two days. The maximum absolute difference between velocities at the periods shown in Fig. 15 is 34 m s −1 .

Post-injection Rayleigh wave group velocity model
The tomographic inversion was repeated for data collected two months after CO 2 injection commenced, in June 2015. Figs. 14(b), (e), (h) and (k) and 15(b), (e), (h) and (k) show the results for different days, 7 and 8 June, at the same periods as shown for the April dataset (0.7 s, 0.9 s, 1.1 s and 1.3 s). The velocity models have similar overall structure to the results for April and there is good correlation between the estimated velocities on different days (Fig. 17), although there are differences in absolute velocities up to 57 m s −1 . Again at periods < 1.0 s there is a southwest-northeast trend in the velocity structures. At periods > 1 s, there are low velocities in the south and northeast, and there is a low velocity anomaly in the east which is continuous between 0.9 s and 1.1 s. Again there are strong similarities between the results from these two consecutive days of ambient noise recordings.

Resolving the differences in pre-and post-injection models
Comparison of the velocity models for April and June 2015 indicates how seismic velocities have changed between the time injection began in April 2015 and the time of the second recording of ambient noise in June 2015. We do not expect significant differences between the two datasets at these periods because there are no known effects of the injection in the near surface during this time, including no evidence for any leakage of CO 2 during injection. Hence, we essentially perform a repeatability test: differences in results between the two surveys will indicate the level of overall uncertainty in practice, other than temporally consistent biases. The use of data from two days in April and two A.L. Stork et al. International Journal of Greenhouse Gas Control 71 (2018) 20-35 days in June also tests the repeatability, as the noise distribution might be more similar for consecutive days over which there should certainly be negligible change in the subsurface velocity structure. Fig. 14(c), (f), (i) and (l) shows the differences between the results for 13 April and 7 June, and Fig. 16(c), (f), (i) and (l) shows the differences between 14 April and 8 June. For periods between 0.7 s and 0.9 s the tomography results in April are similar (Fig. 14(a) and (d)), but there are greater differences in June (Fig. 14(b) and (e)). In addition, the differences between April and June results (Fig. 14(c) and (f)) are clearly significant at 0.9 s. Between 0.9 s and 1.1 s the tomography results in both April (Fig. 14(d) and (g)) and June (Fig. 14(e) and (h)) show large differences: at 0.9 s there is a SE-NW trending high velocity zone, while at 1.1 s there is a NE-SW trending high velocity zone. Although the tomography results show rather large differences, there is a similarity in the pattern of differences between the April and June results for both periods (Fig. 14(f) and (i)). For periods > 1.1 s the tomography results are similar (Fig. 14(g) and (j), and Fig. 14(h) and (k)) and the changes between the results for different days are less significant (Fig. 14(i) and (l)). Between 13 April and 7 June there is a zone of velocity decrease to the east of the injection well at 1.1 s with a velocity increase elsewhere within the array. At 1.3 s a similar pattern in differences is observed but changes are smaller in magnitude. In general, changes in velocity structure between two consecutive days in either April are smaller than the changes between April and June (compare Fig. 15 with Figs. 14 and 16). This shows stability in the results when we can be sure there are negligible detectable changes in subsurface velocity structure.
We have ruled out the possibility that stacking the causal and A.L. Stork et al. International Journal of Greenhouse Gas Control 71 (2018) 20-35 acausal sides of the correlations in the travel time picking is the cause of the large observed velocity differences by comparing results of tomographic inversions using travel time picks from only the maximum amplitude side of the correlations compared to travel time picks from stacks of the acausal and causal sides of the cross-correlation. It was found that estimating travel times from the side of the cross-correlation with the maximum amplitude caused the velocity differences between days to increase (see Supplementary material).
The changes we observe between models for different time periods must be due to some change either in the noise distribution or in the subsurface. Although we need not apply them in surface wave tomography, static correction time shifts are routinely applied to seismograms in seismic reflection surveys to compensate for irregular topography and low velocity zones in the shallow subsurface (a weatheringlayer correction). We do not expect such a weathering-layer related correction to vary significantly with time since the geophones at the Aquistore site are buried below the frost depth and water table, near the base of the coal-mining disturbed layer at 20-25 m depth . Also, Roach et al. (2015) report negligible changes in refraction statics between baseline 3D seismic surveys at the site. While it is true that these methods tend to be most sensitive to P-wave statics whereas our surface wave data are mainly sensitive to shear wave velocities, it nevertheless seems unlikely that changes in weathering-layer related corrections could explain the differences in velocity models between April and June 2015.
As stated by, for example, Wapenaar and Fokkema (2006), when using ambient noise interferometry by cross-correlation we assume that ambient noise sources span a closed boundary surrounding the receiver pair. However, when injection is active, as it is for the June dataset, noise is likely to be produced by the injection process. The Aquistore injection well is at the centre of the array and so there could be a local and comparatively strong source of energy within the array, which for A.L. Stork et al. International Journal of Greenhouse Gas Control 71 (2018) 20-35 some of the receiver pairs means there will be a source of energy between them. This violates the requirement for having energy sources on a surrounding boundary and is a likely cause of some of the differences in the tomography results. However, spectrograms of noise recordings from the centre of the array, close to the injection well, for times before and after injection start-up do not show any systematic differences in the amplitude and frequency content of the noise (Fig. 18). We suggest that we are unable to pick arrivals on the cross-correlation waveforms with sufficient accuracy to resolve velocity changes < 0.05 km s −1 . Uncertainties in the travel-time picks will contribute to the differences observed in velocity models and most uncertainties lie between 5% and 15% for both datasets. These uncertainties are similar in magnitude to the observed percentage velocity variations between days (Fig. 19). The percentage velocity variations between days are up to 20%too large to be able to resolve the expected velocity variations of up to 4% with an influx of CO 2 (Fig. 2). The differences between 14 April and 8 June (Fig. 19(c), (f), (i) and (l)) are larger than the differences between the other dates ( Fig. 19(a), (b), (d), (e), (g), (h), (j) and (k)) thus highlighting the variability in noise sources over time-scales of days. Some of the differences observed at 0.7 s could be explained by the results of previous analysis (Figs. 8 and 12) showing that travel time picks were less stable at lower periods, particularly below 0.8 s. Longer time periods of data are required to satisfy the assumption an isotropic ambient noise field.

Sensitivity and resolution tests
In order to determine to what depth ambient noise can be used to monitor the Aquistore CO 2 storage site, a sensitivity analysis was carried out. S-wave velocities derived from borehole logs and the A.L. Stork et al. International Journal of Greenhouse Gas Control 71 (2018) 20-35 dispersion curve modelling code DISPER80 (Saito, 1988) were used to estimate sensitivity kernels (derivatives) of Rayleigh wave group velocity relative to shear velocity at each depth. Fig. 9 shows sensitivity kernels for the period range 0.5-1.4 s. The kernels show that the maximum sensitivity for all periods analysed occurs at 100-200 m depth (where the CO 2 would be a gas) and the periods analysed are sensitive to depths of up to 400 m. As expected, longer periods show a greater sensitivity to greater depths. The maximum seismic velocity variations predicted by an influx of CO 2 are 3.5-4.5% at 200-400 m deep (Fig. 2). A leak to this depth range would affect Rayleigh wave group velocities to a lesser extent because the surface waves are most sensitive to shallower depths for the periods studied here (Fig. 9). Therefore, the tomographic velocity models derived from the current Aquistore array are not sufficiently precise to resolve this difference. With increased accuracy surface wave ANI could be used to resolve leaks to < 400 m depth and highlight any velocity changes caused by geomechanical deformation during injection. An increase in array aperture would allow longer period surface waves to be used, and improve the sensitivity to depths where the CO 2 remained in a supercritical state, giving additional warning time before CO 2 surface leakage. There is also the possibility of using body-waves to detect seismic velocity changes and the Biot-Gassmann fluid substitution equations predict P-wave velocity changes of up to tens of percent with an influx of supercritical CO 2 at depths of 1-2 km, much larger than the changes predicted for S-waves and surface waves. Cheraghi et al. (2017) use ambient noise to conduct common mid-point reflection imaging to retrieve virtual-shot gathers for the Aquistore site to assess the possibility of using this tool for time-lapse imaging. However, they find the quality of their ambient-noise images is limited by the directionality of the noise and a low body-to-surface wave ratio.
Checkerboard tests were also conducted to assess our ability to resolve lateral velocity variations at the site. Models with different sized anomalies were tested with a maximum velocity perturbation of 0.10 km s −1 relative to the reference velocity. Sets of synthetic traveltimes are computed through the checkerboard model for the same receiver pairs used to estimate the velocity models in Fig. 14 and 15, and then their ability to reproduce the checkerboard is tomographically tested. These tests indicate that, within the geophone array, we resolve  A.L. Stork et al. International Journal of Greenhouse Gas Control 71 (2018) 20-35 some resemblance to velocity checkerboard anomalies as small as 300 m for periods between 0.7 s and 1.3 s, and an example for 0.9 s period is shown in Fig. 20. There is blurring of the checkerboard structure which highlights the requirement for a higher density of ray paths if we hope to accurately recover the absolute velocity values and more detailed structure at the Aquistore site. While these tests do not provide information about genuine spatial resolution in (non-linear) tomography problems that is available using the fully non-linearised methods of Galetti et al. (2015), the latter methods are computationally prohibitive for the number of sensors used here. We therefore apply the checkerboard tests only to provide some indication of achievable resolution.
6.3. Potential for monitoring using ANI de Ridder and Biondi (2012) considered the possibility of using ANI for monitoring of CO 2 storage. Their results show sensitivity to similar depths as obtained in this study and suggest that there is potential to use ANI for monitoring when using 5 days of data. However, their field of study is an oil reservoir in the Norwegian North Sea, their data were recorded on the seabed, and their field is not a CO 2 storage facility. Different noise sources exist in oceans compared to on land and will produce data of different quality and repeatability. Much of the noise used in our study is anthropogenic, while the noise recorded in oceans will mostly be natural, with the addition of noise from shipping or from active seismic shooting if near to locations of interest for the hydrocarbon industry. This may indicate a limitation in terms of location for the use of ANI for monitoring CO 2 storage facilities.
Seismic monitoring of a CO 2 storage site has been shown to be possible in theory, for example using P-wave velocity tomography (JafarGandomi and Curtis, 2011;Spetzler et al., 2008) and controlledsource time lapse reflection seismology (Juhlin et al., 2007). These methods also consider the velocity change caused by the presence of CO 2 , and it is therefore shown to be possible using seismic methods to obtain the required level of repeatability to detect the extent of change in velocity which would be caused by the presence of CO 2 . However, the reliability of such methods is also dependent on the repeatability of surveys, and similar errors to those observed in our travel time picks have been observed. For example, a 3D seismic time-lapse survey by Ivanova et al. (2012) shows a normalised root-mean-square error on the order of 15-25%. For ANI to be reliably used for monitoring of CO 2 leaks the repeatability of surveys must also be improved.
The predicted velocity changes in the presence of CO 2 are small at the Aquistore site (< 4%) and very precise methods are required to detect these changes. Our analysis shows that although standard surface-wave ANI is not suitable for CO 2 leak detection at Aquistore with current data availability and processing methods, with increased data volumes and more precise travel time picking it might provide an early warning of leakage if time frames on the order of months were available.
At other CO 2 storage sites it may be possible to use ANI as noise sources will be different and may be more stable, and because the suitability of the technique also depends on the site geology, the array geometry and storage depth. A similar repeatability analysis to that performed in this study would be required to verify whether ANI is useful for leakage detection at each particular site. Independent of leakage detection capabilities at a given site, accurate 3D velocity models derived from ambient noise recordings could also be used to determine time-dependent static corrections for converted-wave 3D seismic processing where S-wave velocities are required. Overall, there is therefore a need to improve ANI methods in order to be able to monitor and image such sites in future.
Results from different normalisation and picking methods indicate that if a comparison is to be made between pre-and post-injection surveys then the same processing methods should be applied to the data. This means that any systematic errors due to the applied methods are constant over different times and any observed changes are due to actual differences rather than changes caused by methodological changes.
To use ANI for monitoring of CO 2 storage sites such as Aquistore, the array must be large enough to allow longer periods to be used, thus allowing studies to reach to greater depths. The approximate requirement of around three wavelengths of separation between receivers to obtain reliable travel time estimates means that if the velocity is higher, as generally occurs at greater depths, the array must be significantly larger than that at Aquistore to reach depths close to the reservoir. A three wavelength separation gives sensitivity to approximately one wavelength in depth. Assuming a surface wave velocity of 1 km s −1 the use of 10 s period waves (the minimum period required to be sensitive to a depth of approximately 3 km) would require an array with aperture of at least 30 km. However, this is of no consequence if the accuracy of the ANI measurements cannot be improved. Including more noise data is an obvious way to improve the cross-correlation estimates, but it may also be that an improved method of travel time picking may be required, one which produces less variability or more stringent quality control criteria. It is unrealistic to carry out the travel time picking manually if a large number of ray paths is to be used and so this travel time picking method must be automated.
It may be possible to improve the cross-correlation calculations by applying different normalisation and stacking methods. Two methods were tested in this study, but there are several options available (see Bensen et al. (2007) for more possible methods), and further testing with these other methods may improve the tomography estimates. Although we do not observe any significant change in the noise field with injection start-up, an internal source of energy may account for the instability of the cross-correlations during this time, hence it may be required to ensure that there are no large sources of seismic energy within the receiver array. It may be that at a different CO 2 storage site with more stable noise characteristics, ANI and ambient noise tomography may be a more viable method of monitoring CO 2 leakage.
In addition, the tomography method applied here is a linearised method, but tomography problems are intrinsically non-linear. Therefore, application of a non-linear inversion method may also improve the results. However, current non-linearised methods take a prohibitive length of compute power for the array sizes used here which would render the monitoring useless, so faster non-linearised methods must be developed before tomography estimates could be improved in this way.
The possibility also exists that the changes we observe in our tomography results are true changes in the subsurface. There are no reported effects of CO 2 injection at Aquistore, however we cannot rule out the possibility that the results show the effects of injection at the site. In order to test this, results from other methods should be compared for the two time periods.

Conclusion
To assess whether the passive seismic data recorded at the Aquistore CO 2 storage site in Saskatchewan, Canada, would be useful in leakage monitoring, we investigate the potential for ambient noise Fig. 20. A checkerboard resolution test using 0.9 s period June travel-time picks. Anomalies have a maximum pertubation of 0.10 km s −1 . Geophone locations are represented by the black triangles and the red stars are the injection and observation wells. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of the article.) interferometry to provide early warning of a CO 2 leak to the surface. This technique could provide continuous monitoring results for the remote detection of CO 2 movement. CO 2 is stored at depths in excess of 3 km at the site and any technology that can provide remote monitoring of the subsurface is a valuable method for the long-term safety of the CCS project.
Ambient noise interferometry, a linearised tomographic inversion, and a sensitivity analysis were applied to ambient seismic noise recordings from the 2D array of geophones present at the site during two periods of time to provide a time-lapse image of the site. The analysis shows that the ambient noise is sensitive to depths of 100-400 m for the surface wave periods considered here (0.5-1.4 s). The velocity models for April and June 2015 have similar overall patterns with velocities in the range 0.25-0.45 km s −1 , and some differences up to 0.05 km s −1 between the two time periods analysed. Travel-time picking uncertainties and changes in the noise source characteristics explain the differences in the tomography results between the two datasets and it is not thought the differences reflect a real change in surface wave velocities at the site. Differences between consecutive days are small which indicates that the results are relatively stable. However, predicted changes in seismic velocities with CO 2 saturation at 400 m are still too small to resolve with the current accuracy of velocity models.
Considering monitoring of CO 2 storage sites in general, ambient noise might be used to provide a cost-effective early warning system for leakage if significantly longer noise records could be obtained, or if in some other way travel-time picking uncertainties could be reduced. With a spatially larger array than that deployed at Aquistore it may be possible to obtain tomography estimates for longer periods, which are sensitive to greater depths. Repeated 3D seismic surveys are expensive and labour intensive to carry out and process. Therefore, ANI provides a more cost and time-effective imaging method if the lateral and depth extent of CO 2 can be determined with sufficient resolution.