Suitability analysis and revised strategies for marine environmental carbon capture and storage (CCS) monitoring

Environmental monitoring of offshore Carbon Capture and Storage (CCS) complexes requires robust methodologies and cost-effective tools to detect, attribute and quantify CO2 leakage in the unlikely event it occurs from a sub-seafloor reservoir. Various approaches can be utilised for environmental CCS monitoring, but their capabilities are often undemonstrated and more detailed monitoring strategies need to be developed. We tested and compared different approaches in an offshore setting using a CO2 release experiment conducted at 120 m water depth in the Central North Sea. Tests were carried out over a range of CO2 injection rates (6 143 kg d 1) comparable to emission rates observed from abandoned wells. Here, we discuss the benefits and challenges of the tested approaches and compare their relative cost, temporal and spatial resolution, technology readiness level and sensitivity to leakage. The individual approaches demonstrate a high level of sensitivity and certainty and cover a wide range of operational requirements. Additionally, we refer to a set of generic requirements for sitespecific baseline surveys that will aid in the interpretation of the results. Critically, we show that the capability of most techniques to detect and quantify leakage exceeds the currently existing legal requirements.


Introduction
Carbon dioxide (CO 2 ) capture and storage (CCS) in subsurface geological formations has the capacity to provide up to 13 % of the emission reduction that is needed to keep the global temperature increase below 2 • C relative to pre-industrial levels and meet the terms of the Paris Agreement (IEA, 2015). Within Europe, the majority of the CCS capacity lies offshore, in deep geological storage sites such as depleted oil and gas reservoirs and saline aquifers (Energy, 2016;Vangkilde-Pederson, 2009). Although regarded as unlikely (IPCC, 2005), the potential leakage of CO 2 from the storage reservoir into the marine environment could undermine the climate mitigation measures (Haugan and Joos, 2004). In addition, any increase in CO 2 concentrations within the sediment and water column may also lead to local environmental damage, including a pH decrease exceeding the tolerance of calcifying organisms, release of potentially toxic substances into the environment and shifts in community structure Hall-Spencer et al., 2008;Hassenrück et al., 2016;Kleypas and Langdon, 2006;Lichtschlag et al., 2015;Yanagawa et al., 2013).
Whilst international regulations for offshore CCS monitoring are currently not harmonised, the monitoring of a storage complex (i.e. the storage site and its surrounding geological domain), is part of IPCC Guidelines (IPCC, 2006), the EU CCS Directive (EC, 2009) and the London (IMO, 2006) and OSPAR (OSPAR, 2007) Protocols. These regulations state, for example, that monitoring of a storage complex by the operator is mandatory for the purpose of detecting significant irregularities and migration of CO 2 , detecting leakage, and detecting potentially significant adverse effects on the surrounding environment (EC, 2009). Monitoring can also be beneficial for transparency and communication with stakeholders to alleviate reputational risk to the site operators (Waarum et al., 2017). Release of CO 2 from a storage reservoir could occur through fractures and fracture networks, along sealed or active wells, or due to diffusion of dissolved CO 2 across the reservoir seal (Busch et al., 2008;Gasda et al., 2004). In-well monitoring technologies are expected to provide early warning of loss of containment in the vicinity of the injection wells, where leakage risks are assessed to be highest. Leakage signals away from CO 2 injection wells are likely to be stronger in the sedimentary overburden, but they might be very localized and hence harder to locate . By contrast, the area in which anomalies can be found in the water column is larger, but ocean currents and mixing will rapidly dilute any leakage signal (Dewar et al., 2013), and consequently it might be difficult to detect against the natural background CO 2 levels. Core requirements for monitoring techniques and technologies are that they can measure and detect an anomaly, they are cost-effective, the equipment needs to be easy to use with appropriate levels of training and experience, and the processed results should be available as quickly as possible. Moreover, it is essential that the monitoring approach minimises the potential for returning false-positive results, i.e. the detection of high CO 2 concentrations or other environmental anomalies that are not due to CO 2 leakage from a CCS storage reservoir. Source attribution (i.e. determining the origin of the CO 2 ) can help to achieve this (Dixon and Romanak, 2015).
Since 1996 CO 2 has been injected into a sub-seafloor formation in the North Sea at the Sleipner gas field. Sleipner was the first offshore CO 2 storage project and is one of the few currently operational offshore CCS storage sites worldwide (GCCSI, 2018). Because there has been no evidence for CO 2 leakage at offshore industrial sites, detection and quantification of CO 2 leakage in the marine environment has therefore been tested with artificial injections of CO 2 into the overburden sediments and the water column (Blackford et al., 2014;Taylor et al., 2015;Vielstädte et al., 2019). Based on these experiments, and experience from the oil and gas sector, various generic environmental offshore monitoring strategies have been proposed (e.g. Blackford et al., 2015;Shitashima et al., 2013;Wallmann et al., 2015) in addition to storage containment risk assessments that are designed for individual storage complexes (Dean and Tucker, 2017). Offshore monitoring is a fast-developing sector and novel approaches are often not yet included in the monitoring strategies. Environmental monitoring approaches might be especially needed in more remote offshore locations, where monitoring cannot be easily accomplished from the coast, e.g. the potential reservoirs in the North Sea that can be located hundreds of km offshore. Strategies for monitoring offshore CCS complexes also need to take into account financial, logistical and methodological challenges. Compared to onshore monitoring, the major challenges for offshore monitoring are the remoteness of many potential storage sites and the fact that the plume of dissolved CO 2 can be limited to only a few meters above the seafloor , meaning that anomalies may only be present close to the seafloor or in the sedimentary overburden. The majority of offshore monitoring approaches therefore currently still require the use of infrastructure such as ships, autonomous underwater vehicles (AUVs), remotely operated vehicles (ROVs) or seabed observatories. Due to the complexity of the marine environment and the monitoring requirements, it is crucial to test the capabilities of different monitoring approaches, such as survey patterns, sensor detection limits and sensitivity, and strengths and limitations in a real-world CO 2 leakage scenario.
To evaluate the suitability of different monitoring approaches to detect, attribute and quantify leakage in the near-surface sediments and the water column in more remote offshore locations that cannot be monitored from land, a combination of novel and existing monitoring techniques and technologies were tested in the vicinity of a proposed offshore storage site during the 'Strategies for Environmental Monitoring of Marine Carbon Capture and Storage' (STEMM-CCS) project. A controlled CO 2 release experiment was carried out as part of this project in the Central North Sea, close to the Goldeneye platform, which is approximately 100 km northeast of Peterhead (UK) at a water depth of 120 m (Flohr et al., 2021a). To mimic the leakage of CO 2 from a CCS reservoir into the shallow overburden sediments and water column, CO 2 was injected at a depth of 3 m below the sediment-water interface. The leakage rates used during the experiment were comparable to those observed from abandoned wells (Tao and Bryant, 2014;Vielstädte et al., 2015). This paper evaluates the relative suitabilities of the techniques and technologies tested during this experiment to perform specific monitoring tasks. For this we: 1) define the specific monitoring tasks and establish their requirements; 2) describe the tested monitoring approaches and their capabilities; 3) compare the benefits and limitations of the approaches relative to each other; and 4) share the lessons learned from the in-situ testing of these approaches. In doing so we demonstrate the potential of a broad range of techniques and technologies for offshore CCS monitoring, recognising that optimal strategies will vary according to specific storage complexes and sites, prevailing environmental conditions and/or project requirements, operational phase and commercial realities.

Monitoring tasks and challenges
Many existing legal frameworks for offshore CCS monitoring require investigation of: (1) leakage detection, (2) leakage attribution and (3) leakage quantification Dixon and Romanak, 2015;IPCC, 2006;Shitashima et al., 2013). Various monitoring approaches that target these requirements (herein referred to as 'monitoring tasks') were tested as part of the STEMM-CCS controlled CO 2 release experiment, building on the outcomes of previous projects (Dean et al., 2020). Additional information that could assist in the interpretation of the monitoring data, and that can be useful if acquired before, during and after the CO 2 injection, is listed in Table 1. The majority of the approaches tested during the STEMM-CCS project fall into the category of monitoring of the near-surface, classified as regions located less than 10 m above and below the ground surface or seabed (IPCC, 2005). Monitoring also feeds into environmental site characterization and environmental impact assessment, but as these are discussed elsewhere , they are only considered where they overlap with the three main monitoring tasks described here.

Leakage detection
Leakage is defined as the release of any CO 2 from a storage complex (EC, 2009) into the ocean and/or the atmosphere (IPCC, 2006). Under the EU CCS Directive, a monitoring strategy for CCS storage complexes has to assess whether any leakage of CO 2 is occurring and if significant adverse effects for the surrounding environment can be detected (EC, 2009). Similarly, the London Protocol states that monitoring of the seafloor and overlying water column should be performed to detect leakage of CO 2 or other substances dissolved in, or mobilized by, the CO 2 (IMO, 2006). Providing assurance that no leakage is occurring (e.g. at abandoned wells, pockmarks or other higher risk locations) is an equally important task for leakage monitoring. To find an appropriate monitoring approach, leakage detection strategies need to accommodate the fact that the leakage of CO 2 and other substances may occur from a single point source or as more diffuse discharge over a larger area. In the event that a CO 2 leak is detected, further investigations may be needed to detect excess CO 2 in surficial sediment at the leak site. In the sediments, the leaking CO 2 may be present in gaseous or dissolved form and, if a strong overpressure leads to advection of fluids, CO 2 can be released from the sediment also in dissolved form. Gas bubbles leaking from the sediments will quickly dissolve in the water column (Dewar et al., 2013;McGinnis et al., 2011) and CO 2 will then prevail in its dissolved form as dissolved inorganic carbon (DIC).

Leakage attribution
Leakage attribution refers to the discrimination of CO 2 leaking from a specific CCS storage reservoir from any other natural or human-made CO 2 source. In the marine environment CO 2 is produced during the oxidation of organic matter by microbes (Berner, 1980) and higher organisms in the sediments and the water column, and magmatically produced CO 2 can be found in specific geological settings, such as hydrothermal zones. Up to now only few marine CO 2 seeps are known that are not of magmatic or hydrothermal origin (e.g. in the North Sea; McGinnis et al., 2011). Human-made sources from which the leaking CO 2 might need to be discriminated could, for example, be other CCS reservoirs or enhanced oil recovery operations. Although attribution is mentioned in some of the guidelines (e.g. IPCC, 2006), it is currently not a legal monitoring requirement. Practical experience in monitoring of terrestrial CO 2 storage sites, however, has shown that without thorough source attribution there is a significant risk of false positives (Romanak et al., 2014). Thus, source attribution is suggested to be included in future legal directives (Dixon and Romanak, 2015). As attribution may form part of a future CCS monitoring strategy (e.g. as part of any future carbon-credit schemes), attribution methods were tested during the STEMM-CCS experiment (Flohr et al., 2021b) and are included in our suitability analyses.

Leakage quantification
The EU CCS Directive states that if leakages, or significant irregularities that imply risk of leakage, are discovered, then an assessment of the scale of the problem is required (EC, 2009) and quantification will become necessary (EC, 2010). Similarly, the Clean Development Mechanism, which is a mechanism under the Kyoto Protocol for earning carbon credits/certified emission reduction credits for low-carbon projects in developing countries, states that it is mandatory to utilise techniques and methods that can estimate the flow rate and the total mass of carbon dioxide released from any leakage (UNFCCC, 2011). Consequently, in the (unlikely) event of a leakage, the rate and the spatial extent of CO 2 leakage from a CCS storage complex need to be quantified to enable informed decisions on what remedial interventions are necessary and also that leaked carbon can be assessed against the total carbon stored.

Additional useful measurements to interpret the monitoring results
All of the described approaches require a certain amount of background information, sometimes referred to as a baseline or characterisation (Beaubien et al., 2015), in order to interpret the data and identify any anomalies (Table 1). Such information could include: knowledge of potential leakage pathways (to inform monitoring strategies); porewater and water column carbonate chemistry (DIC, pH, pCO 2 etc., to detect anomalies (Blackford et al., 2017), assist with attribution and quantification and to place bounds on impacts ); analyses of carbonate system co-variables (e.g. O 2 or nutrients) to inform stoichiometric or process-based detection and attribution (Botnen et al., 2015); porewater analyses (to understand precursor impacts); knowledge of hydrodynamics (to assist detection); and knowledge of biota and ecosystems (to inform impact sensitivity). Given that all of these parameters are naturally variable over multiple spatio-temporal scales within the marine system, assembling comprehensive observational baselines can be an expensive and impracticable task within the timeframe of an industrial project. Whilst some prior observations are clearly useful or essential, others can be derived from pre-existing sources (e.g. oceanographic models) or may be derived from closely analogous sites contemporary to monitoring activities .

Monitoring techniques and tested approaches
The approaches for monitoring detection, attribution and quantification tested during the STEMM-CCS release experiment are listed in Table 2. These included commercially available instruments, approaches that were adapted from other fields, and more novel technologies with lower technology readiness levels (TRLs). The approaches were either direct measurements of defined parameters, using specific techniques and technologies, or indirect approaches that use modelling and calculations to aid monitoring, based on the obtained measurements (e.g. by defining anomaly criteria). Several of these approaches were able to address multiple monitoring tasks (Table 2).
Monitoring approaches were tested in the UK sector of the Central North Sea during an in-situ controlled release experiment. This experiment was designed to mimic leakage of CO 2 from a CCS reservoir into near-surface sediments and the overlying water column. The STEMM-CCS experimental site was located ~800 m southeast of the Goldeneye platform, above a depleted gas reservoir that had previously been identified to be suitable for storing CO 2 (Cotton et al., 2017;Dean and Tucker, 2017). The water depth was 120 m, with a tidal range of ≤ 2 m and dominant N-S currents. During the release experiment, CO 2 gas was injected through a pipe into sediments at a depth of 3 m below the seabed. The CO 2 flow rate was increased in a series of steps from 6 to Table 1 Information required to enable leakage detection, attribution and quantification. Information may be required either pre-injection (Pre), during operation (During) or post-injection (Post). *Indicates information likely to have been collected for CCS storage site selection.

Table 2
Approaches for monitoring leakage detection, attribution and quantification tested during the STEMM-CCS project, listed in alphabetical order. Green fields = best performance; amber fields = medium performance; red fields = worst performance. Details on the assessment parameters can be found in the text. This table compares the different tested approaches to highlight the tools and technologies that could be considered for certain requirements. Note that where methods were used for several monitoring tasks the scoring only refers to the task highlighted in grey; methods that have the potential to be used for other monitoring tasks in the future are signalled by brackets. The survey platform column reflects the deployment/operational requirements for each approach (i.e. a vessel with/without AUV or ROV) and are not considered in the method cost comparison. The use of an ROV is indicated if the approach requires precise positioning on the seabed; n.a. = not applicable.
143 kg d − 1 over 11 days. Over the course of the experiment, streams of gas bubbles were observed seeping from the seabed, and increased levels of dissolved CO 2 were recorded in the water column and in the sediment porewaters. Different instruments were deployed at the release site as explained in detail in Flohr et al. (2021a) and other related publications (Supplement Table 1). Environmental site characterization and collection of information that can assist in the interpretation of the results for the different monitoring tasks (Table 1) was achieved using previously published data for the region together with new data collected in the vicinity of the experimental site prior to the leakage experiment (cruises POS518 and POS527) or at sites not impacted by the CO 2 release during the release experiment (cruises JC180 and POS534; Dale et al., 2021;Esposito et al., 2021).

Effectiveness, costs and benefits of tested monitoring techniques and technologies
To compare the effectiveness, advantages and disadvantages of the approaches tested during the STEMM-CCS field experiment, five assessment parameters were chosen: time; cost of the individual technique or technology; leakage phase; spatial coverage; and technology readiness level. The relative performance of each tested technique against these parameters is indicated in Table 2, with 'green' reflecting a good performance (tested as suitable for offshore monitoring), amber indicative of an intermediate performance (worth considering as offshore monitoring approach), and red indicative of a poorer performance relative to the other tested approaches (applicable, but further improvement may be desirable prior to commercial application in a realworld scenario). 'Time' reflects the amount of time needed for the results to become available after the field monitoring campaign has been completed, and includes any associated laboratory analysis and/or data processing (green = information available near-instantaneously; amber = information available within days-weeks; red = information available after >1 month). 'Cost' represents the relative cost to produce the data using an individual approach, separated into capital expenditures (CAPEX; the cost of purchase, upgrade and physical maintenance of the technique) and operating expenditures (OPEX; the running, analytical and operational personnel costs) (green = low; amber = medium; red = high). Ultimately, the largest cost for offshore monitoring is nearly always associated with the cost of the vessel needed to deploy the technology (with additional expenditure occurring if the deployment requires the use of a specialised platform such as an AUV or ROV). In this study these platform costs have not been included in the comparative assessment as all of the approaches considered were either deployed from a vessel or required the use of data directly generated from shipdeployed systems ( Table 2). The deployment platform costs are, however, considered separately in the discussion. 'Leakage phase' indicates whether the applied technique detected CO 2 either in the form of gas bubbles (physical detection = amber) and/or dissolved CO 2 (chemical detection = green). This differentiation was used as the chemical methods can measure dissolved CO 2 in the absence of bubbles. No red category was applied for this parameter as all of the techniques were able to detect CO 2 or related parameters. 'Spatial coverage' represents the survey area covered by each approach, and is mainly dependant on the platform needed to deploy the technology. Depending on the measurements being conducted, the spatial coverage was classified as either high (green = hundreds of square meters coverage), medium (amber = meters to 10 ′ s of square meters coverage) or low (red = usually limited to a distance of a few meters away from a stationary instrument). It should also be noted that the spatial coverage of any technique measuring the dissolved phase will always depend on local hydrodynamics, thus all of the data reported in this study is specific to the hydrodynamics of the North Sea CO 2 release site. Similarly, the rating for spatial coverage is relative and only compares the tested approaches. 'Technology readiness level' was classified as either commercially available (green = TRL 9), near market (amber = TRL 7-8), or in development (red = TRL <7). All of the scores presented here reflect the status of the technologies and approaches as in 2020.
A detailed description of the individual technologies and techniques that were developed and deployed, the parameters they measure, their strengths and limitations and, where available, the confidence of the measurements made by each technique are listed in Supplementary  Table 1. Where an approach has the potential to be used for several monitoring tasks, the suitability analysis is shown for the task it performed best; details about its performance with respect to the other tasks are given in Supplement Table 1. A more compact version of this suitability analysis is publicly available online (http://www.stemm-ccs. eu/monitoring-tool/). Note that the online tool is periodically updated, and will likely evolve from this publication.

Proposed strategies for leakage detection, attribution and quantification based on suitability analysis
To ensure confidence in the long-term security of CO 2 storage, comprehensive, risk-based measurement, monitoring and verification (MMV) programs are required. The natural variability of offshore conditions, including overburden characteristics, water depth, regional and local current conditions, presence of natural gas seeps and other environmental factors, means that MMV programs require site-specific strategies, and hence might differ for each individual storage complex. For example, the MMV assessment for the Goldeneye complex recommends monitoring the geosphere and wells, the marine biosphere, seabed and shallow sediment layers and water using multibeam bathymetry systems or sidescan sonars, and seabed and seawater sampling (Dean and Tucker, 2017). MMV programs will also contain a cost-benefit analysis that considers the existing technologies able to address the required monitoring task and their likelihood of success (Dean and Tucker, 2017).
For offshore operations, one of the largest costs associated with any MMV program is currently the survey vessel, and the specific monitoring cost will vary with the operational phase (pre-injection/operational/ closure), the distance to the shore, and whether vessels or other viable survey platforms are available in the area when needed. Shiptime costs can vary greatly, and compared to approaches based on equipment that operates independently once deployed, the survey cost will increase further if the approach requires continuous ship-based measurements or employs an additional platform, such as an AUV or ROV. The need for a research vessel on site for some mobile platforms might change in the near future through the development of long-range AUVs, capable of operating from shore, with a range of up to 5000 km (Roper et al., 2017).
Two approaches for leakage detection and quantification were evaluated during the STEMM-CCS experiment: (i) survey-based approaches using techniques and technologies on mobile platforms such as ship-or AUV/ROV-mounted sensors, which can be used to monitor the whole area and (ii) fixed-installation approaches that place sensors on landers and other stationary platforms at an identified high-risk location on the seafloor with the use of a ship and an ROV. Here we summarize the effectiveness and applicability of the approaches tested in the Central North Sea, as well as lessons learned during their in-situ testing and how they might, in the future, be integrated into the CCS monitoring strategy of reservoirs in similar offshore settings.

Leakage detection
Whilst this paper details the methodological status of CCS monitoring techniques, optimal detectability further depends on devising suitable strategies for the deployment of fixed or mobile sensors (e.g. Cazenave et al., 2021;Greenwood et al., 2015;Hvidevold et al., 2016;Alendal 2017;Oleynik et al., 2020). The initial challenge is to detect a signal which may be intermittent (e.g. due to suppressed bubble flow at high tide) and mobile (e.g. due to the tidal advection of CO 2 -rich plumes). Further, one of the principal difficulties associated with detecting CO 2 leakage from offshore storage complexes is that the storage complex footprint can be up to several hundred square kilometres (when projected onto the overlying seabed), while, depending on the leakage rate, a leakage may only be detectable in an area in the order of 10 ′ s to 100 ′ s of square meters . For example, the suggested survey area of the Goldeneye storage complex is approximately 200 km 2 (Dean and Tucker, 2017). In all cases, the results reported here presume an optimal placement or routing of sensors, and thus reflect a "best case" detection scenario.

Mobile survey platforms
Traditional ship-based methods can be used for detecting leakage in the form of gas bubbles in large areas, such as above the Goldeneye storage complex, for example with active acoustics such as the water column recordings of multibeam echosounders. Multibeam bathymetry systems or sidescan sonars are designed to produce broad-swath beam patterns, and when mounted on a ship, can cover a large area within a limited period of time. The advantage of these acoustic methods is that results are available almost immediately, though more detailed geochemical measurements are likely to be required to determine the nature of any gas bubbles detected. Unlike other gases, CO 2 dissolves quickly in seawater, thus any gaseous CO 2 released during a leak is expected to dissolve within meters of the seafloor under most circumstances (e.g. Dewar et al., 2013;McGinnis et al., 2011;Vielstädte et al., 2015). During the STEMM-CCS release experiment bubbles rose to a height of 8 m during the lowest injection rate (6 kg d − 1 ; Flohr et al., 2021a). Hence, to detect leakage in the water column with sufficient detail a multibeam echosounder of sufficiently high frequency can be mounted on an AUV and operated at a maximum height of about 50-100 m above the seafloor to observe gas bubbles. In principle, covering 200 km 2 by AUV is possible in a week (assuming an average AUV speed of 5 km h − 1 and a swath width of 250 m at a height of ~50 m above the sediment). Of all available approaches, provided the AUV can operate independently from a ship at least for the duration of its mapping missions, this is currently the fastest approach to survey a storage complex.
Bubble release rates can be related to tides, with gas bubble emissions from the sediments decreasing or ceasing at high tide as hydrostatic pressure increases (Blackford et al., 2014;Römer et al., 2016). Chemical anomalies within the water column will have a larger footprint than gas bubbles alone (Dewar et al., 2013;McGinnis et al., 2011) due to dissolution and dispersion by currents and mixing . Therefore, survey-based methods have been tested that are independent of the presence of gas bubbles, but instead are capable of measuring CO 2 dissolution products. Most of the chemical detection methods tested during the STEMM-CCS experiment were able to detect dissolved CO 2, and associated changes in seawater pH, even at low leakage rates (Table 2, Supplement Table 1). During the release experiment the chemically-detectable plume extent (i.e. pH change of >0.01 units) was estimated at ~3 m wide x 90 m long at 2 m altitude above the seafloor at the highest release rate (143 kg d − 1 , Monk et al., 2021). This is consistent with models that predict that, if leakage rates are below 1000 kg d − 1 , the excess DIC plume will not exceed 10 m height, will be predominantly limited to 2 -3 m above the seafloor , and the area with a 0.01 and 0.001 pH change will have a horizontal footprint of below ~650 m 2 and ~12,000 m 2 , respectively (Fig. 1). These modelling results are consistent with other in situ experiment results from the North Sea, reporting a plume height of not more than 2 m above the seabed at a leakage rate of <150 kg d − 1 (Vielstädte et al., 2019). With water column conditions similar to those observed in our experiment, larger leaks might be detected at greater distances from their source (e.g. 54,000 kg d − 1 would result in a 0.01 pH signal up to 1 km from a source; 36,000 kg d − 1 in a 0.001 pH signal up to 750 m away from the source; Fig. 1) and higher into the water column, while the formation of a multiphase buoyant plume could further increase the detection range (Oldenburg and Pan, 2020). However, as the magnitude of the leakage and hence the vertical extent of the anomaly in the water column will not be known in advance, any chemical sensing should ideally be performed as close as possible to the seabed.
Chemical sensors mounted on mobile platforms such as AUVs were Dashed lines show the impact on monitoring when the anomaly criterium is degraded to Δ0.1 pH units. Blue lines indicate the relationship between area impacted and leakage rate. Red lines show the detection length scale, or the maximum distance a sensor can be from a source of a given leakage rate, assuming the leakage plume approximates to a circular shape. Green lines illustrate the distance a mobile platform-mounted sensor may have to travel over an assumed area of interest of 15 km x 15 km to have a high probability of detecting a leak with no a-priori information on leak location. efficient (Table 2) in the detection of anomalies of dissolved species (e.g. pH, DIC, pCO 2 ), due to the large spatial coverage of the AUV. In an area such as the Goldeneye storage complex, the combination of low power (<6 W) and long deployment distance, would give the ability to detect a leakage of 1500 kg d − 1 if the mounted sensors can detect a pH change of 0.01 pH unit. If only a pH change of 0.1 pH unit can be detected by the sensors, then leakage rates of at least 7000 kg d − 1 would be detected with certainty. However, the spatial coverage will also depend on the sampling frequency and response time of the sensor (Table 3). Future launches of long-range AUVs from shore may enable an entire site such as Goldeneye to be surveyed without a vessel and with a single mission. For example; assuming a travel speed of ~1.8 km h − 1 and an AUV range of 5000 km (Roper et al., 2017), a survey of up to 240 km 2 could be conducted over 3 month over a site 200 km from shore (with a grid spacing of <64 m and an altitude of 2 m above seabed). If no autonomous underwater vehicles are available for surveying, towed CTDs (i.e. continuously monitoring CTDs deployed to a certain water depth) with water sampling and geochemical sensors offer an Table 3 Chemical sensors tested during the STEMM-CCS release experiment (on fixed installations and mobile platforms) in alphabetical order. Note that not all specifications are available for all sensors. . Assuming the same conditions as for the AUV survey above, this would lead to the sensor staying in the plume for ~3 s, and would cover a 200 km 2 complex in less than a month. However, surveying a large area by towing a CTD with online video streaming at 1.5 m distance from the seafloor (2 × 2 m 2 footprint; Schmidt et al., 2015) is usually conducted with maximum vessel speed of 1.8 km h − 1 , which would limit the vessel to a maximum survey track of 43 km per day. It would therefore take 3 months to survey an area as big as Goldeneye, and although the OPEX cost for performing the survey would be relatively low, it would likely only be viable if no other means of monitoring are available. However, the instrument could also be used to assess the chemical nature of a leakage detected through acoustic means.

Stationary platforms
If mobile platforms are not available, models have shown that, relative to single measurements, the potential to detect leakage is increased by using long-duration, stationary installations (Cazenave et al., 2021;Hvidevold et al., 2016). Such stationary platforms, equipped with acoustic or chemical sensors, can either be deployed at regular intervals above the storage complex, close to possible leakage pathways (i.e. in areas of higher risk of leakage identified by methods such as 3-D seismics), at plugged and abandoned wells or pockmarks, or following detection of potential breaches of the seal. Sensor-equipped platforms situated on the seabed can typically take many measurements over a long period of time (i.e. covering whole tidal cycles and changes from neap to spring tides), meaning the sampling frequency of a sensor is less important than that on mobile platforms. Chemical sensors will only detect a signal if they are positioned downstream of the leakage, thus knowledge of the dominant current direction in the area is beneficial (Table 1). A further advantage of stationary platforms is that changes of current direction with tide can enable dynamic baseline measurements when the installations are upstream of the source and leakage detection downstream of the source if the installation is within the width of any resulting plume (Flohr et al., 2021a). If no leakage is discovered by the stationary platform, this is also valuable information as the area without leakage can be estimated, decreasing the area that might need to be surveyed by mobile platforms. In addition, after initial purchase, the OPEX of stationary platforms will be smaller than that of mobile platforms (Table 2). Knowledge about the ideal location for deployment, power consumption, long-term stability and sensitivity/detection limit are currently the biggest challenges to detecting leakage with fixed installations. The main downsides to stationary platforms are the logistics of data transfer, the risk of consumable consumption and cleaning. For the former, data can be transferred from benthic landers by means of a moored surface expression, by regularly-released buoyant communication pods, by seafloor cables (for near-shore applications), or by underwater data transfer via acoustic modems to a nearby ship or an autonomous vehicle. For the latter, the system must be designed so that any consumed reagents and battery power can last long enough for a maintenance cycle satisfactory to the end-user. Biofouling can be a problem on some sensors, although deploying sensors below the photic zone can mitigate the problem of phytoplankton-based biofouling, as can employing techniques such as electrochemical (McQuillan et al., 2017) or acoustic (McQuillan et al., 2016) anti-biofouling.
The chemical sensors tested on stationary platforms during the STEMM-CCS release experiment included commercial and customdeveloped pH sensors, Lab on Chip sensors for pH, total alkalinity and nutrients, and a pH and oxygen eddy covariance system. In benthic chambers, DIC was determined on retrieved samples, but chambers were also equipped with temperature and oxygen sensors for the comparison of oxygen/DIC ratios to differentiate natural sediment oxygen demand/ CO 2 production from additional, leaked CO 2 and information about the tidal oscillation inferred from temperature variations ( Table 3). The tested pH sensors were able to detect a variation of 0.005 to 0.1 pH units with a power consumption of 0.001 to 4 W, and have to be serviced approximately every half year to year (Table 3). Based on an abandoned well scenario in the Central North Sea with a leakage rate in the order of 100 kg d − 1 (Fig. 1), for an intermediate precise sensor (e.g. 0.05 pH unit precision) this would mean that the fixed position sensor would need to be on the order of 5 m from the source to detect the leakage .
When located in appropriate positions, methods such as benthic chambers, pH eddy covariance and Lab on Chip gradient measurements ( Table 2, Supplementary Table 1) were able to detect leakage across the sediment-water interface. At the lowest injection rate, at a distance of 2.6 m from the centre of the bubble streams, the pH eddy covariance flux signal was 20 times the flux signal of natural CO 2 production at the seafloor . Lab on Chip measurements on the same instrument frame showed decreases in pH 2-3 times larger than the natural background variation . A limitation of pH eddy covariance measurements is that the resulting fluxes do not directly quantify the total CO 2 emission. Instead, fluxes are highly dependent on proximity to the CO 2 source . To quantify release, we relied on Lab on Chip measurements of pH at two heights in the plume of the bubble streams .
Significant branching of leakage paths, lateral migration of the gas within the overburden  and dissolution of gas in the sediments (Flohr et al., 2021b), can distribute bubble streams and locally elevate excess DIC concentrations in sediments, making it difficult to determine the magnitude of CO 2 retention in surficial sediments. CO 2 -generated anomalies were detected within sediments by pH optodes, microsensors, temperature sensors and geochemical analyses of porewater, but only up to a metre from CO 2 bubble release locations (deBeer et al., 2021;Lichtschlag et al., 2021). In sediment porewaters, the geochemical relationships between total alkalinity (TA) and sulphate (SO 4 ) and element/chloride ratios are potentially useful diagnostic indicators for leakage . During the release experiment, levels of TA and dissolved calcium (Ca) in the porewater were up to 10 times higher than the background concentration . In the overburden sediments in the vicinity of the Goldeneye platform, increased concentrations of ammonium (NH 4 ) and phosphate (PO 4 ) were also detected during the CO 2 release experiment close to the seafloor, potentially due to the displacement of porewaters from depth within the sediment column . As this displacement might precede CO 2 seepage, the detection of anomalies in the sediment could provide an early warning system. Hence, monitoring at and in the seabed allows detection of precursor fluids, seeping formation water, natural gas compounds, dissolved CO 2 and CO 2 gas bubbles. However, most of these methods can cover only a very small spatial area, in our case the size of the central leakage area that was imaged in the seismic data , and thus, leakage detection in this manner would usually not be considered effective or economical for industrial-scale monitoring.

Leakage detection summary
A range of techniques and technologies are currently available that are capable of detecting CO 2 leakage covering an area from mm to km scale, in the form of gas bubbles, dissolved CO 2 or pH, and can be deployed dynamically or on stationary platforms for up to 12 months (Tables 3 and 4 give a rough estimate based on authors' experience of factors such as sensor drift, biofouling and sensor specifications). Based on the approaches tested during the STEMM-CCS CO 2 release experiment, a ship or an AUV equipped with sonar and chemical sensors was found to be likely one of the most efficient ways to detect emissions as this can cover a large area (Fig. 2). Multiple leakage detection approaches (i.e. both acoustic and chemical sensing), could be used at the same time on the ship-based, AUV and ROV platforms, optimising the chances of detection. Similarly, if an AUV or ROV is already being used for other reasons, autonomous sensors can be integrated onto the vehicle quite easily and can monitor for a plume, or map the magnitude and extents of a plume, while undertaking other work. However, if pressure loss and geophysical monitoring indicates low level leakage that may not be detectable in the water column, there are more specialized techniques and technologies available that can detect smaller leakages closer to, or within, the seabed, though location of this leakage would be difficult. For more remote offshore monitoring, such as during the STEMM-CCS experiment, all of the tested techniques and technologies required a transport vehicle (i.e. a ship) from which they are deployed or lowered to the seafloor on a mooring, or deployment by ROV or AUV, which will remain one of the biggest cost factors. However, technology development in the future may lead to shore-based launches of very long range AUVs (Roper et al., 2017), which is expected to substantially reduce the cost of mobile surveys.

Leakage verification and attribution
In the marine environment, CO 2 is naturally present and generated by biological or geological processes in the subsurface, hydrosphere and biosphere, thus methods for fingerprinting CO 2 from CCS reservoirs are needed to evaluate whether any detected CO 2 is natural or results from leakage from a particular storage reservoir. Although leakage attribution is currently not included in any CCS guidelines, it has been critical for onshore CCS projects, such as the Weyburn project to disprove leakage claims, and there are suggestions to include it in the legal requirements (Dixon and Romanak, 2015). Two approaches were tested during the STEMM-CCS experiment: (i) process-based approaches for verifying CO 2 leakage; and (ii) tracer-based approaches for attributing the leakage to a specific reservoir (Table 2, Supplement Table 1, Fig. 2). The process-based approaches use known stoichiometric relationships in natural processes (e.g. biological respiration, CO 2 dissolution etc.) to assess whether a CO 2 anomaly is likely to be of non-natural origin. The tracer-based approaches are based on the use of non-toxic marker species that are naturally present in the reservoir injection zone, in the injected CO 2 (inherent), or can be added to the injected CO 2 .

Process-based verification
An understanding of natural variability above a storage complex can be used to differentiate an anomaly, which may be a leakage signal, from the natural background variability by using one or more continuously measured parameters (e.g. DIC, oxygen or nutrient data; Table 2, Supplementary Table 1). The goal of this analysis is to attribute measured DIC to either injected or natural CO 2 . During the STEMM-CCS experiment, the Cseep method (Table 2) reliably differentiated released CO 2 from the natural variability with a detection limit of 17-19 µmol DIC kg − 1 , which in this case was defined as twice that of the accumulated errors in order to minimize false positives. In the Goldeneye area, natural DIC variations resulted from mixing of water masses, production/ remineralisation of organic matter and/or calcium carbonate, and the oceanic uptake of atmospheric CO 2 from the atmosphere (Omar et al., 2021), hence knowledge about the water column processes was critical for applying this method (Table 1). During STEMM-CCS, Cseep was tested using carbonate system parameters in combination with nutrient data (Omar et al., 2021), but in principle a range of sensors and measuring techniques can be deployed on a range of platforms. For example, the benthic chamber, CTD, eddy covariance and sediment porewater geochemistry approaches all have the potential to produce data that can help to differentiate between biotically and abiotically produced CO 2 (Table 2, Supplement Table 1). The advantage of using these approaches is that leakage verification would be a by-product of their measurements, as their main task would be leakage detection or quantification. However, care has to be taken that enough baseline data and background knowledge is available, and the approach is currently only applicable in fully oxic waters (e.g. Uchimoto et al., 2021). In addition, processes-based results are not necessarily reservoir specific. For example, false-positive results, caused by naturally high water column pCO 2 , led to the halt of injection at the Tomakomai offshore CCS demonstration project in Hokkaido, Japan, showing the need for anomaly attribution (Romanak and Dixon, 2021).

Tracer-based attribution
During the STEMM-CCS release experiment, injected CO 2 was labelled with a series of natural and artificial tracers designed to track the transport of CO 2 through the sediment overburden and into the water column and to quantify dissolution and flux rates (Flohr et al., 2021b). Artificial tracers included sulphur hexafluoride (SF 6 ), octafluoropropane (C 3 F 8 ) and krypton (Kr) and natural tracers included CH 4 , δ 13 C CO2 and δ 18 O CO2 . Tracers will often be easy to detect as they can be measured either in the leaking gas or in dissolved form, even in low concentrations, down to parts per billion depending on the tracers (Flohr et al., 2021b). As attribution with tracers will only be done when a leakage has been reported and the location is known, the spatial resolution is the same as the sampling method that is used, e.g., that of water column sampling with a CTD. While the cost of adding artificial tracers to a reservoir might be high (Roberts et al., 2017; not considered in our CAPEX), the cost of measuring the added and inherent tracers will be comparatively low and the analyses can be done quickly.
In addition, for each storage complex there might be specific, natural tracers available in the reservoir injection zone and the geochemical composition of the formation fluids is often reservoir-specific (e.g. in the Sleipner area Li and B have been identified as potential tracers for displaced formation fluids; Lichtschlag et al., 2018). Knowledge of the chemical composition of reservoir fluids might already be available from formation water analysis. Alternatively, information about reservoir fluid and porewater composition would need to be established for each reservoir prior to CO 2 injection (Table 1) and background analyses could include: (1) anions (e.g. Cl, SO 4 ); (2) cations (e.g. Na, Ca, Mg, K, Sr, Li, Ba, B); (3) trace metals (e.g. Fe, Mn, Co, Zn); (4) other parameters (e.g. Si, pH) and (5) radiogenic strontium isotopes ( 87 Sr/ 86 Sr) (James, 2012). These data can then be compared to sediment porewaters collected as close to the site of the anomaly as possible.

Leakage verification and attribution summary
If leakage detection monitoring identifies CO 2 leakage from the reservoir, then a more focussed study can be done to verify and attribute the emission to a natural source, leakage from the CCS site or even to a specific reservoir (Fig. 2). The use of tracers had not been demonstrated in an offshore setting prior to the STEMM-CCS experiment. Injecting tracers into a reservoir might be costly and future applications might have to take into account additional challenges including legal issues around the use of tracers (Roberts et al., 2017). However, tracers will Table 4 Summary of the capabilities of the different techniques and technologies from the STEMM-CCS field experiment.

Horizontal spatial extent (coverage)
Instruments: sediment microprofiler < pH optodes for sediments < benthic chamber < Lab on Chip gradients and pH eddy covariance < ship-based approaches (CTD, acoustics) < AUV-based approaches (acoustics, sensors) < simulations Min: < m 2 Max: >~25 km 2 per day Vertical spatial extent in and above seafloor Instruments: sediment microprofiler < benthic chambers < pH eddy covariance < Lab on Chip gradients < ship-based approaches (CTD, acoustics) < AUV-based approaches (water column and sub-bottom acoustics, sensors) < simulations Min: sub-mm Max: entire water column likely give the best results as the attribution can be done for a specific reservoir, and the baseline concentrations are not subject to the natural variability that is typical for CO 2 concentration in most ecosystems, limiting the problem of false positives. Process-based computing approaches might also be good indicators of the leakage source, and are often based on the approaches that are already being used to detect or quantify CO 2 leakage.

Quantifying a leakage
The main reasons for quantifying the extent of a leakage are legal requirements, economic reasons, verification of the climate mitigation measures, and environmental impact assessments. Although there is currently no legal threshold value for acceptable leakage from storage reservoirs, for some regulations it still is a legal requirement that, in case of a leak, the amount of CO 2 leaving the seafloor needs to be quantified. For the environmental impact assessment, it is known that smaller leaks Fig. 2. Flow chart showing a potential monitoring strategy for CCS storage complexes based on the experience acquired during the STEMM-CCS release experiment and including detection, attribution and quantification. Grey boxes = monitoring approaches; blue boxes = capacity; Y = Yes; N = not possible/not available; CO 2 (g) = gaseous CO 2 and gas bubbles; CO 2 (d) = dissolved CO 2 .
<150 kg d − 1 would likely have only limited impact (Vielstädte et al., 2019), whereas bigger leaks obviously would affect larger areas (e.g. 1000 T d − 1 would subject a seafloor area of 50 km 2 to a pH drop of 0.1 units; Blackford et al., 2020). From a global climate point of view, a CCS leakage rate of < 0.01% per year would ensure that 90% of the injected CO 2 would be efficiently sequestered after 1000 years (Hepple and Benson, 2005). However, other models suggest that CCS will only be an efficient climate mitigation strategy if the loss of CO 2 from reservoirs is less than 0.001% per year (Haugan and Joos, 2004). For a 10 Mt reservoir, such as Goldeneye, this latter limit equates to a loss of 300 kg of CO 2 per d − 1 . However, until recently there was a lack of reliable methods for CO 2 leakage quantification . During the STEMM-CCS release experiment new methods for enabling leakage quantification were successfully deployed that can detect much lower leakage rates, in the ranges that could be expected if leakage would occur from abandoned wells, which are estimated to range between less than 0.03 kg d − 1 (Tao and Bryant, 2014) and up to 143 kg d − 1 (Vielstädte et al., 2015). Estimated leakage flow rates into the water column from all tested methods agreed within a narrow range (i.e. 22-73% of the injected CO 2 leaked across the seafloor; Li et al., 2020;Flohr et al., 2021b;Koopmans et al., 2021;Schaap et al., 2021;Gros et al., 2021), however, there are differences in costs, leakage anomalies that can be detected (gaseous or dissolved CO 2 ) and not all methods are currently commercially available (Table 2). Water column acoustic techniques have been very effective at detecting and quantifying gas ebullition from the seabed at rates of 30 to 550 kg d − 1 (e.g. Blackford et al., 2014;Li et al., 2020). However, bubble-based methods will likely always underestimate flux rates, as they cannot quantify the dissolved CO 2 component (Blackford et al., 2014;Li et al., 2021). An alternate approach to quantifying the CO 2 emission is geochemically, by the effect of emission on pH. If there are only few bubbles or the leakage is in dissolved form, a sensor-enabled ROV or AUV can perform a more focused survey pattern over a suspected leak and deploy additional sensors (e.g. nutrient and additional carbonate system sensors), or map the plume / gradients to enable quantification (Table 2). In the future, AUVs may be equipped to take samples for confirmatory analysis in the laboratory post mission. pH and/or pCO 2 sensors mounted on towed platforms such as a video-CTD, or potentially on AUVs, can be used to estimate leak flow rate when combined with simulations driven by recorded water currents and observed seafloor leak location(s) . In addition to mobile platforms, CO 2 can also be quantified from fixed platforms that can be located at high-risk locations.
During the STEMM-CCS experiment, methods such as pH eddy covariance, benthic chambers and autonomous Lab on Chip sensors measuring chemical gradients gave accurate results for injection rates as low as 6 kg d − 1 (eddy covariance; Koopmans et al., 2021) or 14 kg d − 1 (Lab on Chip gradients; Schaap et al., 2021). The Lab on Chip approach benefits from highly stable and accurate sensors with a long duration in situ (months to a year). The ability to measure at two vertical positions near the seabed with a single instrument provided valuable spatial information about the plume without added technical complexity. The eddy covariance method is an incredibly sensitive technique which readily differentiated even the lowest release rate of CO 2 from the natural background benthic fluxes; however, its deployment duration is more limited at the moment. Although some of these methods have been used to quantify fluxes from sediments previously, their application to CCS is novel.
In summary, methods are now available that can reliably quantify leakage rates in the form of bubbles, and high and low dissolved CO 2 quantities and can confidently measure CO 2 release rates as low as 6 kg d − 1 (Fig. 2). However, with the exception of gas bubble detection on mobile platforms, the equipment for quantification will need precise placement (for example within 10 m of the leakage to detect the injection rate of 6 kg d − 1 in this study; Koopmans et al., 2021), although in this case the leakage location will be known from the detection of the leakage. According to many monitoring strategies (e.g. Blackford et al., 2015;Shitashima et al., 2013), quantification should be performed once a suspected emission has been detected and attributed. In reality many of the methods that can detect leakage can also quantify it (Table 2,  Supplement Table 1), and leakage detection, quantification and attribution could be achieved simultaneously.

Conclusions
Until recently only a few methods were available for quantification of leakage from an offshore CCS reservoir and methods for attribution were not tested for the marine environment at larger water depths. The STEMM-CCS artificial CO 2 release experiment demonstrated that the approaches described here have the potential to detect, attribute and/or quantify even small rates of CO 2 leakage in offshore environments. Whilst each approach has its own specific capabilities and associated benefits/disadvantages, together they offer the potential to monitor identified areas at a higher risk of leakage, ranging from sub-mm to km in scale. For industry-scale operations, the probability of finding a leakage (e.g. the survey footprint, sensitivity and natural variability), and the costs associated with reliably surveying an area are currently still one of the biggest challenges due to the need for ships and remotelyoperated and autonomous vehicles. This might be overcome in the future by shore-based deployments. Many methods need expertise in either deployment or data interpretation and some approaches are not yet commercially available. Our analysis has shown the level of specific baseline knowledge that should be collected for optimal interpretation of the monitoring results and that this knowledge will be site-specific. For offshore monitoring strategies, the natural sequence of monitoring events should be detection, attribution and quantification; during the STEMM-CCS release experiment we have shown that many tested approaches were able to detect and quantify leakage at the same time and results can be used to verify and, in some cases, also attribute a leakage to a certain reservoir.

Funding
The STEMM-CCS project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No. 654462. Other work which contributed to this experiment has been funded by the ACTOM Act on Offshore Monitoring project (Accelerating CCS Technologies, Horizon2020 Project No. 294766) with financial contributions made from the UK Department for Business, Energy & Industrial Strategy (BEIS). Other work which contributed to this experiment has been funded by the UK's Natural Environmental Research Council: the SPITFIRE project, grant number NE/L002531/1; the Climate Linked Atlantic Sector Science (CLASS) project, funded through the single centre national capability programme grant number NE/R015953/1; the Carbonate Chemistry Autonomous Sensor System (CarCASS) project, grant number NE/P02081X/1. Further funding was received from Bayesian Monitoring Design (Bay-MoDe), funded by the Research Council of Norway through the CLIMIT programme, project 254711; and the Max Planck Society, Germany.

Declaration of Competing Interest
The authors declare the following financial interests/personal relationships which may be considered as potential competing interests

Acknowledgments
We would like to acknowledge the hard work, enthusiasm and professionalism of the captains, crews and operators of the RRS James Cook, the RV Poseidon, the ROV Isis and the Gavia AUV, who made the testing of the different monitoring approaches possible. In addition, the authors would like to thank the following colleagues who enabled the project and experiment to take place: Carla Sands, Rudolf Hanz, Samuel Monk