Detection and quantification of CO 2 seepage in seawater using the stoichiometric C seep method: Results from a recent subsea CO 2 release experiment in the North Sea

Carbon Capture and Storage (CCS) is a potential significant mitigation strategy to combat climate change and ocean acidification. The technology is well understood but its current implementation must be scaled up nearly by a hundredfold to become an effective tool that helps meet mitigation targets. Regulations require monitoring and verification at storage sites, and reliable monitoring strategies for detection and quanti fi cation of seepage of the stored carbon need to be developed. The C seep method was developed for reliable determination of CO 2 seepage signal in seawater by estimating and filtering out natural variations in dissolved inorganic carbon (C). In this work, we analysed data from the first-ever subsea CO 2 release experiment performed in the north-western North Sea by the EU STEMM (cid:0) CCS project. We successfully demonstrated the ability of the C seep method to (i) predict natural C variations around the Goldeneye site over seasonal to interannual time scales; (ii) establish a process-based baseline C concentration with minimal variability; (iii) determine CO 2 seepage detection threshold (DT) to reliably differentiate released (cid:0) CO 2 signal from natural variability and quantify released (cid:0) CO 2 dissolved in the sampled seawater. DT values were around 20 % of the natural C variations indicating high sensitivity of the method. Moreover, with the availability of DT value, the identification of released (cid:0) CO 2 required no pre- knowledge of seepage occurrence, but we used additional available information to assess the confidence of the results. Overall, the C seep method features high sensitivity, automation suitability, and represents a powerful future monitoring tool both for large and confined marine areas.


Introduction
Carbon Capture and Storage (CCS) is a potential significant mitigation strategy to combat climate change and ocean acidification in climate change scenarios (e.g., van Vuuren et al., 2011;IPCC, 2013;Fuss et al., 2014). Three out of four 1.5 • C-consistent emission pathways published by the IPCC include CCS as well as bioenergy CCS (BECCS) as mitigation options (IPCC, 2018). However, the scale of current CCS implementation is insufficient to impcat the global climate, and thousands of large-scale CCS facilities need to be deployed by 2040 to meet mitigation targets (Global CCS institute, 2019). Moreover, the technology is in early stage of research, development, and demonstration (e.g., Hammond, 2018) and involves uncertainties and risks (e.g., Damen et al., 2006) that might raise ethical and governance issues, as is the case for other so-called climate geoengineering techniques (Lawrence et al., 2018).
Many potential CCS storage reservoirs are located below the seafloor (Blackford et al., 2017), and a number of subsea storage demonstration projects are in operation (IEAGHG, 2008;Jenkins, 2020). However, before subsea CO 2 storage can be carried out at industrial scales, potential ecological consequences and adverse environmental impacts of any unintended CO 2 seepages need to be identified. Therefore, monitoring and verification is required at storage sites by national and international regulations (e.g., U.S. EPA, 2010;European Commission, 2011;Dixon and Romanak, 2015), and reliable monitoring strategies for detection and quantification of seepage need to be developed.
Primary monitoring of storage reservoirs is based on seismic techniques imaging CO 2 through the overburden (Jenkins, 2020). However, the detection threshold of such techniques may be of the order of 10 3 t CO 2 (Jenkins, 2020), and they are prone to interpretation uncertainty (Schaaf and Bond, 2019). Therefore, possible seepage at low levels may not be detected by seismic techniques, and monitoring for seepage at the seafloor provides an important secondary strategy. The modelling work of Blackford et al. (2020) reported that release events can be detected at the seafloor at thresholds well below levels that would compromise storage performance or significantly damage the environment.
When CO 2 seeping through the seafloor dissolves into seawater, it goes through a series of reactions that ultimately increase the concentration of dissolved inorganic carbon (DIC; hereafter denoted as C for simplicity). Thus, the excess C dissolved in the seawater (C seep ) should, in principle, be readily quantified from the resulting change in C (ΔC). In practice, however, this is complicated by the fact that the dissolution of seepage CO 2 may be occurring on top of simultaneous C changes that result from natural processes, such as the formation/remineralization of organic matter and/or calcium carbonate (CaCO 3 ), mixing between water masses, and uptake of atmospheric CO 2 . Thus, any natural parts of ΔC must be accounted for before the CO 2 seepage signal can be identified and C seep accurately determined.
The challenge of distinguishing subsea CO 2 seepage signal from natural variability can be overcome using different techniques. Uchimoto et al. (2017; proposed a seepage CO 2 detection method based on site-specific covariance between partial pressure of CO 2 in seawater (pCO 2 ) and dissolved oxygen (DO). Blackford et al. (2017) proposed site-specific anomaly detection criteria that recognises unnatural rates of change in seawater CO 2 concentrations. Gundersen et al. (2020) demonstrated the use of machine learning techniques for detecting signals from a seep using data from seep simulations. The above techniques are concentration-based, meaning that they rely on measured concentrations to identify seepage signals without analysing the processes that govern the natural variations. Romanak et al. (2012) discussed the limitations of concentration-based monitoring and recommended process-based techniques, which identify drivers of the natural variability and estimate their magnitude. Specifically, process-based methods are able to address the sensitivity of seepage signal detection criteria to changes in the natural processes due to, for instance, climatic and/or ecosystem variations.
The EU project STEMM− CCS (https://www.stemm-ccs.eu) carried out a first of its kind CO 2 release experiment designed to imitate CO 2 seepage from an offshore storage site (Fig. 1). The details of different aspects of this unique experiment have been described elsewhere (e.g., Connelly et al., 2019;Dean et al., 2020;Esposito et al., 2021;Flohr et al., 2021). The experiment was carried out at the Goldeneye site (58 • N, 0.4 • W) in the North Sea where a controlled mixture of CO 2 and tracer gases was released into the sediments at 3 m below the seafloor. The suitability of existing and new methods for the monitoring, detection and assessment of potential environmental impacts of the CO 2 release were demonstrate/evaluated. Before the release experiment, the STEMM− CCS project conducted a number of cruises around the Goldeneye site in order to establish environmental and ecological baselines. The project also developed/further-developed novel chemical sensors which have been utilized for data acquisition during both the baseline gathering and during the release experiment.
In this study, we use data from the STEMM− CCS release experiment to demonstrate the ability of the recently developed, process-based, C seep method (Botnen et al., 2015;Omar et al., 2018) to (i) distinguish CO 2 seepage signal from the natural variability, and (ii) quantify the excess seepage CO 2 dissolved in the seawater, i.e., C seep . The C seep method (section 2) estimates the magnitude of the natural variations (ΔC nat ) and filters them out to facilitate the detection of CO 2 seepage signal. We demonstrate the aptness of the method to establish a baseline C concentration (C b ) and a detection threshold (DT) (section 3). We discuss the benefits of detecting seepage signals through the C seep method (section 4), which is a promising marine monitoring tool: its implementation is flexible, it is suitable for automation, it can provide large area coverage, and it requires no pre-knowledge of CO 2 seepage occurrence (sections 5 and 6).

Variables of the seawater CO 2 system
The dissolution of CO 2 in seawater and subsequent equilibrium reactions form a complex chemical system (section 2), often referred to as the seawater CO 2 system, equilibrium properties of which are well characterised (e.g., Zeebe and Wolf-Gladrow, 2001). The seawater CO 2 system can be characterised by the following four measurable master variables (Dickson, 2010): Total alkalinity (TA), which is related to the charge balance in seawater: where brackets denote concentrations. The bicarbonate and carbonate ions ([HCO − 3 ] and [CO 2− 3 ]) contribute about 96 % of the seawater TA. Dissolved inorganic carbon (C), which is the sum of the concentrations of all forms of inorganic carbon dissolved in seawater: where [CO 2 ] represent the sum of the concentrations of CO 2 in aqueous solution. Partial pressure of CO 2 (pCO 2 ) in an air sample in equilibrium with the seawater sample: where xCO 2 is the mole fraction of CO 2 in the gas phase and p is the total pressure. pH T (henceforth pH for simplicity) which represents the total hydrogen ion concentration in seawater: The above master variables can be measured using a wide variety of techniques (Dickson, 2010) following standard operating procedures . TA and C are traditionally determined by acidimetric titration and, currently, discrete sampling and benchtop instrumentation give the highest accuracy.
By measuring any two of the above-mentioned four master variables (along with temperature, salinity, pressure and the knowledge of other non− CO 2 acid-base systems in seawater), it is possible to calculate the remaining variables e.g., by using the CO2SYS software (van Heuven et al., 2011). Additionally, by selecting the appropriate two parameters for measurements, the uncertainties in the computed parameters can be held at the same order of magnitude as their experimental errors (Millero, 2007). Sensors that can autonomously measure pCO 2 and pH in situ at high frequency are commercially available and show satisfactory precision and accuracy (e.g., Bushinsky et al., 2019). Paired measurements of pH and pCO 2 , however, are the least favoured since the computed TA and C values carry the highest uncertainty. Therefore, a major focus of the STEMM− CCS project was to develop new Lab-on-chip (LOC)-based sensor technologies to enable direct high-resolution in situ measurements of pH, TA and C. LOC-based sensors enable the automation of high performance wet-chemical analytical techniques offering high quality measurements usually only possible using benchtop laboratory instrumentation (Beaton et al., 2012;Clinton-Bailey et al., 2017;Rérolle et al., 2013).
Although measuring two of the four master variables gives the basis for a complete determination of the seawater CO 2 system, a processbased interpretation of the variability included in the data requires ancillary variables that are indicators for the governing processes. Examples of such variables include nutrients as indicators of biological activity, and salinity as an indicator of mixing between different water masses. Autonomous LOC sensors for the determination of nutrients (Beaton et al., 2012;Clinton-Bailey et al., 2017), and commercial off-the-shelf sensors (Sea-Bird Scientific) for temperature and salinity measurements were deployed on a number of platforms during the CO 2 release experiment to collect high-resolution in situ observations. Thus, the STEMM− CCS project demonstrated that high resolution data needed for the application of the C seep methodology can be acquired by using cost-effective in situ sensors (section 4.1, Table 3).

Conceptual description of the C seep method
When seepage CO 2 dissolves in seawater, it reacts with the water forming carbonic acid (H 2 CO 3 ), which rapidly dissociates into bicarbonate ions (Eq. 5), which in turn can also dissociate into carbonate ions (Eq. 6) according to the following equilibrium reactions: In terms of measurable variables, the net result of the above reactions is an increase in C and a decrease in pH. Thus, any seepage CO 2 dissolved in seawater should, in principle, be readily quantified from the resulting concentration changes. However, C in seawater is also influenced by natural processes, such as photosynthesis/respiration, biosynthesis/ dissolution of CaCO 3 and changes in temperature and salinity. Therefore, the approach of the C seep method is to quantify the natural variability and filter it out to facilitate the detection of CO 2 seepage dissolved in seawater. To achieve this, the method conveniently focuses on changes in C because this variable tracks concentration changes in all of the inorganic carbon species dissolved in seawater (Eq. 2) regardless of whether the changes are due to natural processes or to CO 2 seepage. Additionally, C mixes conservatively in the absence of biological activity.
The amount of C in seawater is set largely set by equilibrium reactions whilst natural processes produce deviations through various sinks and sources of seawater constituents. Therefore, the C seep method assumes that there is an essentially invariant theoretical baseline seawater C (C b ), which is dictated by the equilibrium history and physical properties of the water parcel, and over which fluctuations created by natural processes (ΔC nat ) and/or CO 2 seepage (C seep ) are superimposed so that: where C is the measured background concentration and C b is assumed to be invariant. The terms on the right hand-side of Eq. 7 are determined in two steps (see also Fig. 4): i.e., C seep = 0. Based on these data, set up a site-specific model to estimate ΔC nat and determine C b according to: where the superscript 'B' stands for background whereas the subscript 'b' stands for baseline.
2 Application of the model of step (1) to C measurements from the sitemonitoring to determine the sum of C b and C seep (if any seepage occurred) according to: where the superscript 'M' stands for site− monitoring. Note the important distinction between C B and C b . The former is fluctuating due changes in ΔC B nat and, thus, depends on time and location whereas the latter is theoretically invariant. In practice, however, the errors in calculating ΔC B nat affect the values of C b values so that they include some variability. This will be referred to as the "remnant variability". Similarly, errors in calculating ΔC M nat and C b affect C seep . In the following section we detail the determination of C b , and analyse its remnant variability and its effects on C seep determination.

Establishing C b for the Goldeneye site
As mentioned in the previous section, the main purpose of establishing C b is to facilitate reliable detection of seepage CO 2 signal and quantification of C seep by comparing baseline and monitoring data. However, baseline determination has its own value. For instance, by analysing C b values obtained from Eq. 8 we can check our understanding of the natural variability of C in the study area. A good understanding of the natural variability of C results in a successful characterization of ΔC B nat , which, in turn, results in confident C b values. Moreover, by determining C b over large spatial and temporal scales we can analyse the variability in C b in relation to sampling location/time. This, in turn, allows us to examine over what spatial and temporal scales we can expect an invariant C b . Furthermore, by computing the difference between C b values obtained for different stations/times we yield an estimate on the uncertainty in C seep , which is a measure of the detection threshold for seepage CO 2 signal. All these aspects are detailed in following subsections.

Baseline data
For the determination of C b (Eq. 8) we used water column measurements from the area at and around the Goldeneye site (Fig. 3). An overview of the datasets is shown in Table 1, and it comprises publicly available historical data as well as new data gathered within the STEMM− CCS project. The historical dataset was obtained during multiyear (2001-2002, 2005, 2008, and 2011) basin-wide cruises using discrete sampling and shipboard analyses (Clargo et al., 2015;Thomas, 2002). We used a 2-by-2.5-degree subset of this historical cruise dataset to characterize the large-scale spatiotemporal variability around the Goldeneye site. The new dataset (2017-2019) comprises both cruise data and high-frequency in situ sensors data. The cruise data were gathered onboard RV Poseidon using samples drawn from Niskin bottles mounted on a CDT-rosette ( Fig. 2A). These water samples were analysed onboard using benchtop instrumentations ( Fig. 2B and D) as described in Esposito et al. (2021). We used the data acquired within the geographical area of 0.35 • Lon x 0.1 • Lat (Fig. 3) in October 2017 (POS518) and August 2018 (POS527).
The high resolution data were obtained from sensors attached to seabed baseline lander (Fig. 2C) deployed by RV James Cook at ~58 • N, 0.3 • W, between April and May 2019, measuring at a two-hourly sampling frequency. These data will be henceforth referred to as the JC180BL dataset.
Our analyses focus on the deep samples (up to 10− 20 m above the seafloor) where the CO 2 seepage signals are expected to appear first.

Determination of C b through modelling the natural variability
The processes assumed to govern the natural variability of the seawater CO 2 system around the Goldeneye site are summarized in Fig. 4. They include (i) biological processes (ΔC bio ) that comprise photosynthesis/respiration (organic matter cycling; ΔC omc ) and formation/dissolution of CaCO 3 (CaCO 3 cycling; ΔC CaCO3 ), (ii) mixing of water masses (ΔC mix ), and (iii) air-sea gas exchange (ΔC ase ) including oceanic Table 1 Overview of the data used to establish the baseline C (C b ) around the Goldeneye site (Fig. 3 3 %
uptake of anthropogenic CO 2 . In the context of this study, the oceanic uptake of anthropogenic CO 2 from the atmosphere is considered a natural process, i.e., a process that is not influenced by the subsea seepage. Therefore, the ΔC nat term in Eq. 8 consists of contributions driven by biology (ΔC bio ), air-sea gas exchange (ΔC ase ), and mixing (ΔC mix ). Furthermore, as indicated in Fig. 4, the contributions of ΔC nat are parameterized through a combination of changes in measured variables relative to some arbitrary reference values (Table 2), stoichiometric ratios, and empirical constants. The details of how the contributions ΔC omc , ΔC CaCO3 , ΔC mix , and ΔC ase are computed from the model parameters are given in Appendix A. Once the above contributions are determined, we obtain an estimate of ΔC nat ( = ΔC omc + ΔC CaCO3 + ΔC mix + ΔC ase ), and then evaluate Eq. 8 to estimate C b .
The mean values and standard deviations for ΔC omc , ΔC CaCO3 , ΔC mix , and ΔC ase are shown in Table 2. In terms of mean values, the most influential contributions are, in decreasing order, ΔC omc , ΔC mix , ΔC ase , and ΔC CaCO3 . In terms of contributions to ΔC nat variability, the contribution order changes to ΔC omc , ΔC mix , ΔC CaCO3 , and ΔC ase in decreasing order. For details of the above contributions see Fig. A1 in Appendix A.
The variability in C B , ΔC B nat , and C b is compared in Fig. 5. C B varies on average between 2090 and 2180 μmol kg − 1 , with mean and standard deviation values of 2160 ± 17 μmol kg − 1 (Fig. 5A), confirming the highly dynamic CO 2 system parameters known for shelf regions (e.g., DeGrandpre et al., 1997). Changes observed at any given station arise from temporal changes occurring on the sub-seasonal to interannual time scales. The historical data captures the seasonal to interannual variability, whereas the sub-seasonal changes are captured by the JC180BL data, which were acquired over 22 days (29 Apr-21 May). Spatial changes, on the other hand, are observed between different stations and their magnitude is comparable to the temporal changes. Another noticeable feature is the lower C B values observed in the coastal stations ( Fig Fig. 5A). This is because the variations in individual boxes of C B and ΔC B nat partially cancel out. In summary, Fig. 5 demonstrates that the application of the C seep method substantially reduces the dependency of C b on time and location, although it does not completely eliminate the spatiotemporal variability. Furthermore, the spatial extent of the study area does not seem to explain the remnant variability in C b . This can be seen in Fig. 5C, where the range in C b values computed for station 56, JC180BL, POS518, and POS527 brackets the entire remnant variability despite these data being acquired in quite close locations (Fig. 3). Possible  (Table 1). (C) The seabed Baseline Lander with integrated biogeochemical LOC sensors attached (white arrows) being deployed from RV James Cook in May 2019. , which is about 2 km east of station 56 of the historical cruises (asterisk), and sampling locations of the STEMM− CCS baseline cruises (POS518, triangles; POS527, circles). Bottom inset: detailed map of the area around the CO 2 release experiment site and the sampling locations during monitoring the released− CO 2 by the POS534 cruise (dots = sampling through pumping; circles = sampling from Niskin bottles) and by the JC180BBL lander (blue star). The latter was deployed ~2.6 m south of the centre of the CO 2 bubble streams. Also shown is the position of the JC180BL baseline lander (green star), which was deployed 375 m southeast of the CO 2 release point.

Fig. 4.
Box diagram showing the required data and parameters for modelling the contributions of natural C variability (ΔC nat ). Also indicated are the outputs of the mathematical model for baseline and monitoring data, and that C seep is determined as the difference of these outputs. In the model parameters, subscripts r denotes reference values, and Δ-values are computed as the difference: measured minus reference. PA: is potential alkalinity i.e. TA from which the influence of organic matter is eliminated (Eq. A3, Appendix) dPA/dS: is change in PA resulting from salinity changes; PA 0 : is the PA concentration at S = 0, i.e., in freshwater; dC/dt: is the annual C increase due to oceanic uptake of CO 2 from the atmosphere. causes of remnant variability in C b are explored in the next subsection.

Evaluation of the uncertainty in C seep and detection threshold
The remnant variation in C b (Fig. 3C) is due to imperfection of the model for ΔC B nat (Eqs. A1-A8) and errors in the model input, i.e., errors in the measured variables (Table 1) and in the estimates of dC/dt (1.2 ± 0.3 μmol kg − 1 yr − 1 ), R CP (117 ± 16), dPA/dS (67.5 ± 7.4 μmol kg − 1 ), and PA 0 (-46 ± 262 μmol kg − 1 ). We used the historical data to estimate the total error introduced into C seep (E Cseep ) by the above uncertainties and due to the observed station-to-station differences in C b (Fig. 5C). To achieve this, we first arbitrarily considered station 57 as a baseline station and the remaining stations as monitoring stations. Since there was no seepage in the area when the historical data were acquired (2001-2011) every station should, in principle, have the same C b as station 57 so that C seep = 0. However, the aforementioned uncertainties and remnant variability in C b will introduce an error such that: where x is any of the station numbers shown in Fig. 3   For each box, the central mark is the median, the edges of the box are the 25th and 75th percentiles, the whiskers extend to the most extreme data points not considered outliers, and outliers are individual crosses. Note that data from POS518, POS527, and JC180BL are shown next to station 56 because all these data were acquired from very close locations (Fig. 3). A comparison between panels (A) and (C) reveals how the implementation of the C seep methodology minimises natural variations in C b temporally (height of individual boxes) and spatially (difference between individual boxes).

Table 3
Overview of the released− CO 2 monitoring data used to compute C seep values at the Goldeneye site (Fig. 3  10 % pH 0.01 C † 0.4 %** Current (direction and speed) C = dissolved inorganic carbon; TA = total alkalinity; S = salinity; T = temperature; PO 4 =phosphate; and NO 3 =nitrate. † C computed from pH and mean sensor TA using the CO2SYS program (van Heuven et al., 2011) and dissociation constants of Millero et al (2006). ** Estimated according to Dickson and Rilley (1978).
A.M. Omar et al. evaluating the underlying model (Eqs. A1-A8; Fig. 4) 50,000 times, i.e., 1000 simulations for each of 50 unique model inputs. We chose MC simulations as an easier alternative for the traditional error propagation by differentiation and summations (Anderson, 1976) which would be rather tedious to program. During the simulations individual uncertainties (with magnitudes as given above and in Table 1) were randomly sampled from normal distributions and added to model inputs which were at their mean values. The distribution of E Cseep values driven by the individual uncertainties is well approximated by a normal distribution with a mean and standard deviation of 0.7 ± 7.6 μmol kg − 1 (Fig. 6; red curve and magenta bars). For comparison, we estimated the E Cseep values driven by the remnant variability in the historical C b values (Fig. 5C, for station numbers 48-50, 55-57, and 72-75). To achieve this, we evaluated the underlying model by randomly sampling model inputs from normal distributions with means equal to the observed data and uncertainty set to zero. The resulting E Cseep distribution is well approximated by a normal distribution with a mean and standard deviation of -1.0 ± 11.4 μmol kg -1 (Fig. 6; black curve and cyan bars).
From the above comparison we deduce that remnant variability in the historical C b values (Fig. 5C), which results from all kinds of model imperfections, produces E Cseep values with 1-STD of ±11.4 μmol kg − 1 of which 67 % (7.6/11.4) are due to individual uncertainties in the model inputs.
We also used the Monte Carlo simulation described above to assess the influence of the individual uncertainties in the distribution of C b . This time we simulated C b instead of E Cseep , by holding all model inputs at their mean values and adding to them randomly sampled individual uncertainties. The resulting C b values were well approximated by a normal distribution with a mean and standard deviation of 2142 ± 4.7 μmol kg − 1 . The remnant variability observed in the historical C b values ( Fig. 5C; for station numbers 48-50, 55-57, and 72-75) could be approximated by a normal distribution with mean and standard deviation of 2140 ± 10.7 μmol kg − 1 . The individual uncertainties in the model inputs, therefore, account for 44 % (4.7/10.7) of the remnant variability in historical C b value and the rest must be ascribed to processes/factors not resolved by our model for ΔC nat . Two possible candidates of such factors include inter-laboratory differences in C measurements, which can be up to 10 μmol kg − 1 (Bockmon and Dickson, 2015), and differences in sampling procedures. This may introduce systematic errors so that the uncertainty in C increases from 1 % ( Table 2) to maximum 5 % i. e. ~ 10 μmol kg − 1 . In fact, we were able to reproduce the whole remnant variability observed in C b (1-STD = ±11 μmol kg − 1 ; Fig. 5C) by repeating the Monte Carlo simulations with a C B uncertainty of 5 %. From this analysis, we conclude that the processes included in our underlying model (Fig. 4) are adequate to explain the natural C fluctuations in the study area, and the observed remnant variability in C b arises Fig. 7. (A) Benthic lander with the LOC sensors (white arrows) attached with inlets (red arrows) at 87 and 17 cm above the seafloor to monitor CO 2 plume in the benthic boundary layer. (B) A towed Video-CTD water sampler rosette being deployed for continuous water sampling using an attached underwater pump and 1-inch tubing/hose. A three-way tube connector was installed at the onboard part of the tubing in order to split the flow and allow for discrete seawater sample collection. mainly from uncertainties in model inputs.
Based on the above results, we define the detection threshold (DT), over which C seep values will be interpreted as CO 2 seepage signal, for two situations. If E Cseep is affected by only individual uncertainties in model inputs then a DT = 15.2 μmol kg − 1 (mean E Cseep plus two standard deviations from the red curve in Fig. 6) will be used. On the other hand, when E Cseep is expected to be influenced by both systematic errors and individual uncertainties in model inputs then DT = 24.8 μmol kg − 1 (mean E Cseep plus two standard deviations from the black curve in Fig. 6) will be used. The effect of changing DT to 1-STD or 3-STD is discussed in section 5.1.

Monitoring data of the CO 2 release
For the computation of C seep (Eq. 9), we used two C M datasets collected during the monitoring of the CO 2 release ( Table 1). The first is a high-frequency dataset acquired by LOC sensors mounted on a lander ( Fig. 7A; Flohr et al., 2021) situated on the seafloor about 2.6 m to the south of the centre of the CO 2 bubble plumes. The data obtained by these sensors, which were deployed by RV James Cook to monitor the CO 2 bubble plume in the benthic boundary layer (BBL), will be henceforth referred to as the JC180BBL dataset. The sensors on this lander had two inlets for seawater, one at 17 cm above the seafloor and one at 87 cm above the seafloor. The sensor alternated between these two inlets and each measurement took 10 min. Here we used data sampled from both the inlets.
The second dataset is data gathered during a cruise with RV Poseidon using towed video-CTD (Fig. 7B) with an attached water pump to continuously monitor parameters in the water column around and above the CO 2 release site (Fig. 3, lower inset, dots). This technique has been described in detail in Linke et al. (2015) and Schmidt et al. (2015). Briefly, the 1-inch inlet tubing connected to an underwater pump was attached to the lower part of the video-CTD frame and the water was continuously pumped at a flow rate of about 30 l min − 1 to sample from specific water depths. A three-way tube connector was installed to split the flow and allow for discrete water sample collection and later onboard analysis.
These observations were made during a full tidal cycle on May 13-14, 16-17, and 19-20, 2019 (Schmidt, 2019). Water column measurements sampled with Niskin bottles were also conducted during the RV Poseidon cruise (POS534; Fig. 3, lower inset, circles), both above the CO 2 release site (16)(17)(19)(20) and few kilometres away from the site (May 10-13). The cruise dataset will be referred to as POS534P or POS534B, where P and B denote sampling through pumping or through Niskin bottles, respectively. For this dataset, our analyses focus on the samples taken from 109 m or deeper, i.e., maximum ≈10 m above the seafloor where the strongest signal of the released− CO 2 is expected.
For both datasets, we include data acquired before the start of the CO 2 release to be able to assess if the application of the C seep method results false positives, i.e., identify false released− CO 2 signals.

C seep computed from high-frequency sensor data
C seep was determined according to Eq. 9, using the high-frequency CO 2 release monitoring dataset (JC180BBL), ΔC M nat computed for the same dataset, and an average C b value determined from the highfrequency baseline dataset (JC180BL). During the computations of ΔC M nat and C b some data quality and paucities were circumvented as detailed in Appendix B. Since JC180BBL and JC180BL datasets were acquired simultaneously (but at different locations; Fig. 3) using the same sensor technology, we applied DT sensor = 16.6 μmol kg − 1 as discussed in section 3.3. In other words, only C seep values larger than DT sensor were confidently interpreted as a signal of released− CO 2 .
The temporal development of C seep together with those of the released− CO 2 rate and water movement (velocity) are shown in Fig. 8. Notably, prior to the onset of the CO 2 release, e.g., May 8-9, the C seep values are close to zero (2.4 ± 2.3 μmol kg − 1 ) as expected since no released− CO 2 was dissolved in the seawater. C seep values that are consistently above DT sensor start to appear on May 15, when the released− CO 2 rate was ≥ 15 kg d − 1 . After that date, the CO 2 release rate was even higher and elevated/extreme C seep values ranging from 40 to 160 μmol kg − 1 regularly appeared and their peak values generally scaled with rate of the CO 2 release. Moreover, the elevated C seep values occurred whenever the current direction was southwards, i.e., aligned with the direction of the sensor location relative to the CO 2 bubble stream through the seafloor (Fig. 3, caption). On the basis of the above observations, we conclude that the C seep method is able to detect the released− CO 2 signal in addition to quantifying the concentration of excess carbon dissolved in the sampled seawater. Moreover, the method Fig. 8. Temporal development of: (A) flow rate of the released CO 2 measured at the tank; (B) C seep, i.e., the excess C resulting from the released− CO 2 dissolved into the seawater sampled at 17 cm (blue) and 87 cm (magenta) above seafloor, respectively; and (C) water current velocity (positive values to the north). Vertical shaded bars indicate the alignment of CO 2 release rate, C seep maxima, and current direction. The horizontal shaded bar indicates the threshold for confidently detecting the signal of CO 2 release in the seawater.
unequivocally detects the released− CO 2 for moderate-to-high CO 2 release rates > 15 kg d − 1 . For low CO 2 release rate ≤ 5 kg d − 1 , C seep values were essentially indistinguishable from zero. The detection of the released− CO 2 signals is based on the DT sensor criterion. However, the fact that the identified signals are in line with other parameters such as the onset of the CO 2 release, current direction, and the strength of the CO 2 release rate lends confidence to the detected signals.

C seep computed from the cruise data
To evaluate Eq. 9 for the cruise observations, we used the mean C b value obtained for the POS527 data (2157 μmol kg − 1 ), while ΔC M nat values were computed from the monitoring datasets POS534P and POS534B. The baseline and monitoring datasets were acquired on different cruises (Tables 1 and 3), but we assumed no systematic errors in the measurements and applied a DT cruise = 16.6 μmol kg − 1 (section 3.3). The temporal development of the resulting C seep values together with those of the CO 2 release rate and the distance between the sampling point and the CO 2 bubble stream at the seafloor are shown in Fig. 9.
For the POS534B data (sampled through Niskin bottles), no distinct C seep maxima were observed and all values were within the DT cruise . For POS534P (sampled through pumping), on the other hand, C seep values above the threshold were observed on the night shift of May 16-17 (Fig. 9B). There was also a distinct local maximum of C seep on the night shift of May 19-20, but this did not rise over the threshold. Based on the DT cruise criterion, the application of the C seep method on this dataset identified a CO 2 release signal only on May 16-17. However, RV Poseidon was investigating the released− CO 2 plume from late May 13 (Fig. 9C), and the local C seep maximum on May 19-20 was probably due to released− CO 2 . In any case, it is interesting to note that this local maximum occurred when the CO 2 release rate was three times higher than May 17. It seems, therefore, that the cruise data did not capture the CO 2 release signal as good as the sensor data ( Fig. 8) even for high CO 2 release rates. Consequently, false negatives might occur more often for the cruise data. Possible causes for the seemingly lower sensitivity of the cruise data are discussed in section 5.2.
There is also one C seep value (-18.8 μmol kg − 1 ) that exceeds the negative side of DT cruise . This is interpreted as an outlier and indicates that a DT cruise = 19 μmol kg -1 is more suitable for this data probably because data acquisition on different cruises introduced additional errors in the model inputs. Note, however, that a higher DT cruise value would increase the risk of false negatives even more.
Finally, the above identification of released− CO 2 signal was based solely on DT cruise , which depends on the remnant variability in C b . This means that with the availability of a DT value, the application of the C seep method does not require any pre-knowledge of whether there is a seepage occurring nor its eventual whereabouts. However, any additional information available about the seepage situation is very useful to assess the confidence of the results.

Implementation flexibility
The modular nature of the C seep method gives the possibility of implementing it at different complexities depending on the purpose of the computations. Some examples of this are provided below.
In this work, the aim was to detect CO 2 seepage signal in addition to quantifying the concentration of excess CO 2 dissolved in the sampled seawater. Thus, we implemented the method in its most rigorous form in four modules by (i) setting up a model that includes as many as possible processes known to influence C concentration in the study area (section 2); (ii) establishing a baseline C through minimising the natural variability (section 3.2); (iii) determining a detection threshold (section 3.3); and (iv) quantifying C seep by evaluating Eq. 9 using C b and ΔC M nat values computed from baseline and monitoring data, respectively (section 4). This implementation can be most relevant during the monitoring phase of CCS projects when both identification of anomalous concentration and quantification of the dissolved CO 2 are necessary.
If only determination of C b is needed, for instance, during site characterisation phase of CCS projects when baseline gathering and understanding/identification of governing processes are the focus, then only the first and second modules of the method need to be implemented. In other cases, the determination of the detection threshold in addition to C b might be necessary. This can be important for modelling purposes for example. In fact, the recent works of Hvidevold et al. (2015,   Fig. 9. Temporal development of: (A) mass flow rate of the CO 2 release, and (B) C seep , i.e. the excess C resulting from released− CO 2 dissolved in the sampled seawater. Data sampled by the pumping technique (section 2) are shown in blue, those sampled above the seafloor with Niskin bottles are shown in filled circles. (C) Lateral distance (accuracy ±2.6 m) between the sampling point and location of the initial CO 2 bubble stream at the seafloor. (2017) and Oleynik et al. (2020), which investigated optimal sensor placement during seepage CO 2 detection, used the C seep method to define a DT. In such cases, the first three modules should be implemented with DT determined by considering the total error E Cseep . Furthermore, as an additional flexibility, the threshold can be chosen by the user as μ_ ECseep +kσ_ ECseep, k = 1, 2 or 3 depending on how important it is to avoid false positives and false negatives. In this work, the threshold was set using k = 2 (section 3.3), increasing k to 3 would increase the occurrence of false negatives, whereas using k = 1 would increase false positives. Apart from determining DT, executing the first three modules determines the level of remnant variability in C b , analyses its sources, and quantifies its contribution to DT through σ_ ECseep . In this way, the user is informed about changes in C b not explained by the underlaying model and their effect on DT.

2016), Alendal
In situations where the purpose is to quickly check for 'unusual' CO 2 concentrations, the C seep method can be implemented as propertyproperty plots of parameters that are indicative of the main processes of interest. For instance, if the main processes governing C variability at the site are organic matter cycling and CO 2 seepage, then a plot of salinity normalized C (nC) versus nutrients may be used as a seepage CO 2 check. Organic matter cycling produces covariation between nC and nutrients with fairly well constrained slopes, whereas CO 2 seepage increases C without impacting nutrients. Thus, samples contaminated with significant seepage CO 2 will clearly deviate from the slope defined by the nC-nutrient stoichiometry as illustrated in Fig. 10. Other equivalent plots can also be used, such as nC versus apparent oxygen utilization (AOU). In this regard, Uchimoto et al. (2017; proposed a seepage CO 2 detection method based on purely statistical covariance between seawater pCO 2 and DO. The existence of such covariance, however, most probably stems from the stoichiometric relationship between C and AOU, although not explored in Uchimoto et al. (2017;. It is important to understand that the above mentioned implementation only offers a way to check for anomalous C concentration that do not stem from salinity changes (i.e., mixing/circulation) and/or organic matter cycling. It does not offer any further check of whether the anomalies occur (solely or partly) due to other natural processes, e.g., in-situ dissolution of CaCO 3 and/or air-sea gas exchange. Therefore a comprehensive modelling of all the important natural mechanisms influencing C, such as in the full implementation of the C seep method, is necessary.
It is also important to point out that the C seep method is designed for real marine monitoring situations, which can be more complicated and challenging than the current release experiment. During monitoring of CCS sites one might not know if seepage is occurring or not, whereas during the experiment we knew where and when to look for a seepage signal. Furthermore, during the experiment, the baseline and monitoring data were acquired simultaneously and at different (but close) locations. Therefore, the difference between the baseline and monitoring C data resulting from natural variability, i.e., ΔC nat , was small and mostly contributed to by ΔC omc . Consequently, as soon as a signal of high C concentration was observed one could intuitively assume (with some confidence) it stems from the CO 2 release. In real monitoring situations, however, spatial and temporal difference between the baseline and monitoring data can be much larger. This, in turn, will produce larger ΔC nat , which might be contributed to by several processes and even conceal the seepage CO 2 signal. Therefore, in real marine monitoring situations, it is essential to model all of the significant natural processes that influence C, such as in the complete implementation of the C seep method. Finally, we note the trade-off involving in the above-implied flexibility regarding the timing of baseline data collection relative to the monitoring data. In section 4, we suggested DT cruise larger than DT sensor was more suitable because baseline and monitoring were collected nearly simultaneously for the sensor data, while for the cruises data collection took place in different years. Furthermore, sensitivity calculations showed that if we were to calculate C seep from JC180BBL using the average C b value obtained from the historical cruise data collected in May 2002 then a DT sensor ≈ 30 μmol kg − 1 would be required. This indicates some of the long-term natural C changes were not captured by the current underlying model, and with such elevated DT value the released− CO 2 signal would be captured only during the periods with higher CO 2 release rate (> 80 kgC d − 1 , Fig. 8). This calls for improved estimate of the long-term C trend in order to assure reliable detection of signals from lower seepage rates as well. To minimise potential error in the long-term trend, baseline data should be up-to-date. This is not necessarily a cumbersome task since any monitoring data that do not reveal CO 2 seepage can be used to update the baseline.

Importance of distance to the seepage source, sampling frequency, and measurement accuracy
An interesting finding from the results presented in section 4 is that although the experimental cruise data measured with benchtop instrumentation (POS534) is more accurate than the in situ sensor data (JC180BBL) (Tables 1 and 3), the latter captured the CO 2 release signal more clearly. For instance, the C seep peak values from JC180BBL were more regular, in phase with the current velocity, and scaled with the CO 2 release rate. A comprehensive analysis of this issue is beyond the scope of this work, but we note two factors that seem to be important. The first is the relative locations of the seepage source and the sensor, with respect to the currents, which is a key factor to take into account. In particular, the sensor data showed a strong vertical gradient. The C seep values were based on samples taken through two inlets: at 17 cm and 87 cm above the seafloor (section 4.1) with an average depth of 119.5 m. At the lower inlet C seep values above DT sensor were between 18 and 170 μmol kg − 1 whereas for the upper inlet the corresponding C seep values were much lower, 17-74 μmol kg − 1 (Fig. 8B) This indicates that released− CO 2 signal was nearly double at the lower inlet compared to 70 cm higher. The average sampling depth of POS534P data that were acquired at the experimental site was 116.4 m (114.3-117.4 m) and, thus, around 3 m shallower than the JC180BBL data. This is probably the main reason for the lower C seep values computed for the former dataset (Fig. 9B versus Fig. 8B). Nevertheless, it is worth mentioning that C seep values for the POS534P data on 19-20 May were lower than those on 16-17 May (Fig. 9B). This is despite the facts that on the 19th-20th the CO 2 release rate was higher and the average sampling depth was a metre lower. On both days the hydrodynamic conditions and ship position were very Fig. 10. Concentration of salinity normalized C (nC) plotted against that of nitrate (NO 3 ) using data from baseline cruises (POS518, POS527, and historical cruises; grey) and the CO 2 release sensor data JC180BBL (blue). Samples contaminated by the released− CO 2 in the JC180BBL data clearly deviate from the slope of the C:N stoichiometry of the data from the baseline cruises.
similar. This suggests, at least for the cruise data, distance to the source alone does not explain the strength of the signal.
The other important factor is the sampling frequency and its effect can be appreciated from the observation that we could not identify any release CO 2 signal in the Niskin bottle data (POS534B) although its analytical method had the best resolution of all the datasets. Therefore, given similar hydrographic settings and distance to the CO 2 seepage source, high-frequency data suits better for capturing the quickly changing seepage signal.
Measurement precision and accuracy are also prerequisites for accurate C seep values as illustrated by the following final example. The contribution of CaCO 3 precipitation/dissolution to the natural C variations around the Goldeneye site is generally small (Table 2 and Fig. A1C). However, we cannot rule out the possibility that its importance became significant during the CO 2 release experiment due to potential CaCO 3 dissolution in the sediment. If such a dissolution took place and if reaction products entered into the water column in significant amounts, then TA would increase in the water column. This would, in turn, increase ΔC CaCO3 values in the monitoring data, which would make a positive contribution to C seep . The experimental TA data from the water column (POS534, JC180BBL) did not show any systematic TA increase in connection with the CO 2 release. However, the effect of CaCO 3 dissolution on TA might have not been captured due to noise in the sensor TA data (Appendix B, and Fig. B1B) and low resolution of the cruise data. Thus, TA sensor data with high precision/accuracy is needed to confidently monitor the importance of ΔC CaCO3 .
For up-to-date information on commercially available sensors that can deliver the data necessary for C seep computation, we refer the reader to the International Carbon Coordination Project (IOCCP) which promotes, among others, the development of measurement technology for marine biogeochemistry. On the IOCCP website (http://www.ioccp.org) there is a directory of commercially available hardware (sensors and instruments), references listing documents on standards, best practices and user guides, etc.

Future outlooks for reducing the C b variability
Several ways to improve modelling ΔC nat need to be tested in the future to reduce the remnant variability in C b and keep DT as low as possible. For instance, improvement of the uncertainties in dPA/dS and PA 0 maybe be sought by assuming several water mass end-members instead of the two used in the current analysis (section A.3). Also, using equilibrium C value computed from parameters not impacted by CO 2 seepage such as S, TA (obtained from S, as in Eq. A5), T, and atmospheric pCO 2 using the CO2SYS package (van Heuven et al., 2011) has the potential to improve the long-term trend. This also minimises the concern of potential contamination in C b by seepage CO 2 . Lastly, site-specific C b value may be taken from outputs of well-calibrated regional biogeochemical models. This is specially desirable in situations when it is difficult to acquire new CO 2 seepage-free baseline data.
Observational data from open databases, such as GLODAP v2 (Olsen et al., 2016) and SOCAT (https://www.socat.info), will be an asset to calibrate the models.

Summary and concluding remarks
In this work we analysed baseline data from historical and recent cruises as well as monitoring data from the first-ever CO 2 release experiment in the north-western North Sea, carried out as part of the EU project STEMM− CCS. We successfully demonstrated the ability of the C seep method to (i) adequately predict natural C variations around the Goldeneye site, over seasonal to interannual time scales; (ii) establish a process-based baseline concentration (C b ) with minimal variability; (iii) determine CO 2 seepage detection threshold and reliably differentiate released− CO 2 from natural variability; and (iv) quantify released− CO 2 dissolved in the sampled seawater (C seep ) with concentrations above a pre-defined threshold.
In addition to the above mentioned utilities, we also showed that the C seep method can be implemented in modules depending on the purpose, and with the availability of detection threshold it does not require prior knowledge of CO 2 seepage. Furthermore, the detection threshold used in this study ranged 16.6-19.0 μmol kg − 1 , which was 18-20 % of the observed natural variability in the study area (Fig. 5), illustrating the high sensitivity of the method. Therefore, the C seep method can be used to monitor large areas by employing pre-defined detection thresholds, C b value(s), and mobile platforms, e.g., autonomous underwater vehicles equipped with fast-response sensors.
Based on the above results, we conclude that the C seep method features high sensitivity, automation suitability, and with a pre-defined detection threshold, it requires no pre-knowledge of seepage occurrence. Thus, it represents a powerful future monitoring tool both for large and confined marine areas.
Current CCS monitoring requirements comprise of five steps (Dixon and Romanak, 2015): background (baseline) measurements, assessment of CO 2 storage performance in the reservoir, detection of leakage, and (if leakage is detected, suspected or alleged) quantification of leakage and impact assessment. A particular strength of the C seep method is that, if used as a water column monitoring tool, it can address several of the above requirements. Other methods were also tested during the STEMM− CCS project to detect, attribute, and quantify the CO 2 release. For instance, natural tracers that are inherent to the injected CO 2 (δ 13 C CO2 , δ 18 O CO2 ) and artificial tracers added to it (Kr, SF 6 , C 3 F 8 ) were tested to identify anomalies originating from the released-CO 2 rather than natural sources (Flohr et al., 2021). Studies on the presentations and comparisons of the performance of the different methods are underway and expected to enter the scientific literature soon.
National Oceanography Centre for the development of the in-situ pH sensors used during the release experiment. We would like to thank the Captains and the crew on RV Poseidon for their excellent support during the three expeditions. Thanks also to André Mutzberg, Dominik Jasinski and Maria Martinez-Cabanas at GEOMAR for their help with sampling and measuring nutrients and carbonate parameters. We are also grateful for two anonymous referees who's comments and suggestions improved the manuscript.

Appendix A. Contributions of natural C variability (ΔC nat )
In the following subsections, we describe the details on how the components of ΔC nat (ΔC omc , ΔC CaCO3 , ΔC mix , and ΔC ase ) are computed (Fig. 4). It must be noted that here we are not interested in quantifying ΔC nat integrated over a given period (e.g., one year). Rather, we are interested in quantifying how much (the contribution) each component of ΔC nat caused the measured C concentration to deviate from the theoretical, nominally constant value of C b that should exist for a given reference value of S, phosphate (PO 4 ), PA and ΔC ase (Table 2).

A.1.1 Contribution of organic matter cycling
The formation of organic matter through photosynthesis takes up C and nutrients from seawater, whereas the decay of organic matter through respiration releases C and nutrients back to seawater. Moreover, changes in C and nutrients during photosynthesis and respiration take place in stoichiometric ratios (e.g., Anderson and Sarmiento, 1994;Redfield, 1934;Redfield et al., 1963). Therefore, we quantified the change in C related to photosynthesis and respiration (ΔC omc ) using observed changes in phosphate (PO 4 ) relative to the reference value PO ref 4 (Table 2) and the stoichiometric ratio between C and PO 4 (R CP ) according to: . A1. Modelled contributions of the natural C changes (ΔC nat ) arising from the following individual processes: (A) organic matter cycling (ΔC omc < 0: production, ΔC omc > 0: respiration relative to the reference PO 4 in Table 2); (B) salinity changes, i.e. mixing between water masses (ΔC mix ); (C) calcium carbonate cycling (ΔC CaCO3 < 0: formation, ΔC CaCO3 > 0: dissolution relative to the reference PA in Table 2); and (D) ΔC nat , i.e., the sum of the above contributions plus ΔC ase. For each box, the central mark is the median, the edges of the box are the 25th and 75th percentiles, the whiskers extend to the most extreme data points not considered outliers, and outliers are crosses.
We used R CP value of 117 ± 16, which brackets the different values reported in the literature (Redfield, 1934;Anderson and Sarmiento, 1994;Körtzinger et al., 2001).
The resulting values for ΔC omc are shown in Fig. A1 A and these range from -40 to +40 μmol kg − 1 . Generally, the shallow near-coast stations (49, 50, and 55; Fig. 3) show net organic matter production relative to the reference state (Table 2), while the rest of the stations show either net respiration or a balance between respiration and photosynthesis.

A.1.2 Contribution of calcium carbonate (CaCO 3 ) cycling
In situ formation (dissolution) of CaCO 3 structures remove (release) CO 3 2− from (to) seawater and therefore decrease (increase) TA and C according to Eqs. 1 and 2. The resulting TA and C changes occur in a stoichiometric ratio of 2:1 (Zeebe and Wolf-Gladrow, 2001). Consequently, TA data can be used to (i) monitor the in situ CaCO 3 changes, and (ii) quantify the resulting C change (ΔC CaCO3 ) using (e.g., Brewer, 1978): where PA is the potential TA computed from the measured TA and nitrate (NO 3 ) following Kanamori and Ikegami (1982) and Wolf-Gladrow et al. (2007): In Eq. A2, nPA is salinity normalized PA (Friis et al., 2003): where the slope dPA/dS = 67.5 ± 7.4 μmol kg − 1 and the intercept PA 0 =-46 ± 261 μmol kg − 1 . Note, the fact that our observational data do not include TA values for freshwater and/or brackish water results in the high uncertainty of the intercept. Eq. A5 and S ref = 35.1 were used to compute PA S ref (Table 2). Moreover, the predicted PA value at zero salinity (PA 0 ) was determined from the intercept of Eq. A5.

A.2 Air-sea gas exchange-driven variability
At the air-sea interface, CO 2 is exchanged between atmosphere and ocean. The direction of the flux is dictated by the CO 2 partial pressure difference between surface seawater and the overlaying air (ΔpCO 2 = pCO 2air -pCO 2sea ). The CO 2 flux is from (to) the atmosphere towards (from) the ocean if ΔpCO 2 > 0 (ΔpCO 2 < 0). Furthermore, the strength of the flux scales with wind speed (e.g., Sweeney et al., 2007). This process influences C in two ways. First, fluctuations in ΔpCO 2 and/or in wind speed cause fluctuations in C in the surface water and some of this signal could be transported to the deeper waters. However, given that the northern North Sea is seasonally stratified, this fluctuation has only a strong impact on the surface water and we thus neglect its effect in this work. Second, since we are using data covering a long period (18 years), accumulation of anthropogenic CO 2 (C ant ) is expected to steadily increase C over time. The rate of this increase could not be discerned from the low-resolution data we used here. However, Omar et al. (2019) analysed a decade of underway surface data in the northern North Sea and reported an annual C increase of 1.2 ± 0.3 μmol kg − 1 yr − 1 . The addition of C ant does not necessarily take place locally in the North Sea, but can be carried by advection of water masses into the region, e.g., with the inflow of North Atlantic Water and, therefore, should be expected to impact also the deep waters. Thus, we used the estimate of Omar et al. (2019) to adjust all C data to the year 2010 according to: ΔC ase = C ant = 1.2 * (yr − 2010) where yr is the year of observation.

A.3 Mixing-driven variability
Mixing of water masses changes the distribution of C and TA in the water column. Here, we consider the PA-S linear relationship in Eq. A5 to be the result of mixing between low-salinity Coastal Water and high-salinity North Atlantic Water. Other water mass end-members maybe contributing to the observed S relationship, but the effect of this is accounted for through the substantial uncertainties in the regression coefficients of Eq. A5.
Measured C is not a conservative parameter because of the sinks/sources driven by ΔC bio and ΔC ase . However, once the fluctuations arising from ΔC bio and ΔC ase are accounted for we assume that the resulting modified C mix conservatively just like PA (above). Therefore, we computed the contribution of salinity changes (ΔC mix ) using the difference between measured and reference salinity and the slope of Eq. A5: Finally, C b was calculated as A.M. Omar et al. C b = C − (ΔC bio + ΔC ase + ΔC mix ) = C − ΔC nat (A8) where the terms in parenthesis are by definition ΔC nat . As can be seen from Fig. A1B, values of ΔC mix range between -40 and 20 μmol kg − 1 and are lower for the coastal water stations. This means that higher C adjustments were necessary for the low-salinity coastal water to be brought to C levels corresponding to reference samples with S = 35.1.

Appendix B. Circumvention of data coverage and quality limitations
The monitoring and baseline datasets JC180BBL and JC180BL, respectively, included some differences in temporal coverage and data quality as depicted in Fig. B1. The following limitations are notable from the graphs: First, salinity was measured only at the JC180BL lander (Fig. B1A). We circumvented this limitation by using the same salinity values for both sites, i.e. we implicitly assumed that there was no significant systematic salinity difference between the two sites, which are < 0.5 km apart (Fig. 3). A comparison between the salinity of the deepest samples in the POS534P dataset (Fig. B1A, black dots) and those measured at the JC180BBL lander (Fig. B1A, navy dots), when RV Poseidon was up to 1 km away from the lander, showed practically indistinct salinity values supporting the above assumption. For example, the largest observed deviation of 0.05 would produce a 3.4 μmol kg − 1 difference in PA (according to Eq. A5) and 1.7 μmol kg − 1 difference in C b (according to Eq. A7). Both these changes are way lower than other uncertainties involved in the determination of the C seep (section 3.3).
Second, TA values from the JC180BBL dataset were highly noisy (Fig. B1B, navy dots). This could partly result from regular disturbance of sediment sampling activity conducted with a ROV moving around and inserting/removing sediment-based instruments quite close to the sensors for TA and PO4 (below). This could have caused pore waters with higher TA and PO4 to mix into the water column occasionally. In any case, we dealt with the noise by using the average value of TA (2333 μmol kg − 1 ) measured at JC180BL. This is a reasonable approximation given that (i) mean values of the TA values from the two landers are essentially indistinguishable (2309 ± 56 μmol kg -1 vs. 2333 ± 5 μmol kg − 1 ), and (ii) TA is mainly a function of S and we used the same S values for both sites (above).
Third, PO 4 values from the JC180BBL dataset are noisier than their JC180BL counterparts (Fig. B1D) probably due to the same reasons as in the TA data (above). Furthermore, from Fig. B2 we noticed that while all cruise data falls around a line defining the stoichiometric relationship known to exist between PO 4 and NO 3 (e.g., Redfield, 1934;Redfield et al., 1963;Anderson and Sarmiento, 1994), the sensor data deviates from this relationship, especially for high PO 4 values. We also noticed that, according to the cruise data, PO 4 values higher than 1 μmol kg − 1 are not typical for the study area. Thus, we considered any sensor PO 4 values higher than 0.8 μmol kg − 1 as suspicious. This also implies that some of PO 4 variability of the sensor data is noise. Thus, for the computation of C b (section 3.2) in the JC180BL data we used the average PO 4 (0.6 μmol kg − 1 ) based on values ≤0.8 μmol kg − 1 .
Fourth, we also noted that NO 3 values from the JC180BBL dataset are systematically lower than those from the JC180BL by around 1 μmol kg − 1 (Fig. B1C). On the other hand, PO 4 values from the JC180BBL are higher than those from JC180BL. We considered the the less noisy JC180BBL NO 3 measurements to be more trustworthy. Therefore, for the computation of ΔC M nat from JC180BBL data (section 4.2) we used a PO 4 value of 0.41 μmol kg -1 calculated from the average NO 3 concentration (3.74 μmol kg -1 ) using the relationship defined by the cruise data (Fig. B2).