Optimal sensors placement for detecting CO 2 discharges from unknown locations on the seafloor

Assurance monitoring of the marine environment is a required and intrinsic part of CO 2 storage project. To reduce the costs related to the monitoring effort, the monitoring program must be designed with optimal use of instrumentation. Here we use solution of a classical set cover problem to design placement of an array of fixed chemical sensors with the purpose of detecting a seep of CO 2 through the seafloor from an unknown location. The solution of the problem is not unique and different aspects, such as cost or existing infrastructure, can be added to define an optimal solution. We formulate an optimization problem and propose a method to generate footprints of potential seeps using an advection–diffusion model and a stoichiometric method for detection of small seepage CO 2 signals. We provide some numerical experiments to illustrate the concepts.


Introduction
Insulating the captured CO2 from the atmosphere, by injecting it into geological formations, is the final step, and the whole purpose, of the Carbon Capture, and Storage (CCS) technology. Many promising storage sites are offshore, and especially the North Sea is considered a promising region for large scale storage (Halland et al., 2013).
Even though the offshore geological storage complexes are chosen and storage operations are designed to assure long term confinement, there is a risk that some of the injected CO2, being buoyant for the initial decade after injection, can migrate toward the surface and seep into the water column (Metz et al., 2005). If a seep occurs, it would reduce the climate change mitigation efficacy (Haugan and Joos, 2004;Torvanger et al., 2012), have impact on the carbon trading framework (García and Torvanger, 2019), and might, at least in the vicinity of the discharge, damage the ecosystem (Jones et al., 2015). In addition, even if all the stored CO2 is successfully contained within the intended formation, a storage project could suffer from accusation of environmental impact (Boyd et al., 2013;Romanak et al., 2013).
This gives motivation for designing monitoring programs that would not only comply with regulations (Dixon et al., 2015), but also to rule out aforementioned unjustified accusations. A monitoring program can also be viewed as part of marine mapping and survey programs, e.g., the Mareano programme (Buhl-Mortensen et al., 2014). It can be part of Ocean Observatories Initiative (OOI) (Smith et al., 2018), helping in assessing the overall health of the marine environment (Halpern et al., 2012), and be a tool for Marine Spatial Planning (MSP) (Domínguez-Tejo et al., 2016). As such, the monitoring program will be useful for communicating risks and benefits from large scale storage underneath the ocean, subsequently assisting in gaining public acceptance (Mabon et al., 2014).
Since the storage site must be monitored for a long period after the injection, and the area in which migrating CO2 might reach the seafloor is large, the marine monitoring program will impose additional costs and challenges to the storage project (Oldenburg and Lewicki, 2006;Blackford et al., 2015Blackford et al., , 2017. In particular, access to offshore sites for monitoring purposes will be harder and more costly than for onshore sites, partly due to seawater being hostile to instrumentation. Thus, the monitoring program must be not only effective but also cost efficient. While geophysical monitoring technologies will be the backbone of the monitoring program for offshore CO2 storage projects (Jenkins et al., 2015;Vermeul et al., 2016), the detection threshold of such techniques is of the order of 10 3 t CO2 (Jenkins et al., 2015) and, thus, secondary monitoring strategies must be in place. Monitoring for CO2 seeps through the seafloor, e.g., monitoring changes in bottom fauna or in the pelagic ecosystem (Wegener et al., 2008;Blackford et al., 2010), detecting bubbles from ship sonars (Brewer et al., 2006;Noble et al., 2012), or elevated concentration of dissolved gases Drange et al., 2001;Botnen et al., 2015;Vielstädte et al., 2015;Uchimoto et al., 2018) could be used to detect possible leakage at low levels. Seafloor monitoring can reduce the probability and magnitude of adverse events to occur if small seeps are not detected. However, the high variability of the marine environment, both in current conditions (Alendal et al., 2005;Ali et al., 2016) and in biochemical activities (Artioli et al., 2012;Romanak et al., 2012;Botnen et al., 2015), poses new challenges compared to the classical environmental monitoring procedures developed during decades of offshore petroleum activities.
Here we focus on seafloor monitoring strategy based on measuring CO2 concentrations using fixed installations. Seawater CO2 is highly dynamic in both space and time. The high variability arises from natural processes, such as photosynthesis/respiration, biosynthesis/dissolution of calcium carbonate (CaCO3) and changes in salinity, which affect the complex seawater CO2 system. Therefore, in order to detect seeps from subsea reservoirs one needs to define an anomaly threshold able to distinguish the seepage signals from the natural variability of seawater CO2. Botnen et al. (2015) demonstrated a stoichiometric approach for detection of small CO2 signals that might arise when extra CO2 stemming from subseafloor seeps dissolve into the water column. This method, henceforth referred to as the C-seep method, will be briefly described in Section 3.1. For a more thorough description the reader is referred to Botnen et al. (2015) and Omar et al. (2018). The C-seep method is based on the so-called back-calculation method (e.g., Gruber et al., 1996) and lies in between the statistical power analysis for obtaining an environmental baseline (Yang et al., 2011) and the pure process based monitoring approach suggested by Romanak et al. (2012). One of the utilities of the C-seep method is that it lowers the concentration threshold for a signal to become statistically significant and here we use the method to define detection limits.
In addition to the environmental statistics, the design of monitoring programs relies on probable seep scenarios that can only be predicted from characterization of the local geology and through flow migration process models. As the marine waters are in constant motion and are characterized with high variability, the footprints of leaks are thus highly anisotropic and strongly depend on the local oceanic and atmospheric conditions (Alendal et al., 2005;Ali et al., 2016). The role of numerical modeling in this context is summarized in Blackford et al. (2018).
Studies on how to design monitoring programs for detecting a seep from an unknown location, incorporating the natural variability, have been performed for fixed installations (Hvidevold et al., 2015(Hvidevold et al., , 2016Greenwood et al., 2015;Ali et al., 2016) and for Autonomous Underwater Vehicles (AUV) (Maeda et al., 2015;Alendal, 2017). These studies did not take into account spatial heterogeneity of advective velocities and used very limited number of simulated leak scenarios.
In particular, Hvidevold et al. (2015Hvidevold et al. ( , 2016 optimized the layout of a fixed array of chemical sensors on the seafloor, using the probability of detecting a seep as metric. It was assumed that all possible leaks would present the same footprint and the layout of the sensors was solved in order to reach the highest probability of detecting the seep. Here we introduce two main improvements to the previous studies; Firstly, we formulate an optimization problem in terms of the classical set cover problem, solution to which could be approximated using well established approximation algorithms. This formulation also allows for different cost functions to be minimized, which could be, e.g., the number of sensors, the cost associated with maintaining sensors, the probability of a leak at specific locations. We give more details on the set cover problem associated to the seep localization in Section 2. Secondly, we use different and more realistic footprints based on a physical model, as described in Section 3. We give illustrations of the concepts throughout the manuscript using model simulations, and some constructed examples. In particular, we use an area in the southern North Sea as area of study to illustrate our results, see Section 4. Finally, we discuss possible extensions of the approach in Section 5.

Sensor placement algorithm
In this section we formulate the optimization problem and define possible cost functions. Let be a bounded subset of d , and U i , = … i N 1, , , be subsets of of a positive Lebesque measure which we denote as U | | i . We can think of sets U i as the footprints associated to a potential leak i, which may occur, that is being active, with some probability p i . We say that a set U i is detected at a point x if x U i . Obviously one point x may detect several sets. Here we would like to discuss how can we place a number of sensor points so that they detect all the sets U i and their placement is optimal with respect to a given cost function. For a given cost function : ( ) 0 we define optimality as the set of points … x x { , , } n 1 which lead to a minimal cost. Here we use cost functions that can be written as the sum of pointwise costs x ( ) incurred by each sensor point x , that is, for a pointwise cost function : 0 . The most straightforward criteria for optimality is the number of points n to be minimal. This corresponds to a pointwise cost function = x ( ) 1. We would like to emphasize that we do not distinguish between two different points x and y if they detect the same sets. That is, if x y U , i for i I and x y U , j for j I , then x and y are in the same equivalence class.
) and x U j , j I x ( ) in the examples below. The pointwise cost function can dependent on the area of the equivalence class, that is, Let N k be a number of sets detected by a point which we call the overdetection number, or simply overdetection, associated with … x x { , , } n 1 . In order to maximize the overdetection number while keeping the number of sensors low, we introduce the pointwise cost function is given by Eq. (1) with the pointwise cost function as in Eq. (4). Now let U i , = … i N 1, , , be associated with certain probabilities p i . If p i is a probability that leak i is active, and assuming that exactly one leak is active, the natural way to define a pointwise cost function would be In this case minimizing … x x ({ , , }) n 1 would correspond to placing sensors so that they detect leaks, associated with higher probability of being active, more times. This cost function could be view as a particular case of N when = p N 1/ i in Eq. (4). In a similar fashion, one can define the probability of overdetection as where is associated with the pointwise cost in Eq. (5). Observe that even within one approach there are multiple ways to choose the cost function, see Eq.
(2). This choice should be motivated A. Oleynik, et al. International Journal of Greenhouse Gas Control 95 (2020) 102951 by the application. The only requirement is that the pointwise cost function is non-negative. We summarize the problem as follows. Find x j , = … j n N 1, , in such that (i) all the sets are detected and (ii) the total cost of placing is minimal. This is a typical example of a set cover problem with the universe Dealing with a small number of sets or sets with rather simple intersections, is trivial. However, the complexity of the problem increases considerably when the number of intersecting sets gets large. In fact, the problem is computationally difficult to solve and different approximation algorithms have been developed to deal with problems of this kind. For the review of the problem and numerical methods see, e.g., Hochbaum (1997) and Schrijver (1986). Here we use a linear integer program formulation, which we present below.
In order to simplify the problem, we can remove repeated columns and zero columns in V and the corresponding elements in w, but keep the ones that corresponds to the minimum of w for the duplicated columns. By doing so we obtain a smaller matrix w , respectively. Each column of the matrix Ṽ corresponds to the sets i , i I x ( ) k that could be detected with the point x k . Finally, we formalize the problem as Each nonzero element z j in z {0, 1} M corresponds to a point x j that can be placed anywhere in the set X j defined as Obviously, the number of nonzero elements in z is at most N and the total cost is given as w z T . The matrix Ṽ and w in Eq. (7) could be further pruned by removing the columns corresponding to x j that do not have desired properties. For example, dealing with fine grids, one may remove the columns corresponding to x j that has too small area X | | j . In the next section we propose a method to generate the sets U i associated with leak footprint.

Determining detection limits using the C-seep method
The C-seep method isolates the effect of leakage CO2 on the Dissolved Inorganic Carbon (DIC) by comparing two measurements acquired at the reference station (ref) and the station being monitored m ( ). The method first minimizes the DIC differences between the two stations that arise from differences in natural processes (Botnen et al., 2015). This is achieved by correcting the DIC of the monitored station back to that of the reference station. It then assumes that the remaining natural variability of the two stations are identical and, thus, seepage of CO2 can be computed as the difference between the corrected DIC and the DIC at the reference station .
Below we present shortly the calculations. A description of the variables can be found in Table 1.
From Botnen et al. (2015) the corrected DIC is calculated as and r C P : is the carbon to PO4 Redfield ratio, which relates the DIC and PO4 produced/consumed during organic matter cycling (Redfield, 1934). The term A 0 stands for the estimated alkalinity when salinity = S 0, assuming that salinity and alkalinity obey the linear relationship see, e.g., Friis et al. (2003).
Assuming that C (ref) is not influenced by seeps we estimate the excess of CO2 at the monitoring station as In Table 1 we give examples of the measurements for three stations 5, 6 and 7, see Fig. 2(a). Here we used the publicly available data seawater CO2 measurements from a cruise in February 2002 (Olsen et al., 2016).
In reality, this is however not the case due to errors in measurements and estimated parameters. Including these errors in the model results in with error distribution . Given a threshold , we can then attribute the excess of CO2 to the seep if > C m ( ) seep . Choosing too high thresholds will not allow for seep detection, while too low thresholds may lead to false alarms.
In order to estimate we performed Monte Carlo (MC) simulations of the underlying model in Eqs. (8)-(11), assuming normal distribution for the measurements and model parameters errors. In particular, we assumed that salinity measurements have standard deviation = 0.003 S and the standard deviations of alkalinity and phosphate are chosen to be 0.1% and 0.05% of measured values, respectively. For the Redfield ratio we use = r 117 C P : µmol kg 1 with the standard deviation equal to 14 µmol kg 1 , based on Anderson and Sarmiento (1994). Finally, we assumed that A 0 in Eq. (10) has normal distribution with 1817.46 µmol kg 1 mean and the standard deviation 48.02 µmol kg 1 based on Omar et al. (2010).
We plot the histogram of the error from the MC simulations with Fig. 1(a) fitted with a normal density function. We chose the station labeled 6 for the reference station as in Omar et al. (2018). There is, however, no particular reason for this choice. In Fig. 1(b) we plot the estimated mean µ and standard deviation as a function of the sampling number N MC together with the 95% confidence intervals for this distribution. The confidence intervals were calculated as in Harding   Table 1 A list of variables used in calculation of C seep in Eq. (11) and the measurements for stations 5, 6, and 7 as in Fig. 2 In Fig. 2 we illustrate the outcome of 10 000 random draws of MC sampling as the box plot for = ref 6 and = … m 1, , 10. The shaded areas corresponds to detection threshold The measurements used here did not contain any seepage signal. Therefore, the box plot indicates the degree at which the assumption of identical background DIC in the reference and different monitoring stations is met. For stations that have similar background DIC as the reference station we require < C | | seep . Positive values above mean that monitored station has higher background DIC than the reference station, whereas negative values indicate that the monitored station has lower DIC levels compared to the reference station. Station 6 can be considered as a good reference for all stations except, perhaps stations 3 and 9, when choosing = 2 . By choosing = we risk to have too many false alarms, in particular from station 3 and 9, while by choosing = 3 we may miss a seep if it occurs. The above consideration of the C-seep is based on a limited data from one winter cruise. Including more data would provide us better knowledge of parameters and errors and result in improved estimates of C-seep.
Moreover, in order for a seep to be detected in a large area reference and monitoring stations must be placed in a such way that they capture the signal originating from most of the potential leak locations and, at the same time, can be used as reference stations for each other. In this paper we focus on the placement of measurement stations. In order to decide if they are representing the natural variability of the region, and thus can be used as the reference stations, the measurements must be collected from the identified locations. This is however outside of the scope of the present paper.

Simulating CO2 footprints
As we mentioned before, designing a monitoring program of subsea CCS reservoirs is challenging due to both the variability of the environment and ocean dynamics. Assuming that the measurements can be corrected for the natural variability, as in Section 3.1, we now focus on the ocean dynamics.
Transport of contaminants, such as CO2, in the ocean is typically modeled using General Circulation Models (GCMs) with additional transport equations for tracers. These models are computationally demanding and, hence, only allow to simulate a few leak scenarios. Under the assumption that the contaminant is a passive tracer, i.e., does not influence on the water density, the tracer transport equations can be integrated off-line. The GCMs can be used to produce characteristic spatial and temporal velocity fields, accounting for tides, storm events, and topographic steering of the currents. Such current statistics, preferably supported by in-situ current time series, together with an advection-diffusion model, being orders of magnitude less computationally demanding than the GCMs, can be used to simulate many more leak scenarios.
Let the transport of a contaminant be given by the advection-diffusion model x 0 Here c x t ( , ) is the concentration of a contaminant, is a bounded connected domain in d , = d 2, 3, W x t ( , ) d is a velocity field, and D x t ( , ) 0 is the diffusion coefficient. The source term f x t ( , ) is assumed to be in the form where is the d-dimensional delta function, > q 0 is the intensity, or the seepage rate, and z is the location of the source. For simplicity we assume that q is constant.
In applications, the point source z is substituted by a small region around z which amounts to replacing x z ( ) with functions of small support. Thus, x z ( ) can be viewed as a limiting case when the support is getting smaller and smaller. Here, we assume that the domain is large enough and the sources positioned far from the boundary so that the contaminant does not reach . The latter assure that spurious effects from the open boundary conditions enforced on the lateral boundaries do not affect the results.
In our examples we use W x t ( , ) obtained from a 800 m resolution regional Bergen Ocean Model (BOM) 1 set up for North Sea . In the vertical the model uses 41 sigma-layers, distributed with higher resolution (1 m) near the free surface and the sea floor. The resulting current is dominated by semi diurnal tidal signal with an average speed close to 10 cm/s, and an amplitude less that 10 cm/s, for details see Ali et al. (2016). For simplicity, we considered only the 1 m thick bottom layer. In our simulations is a × 72.8 74.4 km 2 rectangular area that, in the geographic coordinate system, corresponds to the region marked in Fig. 2(a). We use × 93 91 grid cells of × 800 800 m 2 . In Fig. 3(a) we plot the mean of the current speed at the bottom layer. To illustrate variability in speed and direction, we plot the wind rose diagram for the currents. In particular, in Fig. 3(b) we plot the diagram for the currents over the whole area and in Fig. 3(c) and (d) at particular locations, which are marked in Fig. 3(a) with the red circle and black square. As the horizontal diffusion is insignificant for this grid size, we set = D 0. When we would like to emphasize the parameter dependence of the solution c x t ( , ) we add the parameters of interest after semicolon, e.g., c x t t ( , ; ) 0 or c x t t q z ( , ; , , ) 0 .

Since the model above Eqs. (13)-(15) is linear, a multiple leak scenario solution, that is, when
, can be calculated as is the solution of Eqs. (13)-(15) with = q 1 and = z z j . Let q be the intensity, t 0 and T the seep starting time and its duration, respectively, and the detection threshold obtained as in Section 3.1. Then a leak footprint can be defined as, e.g., the maximal footprint 0, which is a useful property when generating footprints with different parameters q and . In addition, as the model is mass conserving, we can be guaranteed that all leaks with the flux rate larger than q and lasting longer than T will be detected, if the method is working for T and q. Indeed, let > 0 and t 0 be fixed. Then ,m ax 0 for q q and T T , see Fig. 4(b). Our goal is to detect leakages, which can go unnoticed for a rather long time. In particular, we use, what is referred in Blackford et al. (2008) to as a long term-diffuse seepage. That is, we assume a constant low-level seepage of CO2, spread homogeneously across the area of one model box (0.64 km 2 ). We use the same seepage rates as in Blackford et al. (2008), namely, a high seepage rate of 0.0953 kg s and a low seepage rate of 9.53 kg s, both recalculated for the considered model environment (model box of 0.64 km 2 ). That is, we have = × q 1.94 10 high 7 kg m 3 s 1 and = × q 1.94 10 low 7 kg m 3 s 1 , see Table 2.
Further on, we assume T and being fixed. For this reason, we omit T and in the notation and simply write U q ( ) Here we consider 20 leak locations … z z z { , , } 1 20 selected uniformly at random in 0 , see the locations marked red in Fig. 5. We assume that these are the only possible leak locations, which is a simplification. In order to take into account different footprint topologies at the same location, we vary the starting time t 0 . For illustrations we chose 10 different starting points with , q high , and q low as in Table 2. In Fig. 4 we give an example of + c x t T ( , ) 0 for one leak location at = t t 0 0,1 and the corresponding footprints. As expected, the footprint corresponding to the low rate is contained in the one corresponding to The whiskers extend to the extreme points, not considered outliers, and the outliers are marked as red crosses.
From Fig. 4(b), Fig. 5 and properties mentioned above, it is clear that a larger ratio q / implies the larger number of intersecting sets, and thus, potentially, smaller number of sensors needed. We give examples for both = q q high and = q q low in the next section.

Numerical experiments
In this section we illustrate the optimization problem, defined in Eq. (7), using different cost functions. The goal is to find the optimal sensors placement in the region to detect all potential leaks and compare the outcomes corresponding to different cost functions. To illustrate the method we used 200 maximal footprints, as described in Section 3, generated for two different seepage rates q and the detection threshold as in Table 2, see Fig. 5. In the notation of Section 2, these footprints are the sets U i , = … i N 1, , with = N 200. First we consider the sets generated with the high q and then with the low. There is no reason treating these two cases together since the sets corresponding to the low q are contained in the sets with the high q, see the previous section for details.
To solve Eq. (7) we used the Matlab inbuilt function intlinprog. For all the test examples the program found an optimal solution which was indicated by the zero relative dual-primal gap.
In Fig. 6 we plot solutions to the problem with different choice of cost functions. Black crosses correspond to sensor locations, the red dots to the leak locations. The numbers next to the crosses indicate the number of leaks (out of total number of possible leaks) that can be detected per sensor. The color-plot indicates the sets U i with the highest values corresponding to the place of maximum overlap. We specify the overdetection number and the approach name that corresponds to the choice of , see Table 3.
In particular, in Fig. 6(a) we plot a solution that gives the minimal number of sensor, which is equal to 7 in this case. In this case we solve Eq. (7) with = x ( ) 1, which we refer to the minimal number approach. The solutions in Fig. 6(b) and (c) are solutions of the optimization problem Eq. (7) with x ( ) chosen as Eqs. (4) and (5), respectively. The probabilities p i , were set as in Fig. 7(a) with the normalization factor × 815 10, so that = p 1 i . One can see that both solutions are the solution of the minimal number approach (unweighted cover problem), that is, when = x ( ) 1. As the unweighted cover problem could have many solutions, introducing relevant costs could not only give minimal number of sensors but also allows to make the detection more robust.
That is, the solution in Fig. 6(b) maximizes the number of overdetected leaks while minimizing the number of sensors. Here all the leaks have equal probability of being active. Then the cost function Eq. (4) can be viewed as a particular case of Eq. (5) We call the methods corresponding to these cost functions as maximal overdetection and maximal probability, respectively.
Finally, to demonstrate the ability to include cost in the monitoring design, we consider the pointwise cost function where x y ( , ) c c are the coordinates of the center of domain and > 0 is a scalar, see Fig. 7(b). This function aims to illustrate the operational cost which might be site dependent. The assumption is that there is some infrastructure in the center of the domain, e.g., a platform or the injection infrastructure, and that the cost of maintaining sensors increase with distance from this center point. We plot the solution to the optimization problem with this cost function in Fig. 6(d), and call the approach the minimal operational cost. The number of sensors in this case is equal to 12.
In order to compare the solutions … x x { , , } n 1 in Fig. 6 and motivate the choice of the method names, we calculate overdetection numbers, see Eqs.
In addition, we ran 10 000 leak simulations initiated uniformly at random within available time range for currents, that is, between 2 February 2012 02:17:08 and 30 May 00:00:00. The leak locations were randomly drawn from the 20 leak locations. The seepage rate was fixed to high, see Table 2. Next, we have checked how many of the random leaks would be missed by the sensors placed as in Fig. 6. We report the results in the last row of Table 3. All the cases indicate less than 3% failure rate.
Next, we apply the same methods to the leaks generated with the low seepage rate. As shown in Fig. 8 and Table 4 the number of sensors has increased more than twice compared to the high seepage rate case. Analogous to the previous example, we test the sensors locations on the 10 000 leaks. The failure rate has increased, but still remains below 5%, which we consider acceptable. In order to decrease the failure rate without over-fitting, one requires longer time series of the current simulations. We do not pursue this task here.
We would like to point out that, even though using the minimal operational cost increased the total number of sensors in both examples, it did not improve the detection results for the randomly generated leaks. A. Oleynik, et al. International Journal of Greenhouse Gas Control 95 (2020) 102951

Concluding remarks
We have demonstrated how solving a classical mathematical problem could be used to design marine monitoring programs. To perform a more comprehensive design, valid for an actual storage site, will require a geological survey identifying potential leak locations and their relative probability, as for example in Fig. 7(a). In addition, a comprehensive environmental baseline is needed for establishing better detection limits, from for instance the C-seep method. Process models play a significant role in establishing the necessary baseline statistics .
Transport models play important role in predicting the spatial and temporal signal of a tracer discharge to the water column and, thus, adequate current statistics will allow for better footprint predictions. Depending on the data available, it might be beneficial, however more computationally costly, to use three dimensional version of the advocation-diffusion model. In addition, the footprints could be produced accounting for seasonality, the measuring frequency, and other factors and events, e.g., storm passages and fresh water run-off. Data from insitu release experiments, e.g., QICS and STEMM-CCS (Blackford et al., 2014 are very useful for the required validation and quality assessment of these models.  Table 3.

Table 3
Comparison of the solutions with different cost functions, for the high seepage rate case. The value in boldface in rows 3-6 corresponds to the optimal solution in the sense of the cost function type indicated in the corresponding row. The last row reflects the percentage of the 10 000 random leaks missed by the sensors for each of the solutions.  A. Oleynik, et al. International Journal of Greenhouse Gas Control 95 (2020) 102951 Generally speaking, there might be several solutions to the optimization problem that corresponds to the minimal number of sensors, see, e.g., Fig. 6(a)-(c). Thus, it could be useful to design additional cost functions that would allow to select one solution that, at the same time, also optimizes this cost, e.g., see Fig. 6(b) and (c). This could be easily done within the same mathematical framework. Costs associated with sensors placement and maintenance can be issues entering the cost function and, hence, in designing the monitoring program. These costs should be balanced with the needed confirmation that a leak will be detected and the imposed expenses caused by false alarms, i.e., the   Table 4. A. Oleynik, et al. International Journal of Greenhouse Gas Control 95 (2020) 102951 probabilities of false positives and negatives . It is possible to include these considerations into the cost function for the optimization problem. When dealing with fine grids, the cost function as in Eq.
(2) could be of value, as it would secure the same detection outcome if a sensor is moving within the equivalence class.
Obviously, larger area of the footprints will lead to the smaller number of sensors needed, see Figs. 6 and 8 . Thus, adding a tracer to the injected CO2 may be cost efficient. In addition, footprints could be produced in a more sophisticated manner using, for example, machine learning methods .
It is possible to combine different instrumentations, e.g., acoustics, chemical and images, when designing footprints. The complexity of the problem however may increase drastically with the number of footprints and the grid size. Thus, the matrix Ṽ in Eq. (7) could be further simplified by removing some columns corresponding to sets that do not have desired properties, i.e., the sets with too small equivalence class.
Combining fixed measurements with moving platforms are not as straight forward and it would require some effort to establish routines for designing such monitoring platforms. One way would be to design the fixed platforms first, and then add moving platforms to increase our abilities to detect seeps in areas in which the fixed coverage is limited.
As mentioned initially, an important factor when designing the monitoring program is our ability to justify that a seep will be detected and to assist in communicating with stakeholders, governmental bodies and public at large. Collaborating with other offshore operators or activities through data sharing or collaborative surveys will be a win-win situation. The monitoring program can take advantage of existing infrastructure and, as a spin off, the storage project contribute to sustainable management of our oceans.

Table 4
Comparison of the solutions with different cost functions, for the low seepage rate case. The value in boldface in rows 3-6 corresponds to the optimal solution in the sense of the cost function type indicated in the corresponding row. The last row reflects the percentage of the 10 000 random leaks missed by the sensors for each of the solutions. A. Oleynik, et al. International Journal of Greenhouse Gas Control 95 (2020)