Resolution changes relationships: Optimizing sampling design using small scale zooplankton data

Marine research surveys are an integral tool in understanding the marine environment. Recent technological advances have allowed the development of automated or semi-automated methods for the collection of marine data. These devices are often easily implemented on existing surveys and can collect data at finer spatiotemporal resolutions than traditional devices. We used two automated instruments: the Plankton Imager and FerryBox, to collect information on zooplankton, temperature, salinity and chlorophyll in the Celtic Sea. The resulting data were spatiotemporally aligned and merged to decreasing spatial resolutions to explore how distribution patterns and the relationship between variables change across different spatial resolutions. Relative standard deviation was used to describe variability of merged data within grid cells. All variables displayed large, area-wide spatial patterns excluding copepod size which remained consistent across the study area. Copepod biomass and abundance displayed high variations across small spatial scales. Decreasing the sampling resolution changed the description of the data where small spatial changes (those that occur over scales < 3 km) were lost and area wide patterns were emphasized. Furthermore, we found that the choice of resolution can affect both the statistical strength and significance of relationships with high variability at lower resolutions due to the mismatch between the scales of ecological processes and sampling. Determining the optimum sampling resolution to answer a specific question will be dependent upon several factors, mainly the variable measured, season, location and scale of process, which all drive variation. These considerations should be a key element of survey design, helping move towards an integrated approach for an improved understanding of ecosystem processes and gaining a more holistic description of the marine environment.


Introduction
Research surveys are fundamental in furthering our understanding of the marine environment.Motivated by providing a holistic, ecosystem approach to monitoring (Kupschus et al., 2016) or mandated by policy (European Commission, 2008;Danovaro et al., 2016), technological developments are helping surveys move toward increasingly interdisciplinary approaches (ICES, 2015;Doray et al., 2018).Installing automated technologies, which allow for continuous data collection with little human input, are a straightforward step in achieving this goal.Devices such as the FerryBox, used here, collects both physical and biological variables continuously and reports high frequency data throughout a survey (Petersen and Colijn, 2017).Due to their continuous, automated nature, these data are readily available for (near) real time analysis, or retrospective, post-collection, analysis.These often easy to implement devices can reduce vessel costs (time, fuel or labor) when compared to traditional methods (deployment of nets) or towed imaging devices (e.g., deployment and recovery or reduced vessel speed while towing) and allow for an increased number of variables collected at no or little extra cost.Their use can help surveys stay within financial limitations (Bean et al., 2017;Pitois et al., 2018) and allow for optimized survey design (Kupschus et al., 2016) by easily increasing sampling coverage and intensity (Owens, 2014;Doray et al., 2018).These devices do not require an onboard expert, freeing up vessel space and further reducing costs.They can typically sample in all weather conditions, allowing for data collection in hard to sample locations or reduce the time spent waiting for safe sampling conditions.
In recent decades, automated technologies have become commonplace in multiple marine disciplines.The FerryBox is one of many established options for continuous, automated sampling of physical parameters.Acoustic devices are used globally, commercially and scientifically, for fishing (Mann et al., 2008;Simmonds and MacLennan, 2008), marine mammal research (Johnson and Tyack, 2003;Johnson et al., 2009) and bathymetry (de Moustier, 1986).Other purpose-built devices sample a single component, for example, fish eggs (Checkley et al., 2000) or phytoplankton (Olson et al., 2018).The continuous sampling of zooplankton, globally important in carbon cycles (Steinberg et al., 2002;Steinberg and Landry, 2017), fisheries science (Beaugrand et al., 2003;Heath, 2005;Lauria et al., 2013) and used as climate change indicators (Taylor et al., 2002), provides a unique technological challenge arising from the difficulties with sampling the entire zooplankton component accurately.Zooplankton includes a wide range of sizes and behaviors and undergoes rapid, temporal and spatial changes, known as plankton patchiness (Mackas et al., 1985;Abraham, 1998), making replicable sampling difficult.These fine scales changes can be seen with traditional net haul data, which provide a 'snapshot' of the zooplankton but have very high replicate tow variability (Wiebe et al., 1968;Lee and McAlice, 1979;Skjoldal et al., 2013).Capture by netting, and analysis by microscopy, form the gold standard of zooplankton sampling and are principally responsible for our understanding of zooplankton ecology.Their continued use to maintain time series and use as a reliable method is essential to further understanding the zooplankton.Like all sampling devices, nets do suffer some limitations.Deploying plankton nets is time consuming and the collected sample is often preserved using hazardous chemicals for analysis on shore.These challenges have placed pressure on developing cost-effective methods (Danovaro et al., 2016) and in response, technological developments have resulted in a variety of newer devices (Wiebe and Benfield, 2003;Lombard et al., 2019).
The Plankton Imager (PI) is a continuous, automated, imaging device used to sample zooplankton.The PI takes images of all passing particles in seawater pumped onboard a ship (Culverhouse et al., 2016;Pitois et al., 2018).An initial study evaluating the first generation of the instrument (previously known as Plankton Image Analyzer, Pitois et al. 2018) against traditional net sampling, found good agreement in the spatial distribution of zooplankton abundances, although noted a portion of fragile organisms (e.g., Appendicularia) were likely to be damaged by the system pump and consequently under-sampled.The study also described the overall lower capture efficiency of the PI with discrepancies mainly resulting from image quality, such as blurred images, which made accurate classification challenging.In response, hardware changes have resolved these issues, resulting in much improved image quality.The PI has since been used to describe temporal changes in the mesozooplankton community (Scott et al., 2021).This study found that those fragile species (e.g., Appendicularia) are sampled in sufficient quantity to detect seasonal difference.More recently the application of the PI to ecological indicators has been tested (Pitois et al., 2021).To date, all published studies have used the PI for point sampling, similar to a deployed ring net, as opposed to continuous sampling.Here we used a new data extraction method to best take advantage of the PI's continuous nature.
The PI has been used alongside the FerryBox routinely during UK fisheries surveys in the Celtic Sea.We use data collected in parallel from these devices to explore small scale changes in the zooplankton in the context of physical parameters and the relationships therein.As automated devices and the ability to collect vast quantities of data become increasingly common place, a new challenge has emerged in that the data collection rate has become faster than the processing rate, resulting in data bottlenecks.It is therefore important to focus collection efforts to gather the correct type of information, at the required locations, times and scales to answer a particular question, balancing research needs with budget limitations.Here, we aim to explore how best to determine the optimal resolution appropriate for the target process or relationship to avoid mismatching between sampling resolution and ecological scales.These can be used to inform future survey design leading to an increasingly holistic survey description.

Materials and methods
All data were collected in the Celtic Sea from the 3rd of October to the 7th of November 2020 aboard the RV Cefas Endeavour as part of the PELTIC survey (PELagic ecosystems in the Western English Channel and eastern celTIC Seas) (ICES, 2015) (Fig. 1).All in situ data were collected using the ship's continuous flow system sampling at 4 m below sea level.Zooplankton data were collected using the PI (Pitois et al., 2018).Temperature, salinity and fluorescence were collected using the Ferry-Box (4H-JENA, Germany).Zooplankton data were sampled at night for consistency and to reduce the effect of vertical migration (Lampert, 1989;Pitois et al., 2018).

Plankton Imager (PI)
The PI was connected to the ship's continuous flow pump, sampling at 22 L min − 1 with negligible downtime (Fig. 2).The inlet pipe and internal ships piping have various internal diameters larger than the flow cell which has an internal depth of 12.8 mm, giving a field of view of 10 µm × 20.48 mm.As sea water passes through the flow cell where all passing particles are imaged by a Basler 2048-70kc line scan camera with a scanning rate of 70,000 lines per second.Lines are then stitched together, and regions of interest (ROI) are extracted and saved as images.GPS, time and particle size data (area, length and width) are saved in the metadata of each image.The PI worked continuously throughout the survey.The PI has adjustable minimum and maximum size parameters (min.100 µm to max. 2 cm).When using this range, the processing rate of the images could not keep up with their collection rate (i.e., the images are captured faster than they can be written to disk.For the survey, the size range was set from 180 µm to 2 cm.This reduced the number of captured images and allowed for a more manageable dataset for archiving, processing and analysis.Even with this reduced size range, a 1 month survey typically collects 1 tb of data. Over 70 million images were collected during the survey.In the absence of an accurate classifier for the PI, all images required manual classification.A series of subsets were used to reduce the number of images classified to an achievable quantity.A 0.25 • grid was transposed over the study area.Each grid cell typically had multiple transects passing through with the specific number of transects varying based on vessel movements.Data were extracted from the shortest nighttime transect within each 0.25 • cell (min = 20 mins, mean = 136 mins, max = 420 mins).The transect time (and therefore water sampled) varied within each grid cell dependent on vessel activities (e.g., steaming between stations or fishing).This extraction resulted in 17 million images for classification.Finally, data were temporally subsampled where 1 in 10 images were extracted from each transect to further reduce the size of the dataset.This process is similar to random subsampling or 'splitting' of a physical sample and assumes the distribution of organisms within a subsample follows a Poisson distribution (Postel et al., 2000).The resultant 1.7 million images were manually classified to "copepod" or "other" with copepod including the adult and copepodite stages and the latter category comprising all non-copepod zooplankton and detritus.Sorting to only these categories greatly sped up the classification process.The final copepod count per grid cell was multiplied by 10 to resolve for subsampling.
For statistical analysis, the selected transects were divided into 10 min bins where each bin sampled 0.22 m 3 of seawater.This totaled 853 bins (Fig. 1).The minimum bin size was determined as a compromise between obtaining the smallest possible spatial resolution while sampling a sufficient amount of water to allow for subsampling.Sampling a smaller amount of water (e.g. 1 min and 0.022 m 3 of water) may have resulted in unrealistic values when resolving for subsampling.Copepod density was reported as individuals per m 3 (indv.m − 3 ).Particle lengths were obtained from image metadata files and used as a proxy for copepod size or total length.Within each 10 min bin, the geometric mean (geomean) size of all individuals was calculated to take into account their non-normal distribution.This mean value was used to calculate mean copepod wet weight (i.e., individual biomass) following the equation from Pitois et al. 2021: copepod wet weight = 0.299 × total length 2.8348   This was then upscaled with copepod density to calculate biomass across the bin reported as mg m − 3 .

In situ chlorophyll measurements
The FerryBox consists of a water inlet connected to the ship's continuous flow (Fig. 2).It comprises a suite of sensors for measuring physical variables (e.g., temperature, salinity, turbidity, fluorescence and oxygen) and corresponding metadata (GPS, date and time).All data are automatically bin averaged to 1 min on collection to save storage space.Only temperature ( • C), salinity (psu) and fluorescence were used in this study.Discrete chlorophyll samples were taken from the continuous flow passing through the FerryBox at 22 locations within the study area (Fig. 1).The chlorophyll was extracted using 90 % acetone and measured with a Turner fluorometer (Strickland and Parsons, 1972).A linear model was used to convert the FerryBox chlorophyll fluorescence to chlorophyll.The fitted regression model was: [chlorophyll mg m − 3 = 0.72 * chlorophyll fluorescence + 0.0637].The regression was statistically significant (R 2 = 0.91, F = 206.1,p < 0.001).FerryBox data were spatiotemporally aligned to the 853, 10-minute copepod bins described above.This was achieved by taking the mean of three variables across the bin.

Analysis
At the finest resolution (the 853, 10 min bins), data were plotted to describe the broader spatial patterns and examine small spatial scale changes in all variables.For copepod biomass, size and density, the change in value between a 10 min bin and the previous 10 min bin was examined to see if there was a relationship between distance or time between bins and change in value.For statistical analysis and to investigate how changes in resolution can affect how spatial patterns are described and if small scale changes are omitted or accentuated, the bins were merged to decreasing resolutions.Merging was achieved by taking the mean value of all merged bins for each variable.To explain variation within each cell at each resolution, Relative Standard Deviation (RSD) was used as it expresses the variability of a data set as a percentage relative to its location.RSD is calculated as: RSD = (sample standard deviation / sample mean) × 100.
Four resolutions, 0.1 • , 0.25 • , 0.5 • and 1 • were selected.The largest resolution was chosen based on the spatial extents of our study area.A resolution lower than 1 • would have resulted in too few cells or a cell that contained the entire data.The selected resolutions were used to visually compare the changes in the description of spatial patterns associated with merging data to coarser resolutions.The relationship between RSD and decreasing resolution was also explored for all variables at resolutions between 0.01 • and 0.9 • decreasing in 0.01 • increments.Here the mean RSD value across all cells was used.For statistical analyses, the same resolution range (0.01 • to 0.9 • , by = 0.01 • ) was used.Spearman's ρ coefficient was used to test for a significant relationship between number of stations per cell and RSD and explore the relationship between copepod biomass and chlorophyll at these resolutions.Copepod biomass, size and density were log-transformed (log10(x + 1)) for Figs. 3, 4 and 5 to highlight variability.

Spatial distribution
Spatial patterns across the study area for copepod density and biomass were closely aligned (Fig. 3A, 3B).Higher copepod densities (>8000 indv.m − 3 ) and biomass (>150 mg m − 3 ) were typically found in the middle of the study area with lower values found toward the south (<2000 indv.m − 3 and < 50 mg m − 3 , respectively) (Fig. 3A, 3B).Density ranged from 45 to 8790 indv.m − 3 and biomass ranged from < 1 to mg m − 3 .Copepod size had a more uniform distribution across the study area with no obvious spatial patterns with some localized exceptions of larger copepods found in the northern most extents of the study area (Fig. 3C).Size ranged from 199 to 2590 µm.Large fluctuations in each variable were seen over small spatial scales (between adjacent bins, 5 km), this was less frequent for copepod size and is most evident in the central study area for copepod biomass (Fig. 3A -3C).
Small scale changes were not present in chlorophyll concentration, temperature or salinity (Fig. 3D − 3F) with these variables displaying more gradual changes across the area.Temperature was higher toward the east of the study area (Fig. 3E) and salinity was higher toward the south (Fig. 3F).Chlorophyll concentration was consistently low (<0.6 mg m − 3 ) except for the most south-westerly extents of the study area where the maximum value of 1.5 mg m − 3 was seen (Fig. 3D).
The large variations across small spatial changes in all copepod variables are better highlighted by Fig. 4. The change in value between a 10-minute bin and the previous 10-minute bin was explored for density (Fig. 4A), size (Fig. 4B) and biomass (Fig. 4C).There was no clear relationship between the range and distance from the previous station for any variable at small spatial scales (<5 km).On the contrary, the highest changes in density and biomass tended to be within 3 km of the previous bin (Fig. 3A, B).This was seen most clearly in adjacent Fig. 3. Overview of each variable at the finest spatial resolution (10 min bins, approx.2.2 m − 3 seawater) as point data where each point is the bin median latitude and longitude.Point data are highlighted by using Voronoi triangles (with a maximum radius size around the point of 0.1 • ) which allows for a bigger point size, while avoiding overlap to better highlight small scale changes in the variable.datapoints located in the middle of the study area (Fig. 3B, C).

Description at changing resolutions
Most resolutions captured the broad spatial patterns evident in the smallest resolution for copepod biomass, density and size (Fig. 5, Fig. 3A  -C).For example, the regions of low biomass toward the north and higher biomass toward the southwest of the study area (Fig. 3B) were visible at all resolutions (Fig. 5).Although, large changes in copepod variables over small spatial scales seen at the smallest resolution were partly lost at a 0.1 • resolution and absent entirely at 1 • (Fig. 5).This is true for all copepod variables where a high level of detail was lost by only halving the resolution.For example, the area of low biomass (7 • W, 49 • N) seen at 0.5 • resolution was lost when halving to 1 • (Fig. 5, column 1).This loss of small scale detail while capturing broad patterns with decreased resolution was mirrored by copepod density and size.RSD is shown spatially for the selected.resolutions (Fig. 5) and in increasing 0.01 • increments in a scatter plot (Fig. 6) for all copepod variables.Using Spearmans ρ, copepod mean density RSD and number of datapoints per cell were consistently significantly related in cells < 0.27 • resolution (at 0.26 • , Rs = 0.31, p < 0.05, n = 78).For mean biomass RSD, there was consistent significance for cells < 0.4 • (at 0.39 • , Rs = 0.29, p = 0.05, n = 44).For mean size RSD there was no consistent significant relationship at any resolution.For all three-copepod variables, there were exceptions seen at lower resolutions which may result from an insufficient sample size for the Spearmans ρ test.Copepod sizes were consistently low in RSD (<30 %) between grid cells both spatially and across resolutions (Fig. 5, column 3; Supplementary Fig. 1).There was an increase in mean RSD, from 9.96 % to 30.55 %, with decreasing resolution (Fig. 6; Supplementary Fig. 1), although marginal when compared to other variables.Biomass had the highest spatial variation in RSD at all resolutions (Fig. 5, column 1; Supplementary Fig. 1).There was a larger increase meal cell biomass, from 54.1 % to 140.57%, with decreasing resolution (Fig. 6).Density RSD was more consistent and more closely aligned spatially with biomass than size and a had reduced mean cell RSD (27.59 % to 79.9 %, 5, column 2, (Supplementary Fig. 1).

Relationship between chlorophyll and copepod biomass at varying resolutions
At the smallest resolution (10 min bins, Fig. 3) there was a weak, significant relationship between chlorophyll and copepod biomass (Rs = 0.3, p < 0.001, n = 823).The relationship between chlorophyll and copepod biomass was tested at resolutions ranging from 0.05 • to 0.9 • increasing in steps of 0.01 • .The strength of the relationship (Spearmans ρ) and the significance of the relationship are reported in Fig. 7.The relationship at the smallest spatial resolution (0.05 • x 0.05 • ) was similar to the ten-minute bin (p < 0.001, n = 422).When decreasing resolution from 0.05 • to 0.25 • , there was little variation in the strength of the relationship and all relationships were significant.For lower resolutions, the strength and significance of the relationship between copepod biomass and chlorophyll became increasingly variable.For example, at a resolution of 0.83 • the relationship was not significant and had weak positive correlation (ρ = 0.38, n = 11) while a resolution of 0.84 • there was a strong positive correlation and the relationship was significant (ρ = 0.75, n = 11).For resolution lower than 0.9 • , there were not enough data points (n < 10) to perform a Spearmans rank analysis (ideally n > 25).

Application to other variables.
The spatial distribution of chlorophyll concentrations is presented in Fig. 8 to demonstrate merging of other variables to a decreasing spatial resolution.Chlorophyll concentrations had a broader spatial pattern, where changes occurred over larger distances, than all copepod variables (Fig. 3A -C).These patterns are well captured in all selected resolutions (Fig. 8; Supplementary Fig. 2).There are no small-scale changes in chlorophyll concentration (Fig. 5) which was reflected in a lower, consistent RSD both spatially and across resolutions (Fig. 8; Supplementary Fig. 2).The area with the highest chlorophyll concentration, toward the southwest of the study area, also had the highest variation with adjacent cells, which was in turn reflected by a higher RSD (Fig. 8; Supplementary Fig. 2).Temperature and salinity (Supplementary Fig. 5. 10 min bins merged to decreasing resolution for (column 1) copepod biomass (mg/m − 3 (− |-)) (log(x + 1)), (column 2) copepod density (indv.m − 3 ) (log(x + 1)) and (column 3) copepod size (µm)(log(x + 1)) for example resolutions (row 1) 0.1 • , (row 2) 0.25 • , (row 3) 0.5 • and (row 4) 1 • .Copepod color scales are the same as Fig. 3 for comparison.The cell border color indicates relative standard deviation (RSD, %) for the cell.Those cells without a border contain less than 3 data points.RSD color scale is the same for Figs. 5 and 8.For clarity RSD is also show alone for this figure in Supplementary Fig. 1. materials) displayed similar results due to the absence of small spatial changes and broad, slower changes across the study area (Fig. 3E, 3F).

Discussion
The use of continuous instruments allowed for data to be obtained at small spatial scales, which in turn captured both wider spatial patterns and small-scale changes in copepod size, abundance and biomass.The small-scale changes in the copepod abundance, indicative of plankton patchiness (Mackas et al., 1985;Abraham, 1998), are not seen in the physical variables where patterns are study area wide.The physical oceanography of the Celtic Sea and Western Approaches, a seasonally stratified area, is well documented (Pingree et al., 1976;Pingree, 1980;Southward et al., 2004;Smyth et al., 2015).Stratification is known to influence plankton abundances (Fransz et al., 1984;Hure et al., 2022) but the absence of vertical data in this study does not allow for discussion of stratification or its influence on copepod abundance.
Neither surface temperature or salinity appeared correlated with copepod variables.However, the absence of a correlation between zooplankton and physical variables is in line with our understanding that small scale variations in the plankton are driven by a complex series of biological and physical interactions.These were reviewed by Atkinson et. al (2018), using a single point time series (L4 buoy) located in our study area.Average annual densities from the Continuous Plankton Recorder (Richardson et al., 2006) and reported by Johns ( 2006) find the majority of copepod families in lower abundance off the North coast of Cornwall.Although our data only cover 1 month, we find a similar spatial distribution, suggesting that the structure of zooplankton communities, within a specific area, remain similar both in time and space.The area wide patterns for copepod densities also match that of a previous study for the region using the PI (Pitois et al., 2021).Despite a lower taxonomic resolution obtained from image identification compared to microscope identification, another study using the PI found that the community structure described the PI is broadly in line with the L4 and CPR (Scott et al., 2021).
As machine learning classifiers for plankton identification from images collected with automated instruments improve in accuracy (The Turing Centre, 2021), it will be possible to discern zooplankton to increasing taxonomic resolution automatically.Added to the PI, this feature will allow for the removal of the subsampling step that is necessary when manually processing the images.Thus, it will be possible to obtain zooplankton data at higher taxonomic and spatial resolutions, quickly, and at a much lower cost compared to traditional methods.This will be a clear advantage of such systems.Although using the PI has the potential to yield an unprecedented spatial resolution, it cannot replicate the temporal resolution associated with devices such as CPR or longstanding time series such as L4.This is due to the PIs reliance on the vessel on which it's deployed, as it is unrealistic to expect a vessel to survey the same area repeatedly over long periods of time.

Optimizing survey design
Survey demands often result in ad-hoc, last minute changes reducing assurances of sampling the same spot at a consistent temporal resolution.Thus, a multi-method approach would yield the most complete description of the zooplankton.On the one hand, deploying plankton nets on vessels can help understand the vertical distribution of the plankton, whilst time series are invaluable to understand seasonal and long-term changes (Pitois and Yebra, 2022).On the other hand, understanding the small-scale fluctuations in the plankton, and what drives the high variation between neighboring water parcels, can be better understood using continuous data.Although here, data were subsampled and manually classified which limited the minimum achievable spatial resolution, the findings demonstrate the potential for these instruments to resolve these fine scale interactions driving variation.Furthermore, they demonstrate how the choice of resolution can affect  the perceived picture of the plankton as well as relationships between plankton and related variables.Decreasing resolution can result in patterns being emphasized (e.g., chlorophyll) or small-scale changes being lost (e.g., copepod biomass).This demonstrates the 'risk' of a decreased sampling resolution in misrepresenting or incorrectly capturing trends.The variability within cells when merged to a decreasing resolution is not seen by an increased RSD, suggesting RSD is not sensitive to extreme values if the remainder of the merged cells are consistent.We can expect the changes in the data representation with decreased resolution to be reflected in statistical relationships between variables.Here, we chose to look at copepod biomass and chlorophyll concentrations.Chlorophyll data are readily available as a remote sensing package (Aumont et al., 2015) and many large-scale models rely on these data, inferring prey or carbon from chlorophyll (Landry, 1976;Carlotti and Poggiale, 2010).The relationships between chlorophyll and zooplankton are complex and reported relationships are inconsistent in the literature (Casini et al., 2008;Llope et al., 2012;Schultes et al., 2013;Giering et al., 2019).This variation partly stems from the different types of data and spatial temporal scales used between authors (Pitois et al., 2021).In our study, we find high variation in the strength and statistical significance of the relationship resulting only from changing spatial sampling resolution.Although all correlations are positive, we find both inconsistency in the significance and strength of the correlation at lower resolutions.It is likely that even finer resolution data, achieved through removal of subsampling, will yield the most accurate description of these relationships.
Sampling to the finest possible resolution may not be necessary or relevant to the survey's aim, but rather the choice of resolution, whether in space or time, should match the process studied.A sampling resolution too fine could incur unnecessary costs (in data storage and processing) and not be needed to accurately capture large-scale ecological patterns.For example, a coarser resolution than presented here (2 • cells), has been used to successfully capture changes in copepod abundances over time as well investigate their relationship with various physical variables (Bedford et al., 2020).Conversely, too coarse a resolution may miss ecological processes that occur on scales finer than the selected sampling resolution.For example, collecting samples at a specific location once a year (temporal resolution) will not allow to capture seasonal variability.
For our descriptive study we find a spatial resolution of 0.25 • to be a good compromise between capturing small scales changes and broader spatial patterns for copepod abundance and biomass.This resolution was the larger end of those resolutions that had consistency in the statistical relationship between copepod biomass and chlorophyll.Additionally, this resolution can be easily matched to existing products for modelling.For example, a remote sensing ecosystem model output has spatial resolution 0.25 • x 0.25 • (Aumont et al., 2015).Although each study will likely demand a different resolution dependent on the scale of the processes involved.Let's take, for example, a survey designed to study the timing and location of a specific fish spawning and the ecological processes affecting this (assuming both phyto-and zooplankton are variables measured).Prior knowledge will be used to select the overall location and timing of this survey as well as the parameters to measure.At this point, resolution of the measured processes and variables should be taken into consideration.If the survey occurs during the winter months when there is little activity in the plankton ecosystem, sampling these components at very fine temporal and / or spatial resolution is unlikely to be necessary.If, however, that survey occurs during the phytoplankton bloom, a time of fast change within the plankton ecosystem, then a finer resolution that matches the scale of these processes will need to be selected to accurately capture the changes.Similarly, sampling intensity can be adjusted during the course of the survey when and if changes are noticed.
Surveys tend to be designed to collect chosen parameters at preselected locations, usually as many as possible as can be covered by the survey based on time and budget available.In future, as automated tools become common place, optimizing survey design will need to combine different instruments that collect information complementarily to each other.For example, automated devices, such as the PI to collect surface data, alongside plankton nets to collect vertical data, would allow a more comprehensive description of the ecosystem studied.Automated tools, and the resolution they yield, may also help to quantify the variability associated with replicate tows resultant from plankton patchiness and be used to better understand its drivers (Wiebe et al., 1968;Lee and McAlice, 1979;Skjoldal et al., 2013).In theory, this could be achieved with the data presented here, by using a linear regression on the relationship between RSD and Resolution (Fig. 6).Although, it would be specific to this area and season, a 'survey snapshot' (Huret et al., 2018).
The small-scale resolution presented here, and the future potential for even finer resolution of biological parameters, achieved by reducing or eliminating subsampling, has only become recently possible with devices such as the PI.For this study, the time constraints associated with manually sorting images and the choice to subset the data to the study area meant only a small portion entire survey data were used (~2.4 %).This constraint does not apply for other variables, such as physical parameters, where no or little sample processing is required, and all data are instantly available.Recent improvement in classification algorithms (The Turing Centre, 2021) will bring the PI up to speed.Machine learning can eliminate the need for manual classification and thus the PI will be capable of providing continuous data at very fine resolutions (meters and minutes) where data do not need subsampling.Before these solutions can be implemented there are however several challenges that must be overcome, mainly related to the inability of the PI (and other similar systems) to process data as fast as they can collect it.These devices entail a phenomenal data collection rate.For example, if we were to use the full-size range the PI can image, we would collect up to 1 tb of data in<10 min.This currently not feasible as the technology or protocols to write images this fast does not yet exist.These devices clearly entail a phenomenal data collection rate.While the survey data totaled 2 tb for this study, we have made provision for 10 tb of data for the same survey to take place in 2023.The cost of storage and of compute is reducing but these must also form part of survey planning.

Conclusion
We demonstrate the importance of sampling resolution, in the context of pelagic studies, and how it affects relationships between selected measured parameters and their perceived resulting picture.The increasing use of automated and semi-automated technologies allows us to sample at a much finer resolution than previously possible across much larger spatial scales.This is especially true for zooplankton where the PI has the potential to provide unprecedented fine spatial data at moderate taxonomic resolution.Sampling resolution for each measured process will therefore need to be considered as part of optimum survey design.This is to ensure that sampling matches the resolution of the measured process at a specific place and time, and that only necessary data is collected to remain within the survey budgetary constraints.Integrating data collected from various instruments (both traditional and novel) will help to optimize sampling resolution for an improved understanding of ecosystem processes and ultimately, a more holistic view of marine ecosystems in all dimensions.
responsible for the development of the PI technology and contributed to the writing of the manuscript.GM contributed to data interpretation and writing and editing the manuscript.

Funding
through the EnvEast Doctoral Training Partnership (Grant No NE/ L002582/1); co-funded by Cefas as part of the Cefas-UEA Strategic Alliance.The PELTIC survey is supported by the UK Government Department for Environment, Food and Rural Affairs (DEFRA) under the EU Data Collection framework with contributions from grant BX013 ("Extension of the pelagic survey, Peltic, to map and quantify the pelagic fish resources in the SW of the UK").DEFRA have not contributed to any of the science.

Declaration of Competing Interest
Authors Phil Culverhouse and Julian Tilbury are employed by Plankton Analytics Ltd which manufactor the Plankton Imager.The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Fig. 1 .
Fig. 1.Celtic Sea study area and spatial extent of the collected data.Red filled symbols represent in situ discrete chlorophyll samples.Black open symbols represent PI and FerryBox (temperatures, salinity, fluorescence) 10-minute bins.(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 2 .
Fig. 2. Schematic of Plankton Imager (PI) and FerryBox setup aboard the RV Cefas Endeavor.Water is pumped onboard from 4 m below sea level (A).This supplies the PI (B) and FerryBox (C).Within the PI water flows through a flow cell (D) where passing particles are imaged by a line scan camera (E).Within the FerryBox water passes through a suite of sensors (F), here temperature, salinity and fluorescence are used.