High-resolution analysis of a North Sea phytoplankton community structure based on in situ ﬂow cytometry observations and potential implication for remote sensing

. Phytoplankton observation in the ocean can be a challenge in oceanography. Accurate estimations of its biomass and dynamics will help to understand ocean ecosystems and reﬁne global climate models. Relevant data sets of phytoplankton deﬁned at a functional level and on a sub-meso-

properties of water masses by converting most of the inorganic matter into available organic matter (nitrogen, phosphate, silicate, sulfur, iron) and determining the structure of the trophic status of marine environments. Given this importance, it is insufficient to use a single proxy, such as chlorophyll a measurements, for quantifying and qualifying phytoplankton over large scales when attempting to understand its role in biogeochemical processes (Colin et al., 2004). Such a proxy does not reflect changes in community structure (Hirata et al., 2011) and does not yield robust biomass estimations (Kruskopf and Flynn, 2006). Yet this classical proxy is frequently used to study the spatial and temporal variability of phytoplankton from both remotely sensed and in situ measurements. Le Quéré (Le Quéré et al., 2005) pointed out the importance of taking into account the functionality of phytoplankton species when considering the influence of phytoplankton community structure on biogeochemical processes. This functionality concept (i.e. phytoplankton functional types, PFTs) is described as set of species sharing similar properties or responses in relation to the main biogeochemical processes such as the N, P, Si, C and S cycles (diazotrophs for the N cycle such as cyanobacteria, dimethylsulfoniopropionate producers for the S cycle such as Phaeocystis, silicifiers for the Si cycle such as diatoms, calcifiers for the C cycle such as coccolithophorids, size classes mainly used for the C cycle).
Representative data sets of phytoplankton functional types, size classes and specific chlorophyll a concentrations are the subject of active research using high-frequency, in situ dedicated analysis from automated devices such as spectral fluorometers, particle scattering and absorption spectra recording instruments, or automated and remotely controlled scanning flow cytometers (SFCs). Among the highfrequency in situ techniques used to quantify phytoplankton abundance, community structure and dynamics, SFC is the most advanced instrument, counting and recording cell optical properties at the single-cell level. This technology has recently been adapted for the analysis of almost all the phytoplankton size classes and focuses on the resolution of phytoplankton community structure dynamics (Dubelaar et al., 1999;Olson et al., 2003;Sosik et al., 2003;Thyssen et al., 2008a, b). In parallel, algorithms applied to remote sensing data have been developed which are dedicated to characterizing phytoplankton groups, PFTs or size classes (Sathyendranath et al., 2004;Ciotti et al., 2006;Nair et al., 2008;Aiken et al., 2008;Kostadinov et al., 2009;Uitz et al., 2010;Moisan et al., 2012). One of these algorithms, PHYSAT, has provided a description of the dominant phytoplankton functional types (Le Quéré et al., 2005) for open waters on a global scale, leading to various studies concerning the PFT variability (Alvain et al., 2005(Alvain et al., , 2013Masotti et al., 2011;Demarcq et al., 2011;Navarro et al., 2014). PHYSAT relies on the identification of water-leaving radiance spectra anomalies, empirically associated with the presence of specific phytoplank-ton groups in the surface water. The anomalies were labelled thanks to the comparison with high-pressure liquid chromatography (HPLC) biomarker pigment matchups. To date, six dominant phytoplankton functional groups in open waters (diatoms, nanoeukaryotes, Prochlorococcus, Synechococcus, Phaeocystis-like cells, coccolithophorids) have been found to be significantly related to specific water-leaving radiance anomalies from SeaWiFS (Sea-viewing Wide Field-of-view Sensor) sensor measurements at a resolution of 9 km (Alvain et al., 2008). These relationships have been verified by theoretical optical models (Alvain et al., 2012). This theoretical study also showed that additional groups or assemblages could be added in the future, once accurate in situ observations are available.
Describing the community structure on a regional scale will give better quantification and understanding of the phytoplankton responses to environmental change and, consequently, support the modification of theoretical considerations regarding energy fluxes across trophic levels. It is critical for understanding community structure interactions and particularly when it is necessary to take into account the mesoscale structure in a specific area (D'Ovidio et al., 2010), which is the case in areas under the influence of regional physical forcing such as the English Channel and the North Sea. Long-term changes detected in these regions have been shown to impact local ecosystem functioning by inducing, for instance, a shift in the timing of the spring bloom (Wiltshire and Manly, 2004;Sharples et al., 2009;Vargas et al., 2009; or specific migrations of regional (Gomez and Souissi, 2007) or dominant phytoplankton groups (Widdicombe et al., 2010). In addition, hydrodynamic conditions have been shown to play a strong role in the phytoplankton distribution on a regional scale (Gailhard et al., 2002;Leterme et al., 2008). It is therefore crucial to develop specific approaches to characterize the phytoplankton community structure (beyond global-scale dominance) and its high-frequency variation in time and space. In order to achieve this, large data sets of in situ analyses resolving PFTs are essential for specific calibration and validation of regional remote sensing algorithms such as PHYSAT. Flowthrough surface water properties analysis for remote sensing calibration optimizes the amount of matchups (Werdell et al., 2013;Chase et al., 2013). For the purpose of collecting highresolution in situ data describing phytoplankton community structure, automated SFC technology allows samples to be collected at high frequency, resolving hourly and kilometre scales with a completely automated system. The instrument enables single-cell analysis of phytoplankton from 1 to 800 µm and several millimetres in length for chain-forming cells and automated sampling allows large space and time domains to be covered at a high resolution Thyssen et al., 2008bThyssen et al., , 2009Ribalet et al., 2010).
Based on this approach, a high-frequency study of the phytoplankton community structure in the North Sea was conducted. The in situ observations from SFC have been used for the first time and as a first trial to label PHYSAT anomalies detected during the sampling period. Thus, the available data set makes it possible to distinguish between different water-leaving radiance anomaly signatures in which significantly distinct phytoplankton community structures can be described, rather than just the dominant communities, as is the case in previous studies. Our results are an improvement over conventional approaches as they allow the distribution of phytoplankton community structure to be characterized at a high resolution, from both in situ and day-to-day waterleaving radiance anomaly maps specific to the study area.

Materials and methods
Samples were collected during the PROTOOL/DYMAPHYproject cruise on board the RV Cefas Endeavour from the 08 to 12 May 2011 in the south-west region of the North Sea (Fig. 1). Automated coupled sampling using a Pocket-FerryBox (PFB) and a Cytosense scanning flow cytometer (SFC, Cytobuoy b.v.) started on the 08 May at 09:00 UTC and ended on the 12 May at 04:00 UTC. Water was continuously collected from a depth of 6 m and entered the PFB at a pressure of 1 bar maximum. Subsurface discrete samples were collected using Niskin bottles on a rosette and analysed using a second Cytosense SFC (stations 4, 6 and 13 were used in this paper, Fig. 1).

Phytoplankton community structure from automated SFC
Phytoplankton abundance and group description were determined by using two Cytosense SFCs (Cytobuoy b.v.), one was fixed close to the PFB and sampling the continuous flow of pumped sea water, the second one was used for pictures collection from discrete samples. These instruments are dedicated to phytoplankton single-cell recording, enabling cells from 1 to 800 µm and several millimetres in length to be analysed routinely in 1-10 cm 3 of sea water. Each single cell or particle in suspension in the solution passes through the laser beam thanks to the principle of hydrodynamic focusing. The instrument then records the resulting optical pulse shapes and counts each single particle.

Automation of the continuous flow sampling
Automated measurements were run from the continuous flow of sea water passing through the PFB. Samples for SFC were automatically collected from a 450 cm 3 sampling unit where water from the continuous flow was periodically stabilized. This sampling unit was designed to collect bypass water from the 1 bar PFB inlet. The sampling unit water was replaced within a minute. One of the Cytosense instruments was directly connected to the sampling unit and two successive analyses with two distinct protocols were scheduled automatically every 10 min.

Flow cytometry analysis
A calibrated peristaltic pump was used to estimate the analysed volumes and send the sample to the SFC optical unit. Suspended particles were then separated using a laminar flow and subsequently crossed a laser beam (Coherent Inc.; 488 nm, 20 mV). The instrument recorded the pulse shapes of forward scatter (FWS) and sideward scatter (SWS) signals as well as red, orange and yellow fluorescence (FLR, FLO, FLY respectively) signals for each chain or single cell. The Cytosense instrument was equipped with two sets of photomultiplier tubes (PMTs) (high-and low-sensitivity modes), resolving a wider range of optical signals from small (∼ < 10 µm) to large particles (∼ < 800 µm). Two trigger levels were applied on the high-sensitivity PMT to discriminate highly concentrated eukaryotic picophytoplankton and cyanobacteria (trigger level: FLR 10 mV; acquisition time: 180 s; sample flow rate: 4.5 mm 3 s −1 ), from less concentrated nano-and microphytoplankton (trigger level: FLR 25 mV, acquisition time: 400 s; sample flow rate: 9 mm 3 s −1 ). Setting the trigger on red fluorescence was preferred to the commonly FWS or SWS triggering as a tradeoff between representative phytoplankton data sets and non-fluorescing particles/noise recording, but this procedure affected the SWS and FWS pulse shapes to some extent. To ensure good control and calibration of the instrument settings, a set of spherical beads with different diameters was analysed daily. This allowed the definition of estimated-size calibration curves between total FWS (in arbitrary units, a.u.) and actual bead size. This set of beads included 1, 6, 20, 45 and 90 µm yellowgreen fluorescence from Polyscience Fluoresbrite microspheres; 10 µm orange fluorescence Invitrogen polystyrene FluoroSpheres; and 3 µm 488 nm Cyto-cal ™ alignment standards. To correct for the high refraction index of polystyrene beads that generates an underestimation of cell size, we defined a correcting factor by using 1.5 µm silica beads (Polyscience, silica microspheres; Foladori et al., 2008). The phytoplankton community was described using several two-dimensional cytograms built with the Cytoclus ® software. For each autofluorescing phytoplankton cell analysed, the integrated value of FLR pulse shape (total red fluorescence (TFLR), in a.u.) was calculated. For each phytoplankton cluster, the amount of TFLR is reported per unit volume (TFLR cm −3 , a.u. cm −3 ). The TFLR cm −3 of each resolved phytoplankton cluster was summed total TFLR cm −3 ) and was used as a proxy for chlorophyll a concentration. The TFLR signal was corrected from high-sensitivity PMT saturation signal in the case of highly fluorescing cells (> 4000 mV) thanks to the low-sensitivity PMTs that behaved linearly with the high-sensitivity PMT, allowing the reconstruction of the high-sensitivity signal. Discrete samples were collected during the cruise and analysed using a second Cytosense SFC equipped with the image-in-flow system. The samples were analysed using settings similar to those of the Cytosense coupled to the PFB. The amount of pictures was determined before each sample acquisition and pictures were randomly collected within the largest particles until the predetermined number of pictures was reached.

Temperature and salinity
The PFB (4H-JENA © ) was fixed on the wet laboratory bench, close to the Cytosense, in order to share the same water inlet. This instrument recorded temperature and conductivity (from which salinity was computed) from the clean water supplied by the ship's seawater pumping system at a frequency of one sample every minute.
Within the PFB data set, only data related to automated SFC analyses were selected for plotting temperature-salinity diagrams.

Chlorophyll a
Samples for HPLC analyses and bench-top fluorometry (Turner ® fluorometer) were collected randomly within 6 h periods before or after the supposed on-board Aqua MODIS (Moderate Resolution Imaging Spectroradiometer) sensor passage (12:30 UTC) to fulfil classical requirements in terms of in situ and remotely sensed matchup criteria. Samples were collected from the outlet of the PFB, filtered onto GF/F filters and stored directly in a −80 • C freezer. The HPLC analyses were run on an Agilent Technologies, 1200 series. Pigments were extracted using 3 cm 3 ethanol containing vitamin E acetate as described by Claustre et al. (2004) and adapted by Van Heukelem and Thomas (2001). For benchtop fluorometry, the filters were subsequently extracted in 90 % acetone. Chlorophyll a (Chl a) concentration was evaluated by fluorometry using a Turner Designs model 10AU fluorometer (Yentsch and Menzel, 1963). The fluorescence was measured before and after acidification with HCl (Lorenzen, 1966). The fluorometer was calibrated using known concentrations of commercially purified Chl a (Sigma-Aldrich ® ).
The PFB was equipped with a multiple fixed-wavelength spectral fluorometer (AOA fluorometer, bbe © ) sampling once every minute to obtain Chl a values.
MODIS Chl a values corresponded to level-3 binned data consisting of the accumulated daily level-2 data with a 4.6 km resolution.

Mixed layer depth
Daily water column temperature mapping was obtained from the Forecasting Ocean Assimilation Model 7 km Atlantic Margin model (FOAM AMM7), available from the My-Ocean database (http://www.myocean.eu.org/). Model output temperature depths were as follows: 0, 3, 10, 15, 20, 30, 50, 75, 100, 125 and 150 m. Average mixed layer depth (MLD) on the five sampling days was calculated from daily temperature data sets. MLD was defined as the depth associated with an observed temperature difference of more than 0.2 • C with respect to the surface (defined at 10 m; de Boyer Montégut et al., 2004).

Matching method between in situ and remotely sensed observations for phytoplankton community structure
The PHYSAT approach is based on the identification of specific signatures in the water-leaving radiance (nLw) spectra measured by an ocean colour sensor. It is described in detail by Alvain et al. (2005Alvain et al. ( , 2008. Briefly, this empirical method has been first established by using two kinds of simultaneous and coincident measurements: nLw measurements and in situ measurements of diagnostic phytoplankton pigments. The presence of a specific phytoplankton group was established based on pigment analysis. In a first step, this approach has allowed for detection of four dominant phytoplankton groups identified within the available in situ data set, based on the pigment inventories. Four groups were detected first (diatoms, nanoeukaryotes, Synechococcus and Prochlorococcus) only in cases where they were dominant. Note that "dominant" here is used following the definition by Alvain et al. (2005) as situations in which a given phytoplankton group is a major contributor to the total diagnostic pigments. This represented a limitation in using other potential phytoplankton in situ analysis. In a second step, coincident remotely sensed radiance anomalies (RAs) spectra between 412 and 555 nm were transformed into specific normalized water-leaving radiance or RA spectra in order to evidence the second-order variability of the satellite signal. This was done by dividing the actual nLw by a mean nLw model (nLw ref ), which depends only on the standard Chl a. Then, coincident nLw spectra and in situ analysis were used to show that every dominant phytoplankton group sampled during in situ sampling is associated with a specific RA spectrum in terms of shape and amplitude. Based on this, a set of criteria has been defined in order to characterize each group in function of its RA spectrum, first by minimum and maximum values approach and more recently using neuronal network classification tools (Ben Mustapha et al., 2014). These criteria can be applied to global daily archives to get global maps of the most frequent group of dominant phytoplankton. When no group prevails over the month, the pixels are associated with an "unidentified" phytoplankton group.
In this study, remotely sensed observations were selected on the basis of quality criteria that ensured a high degree of confidence in PHYSAT as described in Alvain et al. (2005). Thus, pixels were only considered when clear-sky conditions were found and when the aerosol optical thickness, a proxy of the atmospheric correction steps quality, was lower than 0.15. The effects of sediments and/or coloured dissolved organic matter (CDOM) were minimized by focusing on phytoplankton dominated waters as defined from the optical typology described in Vantrepotte et al. (2012). Waters classified as turbid were therefore excluded from the empirical relationship since the PHYSAT method is currently not available for such areas. Waters classified as non-turbid using the same criteria were selected and the PHYSAT algorithm applied to them. To link coincident in situ and remotely sensed observations, a matchup exercise was carried out. Matching points between in situ SFC samples (considered as in situ data) and 4.6 km resolution MODIS pixels (highest level-3 binned resolution) were selected by comparing their concomitant position day after day. When more than one in situ SFC sample was found in a MODIS pixel the averaged value of TFLR (a.u. cm −3 ) for each phytoplankton group was calculated.

Statistics
Statistics were run in R software (CRAN, http://cran. r-project.org/). Before running correlation and comparison tests on the different in situ sensors (for Chl a and total TFLR), the Shapiro normality test was run. When data did not follow a normal distribution, a Wilcoxon signed rank test was applied. Correlations between data were defined using Spearman's rank correlation coefficient.
As the PHYSAT approach is based on the link between specific RA spectra (in terms of shapes and amplitudes) and specific phytoplankton composition, the set of remotely sensed data was separated into distinct groups with similar RAs. The PHYSAT RA found over the studied area and matching the in situ SFC samples was differentiated by applying a k-means clustering partitioning method (tested either around means (Everitt and Hothorn, 2006) or around menoids (Kaufman and Rousseeuw, 1990)). The appropriate number of clusters (distinct PHYSAT RA) was decided with a plot of the within groups sum of squares by number of clusters extracted. A hierarchical clustering was computed to illustrate the k-means clustering method. Within each kmeans cluster, SFC-defined phytoplankton community was described and differences between TFLR cm −3 per phytoplankton group were compared within the different PHYSAT spectra clusters using the Wilcoxon signed rank test.

Temperature, salinity and mixed layer depth
The sampling track crossed four North Sea marine zones: western Humber, Tyne, Dogger, eastern Humber and Thames (Fig. 1). The PFB measured temperature associated with the SFC samples ranged between 8.83 and 12.39 • C with an average of 10.67 ± 0.72 • C. Minimal temperatures were found in the western Humber area (53-55 • N and −1-1 • E) and maximal temperatures were found in the Thames area (54-52 • N, 2-4 • E; Fig. 2a). Salinity from the PFB ranged between 34.02 and 35.07 with an average value of 34.6 ± 0.26. Highest salinity values were found in the Dogger area above 55 • N and in the limit between the Humber and the Thames areas, 53 • N. Lowest salinity values were found in the Tyne area around 55 • N, −1 • E and in the Thames area (by the Thames plume; Fig. 2b).
The mixed layer depth calculated from the FOAM AMM7 was used to illustrate the physical environment of the traversed water masses. Different mixed layer depth characterized the sampled area, with deeper MLD in the northern part (15 to 30 m) and a shallower MLD in the southern area (∼ 10 m, Fig. 1). A tongue of shallow MLD (∼ 10 m) surrounded by deeper MLD (∼ 20 m) crossed the sampling area at ∼ 55 • N and ∼ 3 • E.
Cell abundance, average cell size and TLFR cm −3 for each cluster are illustrated in Figs. 5, 6 and 7 respectively. Average abundance and sizes of each cluster are addressed in Table 1. PicoRED cells were, on average, the most abundant in the studied area ( Fig. 5b and Table 1), followed by NanoRED2, PicoORG, NanoRED1 and Micro1 (Fig. 5f, a, c and g respectively, Table 1). The other clusters' abundances were below 1.10 2 cells cm −3 on average (Fig. 5d, e, h, i, j; Table 1). Pi-coORG cells were the smallest estimated (Fig. 6a, Table 1), while the largest estimated were MicroORG, MicroLowORG and Micro2 cells (Fig. 6h, i and j respectively, Table 1).
The western Humber zone (Fig. 1) was marked by the highest abundances of PicoRED, PicoORG, MicroORG, Mi-croLowORG and Micro1 (Fig. 5b, a, h, i and g). The eastern part of the Humber zone ( Fig. 1) was marked by the highest abundances of NanoRED1 and Micro1 (as for the western part; Fig. 5c, g). High values of PicoRED were also observed in this part of the Humber zone. The Tyne zone (Fig. 1) had the highest abundance of NanoORG and Micro2 clusters (Fig. 5d, j), and the lowest abundance of PicoRED and NanoSWS. High abundance values of MicroORG were also observed (Fig. 5h). The size of the NanoSWS and the NanoRED2 were the greatest in this zone (Fig. 6e, f). The Dogger zone (Fig.1) was dominated in terms of abundance by the PicoRED and the PicoORG, where the sizes were the smallest ( Fig. 6b and a) but did not show the highest abundance values. The cell sizes of Micro1 were the greatest in this zone (Fig. 6g). Observations in the Thames zone ( Fig. 1) produced the maximal abundance of NanoSWS and NanoRED2 (Fig. 6e, f). Sizes were the greatest for PicoORG, NanoRED1 and NanoSWS (together with the Tyne zone; Fig. 6a, c, e). TFLR follows similar trends to abundance (Fig. 7).

Comparison between scanning flow cytometry, total red fluorescence and chlorophyll a analysis
Several bench-top and in situ instruments, i.e. HPLC, Turner fluorometer and the PFB AOA fluorometer, were used to give exact and/or proxy values of Chl a. Similarly to temperature and salinity, the PFB AOA fluorometer samples were selected to match SFC samples. Overall values of Chl a originating from these instruments were superimposed to the total TFLR cm −3 (by summing up the TFLR cm −3 values of the observed cluster) and the MODIS Chl a values matching the points in Fig. 8. HPLC values varied between 0.21 and 7.58 µg dm −3 with an average of 1.57 ± 2.01 µg dm −3 .
Turner fluorometer values varied between 0.41 and 2.31 with an average of 1.24 ± 0.7 µg dm −3 . AOA fluorometer values varied between 0.73 and 28.53 µg dm −3 with an average of 4.44 ± 5.54 µg dm −3 . The total TFLR cm −3 from SFC, normalized with 3 µm bead red fluorescence varied between 5011 and 399 200 a.u. cm −3 with an average value of 64 394.5 ± 67 488.4 a.u. cm −3 . The Shapiro normality test showed non-normality for each of the variables, so a Wilcoxon test was run between techniques involving similar units. HPLC and Turner Chl a concentrations were not significantly different (n = 9, p = 0.65) and the correlation was significant (Spearman, r = 0.98, Table 2). The absolute values from both techniques were significantly different from the AOA fluorometer values (n = 9, p < 0.001 for both) but were significantly correlated (Spearman, r = 0.86 and r = 0.82 for the HPLC and Turner fluorometer respectively, Table 2). The SFC total TFLR (a.u. cm −3 ) from summing up the TFLR of all the phytoplankton groups was used for comparison with other Chl a determinations. Correlations with the AOA fluorometer, HPLC and Turner fluorometer results were all significant as shown in Table 2.

PHYSAT anomalies and SFC phytoplankton community composition, extrapolation to the non-turbid classified waters in the North Sea
Considering our database of coincident SFC in situ and MODIS remotely sensed observations, a total of 56 matching points were identified, from which only 38 points corresponded to non-turbid classified waters. Matching points between in situ sampling and remote sensing pixels for the purpose of the PHYSAT empirical calibration were selected in the daytime period 06:00-18:00. Additional samples col-   1  and 8). The Chl a values found in the matching points were lower than 0.5 µg dm −3 (Fig. 8).
PHYSAT RAs were calculated based on the method of Alvain et al. (2005) and the average signal was recalculated to fit the sampling area. The RAs were separated into two distinct anomalies using the within sum-of-squares minimization (Fig. 9a) and illustrated on a dendrogram (Fig. 9b).
These two distinct types of anomalies in terms of shape and amplitude are illustrated in Fig. 9c and d and the anomaly characteristics are summarized on Table 3. The first anomaly set (N1, Table 3) was composed of 5 spectra that had overall higher values than the second anomaly set (N2, Table 3), composed of the other 10 spectra. The corresponding SFC cluster proportion of TFLR cm −3 to the overall total TFLR cm −3 found within the two anomalies are illustrated in Fig. 10a and b. Similarly, the relative difference of each phytoplankton cluster's TFLR cm −3 within the two anomalies to its overall TFLR cm −3 median value Table 2. Spearman's rank correlation coefficient between the different methods used for chlorophyll a estimates and with the total TFLR from the scanning flow cytometer per unit volume.
Considering the specificity of each set of RAs in terms of phytoplankton and environmental conditions, it is interesting to map their frequency of detection in our area of interest. A pixel is associated with an anomaly when the RA values at each wavelength fulfilled the criteria of Table 3. The frequencies of occurrence over the sampling period based on a composite overlapping the sampling period are illustrated in Fig. 12a and b. Pixels corresponding to N1 anomaly were mostly found in the 54-56 • N area (Dogger and German, Fig. 1), following the edge between the shallow MLD tongue and the deepest MLD zones (Fig. 1), but also near the Northern Scottish coast (Forth, Forties and Cromarty, Fig. 12a), where MLD was shallow (Fig. 1). The N2 anomaly pixels were mostly found in the Forties, Fisher and German area, on much smaller surfaces (Fig. 12b).

Discussion
Water mass dynamics generates patchiness which modifies phytoplankton community structure and makes it difficult to follow a population over time and at a basin scale. In this context, the hourly observation of phytoplankton at the singlecell and community level and its daily spatial structure resolution from extrapolation using PFT remote sensing mapping can help to follow spatial distribution of phytoplankton communities. The improvement of PFT mapping, i.e. from dominant groups to the community structure resolution, is one of the ideas generated in this paper. This paper shows for the first time that SFC data sets can be used for labelling PHYSAT anomalies at the daily scale. The SFC is a powerful automated system aimed to be implemented in several vessels of opportunity and monitoring programs for future PHYSAT anomalies identification at the daily scale and at the community structure level. A recent publication that enables the classification of a large range of anomaly spectra  Thus, the knowledge and the tools are available, which augurs well for understanding phytoplankton heterogeneity and variability over high-resolution spatio-temporal scales. Indeed, resolving phytoplankton community structure over the sub-mesoscale and hourly scale is a good way to understand the influence of environmental short-scale events (Thyssen et al., 2008a;Lomas et al., 2009), seasonal (or not) succession schemes, resilience capacities of the community after environmental changes and impacts on the specific growth rates , Dugenne et al., 2014. Resolving the community structure and the causes of variations at several temporal and spatial scales has great importance in further understanding the phytoplankton functional role in biogeochemical processes. This scale information is currently lacking for the global integration of phytoplankton in biogeochemical models, mainly due to the lack of adequate technology needed to integrate the different levels of complexity linked to phytoplankton community structure.

Phytoplankton community description
Phytoplankton community structure from automated SFC is described through clusters of analysed particles sharing similar optical properties. Thus cluster identification at the species level is speculative and, as with any cytometric optical signature, it needs sorting and genetic or microscopic analysis to be resolved at the taxonomic level. This deep level of phytoplankton diversity resolution requirement is not needed in biogeochemical processes studies in which functionality is preferred to taxonomy (Le Quéré et al., 2005). In this context, most of the optical clusters could be described at the plankton functional type level because of some singular similarities combining abundance, size, pigments and structure proxies obtained from optical SFC variables (Chisholm et al., 1988;Veldhuis and Kraay, 2000;Rutten et al., 2005;Zubkov and Burkill, 2006). The Cytobuoy instrument used in this study was developed to identify phytoplankton cells from picophytoplankton up to large microphytoplankton with complex shapes, even those forming chains. Indeed, the volume analysed was close to 3 cm 3 , giving accurate counts of clusters with abundances as low as 30 cells cm −3 (100 cells counted), under which the coefficient of variation exceeds 10 % (Thyssen et al., 2008a). Such low abundances were found for some of the clusters identified in this study (NanoORG, MicroORG and Micro2 clusters for which the median abundance value was close to 30 cells cm −3 ), in agreement with concentrations observed in previous studies for the possibly related phytoplankton genus, as discussed below, i.e. cryptophytes (Buma et al., 1992), diatoms and dinoflagellates (Leterme et al., 2006). Previous comparisons between bench-top flow cytometry and remote sensing (Zubkov and Quartly, 2003) could technically not include the entire size range of nano-to microphytoplankton. The Cytobuoy SFC resolves cells up to 800 µm in theory, but this depends on the counted cells in the volume sampled (which is approximately 10 times more than classical flow cytometry). However, the largest part of phytoplankton production in the North Sea is driven by cells < 20 µm (Nielsen et al., 1993), and we can consider this size class to be correctly counted with the SFC. Furthermore, significance between the sum of each cluster's TFLR (total TFLR cm −3 ) and bulk chlorophyll measurements (Table 2 and Fig. 7) confirms the power of SFC for phytoplankton community resolution. PicoORG cells could be labelled Synechococcus (Waterbury et al., 1979;Li, 1994) based on their phycoerythrin pigment fluorescence (Fig. 3a), and their size could be estimated between 0.8 and 1.2 µm (Fig. 6a) and their abundances around 10 2 -10 4 cells cm −3 (Fig. 5a). PicoRED cells could be autotrophic eukaryotic picoplankton, as their cell size varied between 1 and 3 µm (Fig. 6b) and contained Chl a as their main pigment. Thus, PicoORG and PicoRED clusters contained the smallest cells found above the so called nonfluorescing/electronic noise background of this instrument ( Fig. 3a and b). As Prochlorococcus is expected to be absent in these waters, we can conclude that the cytometer observed most of the phytoplankton size classes when sufficiently concentrated in the analysed volume. NanoRED1 cells exhibited abundance and sizes close to those of Phaeocystis haploid flagellate cells (3-6 µm, Fig. 6c; Rousseau et al., 2007, and references therein). Their presence, mostly in the Humber area (Fig. 5c), suggests that this area corresponded to a period between the inter-bloom (haploid stage, life stage persisting between two blooms of diploid colonial cells) and the start of the Phaeocystis bloom (Rousseau et al., 2007). Similarly, NanoRED2 could be referred to as Phaeocystis diploid flagellates or free colonial cells, based on their size and abundance (4-8 µm and 0-50 × 10 3 cells cm 3 (Figs. 6f and 5f respectively), Rousseau et al., 2007). Their maximal abundance was found in the southern North Sea Thames area. Their presence suggested an area of Phaeocystis colonial blooming stage (Guiselin, 2010).
MicroORG cells, whose abundance and size are close to those of some large cryptophytes cells, were found in the same areas as NanoORG cells (Fig. 5h and d respectively), which are related to smaller Cryptophyceae cells. MicroLowORG cells with sizes close to that of MicroORG cells, and although low in concentration, emitted orange fluorescence and could represent cells with little phycoerythrin content. NanoSWS cluster was composed of high-SWS scattering cells that are consistent with the signature of Coccolithophorideae cells (van Bleijswijk et al., 1994;Burkill et al., 2002). The observed abundances did fit with the low Coccolithophorideae concentrations observed in the southern North Sea (Houghton, 1991). The Micro1 cluster could correspond to small nanoplanktonic diatom cells (∼ 10-30 µm, Fig. 6g). Regarding the size range, this cluster could represent several species. They were mainly found within the Humber area. The Micro2 cluster was mostly composed of large diatoms (Rhizosolenia, Chaetoceros) and dinoflagellates (Fig. 4) within the size range of 40-100 µm (Fig. 6j) as observed in the pictures (Fig. 4). The presence of these groups illustrates the boundary between the end of the diatom bloom and the development of a dinoflagellate bloom, from which it could be possible to make a link with the Dinophysis norvegica and Alexandrium early summer bloom, observed in the Tyne region by Dodge (1977). This is in agreement with the stratification observed within the Thames zone (Fig. 1).

Phytoplankton community structure at the North Sea basin scale
The data sets from the spatial (km) and the temporal (hourly) scales for phytoplankton community structure based on single-cell optical properties are important for validating the methods describing phytoplankton community structure from space. Ocean algorithms need specific information on water properties and phytoplankton structure and are dependent on validation from in situ observations, which is always complex to collect and limited by sky condition criteria. The PHYSAT method was built on an empirical relationship between dominant phytoplankton functional types from in situ HPLC analysis and RA. The method was thus limited to dominance cases only as HPLC analysis cannot give us more information. The remote sensing synoptic extrapolation concerning phytoplankton community structure remains to be established and, in spite of a theoretical validation (Alvain et al., 2012), still depends on important in situ data point collection in order to build robust empirical relationships. In this study, the combination of phytoplankton high-frequency analysis from an automated SFC with the PHYSAT method proved to be an excellent calibration by giving an unprecedented amount of matching points for only two significant sampling days (number of analysed samples for non-turbid waters matching MODIS pixels: 38; number of samples used between 06:00 and 18:00: 15, corresponding to 39.5 % prof- Red dashed lines correspond to the minima and maxima values of the spectra as described in Table 3. Figure 10. (a, b) The clusters' proportional contribution to the total TFLR cm −3 within each PHYSAT anomaly (N1 and N2). (c, d) Within each anomaly, the clusters' TFLR cm −3 proportional difference to its median value calculated on the entire matching points data set. Wilcoxon rank test was run for each cluster between the two anomalies. ***p < 0.001; **p < 0.01; *p < 0.1. itability), compared to the 14 % matching points from the GeP&CO data set (Alvain et al., 2005).
The combination of SFC and PHYSAT has shown that a first set of specific anomalies (N1) can be associated with NanoRED1, NanoORG and MicroORG, which con- Figure 11. Box plots within each PHYSAT anomaly (N1, N2) of (a) temperature ( • C), (b) salinity, (c) chlorophyll a (as estimated from MODIS level 3 binned) and (d) total TFLR (a.u. cm −3 ). Wilcoxon rank test was run for each parameter between the two anomalies. ***p < 0.001; **p < 0.01; *p < 0.1. tributed more to the total TFLR cm −3 (a proxy of Chl a, Fig. 7, Table 2) than in the second set of anomalies (N2), in which PicoRED cells contributed significantly more to the total TFLR cm −3 , as well as where Micro1 contribution to total TFLR cm −3 was above its overall median value observed along the matching points (Fig. 10d). Spatial successions between diatoms (as could be found in the NanoRED1 and Micro1 clusters) and cryptophytes (corresponding to the NanoORG and MicroORG specific signatures) revealed differences in stratification, lower salinity and shallower MLD (Moline et al., 2014;Mendes et al., 2013). Indeed, the N1 anomaly corresponds to areas of low MLD (Fig. 1) following the main North Sea current from the south-west to the north- Figure 12. (a, b) Frequency of occurrence of the two distinct anomalies (N1 and N2) over the North Sea during the sampling period (08 to 12 May 2011). Yellow squares correspond to MODIS matching points for non-turbid waters selected between 06:00 and 18:00 and used to distinguish N1 and N2 PHYSAT anomalies. east (Holligan et al., 1989), surrounding the Dogger bank. This anomaly was also found on the north-western part of the northern North Sea, following the Scottish coastal water current with a shallow MLD (Figs. 1, 11a). The N2 anomaly was observed with the deeper MLD of the Forties, Fisher and German areas (Figs. 1 and 11b). These N2 areas corresponded to a phytoplankton community still blooming, while the N1 anomaly areas might be at a stage of late blooming, in which conditions fit Cryptophyceae development and grazing (cells of Myrionecta rubra were observed when using the image-in-flow system, not shown). These organisms were found to be dominant in the areas surrounding the Dogger bank from observations and counts carried out by Nielsen et al. (1993) during the same period.
In conclusion, our study of phytoplankton community structure distribution resolved at the sub-mesoscale evidenced the importance of the North Sea hydrological context. Significant differences between the two sets of anomalies observed during the sampling period are mainly due to cryptophyte-like cells and pico-to nanophytoplankton size class cells. This daily-scale resolution, thanks to highresolution techniques combined with single-cell and remote technologies, will help in understanding the role of circulation and hydrological properties of the water masses on the phytoplankton composition, succession schema, spreading, and bloom triggering and collapsing.