Daily metre-scale mapping of water turbidity using CubeSat imagery

The potential for mapping of turbidity in inland and coastal waters using imagery from the PlanetScope (PS) and RapidEye (RE) constellations is evaluated. With >120 PS and 5 RE satellites in orbit these constellations are able to provide metre scale imagery on a daily basis and could significantly enhance high spatial resolution monitoring of turbidity worldwide. The Dark Spectrum Fitting (DSF) atmospheric correction is adapted to the PS and RE imaging systems to retrieve surface reflectances. Due to the large amount of imagery and the limited band sets on these sensors, automated pixel classification is required. This is here performed using a neural network approach, which is able to classify water pixels for clear to moderately turbid waters. Due to the limited band set and sensor performance, some issues remain with classifying extremely turbid waters and cloud shadows based on a spectral approach. Surface reflectance data compares well with in situ measurements from the AERONET-OC network. Turbidity is estimated from the Red, RedEdge (RE only) and NIR bands and is compared with measurements from autonomous stations in the San Francisco Bay area and the coastal waters around the United Kingdom. Good performance is found for Red band derived turbidity from PS data, while the NIR band performance is mediocre, likely due to calibration issues. For RE, all three turbidity products give reasonable results. A high revisit density allows for the mapping of temporal variability in water turbidity using these satellite constellations. Thanks to the RedEdge band on RE, chlorophyll a absorption can be avoided, and perhaps even estimated. © 2019 Optical Society of America under the terms of the OSA Open Access Publishing Agreement


Introduction
The monitoring of water turbidity and water clarity (euphotic depth, secchi depth) is a global concern. In particular, member states of the European Union are required to monitor turbidity in inland and near-shore waters by the Water Framework Directive (WFD, 2000/60/EC). Changes in water turbidity can be the result of sediment resuspension by winds and currents [1], riverine inputs [2], and could also be an indicator for changes in water quality e.g. due to phytoplankton blooms [3,4]. Long term and high frequency observations are typically required for change detection, and the rapid availability of measurements is important for monitoring or detecting current events e.g. during dredging or offshore construction operations. Automated in situ stations can provide such high frequency measurements of turbidity, both during day-and nighttime, but are limited to a single location, and can hence miss important spatial variability. Optical remote sensing satellites have been used for monitoring water turbidity in the coastal zone [5][6][7] and in rivers and river plumes [2,[8][9][10]. Satellite derived turbidity products are of importance to modelling communities, in terms of validation of and assimilation into sediment transport modelling [11][12][13] and assessing dredging operations [14,15], and as light forcing in coastal ecosystem modelling [16]. Turbidity can be reliably estimated using a single Red or near-infrared (NIR) band [17], especially if a robust atmospheric correction is available using at least another longer wavelength band [6,7,18]. Recently, a Dark Spectrum Fitting (DSF) approach without a priori definition of bands for the atmospheric correction showed good performance for retrieving turbidity in coastal and inland waters from metre- [19] and decametre-scale [20] satellite imagery. slightly varying relative spectral responses (RSR), based on the first two characters (see Fig. 1 and section 2.4). The RE constellation has 5 satellites of <1 m 3 with 5 spectral bands (Blue, Green, Red, Red-Edge, NIR) with an approximate ground resolution of 6.5 m. For RE, the products were the orthorectified TOA 5 band tiles resampled to a ground resolution of 5 m (with Planet Product Identifier REOrthoTile). A single RSR was used for RE ( Fig. 1). Imagery was obtained for a small region of interest (ROI) around autonomous turbidity measurement stations in the San Francisco Bay Area (ROI of 1.2x1.2 km), and the southern North Sea and Irish Sea (ROI of 3x3 km). Data from the AERONET-OC stations worldwide (ROI of 3x3 km) were also obtained. Adjacent scenes and tiles from the same satellite and orbit were cropped and merged to the ROI extent before processing.

In situ radiometry
In situ normalised water-leaving radiance (L wn ) measurements were obtained from all sites in the AERONET-OC network [31], from the AERONET website (http://aeronet.gsfc.nasa.gov) in three quality levels (lev10, lev15, and lev20). The AERONET-OC sites have a Cimel Electronique radiometer measuring the water-leaving radiance at 8 wavelengths, which may vary slightly across the sites. The Venise station has reported bands at 413, 440, 501, 530, 555, 674, 870, and 1019 nm. Due to the recent nature of the satellite data, there were too few matchups with lev20 data, and hence the lev15 cloud-screened data passing empirical quality control thresholds on the L w were used. Matchups with lev10 data were also made after additional quality control of the lev10 data. L wn was used without the chlorophyll a based f/Q correction for bidirectional effects because of the intended application of mapping turbidity [20]. Matchups were defined as satellite images which had bounding in situ measurements less than two hours before and after the overpass. AERONET data was linearly interpolated from the two closest bounding measurements to the satellite overpass time. The in situ L wn were then converted to water-leaving radiance reflectances, ρ w : where F0 is the extraterrestrial solar irradiance [32] for a narrow 10 nm square band centred on the in situ radiometer wavelengths. ρ w values were linearly interpolated to form a 'hyperspectral' dataset and resampled to the RSR of the satellite sensor similar to [19,20].

In situ turbidity
In situ turbidity was obtained from USGS Water Data (https://waterdata.usgs.gov/nwis) for seven autonomous sites in the San Francisco Bay Area in Northern California, and from four autonomous SmartBuoys [33] operated by CEFAS in the North Sea and the Irish Sea (https://www.cefas.co.uk/cefas-data-hub/smartbuoys/). Turbidity was measured as side scattering at a NIR wavelength using YSI EXO or ForestTec DTS-1 sensors in the Bay Area sites, and using a SeaPoint turbidimeter on the SmartBuoys. All sensors are reported to have a NIR wavelength (780-900 nm) and a 90 (±2.5) degree acceptance angle, and are mounted at approximately 1 m below the surface. In situ data was linearly interpolated to the satellite overpass time, and the standard deviation within a six hour window centred on the overpass time was computed. In situ turbidity ranges (5, 50 and 95 percentiles) for the period 2015-12-01 to 2018-03-15 are listed for each site in Table 1. Quality controlled post-recovery data was obtained for the SmartBuoys, which has been used by other studies for validation of satellite derived turbidity data [20,34,35]. A combination of 'approved' and 'provisional' data from the USGS sites was used due to the recent nature of the satellite data. One of the USGS sites (Dumbarton Bridge) is located in South San Francisco Bay, and the six others are in Suisun Bay and the Sacramento River Estuary. The turbidity at the Dumbarton Bridge site is highly variable due to tidal resuspension and bridge pier wakes, and was finally excluded from the analysis due to the difficulty in matching in situ with satellite data in this very variable environment.

Atmospheric correction
Imagery was provided as top-of-atmosphere radiances (L T OA ) and was converted into top-ofatmosphere reflectances, ρ T OA : where d is the sun-earth distance in astronomical units (AU), F0 is the extraterrestrial solar irradiance, band averaged from [32], and θ s the solar zenith angle at the centre of the image.
To remove the impact of variable concentration of atmospheric gases, ρ t is corrected for gas transmittance (t g ), here taken as the product of the band averaged ozone and water vapour transmittances (see [19] for details). For water pixels, the diffuse sky reflectance reflected at the air-water interface is computed analytically [36]. The Dark Spectrum Fitting (DSF) algorithm [19,20] was adapted to the PS and RE band sets. For RE a single RSR was used, while for PS the RSR for three families of sensors was used to convolve the lookup tables (LUTs): (1) 0c for satellite series 0c** and 0d**, (2) 0f for 0f** and 10**, and (3) 0e. The DSF uses multiple dark targets in the subscene to construct a representative dark spectrum, ρ dark . While truly dark pixels may not be present in each image, the DSF minimizes the impact of non-zero reflectance [20]. In the present study, ρ dark is constructed from the first percentile of the gas and sky reflectance corrected ρ t in each band. The atmospheric path reflectance (ρ path ), including both aerosol and molecular scattering, is determined for each band in the ρ dark using two aerosol models (Continental and Maritime) and for each model the band giving the lowest aerosol optical thickness (τ a ) is selected, to avoid negative surface reflectances (ρ s ) in the other bands of the ρ dark . The aerosol model giving the lowest τ a is selected due to the limited band set on the PS and RE sensors [19]. The DSF has no a priori defined black bands over water, and different targets can be used depending on the scene contents and aerosol type. In general, shadowed and offshore non-turbid water pixels are used to determine the ρ path .

Pixel classification
Pixel classification, and especially cloud, cloud shadow, and glint masking over turbid waters, is a challenge for optical remote sensing. In the present study, pixel classification is performed after the atmospheric correction, using a machine learning technique. Machine learning offers more flexibility than manually created spectral tests using band thresholds and ratios, and is well suited to classify a wide variety of scenes with clear to very turbid waters. Spectra corresponding to separate classes were manually selected from RGB composites for different scenes and study sites. The Keras [37] neural network library was used with TensorFlow [38] backend. A neural network with n input neurons, n+1 hidden neurons and m output neurons was trained using the selected spectra, where n is the number of spectral bands (5 and 4 respectively for RE and PS), and m the number of output classes. The standard rectified linear unit ('relu') function was used for the input and hidden layers, and the softmax function for the output layer. As pixel classification is the goal, a binary crossentropy loss function was used.
For both sensors, four classes (m=4) were used; 1: clear water, 2: non-water, 3: mixed, 4: turbid water. Two water classes were included to help the neural network distinguish between spectrally different turbid and clear waters. Pixels were selected manually on ρ s RGB composites with reflectances in each band scaled linearly from 0 to 0.15 to the 8-bit RGB channels. Cloud-free water pixels with no obvious surface effects (glint, wave facets, foam) were selected for classes 1 and 4. Turbidity ranges were determined a posteriori to be around 0-3 FNU for class 1 and 3-20 FNU for class 4. For class 2, non-water pixels were taken from clouds, ships and offshore structures. The 'mixed' pixels in class 3 were selected differently for RE and PS, largely to collect image artefact pixels. For RE, class 3 were pixels at the edges of high objects (e.g. clouds) where the different spectral bands separate spatially on the image due to the sensor's parallax effect for high objects. For PS mixed pixels were selected as surface-level non-water pixels where the VIS and NIR bands were not perfectly collocated, causing a shift between the VIS and NIR datasets. An initial training dataset was created from the collected spectra for both sensors, and the classification result was inspected visually for several cloudy and cloud-free scenes for all study sites. Additional training spectra were then collected from obviously misclassified scenes, and a final training dataset was obtained. RE training spectra were finally selected from seven images from five sites, and PS spectra were taken from fourteen images from eleven sites.

Image filtering
With the large data volumes generated by the PS and RE constellations an automated quality control is required. Scenes were filtered mainly based on the results of the neural network pixel classification. Pixels were classified as water if the classification confidence in either the clear or turbid water class was > 50% or when the sum of both water classes was > 80%. Additionally, the summed classification confidence for the non-water and mixed classes had to be < 20%. Pixels not passing these criteria were considered as non-water pixels, and a dilation operation with 10 iterations was finally applied to obtain a final non-water mask. Scenes were used in the matchup analyses if they covered >50% of the ROI, and were taken with θ s < 70°, avoiding low sun conditions with low signal and increasing errors in the atmospheric and sky reflectance corrections. An additional criterion of at least 25% ROI coverage by any water class was added for the AERONET-OC sites. For the turbidity monitoring sites this restriction was not used due to their close proximity to land -see example images in Appendix A.

Turbidity algorithm
Turbidity in Formazine Nephelometric Units (FNU) was retrieved from the Red, RedEdge (on RE), and NIR channels using the algorithm of [17]: where ρ w is the water reflectance, here taken as the surface reflectance output from the atmospheric correction procedure. The A (FNU) and C coefficients were taken from the tables provided by [17] for the closest wavelength to the band weighted wavelength ( Table 2). Resampling of the calibration dataset to the sensor bands was not viable due to the insufficient spectral coverage of the dataset (600-885 nm) compared to the RSR (Fig. 1).

Pixel classification
Turbid waters were classified correctly for all spectra in both the PS and RE training datasets. For clear waters the RE classification performance was perfect, whereas 25% of the PS spectra were misclassified. These 25% were classified as turbid waters, due to a slight peak in the green band. In this case, the machine learning technique may have actually detected human misclassification in the initial training dataset of these turbid water pixels as clear water pixels. For both PS and RE, all water pixels in the training dataset were however attributed to either the turbid or clear water class, the distinction of which is not of importance to the present application. Turbid and clear water classes have generally different spectral shapes and the separation during training improved the separation of water and non-water pixels. The high reflectance in the PS NIR band for some of the clear water spectra is likely caused by (1) adjacency effects or (2) poor sensor performance or calibration, as no obvious surface effects or haze was visible on the images. The sensitivity of CCD detectors (like the ones used in PS) typically drops off quickly in the NIR, which could give poor performance especially at the lower reflectance end (i.e. over waters). It is probable that the NIR band quality varies significantly over the hundreds of sensors. Cloud shadows can not be discriminated from (clear) water spectrally and were most often classified as clear waters. Most scenes affected by cloud shadows will be filtered out by the criterion on water coverage, if the cloud casting the shadow is within the scene. In some cases the cloud may be outside the scene edges, and hence the dataset can be contaminated with cloud shadows. These are evident in the scatterplots as underestimation of the water reflectance by the satellite compared to the in situ data (see Sections 3.2 and 3.3). Misclassification between the non-water and mixed classes is not considered to be an issue, as only the clear and turbid water pixels are of interest. For PS, about 8% of the mixed pixels of overall lower reflectance were misclassified as turbid waters, caused by the similar spectral shape and magnitude to turbid waters (caused by the spatial shift between the VIS and NIR bands). These training pixels were located at the edge of structures, where there may also be land/water mixed pixel effects. These pixels were largely removed from the matchup datasets by the mask dilation (see section 2.6).
During RE classification, thin haze, small objects (of size comparable to the sensor resolution) and the edges of objects were often classified as 'mixed pixels' although they were not explicitly included in the training dataset. These pixels tend to have a high VIS reflectance, with a lesser impact on the NIR band. Objects on PS scenes with shifted NIR bands tended to be classified as mixed for the pixels covered by the VIS bands, and as non-water for the NIR band -due to the high NIR reflectance. In case of very rough water surfaces, the wave facets -which can be spatially resolved -tended to be classified as clear/turbid waters on the dark/bright sides.
For PS, the neural net was not always able to distinguish between extremely turbid waters and the mixed or non-water classes, misclassifying the former as the latter, effectively filtering out the most turbid waters from the PS matchups (see Section 3.3). This is caused by spectral similarities between the extremely turbid waters and the mixed and non water classes for this spectrally limited bandset. In particular, the strict criterion on summed classification confidence (<20%) of non-water and mixed classes caused exclusion of extremely turbid waters where the spectral ambiguity was largest. An example of the classification is given in Appendix A, where some of the turbid waters around Zeebrugge were classified as non-water or mixed pixels, even though the confidence in the turbid water class was >50%. By applying the full masking tests as described in Section 2.5, the scene was excluded from the matchup analysis, mainly due to <50% water coverage in the final mask.

Water reflectance retrieval
The Dark Spectrum Fitting (DSF) algorithm retrieves surface reflectances over both water and land, and provides realistic spectral shapes over turbid and productive waters. An example RE scene over Oostende is shown in Fig. 2 with some spectra provided in Fig. 3. P1 and P2 show typical turbid and very turbid spectra, with P3 showing characteristics of Red band chlorophyll a absorption. This indicates that even though the Red and Red-Edge bands are wide (Fig. 1) the RE data could perhaps be used to derive chlorophyll a concentration in turbid waters. The reflectances over water were quantitatively evaluated using measurements from worldwide AERONET-OC sites. Satellite reflectances were extracted from a 11 x 11 pixel box centred on the reference location, i.e. the shifted coordinates provided by [20]. Matchups of PS and RE ρ s with lev15 AERONET-OC ρ w are shown in Figs. 4 and 5 as the mean and standard deviation (vertical error bars) from the extracted pixel boxes. The number of matchups (n), the Reduced Major Axis (RMA) regression line slope (m) and offset (b), the squared Pearson's correlation coefficient (r 2 ), the Root Mean Squared Difference (RMSD), and the Mean Average Relative Difference (MARD) between the in situ and satellite data are shown. These summary statistics are also provided in Table 3. No manual or statistical removal of outliers was performed; even though some outliers have known impacts of cloud shadowing, which was not detected automatically due to the spectral similarity of cloud shadows and clear waters. 0.79 with r 2 of 0.63 and 0.55. The Blue band performance is rather poor, especially for the most turbid points from Zeebrugge-MOW1, where the ρ w is significantly underestimated. This could potentially indicate an overestimation of the atmospheric path reflectance in the small ROI (3x3 km) over the most turbid waters, although for those points the Red and Green performance is still adequate. Fig. 5 shows the 90 matchups from all 5 RE satellites (RE1: 19, RE2: 13, RE3: 22, RE4: 13, RE5: 23), and 13 AERONET-OC sites worldwide. For all bands the RMSD is less than 0.01, indicating a good performance at higher reflectances -i.e. at higher turbidities, but large uncertainties at low reflectances, i.e. over clear waters. It should be noted that for lower reflectance ranges the RMSD would typically decrease, so in general for clearer water sites the RMSD will be less than the value reported here. RE performance is quite good in terms of linear correlation statistics, especially for the Green and Red bands, where the range in the data is largest (RMA slopes respectively 1.01 and 1.02 with r 2 of 0.86 and 0.79). For the Blue band, an underestimation is found for the low and high reflectance ranges. Even though the RedEdge band has no close bands on the in situ instrument (giving larger interpolation uncertainty), the results are reasonable with a RMA slope of 0.98 and r 2 of 0.62. The lev10 data gave additional matchups for both constellations with similar performance to the lev15 data: in total 145 for 15 AERONET-OC sites for RE and 85 for 14 sites for PS. Plots for lev10 data are omitted for brevity, but summary statistics are provided in Table 4. The resampling of the narrow band AERONET measurements to the broader satellite bands can introduce errors in the matchup analysis. Especially in the Red and NIR the spectral coverage by AERONET is sparse, with typically three bands at~670,~870 and~1020 nm. For the RE RedEdge band the resampling error is largest, as the in situ instruments have no band located within the satellite band. Overall, we find reasonable performance of the band shifting method. No hyperspectral in situ matchup data was available for any of the satellites. In general, PS underestimates turbid water reflectance and overestimates clearer water reflectance. This could be a direct result of the weighting of the multispectral AERONET-OC data by the sensor's RSR. RE performs better than PS in terms of estimating the spectral shape and magnitude of the various sites, with lower RMSD and better linear correlation statistics, likely due to its narrower and more square bands. In addition, the PS RSR functions (see Fig. 1) show significant out of band response, which may furthermore vary among the hundreds of systems. For RE, a number of negative reflectances were obtained, due to an overestimation of the path reflectance, potentially due to glint contamination -the vertical spread in the RE scatterplots gives some indication of the presence of glint. Negative reflectances could also be caused by the presence of cloud shadows. Cloud shadows not only shadow the surface, but also the atmosphere and hence in some cases shadowed pixels can have a ρ t less than the expected Rayleigh (molecular) scattering. This can cause an overcorrection of the shadowed pixels themselves, but may also lead to an overestimate of the ρ path , as the DSF can not estimate the aerosol optical thickness in the most optimal (i.e. the darkest and shadowed) bands. Negative reflectances of this kind are not present in the PS dataset, due to the more difficult pixel classification and the corresponding more restrictive image selection. The negative values could be filtered out automatically but were retained here to give an indication of typical performance and the spread across the 1:1 line. Since for most of the PS satellites in the analysis only one or two matchups were available it is impossible to establish performance estimate at the satellite level. This must instead be done at the constellation level. The highest reflectance points in the RE matchups were from a single AERONET-OC site (Zeebrugge-MOW1) but from different satellites, and hence it is also difficult to establish site level performance. On average, the RE and PS derived Green and Red band reflectances were adequate for turbidity mapping in brighter waters. For RE the performance is much better than for PS, and RE could provide better water reflectance spectra over a wider range of turbidities. Blue bands are typically the most challenging for atmospheric correction, due to the high atmospheric scattering. Nonetheless, the results presented here indicate that for some applications there may be useful information retrieved in the Blue band (especially for RE), such as the estimation of bathymetry in optically shallow waters or determination of the presence of absorbing substances. The NIR band performance is similarly challenging to validate, due to the usually very low water-leaving signal. For clear waters a significant NIR retrieval could indicate contamination by glint or adjacency effects, but for PS in the present study it can likely be attributed to poor sensor performance and calibration. The results for the PS NIR band indicate that the sensor quality at present is not sufficient for any of the points in the validation dataset. The fact that the PS NIR reflectances were too high is also manifested by the band selected in the DSF. In most cases longer wavelengths would be expected due to the absorption properties of water. Here for 73% of the PS matchups a visible band was used to determine path reflectance, compared to 45% for RE.  Fig. 4 but for RE. Note that these points come from 5 different satellite sensors over 13 different validation sites.

Turbidity retrieval
Example turbidity products with final masking applied are shown in Fig. 6. For each of the matchup sites (Table 1), water reflectances were extracted for an 11 x 11 pixel box centred on the site location, and turbidity was computed for each of the bands listed in Table 2. The Dumbarton site was excluded due to the depth of the turbidity measurement, the extreme variability of suspended sediment concentration around the bridge piers, and pixel identification issues around the bridge structure. In situ data was assumed to be quality controlled, and the automated quality control for the satellite data was used as described in Section 2.6. A high temporal variability is found in both the SmartBuoy and Bay Area estuarine sites, with a standard deviation of 5-15 FNU in the in situ data. RE matchups with the coastal SmartBuoy and USGS Water Data sites are given in Figs Fig. 11 for the Warp SmartBuoy and in Fig. 12 for the USGS Rio Vista site. Time-series plots for the other sites were omitted for brevity. Summary statistics are given in Tables 5 and 6  For RE, Red band derived turbidity performed well, with an RMSD of 5.4 and 5.6 FNU, a MARD of 51 and 31%, a RMA slope of 0.94 and 0.86 and r 2 of 0.78 and 0.81 for the coastal and inland sites (respectively SmartBuoy and USGS). The lower slope found for the inland sites could be indicative of higher Red band absorption e.g. by chlorophyll-a. The RedEdge band derived turbidity has a similar RMSD and r 2 to the Red band turbidity, but a lower slope (especially for the SmartBuoys). The NIR band turbidity performance is found to be quite different between the coastal and inland sites. For the coastal SmartBuoys an underestimation of turbidity is found, which is much noisier than the Red and RedEdge retrievals and has about double the RMSD. Several points with large underestimation by the satellite were visually identified as in cloud shadows, but were retained in the analysis due to a requirement on automated processing. The SmartBuoys are likely moving in the tidal ellipse compared to their quoted location, increasing the uncertainty of the matchups.
For PS, the Red band retrievals are also fairly good, with an RMSD of 3.0 and 6.6 FNU, MARD of 52 and 43 %, a RMA slope of 1.08 and 0.61 and r 2 of 0.45 and 0.41 for the coastal and inland sites. The exclusion of the more turbid PS scenes, due to the waters being indistinguishable from non-water and mixed pixels, cause a lower data range compared to RE, and hence a lower r 2 and lower RMSD. The retrieval of higher turbidities, up to the Red band reflectance saturation, is likely possible for PS, but improvements on pixel classification and image quality control are necessary to do this in an automated manner. The out of band response of the PS Red band and the use of a single wavelength calibration of the turbidity algorithm may also impact the regression slope. The PS NIR band retrievals seem to be meaningless for water applications, and similar to the AERONET-OC matchups, these results indicate a problem with the PS NIR band.

Evaluation of near-simultaneous PlanetScope images
A number of near-simultaneous overpasses (within 15 minutes) from different PS imagers were compared in order to assess the consistency across the different sensors. Available imagery covering the Dumbarton Bridge area in South San Francisco Bay, California, and the area around Oostende, Belgium was used in the analysis. Images were masked according to Section 3.1 and the water reflectances from each image pair were compared for pixels passing the quality control on both images as long as this amounted to >= 25% region coverage. Results were aggregated per image pair and paired per sensor family (0c, 0e, and 0f), and within family results were also included in the plots (Fig. 13 and 14). These plots show a good comparability in the VIS bands between the different PS satellites, with RMSD generally < 0.005. The MARD for the different sites (Dumbarton and Oostende) is 13-17% for the Blue, and 7-8% for the Green bands. The Red band MARD is 10% for both sites, indicating a good expected consistency for the Red reflectance derived turbidity products. Similar to the matchups with the in situ data, the 'lumpy' NIR scatterplots show that there is a large variability in NIR performance between the sensors (MARD of around 40%). Furthermore, the observed NIR reflectance range of 0 -0.08 is unrealistically high for these water targets.

Perspectives
The adaptation of the DSF for atmospheric correction for water applications of PS and RE imagery reveals that consistent results can be retrieved within each constellation. Even without the presence of longer wavelength bands in the SWIR, the DSF is able to retrieve reasonable path reflectances, even from tiny 1.2x1.2 km subscenes, without relying on external inputs of aerosol type or concentration. For varied sites around the globe, the surface reflectances and derived turbidity were in good agreement with in situ measurements of water-leaving radiance (14 coastal sites) and turbidity (6 estuarine and 4 coastal sites). For some of the more turbid sites, larger processing subscenes may need to be used to increase the presence of sufficiently dark pixels (e.g. including shadows/dense vegetation on land or clear waters in the subscene) to avoid negative water reflectances. For most automated processing negative reflectances should be masked automatically and their cause investigated. For example, further work is needed on the automated screening of cloud shadows and their impact on the DSF. Although good performance for rivers is found here, the boundaries of performance in narrow channels and small ponds should also be investigated. Further challenges include the processing of scenes near ice and deserts, with high adjacency effects and lack of shadow casting features, adjacency effects in general for smaller inland waters, and glint contamination for nadir view scenes at lower latitudes. The pixel classification presented here needs to be examined over more study areas and perhaps improved using multi-temporal imagery or external datasets (e.g. from Sentinel-2). With the current performance of the PS sensor, a better classification based just on spectral information may be difficult, although in large part the turbid water cut off could be caused by the additional rather strict image filtering described in Section 2.6. Water targets, autonomous in situ instruments, and a good atmospheric correction method may possibly aid the on-orbit calibration of these large constellations of satellites. Collection of sufficient in situ matchups for a single satellite will take a long time -perhaps even exceeding the satellite's lifetime, and the calibration can instead perhaps be performed with near-coincident data from Landsat and Sentinel-2. Dense time-series of high resolution imagery will allow for new applications to emerge and may give improved coverage for cloudy regions where revisit times are currently too low for any meaningful temporal analysis. The processing algorithms and software will be made open source in order to encourage other researchers to evaluate the presented method for their applications and study areas.

Conclusions
• This study confirms the potential of CubeSats for accurate high spatial (3 m) and moderate temporal (daily) resolution mapping of water turbidity during cloud-free periods, giving an order of magnitude improvement in spatial and/or temporal resolution over currently used data sources (Sentinel-2/MSI: 10 m every 5 days at equator for two satellites; Landsat 8/OLI: 30 m every 16 days, Sentinel-3/OLCI: 300 m nearly daily).
• The Dark Spectrum Fitting (DSF) aerosol correction is adapted here to the PS and RE constellations of satellites, and even over the very small subscenes (1.2x1.2 km and 3x3 km) used here, the atmospheric path reflectance can be estimated with sufficient accuracy. Reasonable reflectances were retrieved for the visible bands in the various turbidity measuring stations and the coastal sites in the AERONET-OC network, especially for the Green and Red bands with the largest observed reflectance ranges.
• The retrieval of Red band water reflectance from PS and RE allows for remote estimation of water turbidity, which corresponded well to autonomous turbidity measurements in inland waters in the San Francisco Bay Area and coastal waters of the North and Irish Seas. The overall RMSD was around 6 FNU across the two constellations, similar to the performance found for Sentinel-2 [20], indicating their applicability for absolute turbidity retrievals in moderately turbid (<80 FNU) waters. The RMSD does depend on the turbidity range, and is impacted by the inclusion of more turbid points. Due to the high observation density, temporal variability of turbidity can be tracked by PS and RE.
• PS NIR reflectances and derived turbidity products were not at all reliable, likely due to calibration issues and poor sensor performance over dark water targets, compounded by adjacency effects. At the time of writing the PS NIR band seems to be of no use for remote sensing of water turbidity. Perhaps the DSF method presented here could aid the NIR calibration over dark water targets. RE NIR reflectances seemed to be significantly affected by glint and adjacency effects, and further research is required to make these usable.
• Autonomous networks of in situ instruments measuring surface radiances and water turbidity are invaluable for validating satellite images, especially in the context of massive narrow-swath satellite swarms such as PS with hundreds of individual sensors. Collocation of above-water (hyperspectral) radiometers and in-water turbidimeters could potentially improve quality filtering, and give better understanding of matchup performance.
• A machine learning approach was used for pixel classification, and could separate pixels quite reliably in water and non-water classes. Cirrus and thin clouds, shadows, and floating objects are still very hard to detect using the limited band set on these sensors, and more efforts on automated pixel classification are needed. For PS distinguishing extremely turbid waters from transparent clouds and erroneous mixed pixels is especially difficult due to the lack of spectral bands, and especially due to the NIR band performance.
• Although the comparison presented here was far from exhaustive, a good consistency is found for several near-simultaneous image acquisitions from separate PS sensors. Similarly, the matchups with AERONET-OC data show reasonable performance across the constellations, even with nearly all PS points coming from different satellite sensors.