A robust threshold-based cloud mask for the HRV chann l of MSG SEVIRI

Introduction Conclusions References


Introduction
The effect of clouds on radiative fluxes depends on cloud type and can vary strongly both in space and time.Accurate information about the physical and radiative properties of clouds are necessary to determine the role of clouds in the climate system including their response to anthropogenic forcings (e.g.Forster et al., 2007).
Geostationary satellite imagers such as METEOSAT SEVIRI (Spinning Enhanced Visible and Infrared Imager) are well-suited to monitor the temporal development of clouds, and to fully resolve their diurnal cycle over land and ocean (Roebeling and van Meijgaard, 2009).The spatial resolution of SEVIRI's narrowband channels (3 × 3 km and AVHRR (1.1×1.1 km 2 ), which limits its ability to resolve small-scale structures.SE-VIRI does however have a high resolution visible channel (HRV) with a nadir resolution of 1 × 1 km 2 .
The HRV channel contains important information for studying the small scale variability of clouds and the underlying surface (e.g.Kl üser et al., 2008;Deneke and Roebeling, 2010).The study of Derrien et al. (2010) improves the detection of small scale low clouds by use of the HRV channel.The HRV channel reflectance was also used by Carbajal Henken et al. (2011) for the detection of deep convective clouds.Nevertheless, few operational products based on the HRV channel are available, which is a significant hurdle for use of its finer spatial resolution for scientific studies and applications.
The estimation of cloud and/or surface properties from multispectral satellite images requires the classification of pixels into cloud-free and cloudy classes as initial step.Most cloud detection algorithms described in the literature rely on a combination of threshold tests applied to different spectral channels for this purpose.Rossow et al. (1989) presents an overview of early methods chosen for cloud masking.These methods often exploit the fact that clouds generally appear brighter in solar channels due to reflection, and colder in infrared channels relative to cloud-free surfaces.In addition, spatial coherence tests are commonly used, as clouds are often more variable than the underlying surface (see e.g.Saunders and Kriebel, 1988).It should be noted that spatial coherence tests also rely on thresholds for identifying regions with high variability.
Cloud masking is one particular case of object identification by thresholding, and can be described mathematically by considering the graylevel histograms based on specific channel radiances.Thresholds are selected to best separate the histograms for the cloud-free and cloudy pixels of a satellite image.Suitable thresholds for satellite channels are often selected by experts (e.g.Saunders and Kriebel, 1988).As an alternative, automatic statistical methods can be applied to select optimal thresholds.Here, methods which make use of a training dataset for threshold selection (supervised Figures

Back Close
Full methods) need to be differentiated from those which select thresholds based on intrinsic properties of the dataset (unsupervised methods).Yang et al. (2007) have investigated several algorithms for unsupervised threshold selection, and have determined the most accurate ones for application to cloud-masking for the multi angle imaging spectro radiometer (MISR) over land.Regardless of method, independent reference data are needed to establish the accuracy of threshold-based classification methods.
The goal of the present paper has been to develop a cloud mask based on the HRV channel, which exploits its high spatial resolution, and is suitable to study small-scale features of clouds, including, e.g.their horizontal dimensions.This does preclude the use of spatial coherence tests due to their non-local nature.Instead, a differencing approach using clear-sky composite reflectances as background is adopted to improve the contrast between clear-sky and cloudy situations (e.g.Minnis and Harrison, 1984;Ipe et al., 2003).Nevertheless, a threshold test applied to a single visible channel cannot achieve the accuracy of other SEVIRI cloud masks which are based on multiple spectral channels.Instead of replicating other cloud mask algorithms, this HRV mask is designed as complement to an existing cloud mask used as reference for threshold selection and to estimate the mask's accuracy.The operational cloud mask product (CLM) by EUMETSAT's Meterological Product Extraction Facility, which is based on the narrow-band channels of SEVIRI (EUMETSAT, 2007) is a convenient choice for this purpose as it is distributed through the EUMETCast system together with the level 1.5 SEVIRI images and is used in our study.
This paper is structured as follows: in Sect.2, a brief overview of the datasets used in our study, including the characteristics of the SEVIRI instrument is given.This is followed by Sect.3, which describes our proposed cloud masking method.Sect. 4 presents results and discussions, followed by conclusions and an outlook in Sect. 5. Figures

Back Close
Full

Instrumental data
The current series of European geostationary satellites, Meteosat Second Generation (MSG), is operated by EUMETSAT.Its main payload is the the Spinning Enhanced Visible and Infrared Imager (SEVIRI), an optical imaging radiometer.Three MSG satellites, Meteosat-8 to -10, have been launched and are positioned in geostationary orbit at an altitude of 36 000 km above the equator.Meteosat-9 observes the full disc of the earth as primary geostationary service with a repeat cycle of 15 min.Meteosat-8 is currently used as stand-by and operates the rapid-scan service covering Europe with a 5 min repeat cycle.Meteosat-10 has been launched on 5 July 2012, and is currently in comissioning.A detailed description of MSG is given by Schmetz et al. (2002).
Only the 0.6 µm, 0.8 µm, 1.6 µm, 8.7 µm and HRV channels are considered in this study.The normalized spectral response functions of the narrowband solar and the broadband HRV channels are shown in Fig. 1.
The narrowband channels cover the full disc of the earth with 3712 × 3712 pixels.At a 3-fold higher resolution, this results in a nominal image size of 11 136 × 11 136 pixels for the HRV channel.However the actual HRV channel coverage is only 5568 pixels in East-West direction.An upper region of 3072 scanlines with a fixed position is centered on Europe.The lower region consisting of 8064 scan lines follows the daily illumination.
Only the upper region is considered in this study.Introduction

Conclusions References
Tables Figures

Back Close
Full

EUMETSAT cloud mask
The EUMETSAT cloud mask (CLM) is derived by the Meterological Product Extraction Facility (MPEF) and utilizes a combination of several multi-spectral threshold tests grouped into different categories to distinguish between cloudy and cloudfree pixels (see EUMETSAT (2007) for a detailed description of the algorithm).The CLM is an operational SEVIRI product and is derived every 15 min for the full disc.In the final product, each pixel is labeled either as 0 (clear sky ocean), 1 (clear sky land), 2 (cloudy) or 3 (no data).The EUMETSAT threshold tests involve almost every SEVIRI channel with the exception of channel 8 (9.7 µm), as it is mainly sensitive to tropospheric and stratospheric ozone and thus adds little additional information for cloud masking, and channel 12 (HRV) as it is not available for the full disc.

Study regions
To evaluate the threshold-based cloud mask algorithm and study its performance for different surface types and synoptic conditions, we have selected the following four regions in and around Europe: (1) Atlantic, (2) the Alps, (3) Upper Rhine Valley and (4) Spain (see Fig. 2).The four regions comprise 192 × 192 HRV pixels (or 64 × 64 LRES pixels) to provide enough data points to calculate representative histogram functions.This size has been chosen as a reasonable trade-off versus the advantage offered by a smaller size and thus a smaller surface variability.To illustrate the improvements gained by applying thresholds relative to a clear sky composite, we have focused on regions with a relatively high spatial surface variability such as the Alps and the Upper Rhine Valley.

Methods and algorithms
For applying a binary classification to separate cloudy and clear sky pixels, we rely on the following simplified assumption.In general clouds have a higher HRV reflectance Introduction

Conclusions References
Tables Figures

Back Close
Full The first step in this study is the separation of clear sky and cloudy HRV pixels based on the EUMETSAT cloud mask.The latter includes multiple solar and thermal threshold tests.As it is not based on a fixed reflectance threshold, clear sky pixels can have a higher HRV reflectance than cloudy pixels.Consequently there is an overlap between the histograms of the clear sky and the cloudy reflectances (Fig. 3).The normalized frequency distribution of clear sky and cloudy HRV reflectances are shown for the regions Spain (yellow), Upper Rhine Valley (green), the Alps (red) and Atlantic (blue).The clear-sky histograms have been calculated as average value over a 16 day period (1-16 June, 2011).The figure shows the lowest average clear sky reflectance and lowest variability over the Atlantic region.The clear sky reflectance histogram over Spain reveals a high average value and simultaneously a high spatial variability.Readers should note the overlap between the clear and cloudy histograms.
Several reasons can cause the overlap in the HRV reflectance histograms.This includes spatial and temporal variability of the surface reflectance, e.g.due to changes in vegetation, atmospheric aerosols.Additionally undetected thin cirrus clouds with low visible reflectance can contaminate the HRV clear sky histogram.This overlap is the major source of uncertainty for our cloud mask.The challenge is thus to reduce this overlap, and to find an optimal threshold to obtain the best classification.

Cloudfree composites
To reduce the uncertainties caused by spatially varying surface reflectance, we apply cloud mask to the 3 times higher resolution of the HRV channel.This is done using nearest-neighbor interpolation.Each clear sky composite is calculated as average value of all clear sky reflectances observed during a 16 day period.The length of this period seems appropriate to ensure relatively constant surface conditions and a high likelyhood to find at least one cloud free observation for each pixel.Nevertheless pixels can occur with no clear sky observation due to persistent clouds.Such HRV pixels are always reported as cloudy.
Instead of subtracting the clear sky composite from the observed reflectance, an anomaly map is created for each 192 × 192 pixel region using Eq. ( 1): (1) Here, r obs (x, t) is the observed HRV reflectance field at time t during a specific period.The subtrahend r cs (x) t − r cs t,x consists of two parts.The first term r cs (x) t is the spatially resolved time average of the clear sky reflectance, while the second term r cs t,x is the spatial and temporal average of the clear sky reflectance.The effect of this treatment is illustrated by the green arrows in Fig. 4. The distribution indeed becomes more narrow, which indicates that our method is capable of compensating for the spatial variability of the underlying surface reflectance.This method therefore minimizes the overlap between the cloudy and clear histograms and reduces the associated uncertainty of the HRV cloud mask.

Optimal threshold
The threshold for detecting cloudy pixels should maximize the quality of our classification.It is thus necessary to compare it to reference data, and to define suitable quality 2836 Figures

Back Close
Full criteria for assessing its accuracy.The four possible outcomes for comparing two binary classifications are listed in the contigency Table 1.In this study the EUMETSAT cloud mask is used as reference, and the predicted class of the HRV cloud mask depends on the selected reflectance threshold.Due to the use of prior information, this method corresponds to a supervised classification algorithm.For determining the optimal threshold a suitable measure is sought that combines the frequencies of the four outcomes into one scalar quantity.
One measure which meets our requirements is the Matthews Correlation Coefficient (MCC, Matthews, 1975).Like the Pearson correlation coefficient for the continuous case, it quantifies the correlation between two binary variables in a range from −1 to 1, with 1 corresponding to perfect agreement.
The MCC is defined as follows: . (2) It can be calculated directly from the contingency table from Eq. ( 2).One advantage of the MCC is its insensitivity to the frequency of both classes.This ensures that our cloud mask performs well in regions and seasons with low, medium and high frequency of clouds.The threshold which corresponds to the maximum of the MCC is chosen as optimal.
The flow chart Fig. 5 visualizes the HRV cloud mask algorithm.The first step is the calculation of the clear sky composite.With the help of this clear sky composite and the associated reduction of the overlap between the histograms due to the decreasing of the surface reflectance variability it is possible to select a relative threshold.This relative threshold is then applied to the normalized reflectance field.The result is the HRV cloud mask.This resulting cloud mask is used as new input for the whole procedure.The selection of the relative threshold becomes stable after the third iteration.HRV channel is difficult.As it is our aim to determine a HRV cloud mask which is complementary to and consistent with the EUMETSAT cloud mask, we redefine 3 × 3 clear sky HRV pixel blocks as cloudy if the appropriate pixel of the LRES EUMETSAT cloud mask detects clouds.This is done in recognition of the fact that the HRV mask is likely to miss some clouds which can be detected using the full range of spectral channels avialable to the EUMETSAT cloud mask.

Results and discussion
In order to assess the quality of the HRV cloud mask, some aspects related to its accuracy are investigated and discussed in this section.The EUMETSAT cloud mask is used as reference, to support the consistency of both cloud masks, and due to the lack of other suitable reference data.For this analysis, the four regions shown in Fig. 2 are used as typical examples for different surface types.
To compare our results with the EUMETSAT cloud mask, the latter mask has been upscaled to HRV resolution.Figure 7 shows the HRV reflectance, the 8.7 µm IR radiance and results from the EUMETSAT and HRV cloud masks for one particular case observed over Spain.This is an extreme example where the HRV reflectance misses a high amount of cloudy pixels corresponding to thin cirrus clouds, and demonstrates that the capabilities of the HRV channel for detecting thin cirrus clouds are limited.When considering the 8.7 µm channel, the thin cirrus clouds can clearly be recognized in the North-Western corner of the region.
The thin cloud restoral redefines a 3 × 3 pixel block only as cloudy, if the entire block is detected as clear by the threshold algorithm.This approach is problematic for situations where small-scale low level clouds occur underneath a larger cirrus cloud.This effect is visible in Fig. 7.The EUMETSAT cloud mask and the 8.7 µm image indicate large cloud coverage due to cirrus.Some brighter pixels appear in the Northern HRV image section, which are likely caused by small convective clouds.In the vicinity of these clouds, unrealistic gaps in cloud coverage occur.
The HRV channel is not used by the EUMETSAT cloud detection algorithm.To demonstrate that its broad spectral response is still suitable for an accurate thresholdbased cloud detection, we have compared the results of our algorithm applied to the HRV channel to those obtained with the narrowband channels at 0.6 µm and 0.8 µm wavelength and at LRES spatial resolution.For this purpose, the HRV channel is simulated as a linear combination of the 0.6 and 0.8 µm reflectances as proposed by Cros et al. (2006) and using the regression coefficients reported by Deneke and Roebeling (2010).The accuracy of the cloud mask applied to the simulated HRV channel lies between those achieved with the 0.6 and 0.8 µm channels.Over ocean, differences are small, and the best accuracy is found for the 0.8 µm channel, as it is slightly darker than the other channels.Over land, best results are obtained with the 0.6 µm channel, but the accuracy of the simulated HRV signal is only slightly lower.Over vegetated surfaces, the 0.8 µm exhibits a significantly lower skill, while the relatively bright surface over Spain causes an overall degradation of detection accuracy.Table 2 summarizes the results of our threshold algorithm and lists the final deviations versus the EUMETSAT cloud mask.Based on the EUMETSAT cloud mask, all regions but Spain have a high average cloud cover ranging from 75 to 85 %.The Atlantic ( 1) is a region with frequent passages of frontal systems.The Alps (2) and Upper Rhine Valley (3) are characterized by orographically induced convection.One should recognize that the 12:00 UTC timeslot is used for this study, which implies a high level of solar irradiance and thus a well-mixed convective boundary layer (Driedonks, 1982).In contrast, the cloud coverage over northern Spain is relatively low.The large observed differences in average cloud cover for the four regions illustrate the importance of choosing a threshold selection scheme which is insensitive to the relative occurence frequencies of both classes (see Sect. 3.2).
The high standard deviation of the clear sky reflectance over the Alps and Spain underlines the high spatial variability of the surface over these regions.This finding correlates with a strong reduction of the thresholds t abs and t rel .Here, t abs is the absolute threshold determined without using the clear sky composite information, while t rel is the threshold relative to the clear sky composite.Both the lower threshold t rel compared to t abs , and the resulting lower deviation (Dev rel ) compared to Dev abs confirm that our choice of using a clear sky composite improves the separability of clear and cloudy radiances, and thus results in an overall improved classification accuracy.
Even though the clear sky variability over the Atlantic region is very low with a standard deviation of only 0.007, there is still a significant spread of 0.013 between t abs and t rel .This effect is caused by the initial clear sky composite, which includes some brighter pixels corresponding to small clouds missed by the EUMETSAT cloud mask.By applying our iteration scheme, these bright pixels are filtered out, as is reflected by a reduced standard deviation of the clear sky reflectance.
Dev fi is the final deviation after applying the thin cloud restoral.The difference between Dev fi and Dev rel is lowest over the Alps (1.7 %) and the Upper Rhine Valley (2.3 %).This is probably due to the high amount of clouds in general, and convective clouds in particular over these regions, which limit the applicability of the thin cloud Introduction

Conclusions References
Tables Figures

Back Close
Full restoral.On the other hand, a strong impact of the thin cloud restoral on the deviation of the final HRV cloud mask is found for the Atlantic and Spain.Sample scenes such as Fig. 7 indicate a relatively high amount of thin clouds over these regions.
To identify the overall effect of including the HRV channel for cloud masking, Fig. 9 compares the cloud coverages obtained from the EUMETSAT and HRV cloud masks for the three land regions.In the majority of cases, the average HRV cloud coverage lies below the cloud coverage of the EUMETSAT cloud mask.This systematic difference is composed of two effects: first, thin cirrus clouds which are missed by the HRV cloud mask but at least partly corrected for by the thin cloud restoral.Second, SEVIRI LRES pixels which are counted as completely cloud filled by the EUMETSAT cloud mask, but are identified as broken by the HRV cloud mask.While it is impossible to separate both effects without independent reference data, the latter seems to dominate.
The blue error-bars indicate the fraction of deviating classifications between both cloud masks found for each timeslot.Generally, the fraction rises with increasing cloud amount, until it reaches a value of about 80 % where it starts to fall again.Thus, partly cloudy conditions cause the highest deviations.This result is expected, because the HRV cloud mask can gain additional information about partly cloud filled pixels and cloud edges compared due to its higher spatial resolution.

Summary and conclusions
In this study we have presented and evaluated a robust threshold based HRV cloud mask which is based on the EUMETSAT cloud mask and extends it to a 3-fold higher spatial resolution while maintaining consistency with its results.The optimal threshold to differentiate between clear sky and cloudy radiances is chosen by maximizing the Matthews Correlation Coefficient (MCC), a quality measure for binary classifications which is not influenced by the ratio of cloudy to clear pixels, to ensure best agreement of our cloud mask with the EUMETSAT cloud mask.Clear sky anomaly maps are used to account for regions with high variability in surface reflectance.As a result, the Introduction

Conclusions References
Tables Figures

Back Close
Full overlap in the clear and cloudy histograms and thus the uncertainty in the classification is significantly reduced.An iterative approach is chosen to include the HRV cloud mask information in the calculation of the clear sky anomaly maps, with convergence generally achieved after three iterations.
A thin cloud restoral is done to account, e.g. for thin cirrus clouds that are not detected by the HRES visible channel, in order to ensure that the HRV cloud mask results are consistent with the EUMETSAT cloud mask.Completely clear 3×3 HRV pixel blocks are redefined as cloudy if the corresponding LRES pixel is reported as cloudy in the EUMETSAT cloud mask.Some remaining artifacts after this cloud restoral are found, which are explained and illustrated in Fig. 7.
On average, 10 % of all 3 × 3 clear sky HRV pixel blocks are missed by our threshold test and restored to cloudy pixels, which occurs mainly for thin cirrus clouds.Our results indicate that the HRV cloud mask performs very reliable in cloudy conditions.The frequency of cloudy LRES pixels which are found to be broken in our data set is 16 %.The highest frequency with 24.3 % occurs over the Alps and the lowest fraction over the Atlantic (4.6 %).The amount of broken pixels reaches 15.5 % over Spain and 19.4 % over the Upper Rhine Valley.The high values over the Upper Rhine Valley and the Alps are expected and underline the frequent occurrence of small scale convective cumuli clouds over these regions.Deviations between the EUMETSAT cloud mask and the HRV cloud mask after thin cloud restoral occur for 5.8 % of the HRV pixels.This deviation results from an overestimate of the cloud fraction due to partially cloudy HRV pixel blocks which are reported as completely cloudy by the EUMETSAT cloud mask.
This HRV-based cloud mask is part of our wider effort to extend the Cloud Physical Properties retrieval (Roebeling et al., 2006(Roebeling et al., , 2008) ) to the high spatial resolution of the HRV channel, including an estimate of cloud optical thickness (Carbajal Henken et al., 2011) and other cloud properties.It offers also the possibility to apply the cloud mask as tool to study the geometric size of convective clouds including their temporal evolution in the future.Similar approaches will be essential to optimally utilize the data from future Introduction

Conclusions References
Tables Figures

Back Close
Full  Full  Full 2 ) lags behind that of polar orbiting imagers such as MODIS (1×1 down to 0.25×0.25 km 2 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | compared to the clear sky surface and thus appear brighter.Counter-examples include snow-covered surfaces, enhanced clear-sky reflectances due to aerosols, and cloud shadows, but are neglected here.A perfect classification would allow an exact separation between clear sky and cloudy reflectances based on a single reflectance threshold.
the thresholds for cloud detection relative to a composite map of clear sky HRV reflectance.These maps are derived initially based on the EUMETSAT cloud mask.Due to the lower SEVIRI standard resolution, it is necessary to upsample the EUMETSAT 2835 Discussion Paper | Discussion Paper | Discussion Paper | Figure 4 demonstrates the effects of reducing the spatial variability by applying the clear sky composite anomaly map.The solid line shows the histogram of the clear sky HRV reflectance over Spain observed by MSG SEVIRI on 15 July 2011 and derived by applying the EUMETSAT cloud mask.The dotted line represents the normalized clear sky histogram.
Discussion Paper | Discussion Paper | Discussion Paper |

A
final processing step is introduced to consider thin clouds with a thin cloud restoral.The detection of thin cirrus clouds solely based on the broadband information from the Introduction Discussion Paper | Discussion Paper | Discussion Paper | First, the cloud detection frequency for pixels in the EUMETSAT cloud mask has been determined as a function of the number of cloudy pixels identified by the HRV cloud mask algorithm in the corresponding 3 × 3 HRV pixel blocks.Results have been aggregated for each region over the time period from 1 July 2011 until 16 August 2011.The result of this comparison is plotted in Fig. 6.For completely cloudy HRV pixel blocks, we find 100 % agreement with the corresponding EUMETSAT cloud mask classification.In contrast, 10 % of all completely clear HRV pixel blocks are actually identified as cloudy by the EUMETSAT cloud mask.Closer inspection of several corresponding scenes revealed that this deviation is mainly caused by optically thin cirrus clouds, which are not detected by the HRV channel due to their low reflectance.These cases are addressed by the thin cloud restoral and motivated its inclusion in the algorithm.Although the thin cloud restoral works well in general, some artifacts can occur under specific circumstances as described below.Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | satellite missions such as METEOSAT Third Generation, whose imager has different spatial resolutions for the solar and infrared channels.Discussion Paper | Discussion Paper | Discussion Paper | Ipe, A., Clerbaux, N., Bertrand, C., Dewitte, S., and Gonzalez, L.: Pixel-scale composite topof-the-atmosphere clear-sky reflectances for Meteosat-7 visible data, J. Geophys.Res., 108, 4612, doi:10.1029/2002JD002771,2003.2832 Kl üser, L., Rosenfeld, D., Macke, A., and Holzer-Popp, T.: Observations of shallow convective clouds generated by solar heating of dark smoke plumes, Atmos.Chem.Phys., 8Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |

Fig. 1 .Fig. 2 .Fig. 3 .Fig. 4 .Fig. 5 .Fig. 6 .
Fig. 1.Normalized spectral response functions of the Meteosat-9 SEVIRI radiometer for the 0.6 µm (red), 0.8 µm (green) and HRV (black) channels.The central wavelength of each channel is marked by a thick colored line, and the spectral region covered by the channel width has been shaded.The solar spectrum is added as dotted line.

Table 1 .
Contingency table with binary classification cloudy and clear.

Table 2 .
Results of the HRV cloud mask algorithm averaged over three 16-day periods starting 1 June 2011, 1 July 2011 and 1 August 2011.The four regions considered are listed in column 1. Cc is the average cloud cover, and r cs is the average HRV clear sky reflectance including its standard deviation std(r cs ).Columns 4 and 5 report the cloud detection thresholds above which a pixel is classified as cloudy.t abs lists the absolute threshold without use of the HRV clear sky reflectance composite, while t rel is the threshold relative to the composite.The percentage deviations between the HRV and the EUMETSAT cloud mask are given in columns 6-8.Here, Dev abs and Dev rel are the deviations found using t abs and t rel , respectively.Column 8 lists the final deviation Dev fi after applying the HRV clear sky composite and thin cloud restoral.