Automatic cloud classification of whole sky images

Automatic cloud classification of whole sky images A. Heinle, A. Macke, and A. Srivastav Excellence Cluster “The Future Ocean”, Department of Computer Science, Kiel University, Kiel, Germany Leibniz Institute of Marine Sciences at Kiel University (IFM-GEOMAR), Kiel, Germany Received: 21 October 2009 – Accepted: 22 December 2009 – Published: 27 January 2010 Correspondence to: A. Heinle (ahe@informatik.uni-kiel.de) Published by Copernicus Publications on behalf of the European Geosciences Union.


Introduction
Clouds are one of the most important forces of Earth's heat balance and hydrological cycle, and at the same time one of the least understood.It is well known that low clouds provide a negative feedback and high, thin clouds a positive feedback on the radiation budget.The net effect of clouds, however, Correspondence to: A. Heinle (ahe@informatik.uni-kiel.de) is still unknown and they cause large uncertainties in climate models and climate predictions (Solomon et al., 2007).
The effect of clouds on solar and terrestrial radiation is due to reflection and absorption by cloud particles and depends strongly on the volume, shape, thickness and composition of the clouds.Large-scale cloud information is available from several satellites, but such data is provided in a low resolution and may contain errors.For example, small clouds are often overlooked due to the limited radiometer field of view.Low or thin clouds and surface are frequently confused because of their similar brightness and temperature (Ricciardelli et al., 2008;Dybbroe et al., 2005).Additionally, the solar radiation reaching the ground with respect to the cloud type cannot be determined, even though this is essential for cloud-radiation studies.
Nowadays ground-based imaging devices are commonly used to support satellite studies (Cazorla et al., 2008;Feister and Shields, 2005;Sakellariou et al., 1995).One of the best known commercial manufacturer of such instruments is the Scripps Institute of Oceanography at the University of California San Diego.Their Whole Sky Imagers are constructed to measure sky radiance at diverse wavelength bands (visible spectrum and near infrared) across the whole hemisphere (Shields et al., 1998(Shields et al., , 2003)).Due to the high-quality components involved, these imagers are often too expensive for small research groups.Therefore, as a cost-effective alternative, a few research institutions in several countries have developed non-commercial sky cameras for their own requirements (Pagès et al., 2002;Seiz et al., 2002;Pfister et al., 2003;Souza-Echer et al., 2006;Kalisch and Macke, 2008).In most cases an upward looking fisheye-objective is used to image the whole sky with a field of view (FOV) of about 180 • .Individual algorithms to automatically estimate cloud cover already exist for many of them (Pfister et al., 2003;Long et al., 2006;Kalisch and Macke, 2008).Automatic cloud type recognition, however, is still under development and few papers have been published on that subject.
Published by Copernicus Publications on behalf of the European Geosciences Union.In one prior study Singh and Glennen (2005) present an approach of cloud classification for common digital images (without 180 • FOV) to be utilized in air traffic control.Numerous features have been extracted and used to distinguish five different sky conditions, but the authors acknowledge their results as modest.Another recent paper (Calbó and Sabburg, 2008) introduces some possible criteria for sky-images to classify eight predefinded sky conditions, which include statistical features, features based on the Fourier transform, and features that need the prior distinction between clear and cloudy pixels.However, the classifier is based on a very simple classification method and achieves an accuracy of only 62%.Other publications handle simpler issues such as the estimation of cloud base height or the identification of thin and thick cloudiness (e.g.Seiz et al., 2002;Kassianov et al., 2005;Long et al., 2006;Cazorla et al., 2008;Parisi et al., 2008).Parisi et al. (2008) in particular report that they were not able to classify cloud type.
The objective of this study is the development of a fully automated algorithm classifying all-sky images in real-time with high accuracy.The cloud camera and the associated imager data are introduced in the following section.In Sect. 3 the features used to classify cloud types as well as the algorithm, a k-nearest-neighbour (kNN) classifier assigning the pre-processed images due to their feature vector to one of seven different sky conditions, are presented.The performance and results of the algorithm are discussed in Sect.4, and Sect. 5 contains the summary and proposals for future research.

Camera
The images used to develop the algorithm have been obtained by one of two cloud cameras constructed to enable cost-effective continuous sky observations for research associated with radiative transfer at the Leibniz Institute of Marine Sciences at the University of Kiel (IFM-GEOMAR).These all-sky imagers are based on commercially available components and are designed to be location-independent and run during adverse weather conditions, as one of them is primarily operating onboard a research vessel.The basic component is a digital camera equipped with a fisheye lens to provide a field of view larger than 180 • , enclosed by a water and weather resistant box.In order to obtain a high temporal resolution, the cameras are programmed to acquire one image every 15 s, stored in 30-bit color JPEG format with a maximal resolution of 3648×2736 pixel.As such, the images are rectangular in shape, but the whole sky mapped is circular where the center is the zenith and the horizon is along the border (spherical projection, see Fig. 1).More details about the cameras and their usage can be found in Kalisch and Macke (2008).

Images
For the development of the cloud type classification algorithm, images with a resolution of 2272×1704 pixel, captured during a transit of the german research vessel "Polarstern" from Germany to South Africa in autumn 2007 (ANT XXIV/1), are used (Schiel, 2009).In the course of this expedition, different climate zones in several seasons were crossed and therefore the acquired data covers a wide range of possible sky conditions and solar zenith angles.
To create an image set required for feature search and later training of the cloud type classifier, we screened the complete data set and selected approximately 1500 all-sky images from the 75 000 obtained onboard in total.The selection procedure focused on temporal independence and uniqueness with respect to our pre-defined cloud classes (see next section).Furthermore, we insured that the final image set includes a large variety of different cloud forms as well as images of different daytimes and consequently different states of solar zenith angle.
The training set generated in this fashion, called TRAIN, contains about 200 independent images per cloud class.

Algorithm
In this section, the individual cloud classes are presented, followed by an introduction to the methodology of the applied classifier, the kNN classifier.We then describe the preprocessing of the imager data and explain the features integrated, as well as the feature selection method.Atmos.Meas. Tech., 3, 557-567, 2010 www.atmos-meas-tech.net/3/557/2010/

Cloud classes
In contrast to other publications handling automated cloud classification, we used phenomenological classes to be separated according to the International Cloud Classification System (ICCS) published in WMO (1987).Therein, ten genera are defined which represent the basis of our classification.
Based on visual similarity we combined some genera (altostratus and stratus, cirrocumulus and altocumulus, cumulonimbus and nimbostratus) to avoid systematical misclassifications.Aditionally, we merged the genera cirrus and cirrostratus due to lack of available data showing the latter, as well as the difficulty in detecting very thin clouds, such as some kinds of cirrostratus.Besides, it must be noted that the class of clear sky includes not only images without clouds, but also images with cloudiness below 10%.Despite these generalizations, the resulting classes (see Table 1) represent a suitable partitioning of possible sky conditions and are especially useful for radiation studies.In order to simplify the application of the cloud classes, each is labeled by an individual identification number (see also Table 1).

Classifier
To classify the images described in Sect.2, the k-nearestneighbour (kNN) method is chosen, which is part of the supervised, non-parametric classifiers (Duda and Hart, 2001)."Supervised" means that the separating classes are known and a training sample is used to train the classifier.Nonparametric classifiers in general do not assume an a-priori probability distribution.Compared with other classifiers, the kNN method is very simple (and therefore associated with only low computatitional costs) and simultaneously quite powerful (Serpico et al., 1996;Vincent and Bengio, 2001;Duda and Hart, 2001).Even in the specific field of cloud type recognition, some results for comparison with linear classifiers and neural networks exist, underlining the high performance of kNN classifiers (Singh and Glennen, 2005;Christodoulou et al., 2003).kNN classifier.The assignment of an image to a specific class using kNN classifiers is performed by majority vote.After pre-processing, several spectral and textural features are extracted from an image.In the next step, the computed and normalized feature vector x is compared with the known feature vectors x i of each element in the training data by means of a distance measure, in our case the Manhattan Distance The class associated with the majority of the k closest matches determines the unknown class.In the case that this majority is not unique, the training date with the absolute smallest distance to the unknown image specifies the target class.Therefore, the composition of the training sample and a meticulous selection of suitable images is of great importance.
Complexity.The kNN classifier is often critizised for slow runtime performance and large memory requirements (in other words high time and space complexity, respectively).
The time complexity of an algorithm is a measure of how much computation time is needed to run the algorithm and is thus dependent on the number of calculation steps.In the case of image classification, this measure refers to the computational expense in classifying an unknown image.Using the kNN classifier, all distances between the feature vector of this image and each of the n members of the training sample are required for the calculation.These distances depend on the dimension d of the feature vector and we get a total complexity of O(nd) (here n = 1497 and d = 12).Since kNN methods store a set of prototypes in memory once, the space complexity of such an algorithm is O(nd) as well.

Pre-processing
To obtain suitable features for separating the defined classes, it is necessary to eliminate some areas of the analysed raw images, as they are rectangular in shape but the interesting part, the mapped sky, is circular.analyses showed that it is useful to segment the images into clear and cloudy areas before calculating features.
Therefore, an image-mask, constructed by visually identifying image regions containing confounding information, is used first.The mask adapts the detected sections as well as completely white pixels (such as the ones displaying the sun) to the background by setting all corresponding pixel values to zero.Afterwards the remaining area is divided pixel by pixel into clear and cloudy regions, utilizing their red and blue pixel values.
In a clear atmosphere (without aerosols), more blue than red light is scattered by gas molecules, which is why clear sky appears blue to our eyes.In contrast, clouds (containing particles like aerosols, water droplets and/or ice crystals) scatter blue and red light to a similar extent, causing them to appear white or grey (Petty, 2006).Therefore, image regions with clear sky show relatively lower red pixel values compared to regions showing clouds, and the ratio R/B may be used to differentiate these areas.A separating threshold, whose exact value depends on both the camera used and prevailing atmospheric conditions, has to be determined.Suitable values are discussed in several papers handling cloud cover estimation (e.g.Pfister et al., 2003;Long et al., 2006).However, during the testing phase we noticed problems in detecting thick clouds and classifying circumsolar pixels at the same time.Therefore we modified the criterion and considered the difference R − B instead of the ratio R/B.Comparisons showed that segmentation using such a difference threshold still results in minor errors, but outperforms the ratio method.For our application the value R − B = 30 is optimal (see Fig. 2).

Features used
Out of numerous features tested (for example, features describing edge or color, features considering the run-length of primitives, their quantity or frequency, or features describing the texture of an image), we selected 12 features for application (see below).The choice of these features is based on their Fisher Distances F x ij , a selection criterion used in cloud classification work relating to satellite imagery (Pankiewicz, 1995).It is defined as Atmos.Meas.Tech., 3, 557-567, 2010 www.atmos-meas-tech.net/3/557/2010/where µ x i and µ x j are the means of feature x with respect to class i and j , σ x i and σ x j the corresponding standard deviations.The features best suited to separate the defined classes are those which have the largest Fisher distances F x ij .It should be noted, however, that the feature set chosen in this way has to guarantee the separation of all classes.That means that features with smaller Fisher distances have to be included in the final set as well, if they discriminate classes which are not separated by other features with higher distances.
Most of the features are based on grey-scale images.Since the original data is provided in color, a partition into the three components R, G and B has to be performed before the features can be calculated.A simple transformation provides the grey-scale images, containing only the color information of one channel (R, G or B).
Spectral features.Spectral features describe the average color and tonal variation of an image.In cloud classification they are useful to distinguish between thick dark clouds, such as cumulonimbus, and brighter clouds, such as high cumuliform clouds, and to separate high and transparent cirrus clouds from others.
The spectral features implemented in the algorithm are the following: -Mean (R and B) -Standard deviation (B) -Skewness (B)
In the brackets, R, G and B specify the color for which the individual feature is calculated.Due to the color of the sky and the different translucency of clouds, the color component B has the highest separation power.Thus most features are calculated for the grey-scale image containing the B color information.
Spectral features like the ones above support a division of cloud classes, but considering only those is not sufficient.
They do not provide information about the spatial distribution of color in an image.In most issues of pattern recognition and particularly in cloud type recognition, however, this distribution is equally significant.For example, images showing cumulus clouds and others showing altocumulus or stratocumulus clouds have similar mean color values and cannot be separated with those features.On the other hand, their spatial distribution of color is quite different, and other kinds of features can be added to separate those cases.
Textural features.To describe the texture of an image, statistical measures computed from Grey Level Co-occurrence Matrices (GLCM) may be used.A GLCM is a square matrix for which the number of rows equals the number of grey levels in the considered image.Every matrix element represents the relative frequency P (a,b) that two pixels occur, separated in a defined direction by a pixel distance = ( x, y), one with grey value a and the other with grey value b.To avoid dependency on image orientation, often an average matrix is calculated out of two or four matrices, expressing mutually perpendicular directions.Furthermore, because the computation of GLCMs strongly increases with increasing number of intensity levels G, it is advantageous to reduce the original number (G = 256) of grey levels.
The textural features used in this study are the following four of 14 statistical measures proposed by Haralick et al. (1973), computed from an average GLCM with pixel distance = (1,1): -Energy (B) The energy shows the homogenity of grey level differences.
-Entropy (B) The entropy is a measure of randomness of grey level differences.
-Contrast (B) Contrast is a measure of local variation of grey level differences.
-Homogenity (B) The homogenity reflects the similarity of adjacent grey levels.
CC is a measure of the average cloudiness, and for example stratiform clouds could be well distinguished from other sky conditions using this feature.
For each pre-classified image in the training sample TRAIN we computed the features presented and stored them with their assigned cloud class.Since the kNN classifer chooses the target class of an unknown image based on its distance in the feature space to the training images and the features differ in their value ranges, we normalized the features to the interval [0,100].This ensures that all features are equally weighted in the decision process.

Results and discussion
In this section we describe the methodology used to estimate the performance of the created algorithm as well as to optimize the included parameters and the respective results are discussed.Afterwards, an additional test sample of random images is presented to assess the performance of the algorithm in classifying more ambiguous images.
The algorithm was implemented in IDL and tested on an Intel Celeron 530 with 1.73 GHz and 512 MB RAM.For one image it took about 1.3 s to return the classification result.

Methodology of performance estimation
To estimate the performance of the selected features and the created algorithm we applied the Leave-One-Out Cross-Validation (LOOCV).Cross validation methods in general have the advantage that they reuse the known training sample to estimate the capability of an algorithm, nevertheless being unbiased, instead of needing an additional test sample (Ripley, 2005).Therefore, they are often used for validation or feature selection in the area of pattern recognition.In cloud type recognition the LOOCV has been applied by e.g.Tag et al. (2000) or Bankert and Wade (2007).
LOOCV.From the training sample T , one single element t is removed and the algorithm is trained with the remaining data (T − t).Then the element excluded, which is independent from the data used for training, is classified.This is repeated n times, where n is the number of elements in T , such that each element in the training sample is used for validation exactly once.The average number of correctly classified elements is finally used as measure of performance.
First results.The results of the first LOOCV performed are given in Table 2.All features were equally involved in the classification process and the parameter k, the number of considered neighbours (see Sect. 3.2), was set to 3 as a first guess.We see an overall accuracy of about 96%, where the class clear shows the best classification results with 98.8%.Confusions of these class primarily exist with cirrus clouds and also rarely with cumulus clouds in case of low cloudiness.This is caused by thin and transparent parts of cirrus clouds which cannot be detected by the algorithm.Consequently, such images are classified as clear sky.Moreover, the so-called "whitning effect" provides a missclassification of cloud free pixels near the solar disk.Such pixels are often whiter and brighter than the rest of the hemisphere due to forward scattering by aerosols and haze (see Fig. 3, left) and therefore interpreted as thin clouds by the algorithm (see also Long et al., 2006).
Most of the remaining cloud classes show accuracies of about 96% or 97%, except for the cumulus class and the class of high cumulus.Both have slightly lower hit ratios due to confusions among themselves which is caused by the difficulty in distinguishing those two classes.They differ only in the size of the individual cloudlets for which no clear boundary exists, so that a discrimination can be extremely difficult.
Atmos.Meas.Tech., 3, 557-567, 2010 www.atmos-meas-tech.net/3/557/2010/The next remarkable errors occur between stratocumulus, stratus and the class of thick clouds.Some cases of stratocumulus are classified as stratus, some images showing stratus are assigned to those showing thick clouds and, in turn, images with thick clouds are sometimes classified as stratocumulus.These confusions, however, are well understood.All three classes occur frequently as transitional forms from one in the other and the automatic classification of such images could differ from the manual preclassification.
Also, misclassifications of some images displaying stratus and thick clouds appear to be caused by raindrops on the camera protecting dome (see Fig. 3, right).The drops are naturally also mapped on the images and lead to texture feature values similar to those representing patchy altocumulus and cirrocumulus.
Apart from these errors, the first results, based on the guess of using 3 nearest neighbours, are quite good.However, we wanted to see if the performance of the algorithm could be improved by using another value of k or by weighting the individual features.
Improved results.For the LOOCV discussed above, all features were equally weighted.To assess whether improvements can be achieved by varying the impact of the individual features, we added a weight vector and ran the LOOCV for different configurations of this vector.Furthermore, because k, the number of neighbours considered, is a variable parameter, the LOOCV has also been carried out for different values of k.
The remaining classes are recognized quite well.Some more confusions exist between cumulus and high cumulus due to their similarity in color and smooth transition in definition.Also confusions occur between the last three classes, stratocumulus, stratus and thick clouds.The reason is the frequently changeover from one to another class, a natural phenomenon that will always lead to some misclassifications of these classes using automatic methods.
One further problem, not visible in the results of the LOOCV, but occuring in the analysis of random data, is the incorrect class assignment due to simultaneous appearance of more than one predefined cloud class.In nature, the sky often provides a wide spectrum of different cloud types at the same time, e.g.cirrostratus and stratocumulus or cirrus and cumulus frequently occur together.In order to avoid missclassifications due to this phenomenon, we suggest an initial partitioning of the images into smaller subimages and their separate classification.However, it is important to check if these subimages still include enough information to assign the image parts to a cloud class.
We are convinced that by use of the suggestions elucidated above and thus, an elimination of errors caused by questionable images, an improvement of the algorithm is possible.Moreover, other, not mentioned features also may lead to an increasing of the algorithm's performance.However, the algorithm here presented is already quite powerful and suitable for research purposes.For example, at the Meteorological Institute of IFM-GEOMAR in Kiel the implemented algorithm is currently in use and available for people interested.

Fig. 2 .
Fig. 2. Segmentation for optical thick clouds (top) and clear sky (bottom) using a treshold of R/B = 0.8 (middle) and a treshold of R − B = 30 (right).

Fig. 3 .
Fig. 3. Misclassification of clear sky caused by the "whitening effect" (left) and missclassification of stratus due to raindrops (right).

Fig. 2 .
Fig. 2. Segmentation for optical thick clouds (top) and clear sky (bottom) using a treshold of R/B = 0.8 (middle) and a treshold of R −B = 30 (right).

Fig. 2 .
Fig. 2. Segmentation for optical thick clouds (top) and clear sky (bottom) using a treshold of R/B = 0.8 (middle) and a treshold of R − B = 30 (right).

Fig. 3 .
Fig. 3. Misclassification of clear sky caused by the "whitening effect" (left) and missclassification of stratus due to raindrops (right).

Fig. 2 .
Fig. 2. Segmentation for optical thick clouds (top) and clear sky (bottom) using a treshold of R/B = 0.8 (middle) and a treshold of R − B = 30 (right).

Fig. 3 .
Fig. 3. Misclassification of clear sky caused by the "whitening effect" (left) and missclassification of stratus due to raindrops (right).

Fig. 3 .
Fig. 3. Misclassification of clear sky caused by the "whitening effect" (left) and missclassification of stratus due to raindrops (right).

Table 1 .
Classes to be distinguished.

Table 2 .
Confusion matrix of CV for equally involved features in %.