SPATIAL-TEMPORAL CONDITIONAL RANDOM FIELDS CROP CLASSIFICATION FROM TERRASAR-X IMAGES

The rapid increase in population in the world has propelled pressure on arable land. Consequently, the food basket has continuously declined while global demand for food has grown twofold. There is need to monitor and update agriculture land-cover to support food security measures. This study develops a spatial-temporal approach using conditional random fields (CRF) to classify co-registered images acquired in two epochs. We adopt random forest (RF) as CRF association potential and introduce a temporal potential for mutual crop phenology information exchange between spatially corresponding sites in two epochs. An important component of temporal potential is a transitional matrix that bears intraand inter-class changes between considered epochs. Conventionally, one matrix has been used in the entire image thereby enforcing stationary transition probabilities in all sites. We introduce a site dependent transition matrix to incorporate phenology information from images. In our study, images are acquired within a vegetation season, thus perceived spectral changes are due to crop phenology. To exploit this phenomena, we develop a novel approach to determine site-wise transition matrix using conditional probabilities computed from two corresponding temporal sites. Conditional probability determines transitions between classes in different epochs and thus we used it to propagate crop phenology information. Classification results show that our approach improved crop discrimination in all epochs compared to state-of-the-art mono-temporal approaches (RF and CRF monotemporal) and existing multi-temporal markov random fields approach by Liu et al. (2008).


Introduction
To monitor and estimate food production, up-to-date precise crop spatial information is required.Earth observing satellites have undergone improved spatial, spectral, and temporal resolutions.Changes in a scene can be monitored regularly and on demand.Such trend favours development of novel image classification methods that can handle temporal data (Jianya et al., 2008).This is especially true for radar sensors which overcome limitations of optical sensors: their signals can penetrate clouds and are independent of daylight (Gomez-Chova et al., 2006;Tupin, 2010).Incorporating crop growing degree day information into multitemporal radar images from TerraSAR-X is likely to improve crop classification.Thus, Synthetic Aperture Radar (SAR) temporal data can be used to benefit crop classification given novel spatial-temporal context methods.
Use of context (spatial and temporal) in image segmentation and classification has recently gained popularity. Spatial context accounts for similarities among pixels in regard to distance from each other.It determines probability of a pixel or a group of pixels occurring at a given location based on nature of its neighbourhood (Tso and Mather, 2009).Goodchild (1992) defines spatial context as "the propensity for nearby locations to influence each other and to possess similar attributes."In contrast, temporal context defines spectral similarities of pixels with respect to different acquisition times.Use of images acquired at different times has shown significant improvement in classification e.g. in classification of crops and vegetation (Lu and Weng, 2007).However, complexity accompanying multi-temporal data requires approaches that can effectively integrate spatial and temporal data.More also, continuous increase in high temporal resolution satellites has led to a "Tsunami" of data in archives.Therefore, spatial-temporal automated classification methods are nec-Markov Random Fields (MRF) (Geman and Geman, 1984) have widely been used to integrate spatial-temporal context in image classification.Introduction of Bayesian concept by Swain (1978) to classification of multi-temporal images motivated several MRF temporal studies.Examples include: a MRF approach unidirectionally passing temporal information from a classified image at a given date to a subsequent image of the same area at a later date by Jeon and Landgrebe (1992); Solberg et al. (1996) later extended in (Melgani and Serpico, 2003) to allow bidirectional exchange of temporal information.Liu et al. (2006) uses temporal correlation and temporal exclusion to control certain changes in forest disease spread monitoring.In (Moser and Serpico, 2011) MRF is applied for multi-scale multi-temporal high resolution image classification with transitional matrix determined using Expectation Maximization (EM) algorithm.These studies, except (Moser and Serpico, 2011), used one generalized class transition matrix in MRF determined heuristically.Leite et al. (2011) used a combination of expert knowledge and training data.In this approach, the matrix globally assumes stationary class transitions over all pixels neglecting changes that may exist in the image (Liu et al., 2008).In addition, MRF's assumption of conditional independence in observed data adopted for computational tractability neglect spatial context inherence in images (Lafferty et al., 2001;Kumar, 2006;Zhong and Wang, 2007a;Parikh and Batra, 2008).Remotely sensed images exhibit a coherent scene because neighbouring sites are spatially correlated.This concept is modelled by Conditional Random Fields (CRF) (Lafferty et al., 2001) introduced for one-dimensional text classification and extended to two-dimensional image classification (Kumar, 2006).The Framework provided by CRF integrates spatial context both in class labels and data.bels and data has triggered studies in several applications.So far CRF has been used for: classification of settlements and urban areas (Zhong and Wang, 2007a,b;Hoberg and Rottensteiner, 2010;Niemeyer et al., 2011Niemeyer et al., , 2013;;Kenduiywo et al., 2014), estimation of ground heights from LiDAR data (Lu et al., 2009), building extraction (He et al., 2008;Wegner et al., 2011b,a) and interpretation of terrestrial images (He et al., 2004;Korc and Förstner, 2008;Gould et al., 2008).However, these are mono-temporal CRF studies.Despite the benefits of multi-temporal images, a few CRF studies exist: crop type classification using RapidEye images by (Hoberg and Müller, 2011), land-cover classification from IKONOS and RapidEye images (Hoberg et al., 2010) and multi-scale multi-temporal study using IKONOS and Landsat images (Hoberg et al., 2011).The studies incorporate temporal context by passing temporal information through empirically determined global transition probability matrix.However, the transition matrix does not optimally represent all site changes between epochs.Hoberg et al. (2015) notes that incorrect determination of transition matrix leads to erroneous transfer of information into other epochs subsequently reducing classification accuracy.Developing an approach of determining transition probabilities for each site can minimize such errors.More also, according to our knowledge, no studies have used multi-temporal radar data for CRF classification of crops.We develop a spatial-temporal CRF classification approach exploiting site-wise crop phenological transitions from multi-temporal TerraSAR-X images.
Multi-temporal SAR images of a given season will improve crop classification.Crops show varied backscatter radar signal in time (at different epochs1 ).We intend to maximize feature separation by exploiting this temporal crop phenology.For instance, crops that may not be resolved in one epoch spectrally can be resolved in another.Notably crops undergo phenological changes between different epochs resulting in varied spectral properties (Bargiel et al., 2010).Integration of spatial-temporal crop phenology information from different epochs of a vegetation season using sitewise transition matrix between a pair of epochs in temporal potential of CRF is the contribution of this study.In our study, Terra-SAR images are acquired at different phenological stages of a crop season.Therefore, we determine the site-wise transition matrix using Bayes' theorem of conditional probability.The conditional probability -representing class transitions in the matrix -are computed from a pair of class probability vectors estimated by random forest classifier at corresponding sites in different epochs.We derive this approach from a MRF forest change detection study using optical images in (Liu et al., 2008) and extend it to this study.So far no studies have considered incorporating phenology information into CRF classification using SAR images in this manner.
The rest of the paper is organized as follows.Section 2 introduces the approach we adopted and discusses how CRF is designed using site-wise conditional probabilities for bi-directional temporal information flow.Section 3.4 describes classes (crop types), data and features used, details of experiments conducted by our approach and other state-of-the-art methods and results obtained.In Section 4 a discussion of results from our approach compared to state-of-the-art is made leading to conclusions in Section 5.

Conditional random fields
A supervised image classification assigns class labels to image sites given user defined examples known as training sites.Train-ing sites, represented by a vector of features (numeric attributes computed from user defined image sites), and corresponding class labels serve as an input to an algorithm that infers labels of all other image sites.Mono-temporal CRF classification aims to estimate an optimal label configuration ĉ of a vector of class labels c = (c1, c2, . . ., cm) T where m is number of image sites, from image data x, i.e. x = (x1, x2, . . ., xm) T , by maximizing posterior probability P (c|x) thus ĉ = arg max c P (c|x).A monotemporal CRF models P (c|x) with a graph structure in which nodes are linked to image sites and edges bear relationships between a pair of adjacent sites.Thus, P (c|x) is modelled as CRF as: where A denotes association potential, I is spatial interaction potential, S is a set of all image sites, i is a site in the image, j is a neighbour of site i, N is a set of neighbours of i and Z(x) is a normalizing constant called partition function.

Spatial-temporal CRF: Problem formulation
In spatial-temporal contextual image classification, image sites are dependent random variables that form a random field (region) with correlated spatial and temporal neighbours.For computational tractability the spatial and temporal random fields are factorized into neighbourhood systems (Li, 2009).
For spatial-temporal classification, we consider p co-registered images with the same spatial resolution and extent acquired respectively at epoch t ∈ T such that T = {t0, t1 . . ., tp}.Assume c t0 = c t0 1 , . . ., c t0 m and c t1 = c t1 1 , . . ., c t1 n where m and n represent number of image sites, are a set of random labels over a set of multispectral image features x t0 and x t1 corresponding to epochs t0 and t1 respectively.The set of classes in any epoch can differ in both composition and number.For instance, in multi-scale classification different number of landcover classes can be defined subject to image resolution.Normally the image of high resolution may contain sub-classes of a class in the lower resolution image.In such a case the transition matrix would be rectangular.In our case, the image resolution and number of classes is the same within a season and thus the transition matrix is square.To estimate P (c|x) in spatialtemporal classification we extend Equation (1) as: where T P is temporal interaction potential, K is a set of image sites in epoch t1 that are temporal neighbours of site i, i.e. i is a spatial neighbour to j and a temporal neighbour to k in epoch t1 (i and k correspond to the same ground location).Parameters λ1 and λ2 represents weights used to regulate spatial and temporal information respectively.Equation (2) provides a general spatial-temporal CRF classification framework (Hoberg et al., 2015).However, our T P is developed on site-wise transition matrix computed using initial class probability estimates at each site from training sites and data as opposed to an empirical c t0 x t0 c t1 x t1 t 0 to t This property enables use of domain-specific discriminative classifiers in structured data rather than restricting the potentials to a certain form (Zhong and Wang, 2007a,b).We omit weight notations λ1 and λ2 in subsequent sections for clarity in CRF potential equations.

Association Potential
It determines how likely an image site i takes a label c t0 i in epoch t0 given the data x t0 : A(c t0 i , x t0 ) = P (c t0 i |fi(x) t0 ), fi(x t0 ) is a site-wise feature vector (Kumar, 2006).We adopt random forest (RF) (Breiman, 2001) to determine A by independent classification of different epochs assuming class conditional independence in them.A RF conducts classification by casting votes from a number of decision trees DT generated during training.If the number of votes cast for a given class c by RF is Vc, then our A is P (c t0 i = c|fi(x t0 ) = Vc D T .We set DT = 200 because at that value RF stabilizes (Hastie et al., 2011) and set tree depth as 25.

Interaction Potential
It measures the influence of data and neighbouring labels on site i in epoch t0.It ensures that site i, as initially determined by association potential, is labelled to its corresponding "true class" given data evidence x t0 and neighbourhood dependency N where j ∈ N .This study models I using contrast sensitive Potts model designed based on Euclidean distance dij of adjacent node features fi and fj: where R is the number of features.Then, model of I is: where β is a spatial interaction parameter that regulates smoothness, parameter η regulates the contrast-sensitive term and h(c t0 i , c t0 j ) is a histogram matrix count bearing co-occurrence of labels of neighbouring sites i and j.We normalize the histogram by row in order to minimize bias of dominant classes in training data.Therefore, the model is different from contrast sensitive Potts model because transitions of classes is now governed by their frequency in training data (Kosov et al., 2013).

Temporal interaction potential
It models interactions between data x t0 and x t1 and labels c t0 i , c t1 k of site i and k: We consider crop classification within one season.Consequently, spectral changes in classes at a given epoch are a result of phenology rather than transitions to other classes in another epoch.This is because a particular crop type is observed in the entire season from planting to harvest.We exploit the fact that crop phenology varies temporally and also spectrally to enhance crop discrimination.Spectral changes due to crop phenology is expressed by joint probability in Equation ( 5).
Transitional probabilities between similar and different classes can be represented in a transitional matrix T ik .The matrix can be determined by expert opinion, empirically from existing data sources or computed (Liu et al., 2008).We compute T ik for each site in the image using conditional probability computed from class membership probability vectors for node i and k estimated by RF for each site based on spectral observation in training sites.
The matrix is then used to introduce a temporal directed edge between node i and k as illustrated in Figure 1.
where prior probabilities P (c t0 i = α) and P (c t1 k = ω) are marginal distributions in Table 1.From rules of probability, sum rule and product rule, Equation ( 6) can be transformed into: Similarly following Bayes' theory transitions in the reverse direction, i.e. t1 ⇒ t0, are computed as: Therefore, from Equations ( 9) and ( 10) we compute site-wise transition probability matrix by dividing the a × b joint probability matrix of joint distributions in

Training, Inference and parameter estimation
Solution to Equation ( 2) is obtained by maximizing probabilities, spectral (A) , spatial (I) and temporal (T P ) using Bayes' Maximum A Posterior (MAP) estimate.This requires an inference algorithm and we employ sum-product Loopy Belief Propagation (LBP).The association potential probabilities used in both I and T P are trained using RF implemented in OpenCV (OpenCV, 2014).We determine I parameters β and η with help of Powell's search method (Kramer, 2010) and set β = 5 and η = 1 in all experiments.All potentials (A, I, and T P ) were given an equal weight, i.e. λ1 = λ2 = 1.

Study site and data
The study was conducted in northern Germany (52.26 • N, 9.84 • E) (Fig. 6).The region is characterized by intensive agriculture, with large field sizes.The average annual precipitation in the area is 656 mm and the average annual temperature is 8.9 Ground reference data campaign was conducted concurrently with image acquisition.We divided ground reference polygons for separate use in training and validation as shown in Table 2. Separation of the ground reference polygons was done using stratified random sampling in sampling design tool in ArcGIS 10.0 (Buja and Menza., 2013).
We adopt the three main classes for classification using our approach in Equation ( 2).
The listed crops go through different phases at different times a fact that can enhance discrimination.A crop phenology calendar is shown in Figure 3. Four phases, preparation, seeding, growing, harvesting and post harvest, are considered.Preparation phase involves ploughing and soil grooming processes before seeding.
In seeding phase, crop seeds are placed in the soil.Growing phase includes the period between crop germination to ripening.After ripening, harvesting starts where mature crops are gathered from the fields using relevant methods.The last stage is post harvest phase where the field could be fallow or with some remaining ripe crops.
Figure 3. Phenology stages of crops considered for classification.

Features
Classification using Equation (2) requires definition of site-wise feature vectors fi(x t0 ) used in both A and I.We compute eight texture features (mean, variance, homogeneity, contrast, dissimilarity, entropy, second moment, and correlation) from Gray Level Co-occurrence Matrix (GLCM) (Haralick et al., 1973) using a 3 × 3 window with 0 • direction.The features were computed from the dual polarized TerraSAR-X amplitude values in each epoch.Thus fi(x t0 ) used in RF (association potential) is a 16 dimensional feature vector.We use the same features in I. Thus, the distance function, dij, in I is a 16 dimensional feature vector, i.e.R = 16 in Equation (3).All the features were normalized between 0 and 1 to minimize undue influence by features with high values.

Experiments
To evaluate our approach we conducted experiments using the test site and data described in Section 3.1.The experiments were done using four approaches: 3.4.1 Approach 1 In this approach we use RF in Section 2.3 for mono-temporal classification of crops.Approach 1 also forms A in CRF.
3.4.2Approach 2 experiments are based on mono-temporal CRF in Section 2.1.Basically, it considers spatial context only with no temporal interactions hence "mono-temporal".
3.4.3Approach 3 is an existing state-of-the-art spatialtemporal MRF approach in (Liu et al., 2008).We implement this approach for comparison with spatial-temporal CRF developed in the study.In the approach, site-wise transitional matrix were determined using conditional probabilities described in Section 2.5.1.To implement spatial-temporal MRF we set η = 0 in Equation ( 2) but, other parameters remain as described in Section 2.6.This eliminates data interaction in I reducing it to a MRF Ising model: where I: 3.4.4Approach 4 is the spatial-temporal CRF approach presented in this paper, where the site-wise transition matrix is computed from conditional probabilities as shown in Section 2.5.1.The approach is an extension of MRF forest change detection method in (Liu et al., 2008) into CRF crop type classification using crop phenology informattion.

Results
In conducted experiments, we applied cross-validation based on reference data in Table 2 to evaluate our approach (approach 4) vis-à-vis approaches 1 to 3 in Section 3.4.Overall accuracy (OA), kappa statistic, producer (Prod) and user (User) accuracy measures were computed from error matrices generated by comparing classified pixels against validation set.These accuracy measures are illustrated in Tables 3 to 10.
Tables 3 and 4   Introduction of spatial interaction by approach 2 using CRF improves discrimination of crops as shown in Tables 5 and 6.The OA improves by 3.2% and 11.7% in March and June respectively.Results of addition of temporal potential using approach 3 in (Liu et al., 2008) are shown in Tables 7 and 8. Approach 3 improved classification OA by 48% and 12.8% in march and june when compared to approach 1 in Tables 5 and 6.In comparison to approach 2 OA improved by 45.1% and 1.1% in march and june respectively.This depicts importance of temporal information.

Discussion
The aim of this study was to design a spatial-temporal approach that exploits phenological changes within a vegetation season to improve discrimination of crops.Random forest, approach 1, used in A of CRF considers pixel-wise labelling by voting using decision trees.Results of this approach are characterized by mislabeled pixels in the form of "salt and pepper" as shown in Figure 4 which impacts classification accuracy.This illustrates significance of context in classification.We used RF as a baseline to incorporate spatial and temporal context using CRF.
As demonstrated in Tables 5 and 6, spatial interaction introduced by CRF significantly improved crop discrimination.The CRF interaction potential models site dependencies in a statistically sound manner in order to allocate a label to a given site.This minimizes "salt and pepper" effect common in per pixel approaches by ensuring that sites are labelled in regard to neighbouring labels and data.
Addition of temporal term further enhanced crop discrimination by exploiting crop phenology.Phenology enhanced class discrimination because different crops vary spectrally in time.For instance, the unploughed group of crops grow throughout the year, Figure 3, and has higher accuracy in june when other crops are in growing stage.Its spectral discrimination reduces later in march when most crops are in preparation stage because they are covered by similar land-cover.We incorporated this knowledge into T P for classification using approaches 3 and 4.
Results from approach 3 and 4 indicate an increase in classification accuracy.The two approaches use site-wise conditional probabilities to compute temporal transition matrix.Conditional probabilities determine probability of inter-and intra-class changes.During a vegetation season, observed spectral changes are only within a class.Therefore, transition from class A to another class B are minimal assuming no natural or artificial crop interference.We use this fact to enhance discrimination of classes that can not be separated easily, for instance in march epoch.However, approach 3 is based on MRF which ignores spatial interactions in data a fact that makes it perform lower than CRF.We therefore implemented the approach in CRF, i.e. approach 4.
Observations from approach 3 led us to design a robust method (approach 4) to incorporate phenology information in all epochs.We use site-wise conditional probability to compute temporal transition matrix.So far our approach demonstrates stable accuracy in considered epochs.This is because conditional probability facilitate bi-directional exchange of temporal information in form of probabilities thereby enhancing accuracy of a class with low discrimination in either epoch.The approach is suitable for seasonal crop classification since uncertainty in a class at a given epoch is resolved in another.For instance, low accuracy in grain and broad leaved classes in March, see Table 3, is improved by our approach as shown in Table 9.Moreover, our temporal term (T P ) is data dependent unlike in (Hoberg et al., 2015) where it was determined by empirically.

Conclusion and outlook
Our study has demonstrated significance of spatial-temporal information for seasonal crop classification.Spatial interaction significantly enhances spatial dependency minimizing "salt and pepper" effect common in per pixel classifiers.The introduced temporal potential resolved classes with low accuracy in March 2009 using information from June 2009 based on crop phenological changes in time.Moreover, our approach outperforms existing MRF state-of-the art spatial-temporal approach.Follow up studies will consider classification of sub-categories of crops and using images acquired at more than two epochs.

2. 5
.1 Site-wise transition probability matrix Conditional probability expresses the likelihood of a class label to take up a site i in epoch t0 given class label information from a site k in epoch t1.It represents intra-and inter-class transitions.Consider a set of classes a and b such that α ∈ α1, α2, . . ., αa and ω ∈ ω1, ω2, . . ., ω b in epoch t0 and t1 respectively, using Bayes' theory class transitions from t0 to t1, i.e. t0 ⇒ t1, can be expressed as:

)
ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume II-3/W4, 2015 PIA15+HRIGI15 -Joint ISPRS conference 2015, 25-27 March 2015, Munich, Germany This contribution has been peer-reviewed.The double-blind peer-review was conducted on the basis of the full paper.doi:10.5194/isprsannals-II-3-W4-79-2015 Location of the study site.We use two dual polarized (HH and VV) TerraSAR-X High Resolution Spotlight images acquired on 11th March 2009 and 18th June 2009 at an incidence angle of 34.75 • with a ground resolution of 2.1 m in ground range direction and 2.4 m in azimuth direction.The images were delivered as ground range products (MGD) with equidistant pixel spacing.All images are coregistered to an extent of 7.1 × 11.8 km 2 and projected to WGS 1984 UTM Zone 32N.Experiment site used covers an extent of 2.3 × 2 km 2 .

Table 1 .
Table 1 by row sum and column sum (marginal distributions) for crop phenology transitions in t0 ⇒ t1 and t1 ⇒ t0 respectively.Computation of site-wise conditional probability matrix.

Table 2 .
Data used for validation and training in hectares (ha).

Table 4 .
illustrate results of approach 1, RF, in both epochs (June and March).Approach 1 considers no spatial-temporal information and thus accuracy values are low.Classification accuracy is particularly low in March with many pixels mislabeled as shown in Figure4compared to June.This is because of different phenological states of crops as demonstrated in Figure3.Approach 1, June

Table 8 .
Approach 3, JuneResults of the new spatial-temporal method, approach 4, are illustrated in Tables9 and 10.The use of conditional probabilities to determine the temporal transition matrix improved classification accuracy significantly.Impact of this approach is evident ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume II-3/W4, 2015 PIA15+HRIGI15 -Joint ISPRS conference 2015, 25-27 March 2015, Munich, Germany compared to CRF mono-temporal classification in Tables5 and 6considering classification accuracy improved by 47.3% and 2.9% in March and June respectively.In comparison with approach 3, MRF spatial-temporal classification, approach still improved OA by 2.5% and 1.8% in march and june respectively.