Rubber Tree Distribution Mapping in Northeast Thailand

In many parts of mainland Southeast Asia rubber plantations are expanding rapidly in areas where the crop was not historically found. Monitoring and mapping the distribution of rubber trees in the region is necessary for developing a better understanding of the consequences of land-cover and land-use change on carbon and water cycles. In this study, we conducted rubber tree growth mapping in Northeast Thailand using Landsat 5 TM data. A Mahalanobis typicality method was used to identify different age rubber trees. Landsat 5 TM 30 m non-thermal reflective bands, NDVI and tasseled cap transformation components were selected as the model input metrics. The validation was carried out using provincial level agricultural statistical data on the rubber tree growth area. At regional (Northeast Thailand) and provincial scales, the estimates of mature and middle-age rubber stands produced from 30 m Landsat 5 TM data compared well (high statistical significance) with the provincial rubber tree growth statistical data.


Background
Around the world regional and global markets are driving the conversion of traditional agriculture and occupied non-agricultural lands to more permanent cash crops.In many parts of mainland Southeast Asia rubber plantations are expanding rapidly in areas where the crop was not historically found [1].Over the last several decades more than 1,000,000 hectares of land have been converted to rubber plantations in non-traditional rubber growing areas of China, Laos, Thailand, Vietnam, Cambodia and Myanmar [2,3].Like many mainland Southeast Asian countries, Thailand has experienced dramatic social and environmental change over the past decade [4].Northeast Thailand is a semi-arid [5] and chronically undeveloped area, and is a non-traditional rubber growing region of Thailand (Figure 1).Rubber tree development in this region has been booming for the past two decades.To encourage farmers from leaving home to find jobs in other parts of the country, this region is being targeted by the Thai government to grow more rubber trees in the next decade.The expansion of rubber trees has altered the ecosystem by influencing local energy, water and carbon fluxes, especially when rubber trees replaced eco-logically important secondary forests and traditionally managed swidden fields [6][7][8][9].Timely monitoring and mapping rubber tree growth distribution in the region is critical for documenting its expansion and understanding its implications for water and carbon dynamics.

Introduction
Remotely sensed imagery classification for deriving land-use/cover information is well documented and plays an important role in global-change studies, natural resource management and environmental applications.A number of studies of the distribution of rubber tree growth have been conducted in Southeast Asia, such as in Yunnan, China [7,8,10], Indonesia [11] and Laos [12].Most spatial analysis of rubber trees has been limited to suitability analyses in Thailand (e.g.[13]).Land-cover mapping over large areas with limited training samples is challenging due to the capability of classifiers to generalize patterns in unsampled areas.Analysts attempting to map rubber tree growth face two significant challenges.First, mature rubber trees are easily confused with tropical evergreen vegetation due to similar multi-spectral reflectance characteristics.Area of mature rubber trees is often overestimated by misclassifying secondary forests as rubber trees.Second, young rubber trees are usually dominated by bare ground and mixed scrub, or intercropped with short-term economic crops such as cassava and pineapple.Even after 3 -4 years of growth, rubber tree canopies comprise a small fraction of total planted area.These conditions make it difficult to map rubber trees.
Recently machine learning techniques [14] such as artificial neural networks and decision tress [15] have been widely used in remotely sensed imagery classification because they show many advantages over conventional classifiers [16][17][18].However, the substantial computation time and heuristic training process of machine learning classifiers make rubber tree growth mapping over large areas inefficient.Experiments conducted by Li and Fox [19] suggest that using approaches of neural networks and decision trees with spectral information and vegetation indices overestimated the number of rubber tree pixels.Additionally, these classifiers require training sites to contain sufficient both "presence" and "absence" information.In other words, the analyst must acquire such information from the training samples.In reality, it is difficult to collect sufficient training samples in the field to cover the numerous patterns of rubber tree stages that show up in an analysis.Given this fact, a presence-dataonly model looks increasingly promising in dealing with species distribution mapping, especially when knowledge about available land-cover types is limited.Sangermano and Eastman [20] and Hernandez et al. [21] conducted experiments using a presence-data-only model -Mahalanobis typicality approach to model species distribution, as Mahalanobis typicalities provide information about how typical instances being analyzed are compared to those used as a reference [20].
Selection of satellite images is another critical step for mapping of land-cover.Given the trade-off between spatial and temporal resolutions, currently selection of remotely sensed data for land-cover mapping at a global or a regional scale tends to use either low spatial but high temporal resolution imagery, such as Moderate Resolution Imaging Spectro-radiometer (MODIS) (e.g.[22]), or low temporal but high spatial resolution imagery such as Landsat Thematic Mapper (TM)/Enhanced Thematic Mapper (ETM+), etc. [23][24][25][26][27]. Li and Fox [28] improved rubber tree growth mapping using ASTER data by integrating Mahalanobis typicalities with a neural network model.Another successful application using Mahalanobis typicality approach was the mapping of rubber trees across the mainland Southeast Asia using the Mahalanobis typicality method with MODIS time-series NDVI and statistical data [19].In this study we examined the potential of Mahalanobis typicalities for rubber tree growth distribution mapping using Landsat 5 TM imagery.

Study Area and Data
The study area encompasses nineteen provinces of Northeast Thailand as shown in Table 2 and Figure 1.The Landsat 5 TM imagery used in this study was acquired from Land Processes Distributed Active Archive Center (LPDAAC).Eleven TM scenes (WRS Path/Row: 126-129/47-50) were used to cover the entire study area.The acquisition dates of the images vary from 2004 to 2009, depending upon the availability of cloud free or acceptable cloud contamination images.To develop training sites for calibrating the classifiers, we used NASA's Landsat GeoCover products (http://www.geocover.com/gc_lc/data_products/)to identify land-use and land-cover types.Rubber tree identification was primarily based on GPS ground truthing samples collected in the field in January and March 2009 and high resolution QuickBird / IKONOS images from Google Earth.Since the main purpose of this experiment was to map rubber tree growth distribution, only six broad categories were identified for the study region, i.e., rubber trees, forest, water, bare soil, paddy rice and others.Three sub-categories were defined for the rubber tree class: 1) Rubber 1, mature rubber trees older than 4 years old; 2) Rubber 2, middle-age rubber trees between 2 to 4 years old; and 3) Rubber 3, young rubber trees less than 2 years old dominated by bare ground and mixed scrub, or intercropped with other crops.Figure 2 shows photographs of different age rubber stands under various situations.Training sites comprising a total of 85,697 rubber tree samples were developed, accounting for 0.046% of the total study area.Among those samples, 67,839 were of mature rubber trees, 12,770 were of middle-age rubber trees, and 5088 were of young rubber trees, accounting for 0.037%, 0.007% and 0.003% of the total study area respectively (Table 1).
Rubber tree growth statistical data provided by the Thai Rubber Association (http://www.thainr.com/en/index.php?detail=stat-thai&page=1#) estimated rubber tree growth area and production between 2005 and 2007 at the provincial scale.Provincial and international boundaries were extracted from a Thailand vector GIS layers for

Image Pre-Processing
Atmospheric correction using Chavez's Cost model was applied to the TM images to reduce atmospheric scattering effects [29,30].The TM images were geo-metrically corrected using the Landsat GeoCover data set and the nearest neighborhood resampling method [31].All images were co-registered to the UTM system (zone 48 N).Clouds and shadows were masked out from the images.

Development of Input Metrics
The Normalized Difference Vegetation Index (NDVI), a frequently used measure of photosynthetic activity to estimate productivity [32], is sensitive to canopy structure and chemical content [33].In this study, Landsat TM NDVI was derived from TM band 4 and band 3, to capture general patterns of different vegetation types.The NDVI formula we used was: The Kauth-Thomas Transformation (KTT or tasseled cap transformation) [34,35] has been found to be sensitive to structural characteristics of forest environments [35,36].To highlight spectral difference among stands of different age rubber trees, e.g., mature, middle and young, as well as that among rubber trees and deciduous forest, shrubs and bare soil etc., we employed KTT on the six reflective TM bands to produce soil brightness, vegetation greenness, and soil/vegetation wetness components.The vegetation greenness component is well correlated with tree canopy cover, leaf area index and live biomass above ground [37]; therefore it was expected to be able to capture difference among stands of rubber trees of diverse ages due to differences in canopy densities.The soil brightness component expresses differences in soil properties, such as particle size and organic matter content; and the soil/vegetation wetness component is sensitive to soil and plant moisture [37].These two components together were expected to capture difference between young rubber trees (dominated by bare soil or shrubs) and pure bare soil fallow fields.The KTT equations [41]

Mahalanobis Typicality
In statistics Typicality Probability can be expressed by the relative distance of a particular class to the class mean, i.e., Mahalanobis distance [38][39][40]: where x is the input variable vector (from the context of remote sensed imagery, x is the data vector for the pixels in all wavebands), μ i is the mean vector for class i over all pixels, V i is the variance/covariance matrix for class i, and T is the transpose of the matrix .Ranging from 0 to 1, the Mahalanobis typicality measures the absolute strength of class membership [40] and determines the similarity of an unknown sample to a known group of samples.A typicality value of 0 indicates the instance being analyzed is atypical of the reference samples, while a value of 1 suggests the instance is identical to the known samples.Therefore, typicality values express how reasonable to assume a case really belongs to a particular class.In the context of remotely sensed imagery classification, the outputs of Mahalanobis typicalies are a set of probability images (one per class) that express the typicality of each pixel relative to the training samples.Because a Mahalanobis typicality calculates an intra-class similarity, it can use presenceonly data and is not affected by absence-data.For more detailed description of Mahalaanobis Typicality, see Eastman [41] (2009) and Sangermano and Eastman [20].

Classification and Image Post-Processing
Since TM scenes acquired in succession on the same track (with the same path numbers) are of more consistency in terms of the radiometry, we mosaiced the TM scenes with the same path numbers to spatially produce larger images.Separate training sites were developed for each of the mosaiced images.Four mosaiced images were made to cover the whole study area.Each larger mosaiced image was processed individually for landcover classification.
Six TM reflective bands (1-5 and 6), NDVI, and the greenness, brightness and the moistness from the KTT were developed for each mosaiced image and used as input variables to the Mahalanobis distance classifier.
Unlike traditional hard classifiers, the output from the Mahalanobis distance classifier is not a single classified land-cover map, but rather, a set of images (one per class) that expresses typicality of pixel reflectances relative to those described by the training sites.In our case study, eight Mahalanobis typicality maps were generated representing the typicality probabilities for each of the eight categories (Table 1).Since we are interested in the distribution of rubber tree growth, only the rubber tree classes, e.g, mature, middle-age, and young were analyzed and the rest of five classes were ignored.To reduce commission errors of rubber trees (i.e., the probability that a sample from a land-cover map is misclassified against what it is from the reference data) only pixels that were most typical of the rubber tree classes described by the training sites were retained.To do this, we set typicality thresholds of 0.94 -0.97 for both mature and middle-age rubber trees, and 0.99 for young rubber trees.These thresholds were determined based on the best calibration of the classifier using our training sites.The threshold set for the young rubber trees (0.99) is more restricted than those for mature and middle-age rubber trees because young rubber trees are highly likely to be confused with bare soil or fallow fields after plowing.A discrete map of rubber tree growth (Boolean map) was extracted by thresholding the typicality values for each type of rubber trees.However, the rubber tree growth map created this way scattered rubber trees into small patches because the high typicality thresholds filtered out pixels with lower typicalities even when they neighbor "most typical" pixels and actually do belong to a rubber tree category.To overcome this drawback, we used 11 × 11 sized mean filters to retrieve neighboring pixels excluded by the Mahalanobis typicality.Again, thresholds were set adaptively with the filtered images for each of the three types of rubber trees based on the best calibration using the same training site information.The final rubber tree growth map was generated by overlaying the three types of rubber tree maps.The class membership of ambiguous pixels such as those classified as rubber trees in more than one of the three types of rubber tree growth maps were determined by assigning each pixel to the class where it had its highest typicality statistic.The selected filters were able to generalize the image and exclude isolated misclassified rubber tree pixels, thus the filtered images retain both map accuracy and the spatial connectivity of the rubber tree classes.

Validation
The output rubber tree growth map for Northeast Thailand created from the TM images estimates rubber tree growth area by pixel.The total area of rubber trees was then extracted to the provincial level.The results were compared with provincial level statistical data from Thailand collected between 2005 and 2007.The relative error (RE) was used to evaluate the accuracy of the estimated rubber tree area for each province in Northeast Thailand, i.e., RE = (Estimated -Statistics)/Statistics × 100 (4) Linear regressions were employed between the statistical data and the estimates (30 m Landsat TM data) for validation [22].The root mean square error (RMSE) was calculated between the statistical values and estimates for the study area using the following equation: where n is the number of provinces in Northeast Thailand.
Areas of different age rubber trees were compared between the estimated and statistical data.The following area indices were calculated from the statistical data: Due to different age groups defined in the rubber tree growth classification (i.e., <2, 2 -4 and >4 years old), the estimated areas of the classified rubber trees from the satellite imagery were not directly comparable with those from the statistical data (i.e., <2, 2 -7 and >7 years old).Conversions were needed for the estimated results to make them consistent with the statistical data.To do this, the area of the mapped rubber tree growth that is equivalent to 2 -7 years old as defined in the statistical data (A m 2006 ) was approximated using the following equation: (7) where A Rubber1 and A Rubber2 are areas of Rubber1 and Rubber 2, which were defined as more than 4 years old and between 2 and 4 years old, respectively.

Results and Discussion
Figure 3 is a rubber tree growth map generated from the TM imagery showing mature, middle-age, and young rubber trees for Northeast Thailand.Tables 2 and 3 compare the estimated rubber tree area versus the statistical data.
Rubber tree growth estimates in most of the nineteen provinces in Northeast Thailand were based on TM images acquired in 2006, but data from Ubon Ratchathani province were from a 2009 image, because it was the best cloud-free image available covering this region.Sakon Nakhon and Nakhon Phanom provinces were based on 2007 images, and Nong Khai and Kalasin were covered by multiple images ranging from 2004 to 2007 (Table 2).The estimated rubber tree area for each province from the TM imagery actually reflected total area of rubber trees planted in a particular province as of the acquisition date(s) of the image(s).For example, Nong Khai province is covered by three TM scenes acquired from three different dates of 2004, 2006 and 2007.To reasonably validate the estimated rubber tree area, we used adjusted statistical data instead of the statistical data from each year, e.g., for Nong Khai, the validation was based on the average of rubber tree growth areas in 2004, 2006 and 2007.Similarly adjusted statistical data were used for the other provinces.For Ubon Ratchathani province, although time is asynchronous between the statistical data (2007) and Landsat 5 TM image (2009) (Table 2), it is reasonable to see that the area estimated from the latter is higher than the actual area from the former, which may indicate an increase of rubber planting area during the two years.

Total Rubber Tree Growth Area
The relative errors (RE) shown in Table 2 provides information about the accuracy of rubber tree growth area estimates for each province.Results indicate that the estimated area of rubber trees for most provinces is con-  2.
sistent with the adjusted statistical data except in four provinces, Maha Sarakham, Nakhon Ratchasima, Sakon Nakhon and Khon Kaen.Estimates for ten of the nineteen provinces performed well with REs below one third, and five of which performed very well with REs below 16%, including Nong Khai, Kalasin, Udon Thani, Ubon Ratchathani and Mukdahn.The ratio of RMSE to the statistical data for all the nineteen provinces was only 2.89%.Figure 4(a) illustrates per province rubber tree growth area comparison between the Landsat estimate and the adjusted statistical data.The regression of rubber tree growth area at the provincial level was statistically significant (y = 1.0008x + 702.4;R 2 = 0.774; p ≤ 0.001).When the three provinces with the highest REs (i.e., Maha Sarakham, Nakhon Ratchasima, Sakon Nakhon) were excluded from Table 2, the ratio of RMSE to the statistical data of the estimated rubber tree growth area for the sixteen provinces decreased to 2.08%, and R 2 for the regression line increased to 0.9106 (y = 1.0471x + 2006.1;R 2 = 0.91; p ≤ 0.001) (Figure 4(b)).The total area of rubber tree growth for the sixteen provinces was underestimated by 21,800 ha, which was less than 10%.The following analysis and validation is based on data from these sixteen provinces.

Mature Rubber Trees
The Landsat classification map defined mature rubber  trees as rubber trees more than four years old; this class covered a wider range than that represented by tapped rubber trees from the statistical data, as latex is generally collected from rubber trees that are more than 7 years old.
It is thus expected that the area of mapped mature rubber trees should be greater than that of Tapped rubber trees.
Figure 5 shows that the area of estimated mature rubber trees (rubber trees more than 4 years old) is significantly correlated (R 2 = 0.7766; p ≤ 0.001) with the area of harvested rubber trees at the provincial level (16 provinces), and the estimated area of mature rubber trees is approximately 2 -3 times more than tapped rubber trees.

Middle-Age Rubber Trees
Figure 6 illustrates that the area of rubber trees between two to seven years old for fifteen provinces (Ubon Ratchathani not included) approximated by the estimated area from Landsat using Equation ( 7) is consistent with that estimated from the statistical data.The two sets of data are significantly correlated (R 2 = 0.7911; p ≤ 0.001) and the estimated area (79,926 ha) and statistics area (77,863 ha) are equivalent (see Table 3).This suggests that the area of middle-age rubber trees can be reasona-bly reflected through the satellite imagery although the mapped rubber tree growth has different age structure with the statistical data.

Young Rubber Trees
Mapping of young rubber tree growth has been difficult for researchers as young rubber trees are often dominated by diverse ground conditions.The estimate of the area of young rubber trees was not as satisfactory as those for mature and middle-age rubber trees.Table 3 shows the area of newly planted rubber trees in 2006 for the sixteen provinces versus the estimated area from the Landsat data.Results indicate that young rubber trees were only closely estimated for Ubon Ratchathani (6016 ha versus 7281 ha from the statistical data), and remotely sensed based estimates of rubber tree growth area underestimated young rubber trees in all other provinces.This is because when using the Mahalanobis typicality method, only the pixels that are most typical of young rubber tree samples encountered in the training sites are retained.Those pixels that are less "typical" of training samples are screened out by the thresholds determined by the amount of rubber trees in each province.shows no correlation (R 2 = 0.034; p = 0.497) between the estimated and statistical data sets.However, when only looking at thirteen out of sixteen provinces (i.e., removing Loei, Amnat Charoen and Nakhon Ratchasima), the estimated area of young rubber trees shows the same trend with that from the statistical data; there was a significant correlation (R 2 = 0.9106; p ≤ 0.001) between these data sets (Figure 8(b)).This suggests that the mapping accuracy would be expected to improve if more young rubber tree training samples were included to represent a wider variability of this class (the available training samples used for young rubber trees only ac-  counted for 0.003% of the total study area).Figure 7 illustrates the total estimated area of tapped and young rubber trees versus that calculated from the statistical data.The two data sets are highly correlated (R 2 = 0.8251; p ≤ 0.001) but with a slope of 0.6479.This suggests that the underestimate was mainly due to underestimating the area of young rubber trees.

Multi-Regression Analysis
Multi-regression analysis of data from the sixteen prov- inces was conducted on the dependent variable, total area of rubber tree growth for each province from the adjusted statistical data, against the three independent variables, estimated areas of mature rubber trees (A R1 ), middle-age rubber trees (A R2 ) and young rubber trees (A R3 ).The regression equation is listed as follow: Rubber_Area_statistics = 0.844 × A R1 + 1.19 × A R2 -0.58 × A R3 + 3057.626The regression is significant (p ≤ 0.001) with R 2 = 0.913 and adjusted R 2 = 0.892.This result indicates that the total area of rubber tree growth from the statistical data was well estimated by the Landsat TM data.

Summary and Conclusions
Rubber tree growth mapping using 30 m Landsat TM data was conducted for Northeast Thailand with a very limited number of training samples.The Mahalanobis typicality method was used to identify different age rubber stands.Sixteen out of nineteen provinces were selected for final statistical analysis and validation.The validation was carried out using provincial level statistical data.Several conclusions can be drawn from this study.First, the Mahalanobis distance based typicality model successfully classified rubber stands that were typical of known samples from the training sites and overcoming the drawback of overestimation.Second, NDVI and tasseled cap transformation components together with Landsat TM 30 m reflective bands were useful in differentiating rubber trees from other vegetation and crops.Third, at regional and province scales, the estimates of mature and middle-age rubber tree growth areas using 30 m Landsat TM data were significantly close to the provincial statistical data.Therefore the typicality approach is a prominent and a robust means of mapping species distribution with limited presence information.Improvements can be made to map young rubber trees more accurately with more high-quality training information for model calibration.Future work will explore the usefulness of phenological factors for enhancing rubber tree growth discrimination.

Figure 8 Figure 4 .
Figure 4.Estimated total area of rubber trees versus that from the adjusted statistical data for (a) 19 provinces (p ≤ 0.001) and (b) 16 provinces (p ≤ 0.001) of Northeast Thailand in 2006.

Figure 5 .
Figure 5. Relationship between the estimated area of mature rubber trees (more than 4-years old) from Landsat TM data and the area of tapped rubber trees (usually more than 7-years) from the adjusted statistical data for Northeast Thailand (16 provinces) (p ≤ 0.001).

Figure 6 .
Figure 6.Estimated area rubber trees between 2 -7 years old from Landsat TM data versus that calculated from statistical data for Northeast Thailand (15 provinces) (p ≤ 0.001) in 2006.

Figure 7 .
Figure 7.Estimated area of rubber trees more than 7 and less than 2 years old versus the total area of rubber trees of the same ages calculated from the statistical data for Northeast Thailand (15 provinces) in 2006 (p ≤ 0.001)

Figure 8 .
Figure 8.Estimated area of young rubber trees (less than 2 years old) from Landsat TM data versus the area of new rubber trees planted in 2006 from the statistical data for Northeast Thailand, (a): for 16 provinces (p = 0.497); and (b): for 13 provinces (p ≤ 0.001).

Table 2 . Provincial level rubber tree growth areas for Northeast Thailand from 2006 statistical data and 30 m Landsat TM estimates.
*Maha Sarakham, Nakhon Ratchasima, Sakon Nakhon were not included in the analysis.

Table 3 . Provincial area of different age rubber trees for Northeast Thailand from 2006 statistics and 30 m Landsat TM estimates (Thai Rubber
Association, http://www.thainr.com/en/index.php?detail=stat-thai&page=1#).Maha Sarakham, Nakhon Ratchasima were not included in the analysis; ** multiple-year statistical data were adjusted.