Introduction

There are over 2600 described termite species (Kambhampati and Eggleton, 2000). They are found naturally in a wide range of terrestrial environments, particularly in the tropics where they are, in many cases, the dominant invertebrate decomposers (Eggleton, 2000). A few species are subtropical, but of limited ecological importance. However, in urban environments, they can be highly destructive and thus have a high economic importance.

Reticulitermes has a Holartic distribution (Eggleton, 2000) and is the principal termite genus present in southern Europe. In mainland Portugal, populations of the genus occur naturally over the entire country, and are a well-recognised pest of wood in service (Nobre and Nunes, 2003). All known infestations in Portugal involve termites from the species complex hitherto referred to as Reticulitermes lucifugus (Rossi), which is thought to be indigenous. The taxonomy of this genus is currently under review, as the use of molecular markers provides new phylogenetic insights (eg Clément et al, 2001; Jenkins et al, 2001; Marini and Mantovani, 2001; Austin et al, 2002). In the Mediterranean area overall, as many as seven species have been proposed (Clément et al, 2001), based on raising some morphotypes or genotypes to the species level.

Traditionally, the diagnosis of Reticulitermes species is based on the morphological characters of the soldiers, imagos and/or alates. In termites, morphological differences are great between genera, but not between species (Krishna and Weesner, 1970), and more often than not, the critical castes (usually soldiers) are not numerous in the majority of specimen sets. Cuticular hydrocarbon (CHC) profiles have been used for taxonomic purposes (eg Clément et al, 1988; Bagneres et al, 1990; Haverty et al, 1996; Collins et al, 1997; Haverty and Nelson, 1997). However, there is also evidence indicating that changes in CHC composition can be induced by environmental conditions (eg Collins et al, 1997; Woodrow et al, 2000) or by the cohabitation of different species (Vauchot et al, 1995, 1998). If changes in CHC can be induced by external factors, this may compromise the use of the character in phylogenetic reconstruction. More recently, molecular genetic methods have been introduced to gain insight into termite phylogeny (eg Jenkins et al, 2001; Marini and Mantovani, 2001; Austin et al, 2002).

Previous work on the European representatives of the genus Reticulitermes, based on a biometric analysis, suggested the presence of north–south variation within Iberian (including Portuguese) termite populations (Clément et al, 1982). The existence of significant morphological variability within the Iberian termite populations within Iberian Peninsula was also suggested by Parton et al (1981) and Clément et al (1988).

Based on Clément (1981, 1986), it was first believed that a single Reticulitermes species inhabited Portugal (R. banyulensis), together with a zone of hybrid colonies between R. banyulensis and R. grassei, the latter being found in France and Spain. Subsequent work applying morphological, chemical and molecular techniques together with previously collected preserved specimens, Clément et al (2001) suggested the additional presence of R. grassei over the majority of the country, without hybrid zones, pointing out that R. banyulensis was also in fact present in Spain and southwestern France. The mapped natural geographic distribution of Reticulitermes species presented by Clément et al (2001) included a sympatric zone containing both R. grassei and R. banyulensis within Iberian Peninsula, stretching from northwestern Spain through Portugal to southeastern Spain.

A criticism of the existing taxonomy is that termites in Portugal have been undersampled compared to other southern European countries (eg Clément et al, 2001; Marini and Mantovani, 2001; Austin et al, 2002; Kutnik et al, 2004; Luchetti et al, 2004). We therefore carried out an intensive survey with the aim of understanding the taxonomic diversity in Portugal, looking for the hypothesised sympatric zone as well as the variability within populations referred to in previous work (Parton et al, 1981; Clément et al, 1982, 1988). We attempted to resolve the species and population diversity of Reticulitermes in Portugal using a molecular marker that has been proven to discriminate variation accurately and consistently within this genus (Jenkins et al, 2001; Austin et al, 2002).

Methods

Specimen collection

A north–south transect of 640 km was established, spanning the country using the 50 × 50 km Universal Transverse Mercator grid system (Figure 1). Each square was subdivided to allocate one sampling site in each of the 25 × 25 km quadrants. Four different colonies were therefore sampled per cell, giving a total of 44 colonies over the whole transect. In each cell quadrant, a systematic search for termites was conducted for no longer than 30 min and at a maximum of two localities. The selection of localities was designed to incorporate most of the ambient climatic variations (such as temperature, humidity, rainfall, hours of sunshine) found in Portugal. The choice of sampling sites also covered the most commonly encountered microhabitats for termites.

Figure 1
figure 1

Location of sampling sites.

Specimens were collected in all the squares sampled (a total of 44), preserved in 90% alcohol and frozen at −20°C for subsequent molecular analysis. Two additional sets of specimens were collected from property infestations to test the possibility of a different species in urban habitats. Voucher specimens preserved in 90% ethanol are maintained at the Museu de História Natural, Department of Entomology, University of Lisboa.

Genetic data collection

Specimens were allowed to dry on filter paper and DNA was extracted from individual heads using a modified version of the method described by Sunnucks and Hales (1996). Extracted DNA was resuspended in 100 μl of Tris (10 mM) and stored at −20°C. The polymerase chain reaction (PCR) was used to amplify a ca. 530 bp fragment of the COII gene. The primer pairs used for PCR amplification and to prime forward and reverse sequencing reactions (5′-TAA GAC TAC AAG ACG GAG CA-3′ and 5′-AAG TGA TTT GCC GAT ACT CT-3′) were designed exploiting the mismatches of existing known sequences of R. banyulensis and R. grassei available at GenBank, and therefore would be specific to amplify a COII fragment of the two species. PCR reactions were conducted using 1 μl of the extracted DNA, in a standard 25 μl reaction, with 0.4 pmol/μl of each primer, 1.5 mM MgCl2, 0.4 mM dNTPs and 0.04 U/μl Taq DNA polymerase. Amplification was achieved in a Biometra® Personal Cycler, including a precycle denaturation at 94°C for 3 min, 30 cycles of a standard three-step PCR (51°C annealing) and a postcycle extension at 72°C for 10 min. Samples were subsequently processed by Barts and The London School of Medicine and Dentistry Genome Centre (Queen Mary, University of London), where they were post-PCR purified and sequenced in both directions using an ABI PRISM® 3700 DNA Analyzer.

Data analysis

The nucleotide sequences of the samples were edited with BIOEDIT 5.0.6 (Hall, 1999) and unambiguously aligned without gap insertions using ClustalX (Thompson et al, 1997). Haplotype distributions, number of haplotypes, haplotypes diversity (h) and nucleotide diversity (pi) were calculated using DNAsp v4.00 (Rozas et al, 2003).

A BLAST search was performed to select Reticulitermes sequences from previously nominated species for inclusion in the analysis. GenBank accession numbers of the sequences used are given in Table 1. In order to identify the most appropriate evolutionary model by the Akaike Information Criterion, the program Modeltest version 3.5 (Posada and Crandall, 1998) was used, and the most appropriate model was used to calculate pairwise genetic distances among haplotypes. Using PAUP* (Swofford, 2000), a maximum parsimony method was applied to a heuristic search with the TBR branch-swapping algorithm, and random stepwise sequence addition with 10 replicates, to construct a phylogenetic tree. The stability of the nodes was tested by 1000 bootstrap pseudoreplicates.

Table 1 Sequences from GenBank used in the cluster analysis

The genetic distances among individual haplotypes calculated were used to test for spatial differentiation (association between genetic and geographic distances) within the transect. Geographical distances were taken as the minimum linear distance between sampling sites and the significance of correlation between genetic and geographical distances was assigned by a Mantel test (1000 permutations; Sokal and Rohlf, 1995).

Using the split decomposition method, a parsimonious split tree was calculated as set out in SPLITS TREE version 4.0 (Huson and Bryant, 2005). The split-decomposition method is based on the fact that any data set can be partitioned into sets of sequences or ‘splits’ (therefore, a split is the division of haplotypes into two exclusive sets), and from combining those splits successively, a split-graph can be built (Bandelt and Dress, 1992; Posada and Crandall, 2001). This type of network accommodates the formation of contradictory groupings (incompatible splits), introducing a loop into the split-graph (Posada and Crandall, 2001), which clarifies the possible presence of alternative groupings. The most statistical parsimonious unrooted haplotype network was also constructed using TCS version 1.18 (Clement et al, 2000), a more traditional method that emphasises what is shared among haplotypes that differ minimally (Posada and Crandall, 2001).

Results

Sequencing of part of the mtDNA COII locus PCR-amplified product revealed a clear amplicon with an average size of 527 bp (GenBank accession numbers AY954404–AY954449). The sequenced portion falls between base pairs 89 and 616 of the complete COII region of Drosophila yakuba (X03240 used for alignment). A total of 371 positions were constant and 95 of the variable positions were parsimony informative. Twenty haplotypes with a total of 28 polymorphic nucleotide sites were observed, group E being the most common one with five representatives (Table 2). Haplotype diversity (h) was 0.956. The nucleotide diversity (pi) was 0.008, and the Tajima's D-test resulted in a nonsignificant P-value (D=−1,212; P>0.1): the null hypothesis of neutrality for the mtDNA COII gene was therefore not rejected.

Table 2 Sequence variation of the 20 haplotypes observed in Reticulitermes specimens

The samples from pest infestations were included in the analysis but did not result in new haplotypes. The GTR+G model, assuming rate matrix parameters of 0.2238, 14.5788, 0.7971, 1.7060 and 14.5788, and base frequencies of 0.4173 for A, 0.2686 for C, 0.11128 for G and 0.2013 for T, proved to be the most appropriate evolutionary model for the COII data set.

The maximum parsimony analysis (Figure 2), undertaken to examine relationships between the specimens sampled and established Reticulitermes species, showed a well-supported clade in which all the sampled haplotypes were grouped with the majority of sequences (from GenBank) as being of R. grassei and were, therefore, designated as such. The only exceptions were two specimens from Spain (Rg9 – AY510596; Rg10 – AY510601), which clustered with R. banyulensis.

Figure 2
figure 2

Single most parsimonious tree based on the mtDNA COII gene during a branch and bound search with PAUP*. Bootstrap values for 1000 pseudoreplicates are listed at the branches supported at ≥50%. Rooted with Coptotermes formosanus (Cf). Codes according to Tables 1 and 2.

Genetic variation among the R. grassei haplotypes ranged from 0.1 to 1.9%. The network resulting from the splits decomposition method is shown in Figure 3, together with the distribution of the genetically identical groups in the transect. The fit of approximation (measured by the sum of all approximated distances divided by the sum of all original distances) was 98.9%, an indication of how well the split-graph represents the input distance matrix. The haplotypes L, M and N formed a clearly distinct clade, both on the phylogenetic tree and on the network approach (Figures 2 and 3). The statistically parsimonious method highlighted the haplotype with the highest outgroup probability as haplotype E (displayed as a square: Figure 3). The E and I haplotypes differ from each other only at one nucleotide site (Figure 3) and are linked to some of the other haplotypes through a loop, indicating that the splitting defined contradictory groupings.

Figure 3
figure 3

(a) Distribution of the 20 haplotypes, based on partial mitochondrial COII sequencing (527 bp), of R.grassei within the sampled quadrates. (b) Network parsimonious split decomposition tree. The loops indicate contradictory groupings. The bootstrap values are shown (1000 bootstraps). (c) Statistical parsimonious unrooted haplotype network. Each line represents one mutational step connection between two haplotypes and all are supported as being parsimonious at the 95% level. The size of the square or oval corresponds to the haplotype frequency. Small empty circles represent missing intermediate haplotype states.

The Mantel test was significant (r=0.292; P<0.01), thus showing the existence of an association between the genetic and geographic distances.

Discussion

The molecular analysis showed that R. grassei was the only species encountered within the transect. This species therefore seems to occupy all habitats within the range of environmental conditions found in Portugal. R. banyulensis was not encountered, contrary to the previous suggestion of Clément (1981, 1986). However, the possibility of DNA leaking across species borders should be considered. A well-supported clade (Figure 2) joined all the sequences from R. banyulensis and two Spanish haplotypes considered to belong to R. grassei. The genetic similarities between R. grassei and R. banyulensis can be explained by recent speciation, as suggested by Clément et al (2001). This group may have diverged as recently as 8000 years ago, during the Atlantic period. Therefore, the morphological variation and enzymatic polymorphism referred to by Clément (1981), Parton et al (1981) and Clément et al (1982, 2001) might be too recent to be reflected in the mtDNA region sequenced.

The hypothesised presence of a north–south variation within Portuguese termite populations (proposed by Clément et al, 1982) was incoporated in the design and placement of the transect. An association between genetic and geographical distances was demonstrated, leading to the conclusion that the termite populations sampled have differentiated in situ, with geographically closer populations being more closely related genetically. Therefore, it can be assumed that the present distribution of Reticulitermes populations in Portugal is mainly the result of the natural population dynamics without an overall anthropogenic effect. As an example of the opposite situation, Austin et al (2004) found, in Oklahoma, an important anthropogenic role in the distribution pattern of populations as indicated by the absence of geographic association of the haplotypes.

Haplotype E was found at five locations, within 90 km of each other. The network analysis accounts for the possibility that, as predicted by coalescence theory, ancestral haplotypes will be the most frequent sequences sampled in a population level study (Crandall and Templeton, 1993; Posada and Crandall, 2001). If predictions from coalescence theory are to be applied to intraspecific phylogeny reconstruction, haplotype E may represent the ancestral form.

We also observed adjacent locations sharing the same haplotype, for example, haplotypes A, Q or S. Lack of haplotype heterogeneity at the locality level may indicate a small-scale anthropogenic effect, but could also be explained by a lack of nestmate agonism (Bulmer and Traniello, 2002) or by mixing between adjacent colonies (Clément, 1986).

As colonies of subterranean termites grow and expand their foraging range, it is possible for groups of foragers to become cut off from the rest of the colony and to then form nesting buds, which may produce new neotenics from among the workers present, leading to independent colonies. Therefore, one may expect to observe several adjacent colonies sharing the same genetic background. Given time and appropriate isolation mechanisms (behavioural, chemical or temporal), these colonies may evolve to become differentiated from the common ancestor, as may have been the case for the haplotypes found at only one location. The discovery of unique haplotypes suggests limited current levels of genetic exchange.

Haplotypes T and G were found at a few locations within the transect and at the urban sites sampled outside the transect. Since information on the adjacent populations is lacking, the mechanism of dispersal of these two haplotypes remains unclear.

The geographical component of the genetic polymorphism should be the subject of future research, incorporating several spatial scales, as the actual population structure reflects colonisation and differentiation traits that have occurred in the past. Understanding the means by which termites spread through a habitat is of great importance, especially in the context of introductions taking place accidentally by humans. According to Husseneder et al (2003), the key to understanding the establishment and proliferation of termite colonies lies in molecular genetic research on termite populations, both native and introduced. The extent of genetic variation and subsequent gene flow in Reticulitermes species in mainland Portugal has not yet been investigated.