Sequence Analysis of the Cytochrome C Oxidase Subunit I Gene of Pseudagrion pilidorsum (Odonata: Coenagrionidae)

Pseudagrion pilidorsum is 1 of over 140 species of Pseudagrion (in the family Coenagrionidae), the largest genus of damselfly. This species exhibits dimorphism due to the different body colorations of males and females, making them difficult to distinguish from other congeneric species. This study analyzed the cytochrome C oxidase subunit I (COI) gene sequence of P. pilidorsum found in Bogani Nani Wartabone National Park (North Sulawesi) and compared it with other sequences of P. pilidorsum from distinct geographical locations in Asia. The COI gene for the Sulawesi specimen was amplified using the universal primer pair LCO1490 and HCO2198. A sequence homology search was conducted through BLAST. Multiple sequence alignment was executed using CLUSTAL O (1.2.1). A phylogenetic tree was constructed using the Neighbor-Joining method, and genetic distance was calculated using the Kimura 2-parameter. The COI gene sequence of the Sulawesi specimen lies in the range of 83.99-89.10% with other P. pilidorsum deposited at GenBank, namely KF369526 (Sarawak specimen), AB708543, AB708544, and AB708545 (Japan specimens). The genetic distance falls in the range of 0.146-0.149 between the Sarawak specimen and the Japan specimen; 0.122-0.125 between the Sulawesi and Japan specimens; and 0.185 between the Sulawesi specimen and the Sarawak specimen. It can thus be inferred that the Sarawak and Japan specimens may not belong to the same species; the Sulawesi and Japan specimens may not belong to the same species; and the Sarawak specimen and Sulawesi specimens might be placed in different genera.


Introduction
There are more than 5,875 species of dragonflies worldwide; 2,739 are from the suborder Zygoptera (damselfly), and 2,941 are from suborder Anisoptera (dragonfly) [1]. Eight hundred and seventy species are in the Australasian region that includes Sulawesi, Moluccas, Papua, and Australia. Approximately 175-250 species in these regions have not been described [2]. An estimated 1,500 species of dragonflies still await description around the world, so the number of dragonfly species that exists today could approach 7,000 [2]. One damselfly species found in North Sulawesi is Pseudagrion pilidorsum (Brauer 1868) from the family Coenagrionidae. This type of damselfly is difficult to distinguish from others among the same genera due to the variety of color and also the similarity of morphological characteristics. The existing methods for identifying unknown insects rely on morphological characteristics, which presents a problem if organisms alter themselves physiologically and morphologically due to unfavorable environmental conditions. This situation often may lead to the incorrect identification of species. A tool that enables both fast and accurate identification based on DNA sequence is regarded as one taxonomic method to analyze species diversity [3]. Therefore, the DNA barcode was developed for easy identification [4].
Some studies have used DNA barcodes to calculate the genetic distance threshold e.g., for butterflies [5], crustaceans [6], corals [7], and fishes [8], but this distance cannot be generally applied to the identification of the species because there are some inconsistencies of intra-and interspecific species diversity, which are highly dependent on the organism observed [9]. The latest advances in the fields of statistics and geographical genetics have enabled investigation into how environmental factors affect the geographical organization and population structure of molecular genetic diversity within species [10]. The DNA barcode recommended for animal species identification is the cytochrome C oxidase subunit I (COI) gene, the most important protein-coding gene in mitochondrial DNA (mtDNA). This gene has been widely used in molecular evolution research and for evaluating inter-and intraspecific diversity in ciliates [11] and insects [12]. It may also be used in conjunction with traditional morphology-based identification to more accurately identify and classify species.
Factors that influence the distribution of the diversity of dragonflies are either historical (geological) or ecological factors. Both define the current species diversity, but the compositions of the family and genus level are mainly determined by geological factors [13]. Local biodiversity is strongly influenced by speciation and extinction caused by ecological and non-ecological factors. Many studies have shown that climate patterns also have an impact on radiation and the diversification of species [14]. Some studies have investigated the habitat, distribution, abundance, and morphology of the damselfly in Bogani Nani Wartabone National Park (BNWB-NP) [15], but there has been no research on their genetic aspects to date, especially in P. pilidorsum.
In this study, we used the COI gene sequence data of the damselfly P. pilidorsum obtained from Bogani Nani Wartabone National Park. The data were compared with other available COI gene data of other P. pilidorsum obtained from GenBank. This study aimed to analyze variation in the COI gene sequences of P. pilidorsum from various places in Asia to provide information on the intraspecific variation of COI sequences based on different geographic locations.

Materials and Methods
Study Area. P. pilidorsum specimens were obtained in March 2014 on river banks in the primary forest of Bogani Nani Wartabone National Park in North Sulawesi at the coordinates 00º34'33.87"N and 123º53'58.02"E. The study area is situated at an altitude of about 241-289 meters above sea level with a temperature between 26-30 C, 65-74% humidity, and 60-85% vegetation cover.
Procedure. The damselflies were captured using swing nets. Identification of their morphological characteristics was carried out by observing wing venation, color patterns, and genitalia using identification keys [13]. The coxae were removed carefully from the body of the male damselfly along with the flesh. The coxae and flesh were kept in 90% ethanol until their use. Extraction of DNA from the flesh was performed using a universal innuPrep DNA Micro Kit (Analytik Jena) according to the protocol provided by the manufacturer. Amplification of the DNA was done using 5x Firepol PCR Master Mix Ready-to-Load (Solis BioDyne) with the universal primer set LCO1490 (5'-GGTCAACA-AATCATAAAGATATTGG-3') dan HCO2198 (5'-TAAACTTCAGGGTGACCAAAAAATCA-3') [38]. Amplification of the DNA was carried out according to the method used by Kairupan et al. [5] as follows: 5 minutes initial denaturation at 94 C followed by 35 cycles of denaturation at 94 C for 30 seconds, annealing at 54 C for 45 seconds, elongation at 72 C for 1 minute and 30 seconds, and final extension for 5 minutes at 72 C. The amplicons were sent to 1st BASE Malaysia for sequencing using the primer set LCO1490 and HCO2198.
Data analysis. The chromatograms were processed using Geneious v5.6 according to the method described by Tallei and Kolondam [16]. Briefly, the sequences were pairwise aligned using global alignment with free end gaps to identify regions of 93% similarity. Approximately 30 nucleotides at the beginning of the DNA sequences were trimmed, and erroneous nucleotide readings were corrected. Other COI gene sequences of Sequence Analysis of the Cytochrome C Oxidase Subunit I Gene P. pilidorsum and other members of Coenagrionidae were obtained from GenBank. All sequences were subsequently trimmed accordingly to avoid ambiguous alignment and to reach the core area. The sequences were aligned using Clustal O (1.2.1) multiple sequence alignment (www.ebi.ac.uk/Tools/msa/clustalo). The phylogenetic tree was inferred using the Neighbor-Joining (NJ) method [41]. The evolutionary distances were computed using the Kimura 2-parameter approach [40]. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1000 replicates) is shown next to the branches [43]. Evolutionary analyses were conducted in MEGA 6.0 [42].

Results and Discussion
Damselfly specimens P. pilidorsum that were obtained from TN-BNWB are displayed in Figures 1A and 1 Table 2, while the genetic distance of COI based on the Kimura 2-parameter is presented in Table 3. The COI gene of P. pilidorsum has been deposited in GenBank with accession number KX447495 (Sulawesi specimen). It has an identity in the range of 83.99-89.10% with other P. pilidorsum obtained from GenBank, namely KF369526, AB708543, AB708544, and AB708545. KF369526 is a specimen from Sarawak, Malaysia (voucher RMNH.INS.228961), while AB708543, AB708544, and AB708545 are specimens from Japan.
Based on its identity, the Sulawesi specimen is closer to the Japan specimens (88.86-89.10%) than to the Sarawak specimen (83.99%) ( Table 2). This conclusion was supported by the genetic distance between the Sulawesi and Japan specimens, which ranged from 0.122-0.124 (Table 3). The identity of all P. pilidorsum ranged from 84.92-100.0%. The genetic distances of all P. pilidorsum specimens varied from 0-0.185. There was no genetic distance between AB708543 and AB708544,  which are both Japan specimens. This similarity is in contrast with the high genetic distance between the Sulawesi specimen and the Sarawak specimen, which was 0.185. In contrast, the genetic distance between the Sarawak specimen and the Japan specimens is 0.145-0.148. The Sarawak and Sulawesi specimens have a distance (0.185) that is greater than that between P. ignifer (FJ812818) and all P. Pilidorsum (0.151-0.173) specimens. The distance between Pseudagrion and Ceriagrion (KU220870) is 0.187-0.222. However, the distance between P. microcephalum and other Pseudagrion is 0.205-0.235. This shows that the intergeneric distance between Pseudagrion and Ceriagrion overlaps with the interspecific distance among Pseudagrion.
The number of base substitutions per site from between sequences is shown in Table 3. Standard error estimate(s) are displayed above the diagonal. The sequence alignment reveals a high intraspecific variation within P. pilidorsum ( Figure 2), although only a single amino acid substitution is produced, where Ser becomes Ala ( Figure 3). With a length of 431 bp, the COI sequences of these specimens reveal a GC content of 42.9% for KX447495, 39.7% for KF369526, 38.5% for AB708543 and AB708544, and 38.7% for AB708545. Figure 4 shows the evolutionary history inferred using the NJ method [41]. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The bar at the bottom represents the amount of genetic change of 0.02 nucleotide substitutions per site calculated using the Kimura two-parameter method. This is the number of substitutions or changes divided by the length of the sequence. The tree supports the result that the Sulawesi specimen is more closely related to the Japan specimen.
P. pilidorsum has a total body length of 4.7 cm with a wingspan of 1.8 cm. The difference between males and females is largely based on the body color of the head and thorax. The male is generally dark red and orange ( Figure 1A), while the female is a dull orange to yellow color ( Figure 1B). The abdomens of both males and females are yellowish. This species is found along riverbanks in primary and secondary forests as well as on farmland with a moderate to light intensity [15].
A complete understanding of the diversity of morphology, ecology, behavior, and distribution of dragonflies/ damselflies must involve their evolutionary historical factors; future study of the evolution of these traits must be based on a phylogenetic framework [17]. Monitoring the biodiversity in an area that is ecologically significant for conservation will depend on reliable species identification. Molecular methods, such as DNA barcodes, provide consistent results that can be used in con-junction with traditional morphological identification as a taxonomic tool for evaluating species identification and diversity [18,19]. Mitochondrial DNA is also a reasonably accurate genetic marker because it is matrilineally derived [20].
With its 431-bp length, the COI sequence has a GC content in each P. pilidorsum specimen as follows: 42.9% KX447495, 39.7% KF369526, 38.5% AB708543, 38.5% AB708544, and 38.7% AB708545. This indicates that the COI gene sequences used in this research are from mtDNA and not nuclear mitochondrial pseudogenes (numts), such as those found in bird tyrant flycatchers (family Tyrannidae) in North America. Numts are nonfunctional copies of mitochondrial genes that have been translocated into the nuclear genome [23]. In animals, the GC content for all mtDNA ranges between 13-54%, while the sequences used for the DNA barcode range from 22-53% [24]. Numts have higher GC contents compared to orthologous mtDNA [25]. DNA with a high GC content is assumed to have a heatresistant helix structure, thus providing a selective advantage for animals with high metabolic regulation induced by the environment (e.g., light, temperature, salinity, oxygen, and pH) [26].
Sequence divergences are also present in the mitochondrial COI gene of insects. In the four-spotted skimmern Libellula quadrimacular, the sequence divergence is <1% (0.01) in Europe, 0.002-0.023 in North America, and >0.008 in Japan. It is estimated that the substitution rate for insect mtDNA sequences, including the COI region, ranges between 0.015-0.013 per million years [27]. For COI, the intraspecific p-distance for the dragonfly Orthetrum melania varies between 0.00-0.03999, while the interspecific p-distance is from 0.033-0.1729 [28]. The inter-subspecific (intraspecific) Kimura 2-parameter genetic distance based on COI genes amongst broad-winged damselflies Psolodesmus mandarinus kuroiwae, P. m. mandarinus, and P. m. dorothea ranged from 0-0.076. The genetic divergence between Psolodesmus mandarinus kuroiwae and P. m. mandarinus (0.067-0.076) is far higher than that within P. m. kuroiwae (0-0.011) or P. m. mandarinus (0-0.022). Based on this evidence, P. m. kuroiwae was proposed to be classified as a distinct species [39].
Avise (2000 in Herbert et al. [44]) stated that intraspecific divergences of COI genes are rarely greater than 2%; in fact, most are less than 1%. The average percentage of the sequence divergence of congeneric organisms in arthropoda is 10.1% ± 4.9 [44]. Intraspecific Kimura 2parameter distances range from between 0.122-0.124 (12.2-12.4%) for Sulawesi (KX447495) and Japan specimens, 0.185 (18.5%) for Sulawesi and Sarawak specimens, between 0.145-0.148 (14.5-14.8%) in Sarawak and Japan specimens, and 0-0.002 (0-0.2%) amongst Japan specimens themselves. This distance represents the proportion (p) of nucleotide sites where the two sequences being compared are different. This value is obtained by dividing the number of nucleotide differences by the total number of nucleotides being compared [29]. According to Yong et al. [28], this result indicates that the Sarawak and Japan specimens (0.145-0.148), as well as the Japan and Sulawesi specimens (0.122-0.124), are different species or even possibly different genera because the Kimura 2-parameter distance ranges are very high, The distance between the Sarawak and Sulawesi specimens is 0.185, which is greater than the distance between P. ignifer (FJ812818) and the rest of P. pilidorsum. Therefore, the Sarawak specimen needs to be revisited. The Sarawak and Sulawesi specimens can be considered for reclassification as different genera with a distance greater than 17.29% if referring to Yong et al. [28] and Avise [45]. Interestingly, the distance between P. ignifier and all P. pilidorsum (0.151-0.173) was lower than that found between the Sulawesi and Sarawak specimens (0.185). The distance between all Pseudagrion and Ceragrion (KU220870) is 0.187-0.222, and this distance has some overlap with the distance between P. microcephalum and other Pseudagrion (0.205-0.235).
The highest intra-specific distance for the damselfly according to Lin et al. [39] is 0.076 (7.6%). As a comparison, the Asian Pseudagrion species studied using the COI gene is more closely related to Asian Archibasis compared to two African Pseudagrion, so the damselfly taxa in Asia need to be reclassified [30].
Our findings show overlapping intra-and interspecific variations in Pseudagrion. For Diptera, for example, there is an overlapping intra-and interspecific variation in COI gene sequences, namely between 0-15.5%, so this result could represent an error in identifying the species [31]. Freshwater crayfish Cherax preissii showed the highest divergence value with a maximum genetic distance of 4.45% (patristic distance: 0.5) between the two main populations in North and South Australia [26].
Using the COI gene, Lin et al. [39] showed that the genetic divergence between Psolodesmus mandarinus kuroiwae and P.m. mandarinus is 7.0-7.6%, which was higher than that for P. m. kuroiwae (0-1.1%) or P. m. mandarinus (0-2.2%). Based on this evidence, P. m. kuroiwae should be classified as a distinct species. The greater the genetic distance between populations or individuals, the more isolated they are from one another due to the less interbreeding that takes place between them. The degree of isolation and demographic history of populations are key factors that affect the population's or individual's potential to adapt to different environmental conditions. Changes in the biogeography may have an impact on genetic variation both within and between populations [32]. The COI gene is widely used as a tool in the study of phylogeography and history at the intra-and interspecific levels because of its low or absence of recombination, uniparental (maternal) mode of inheritance, conserved structure, and relatively high rate of evolution; this gene has been used extensively to study the evolution of insects [12,20,32].
Nucleotide diversity is the average number of nucleotide differences per nucleotide sites being compared between DNA sequences [34]. High nucleotide variations in the data presented indicate geographical isolation, which suggests that gene flow between populations is limited in these three locations. Thus, the populations can be regarded as different conservation units because they have substantial differences in the level of DNA [33]. Therefore, these three groups of the population have evolved independently over a given period, so they may Tallei, et al. have unique adaptations that ought to be preserved, although the types of adaptations are not yet clear. A population with a low level of genetic diversity is less able to adapt to new selection pressures because the limited set of genes (gene pool) in a particular group will decrease the tendency for the presence of adaptive alleles in a group or population [33]. A population or group will not only differ in adaptive loci as a direct result of natural selection but also will contain variations at neutral loci due to genetic drift, which indirectly results from selection through a reproductive barrier. This pattern was identified in a semi-natural experiment that produced the population or group Anthoxanthum odoratum, which was previously thought to be different as a result of adaptation due to the addition of different nutrients to the diet. This is an example that illustrates how local adaptation can shape the genetic diversity and differences between populations [33].
The NJ phylogenetic tree ( Figure 4) indicates that P. pilidorsum and P. ignifer formed a monophyletic group. The position of the Sulawesi specimen is close to the Japan specimens, and this trend is supported by the Kimura 2-parameter genetic distance calculation. The phylogenetic relationship represents kinship among the study groups. The horizontal dimension is the number of genetic changes indicating the evolutionary lineage that changes over time. The longer the branches in the horizontal dimension, the greater the number of changes that have occurred. By assuming one generation per year, the value of pairwise sequence divergence is 2.3% per year for COI so that the neutral mutation rate per nucleotide site per generation (α) is 1.15 × 10−8 [20].
The combination of fragments of COI and ND1 is better compared to only one fragment alone [35]. Several studies that have used more than one fragment for the DNA barcode have provided information on the level of difference and variation patterns for phylogenetic studies by examining the divergence in DNA sequences [36,37]. A character-based approach provides a higher resolution than the distance-based method in Odonata, especially on a closely related taxonomic entity. Characterbased DNA barcodes can be used to characterize species through a unique combination of diagnostic characteristics that are not genetic distance-based. In this way, the species barriers may be determined using a series of diagnostic evaluations about the characters that can be improved at any stage of resolution using multiple genes [18].

Conclusions
Our results indicate that the identity of the COI gene in the Sulawesi specimen and in other P. pilidorsum from GenBank (Sarawak and Japan specimens) is low, ranging from 83.99-89.10%. Moreover, the genetic distances are high enough to suggest that the Sulawesi specimen differs from the Japan specimen and the Sarawak specimen also differs from the Japan specimen. This finding also indicates that the Sarawak and Sulawesi specimens are of different genera. However, more specimens from Sulawesi are needed for analysis to clarify their taxonomic identity. Although this finding suggests the need to reclassify P. pilidorsum, in-depth study involving more than one marker for the DNA barcode as well as a character-based DNA barcode must first be undertaken.