Mitochondrial Genome Supports Sibling Species of Angiostrongylus costaricensis (Nematoda: Angiostrongylidae)

Angiostrongylus costaricensis is a zoonotic parasitic nematode that causes abdominal or intestinal angiostrongyliasis in humans. It is endemic to the Americas. Although the mitochondrial genome of the Brazil taxon has been published, there is no available mitochondrial genome data on the Costa Rica taxon. We report here the complete mitochondrial genome of the Costa Rica taxon and its genetic differentiation from the Brazil taxon. The whole mitochondrial genome was obtained from next-generation sequencing of genomic DNA. It had a total length of 13,652 bp, comprising 36 genes (12 protein-coding genes—PCGs, 2 rRNA and 22 tRNA genes) and a control region (A + T rich non-coding region). It is longer than that of the Brazil taxon (13,585 bp). The larger mitogenome size of the Costa Rica taxon is due to the size of the control region as the Brazil taxon has a shorter length (265 bp) than the Costa Rica taxon (318 bp). The size of 6 PCGs and the start codon for ATP6, CYTB and NAD5 genes are different between the Costa Rica and Brazil taxa. Additionally, the two taxa differ in the stop codon of 6 PCGs. Molecular phylogeny based on 12 PCGs was concordant with two rRNA, 22 tRNA and 36 mitochondrial genes. The two taxa have a genetic distance of p = 16.2% based on 12 PCGs, p = 15.3% based on 36 mitochondrial genes, p = 13.1% based on 2 rRNA genes and p = 10.7% based on 22 tRNA genes, indicating status of sibling species. The Costa Rica and Brazil taxa of A. costaricensis are proposed to be accorded specific status as members of a species complex.


Introduction
Angiostrongylus costaricensis Morera and Céspedes, 1971 is a zoonotic parasitic nematode of the Angiostrongylidae family (superfamily Metastrongyloidea) [1]. It resides in the mesenteric arteries of its definitive hosts and its eggs embryonate in the intestinal wall [1]. There are some 16 definitive hosts, including 10 species of rodents, 1 species of opossum, 2 species of carnivores and 3 species of primates [2]. Dogs have been suggested to play a role as a reservoir host [3].
This parasite was first discovered in humans in Costa Rica [4]. It is endemic to the Americas (from southern USA to northern Argentina in South America) and in humans causes abdominal or intestinal angiostrongyliasis, which mimics appendicitis with eosinophilia [5,6]. The first reported outbreak of abdominal angiostrongyliasis occurred in Guatemala [7]. Databases from 1996 to April 2012 revealed 27 case descriptions of abdominal angiostrongyliasis and 1 case series of 194 patients [8]. The main countries of origin were Costa Rica (89.6%), Brazil (2.7%) and the United States (1.8%). The life cycle of the Brazil taxon of A. costaricensis in its vertebrate hosts (Sigmodon hispidus) and in a mouse model [9,10] is much more complex than that of the Costa Rica taxon [11].
Molecular phylogeny based on COI nucleotide sequences [12] and ITS-2 gene [13] indicated possible occurrence of cryptic species for the A. costaricensis taxa from Costa Rica and Brazil. A multi-gene phylogeny will help to resolve the taxonomic status.
To date the mitochondrial genome of the Brazil taxon of A. costaricensis based on long-PCR method has been published [14]. We report here the complete mitochondrial genome (mitogenome) of the Costa Rica taxon of A. costaricensis by next-generation sequencing and comparison with the mitogenome of Brazil taxon to determine their taxonomic status as sibling species.

Ethics statement
A. costaricensis is a parasitic nematode. It is not endangered or protected by law. No permits are required to study this lungworm.

Specimen and genomic DNA extraction
A. costaricensis adult specimen preserved in RNAlater (RNA stabilization solution) was a gift from the Department of Parasitology, University of Costa Rica. Whole genomic DNA was extracted using G-spin Total DNA Extraction Mini Kit (iNtRON Biotechnology, Inc, Korea) following the manufacturer's instructions with minor modification.

Sample and library preparation
The purified genomic DNA was quantified using Qubit dsDNA HS Assay Kit (Life Technologies, USA) and normalized to 50 ng. Library was prepared using Nextera DNA Sample Preparation Kit (Illumina, USA) following the manufacturer's protocols. Size estimation of the library was performed on a 2100 Bioanalyzer using High Sensitivity DNA Analysis Kit (Agilent Technologies) and quantified with Qubit 2.0 Fluorometer (Life Technologies, USA).

Genome Sequencing
The library was normalized to 12 pM and sequenced using the Illumina MiSeq Desktop Sequencer (2 x 150 bp paired-end reads) (Illumina, USA).

Sequence and genome analysis
Raw sequences were extracted from the Illumina MiSeq system in FASTQ format and the quality of sequences was evaluated using the FastQC software [15]. All the ambiguous nucleotides and reads with an average quality value lower than Q20 were excluded from further analysis.
De novo assembly was performed using the CLC Genomic Workbench version 7.0.4 (Qiagen, Germany) and contigs greater than 13 kbp were subjected to BLAST [16] alignment against the nucleotide database at National Center for Biotechnology Information (NCBI). Contigs with hits to mitochondrial genes or genomes were identified and extracted from CLC Genomic Workbench.

Mitogenome identification, annotation and visualization
A contig identified as mitogenome was manually examined for repeats at the beginning and end of the sequence to establish a circular mtDNA. It was then annotated with MITOS [17] followed by manual validation of the coding regions using the NCBI ORF Finder (http://www. ncbi.nlm.nih.gov/gorf/gorf.html). The sequin file generated from MITOS was edited and submitted to NCBI according to ORF Finder result (NCBI GenBank accession number KR827449). The circular mitogenome of A. costaricensis was visualized with Blast Ring Image Generator (BRIG) [18].

Mitogenomes from GenBank
The mitogenome of A. costaricensis available from GenBank (NC_013067) was based on Brazil taxon. Mitogenomes of A. cantonensis (NC_013065) and A. vasorum (NC_018602) were used for comparison with Aelurostrongylus abstrusus (NC_019571) as an outgroup.

Phylogenetic analysis
The total length of the aligned sequences of 36 mt-genes, 12 protein-coding genes (PCGs), 2 rRNA and 22 tRNA genes as well as the selected models used for maximum likelihood (ML) and Bayesian Inference (BI) analyses are summarized in Table 1.
The 12 PCG sequences were separately aligned using ClustalX [19] program and were subsequently edited and trimmed using BioEdit v.7.0.5.3 [20]. The sequences of rrnS, rrnL and 22 mt-tRNA genes were aligned by MAFFT version 7 [21]. Kakusan v.3 [22] was used to determine the best-fit nucleotide substitution models for maximum likelihood (ML) and Bayesian (BI) analyses selected using the corrected Akaike Information Criterion [23] and the Bayesian Information Criterion [24], respectively. Phylograms of 12 concatenated PCGs, 36 mt-genes, 2 rRNA genes and 22 mt-tRNA genes were constructed using TreeFinder [25]. Bootstrap values (BP) were generated via 1,000 ML bootstrap replicates. Bayesian analyses were conducted using the Markov chain Monte Carlo (MCMC) method via Mr. Bayes v.3.1.2 [26], with two independent runs of 2x10 6 generations with four chains, and with trees sampled every 200 th generation. Likelihood values for all post-analysis trees and parameters were evaluated for convergence and burn-in using the "sump" command in MrBayes and the computer program Tracer v.1.5 (http://tree.bio.ed.ac.uk/software/tracer/). The first 200 trees from each run were discarded as burn-in (where the likelihood values were stabilized prior to the burn-in), and the remaining trees were used for the construction of a 50% majority-rule consensus tree. Phylogenetic trees were viewed and edited by FigTree v.1.4 [27].

Mitogenome analysis and features
Next-generation sequencing on Illumina MiSeq Desktop Sequencer generated approximately 4,302,276 reads from A. costaricensis (Costa Rica taxon) library. Removal of low quality sequence (< Q20) and sequences shorter than 50 nucleotides resulted in 4,190,993 reads. De novo assembly with these reads resulted in 440 contigs with maximum length of 13,652 bp and N50 of 452. The total GC content was 25.1%, with base composition of 26.7% A, 48.2% T, 18.8% G, and 6.3% C. The mitogenome of the Costa Rica taxon of A. costaricensis was 13,652 bp long, comprising 36 genes (12 protein-coding genes-PCGs, 2 rRNA genes, and 22 tRNA genes) and a non-coding region (A + T-rich control region) (Fig 1, Table 2). All the 36 genes and the control region were located on the H-strand. Spacing sequences were present in 17 regions, ranging from 1 to 64 bp with the largest between NAD4 and COX1 genes. The sequence with 64 bases had clear stem-loop structures and a long poly-T stretch of 11 bp (S1 Fig). Three regions had overlaps with 1 bp each ( Table 2). The control region (318 bp) was flanked by trnA(tgc) and trnP(tgg) genes (Fig 1).
The commonest start codon was TTG (in 7 PCGs-cox2, cox3, cytb, nad1, nad2, nad3, nad4), followed by two each for ATG (nad5, nad6) and ATT (cox1, nad4l), and one for ATA (atp6). Four PCGs had TAG stop codon (atp6, cox2, nad4, nad6) and three had TAA (cox1, cytb, nad1) while the remaining five genes (cox3, nad2, nad3, nad4l, nad5) had incomplete T stop codon (Table 2). Table 3 summarizes the nucleotide composition of the mitochondrial whole genome, protein-coding genes, rRNA genes and control region. All were A+T rich. The A + T content for PCGs ranged from 69.5% (cox1) to 79.3% (nad2). Only one PCG (cox1) had A + T content of less than 70%. The A + T content of the non-coding control region was 87.1%. The GC skewness values for the whole genome, PCGs, rRNA genes and control region were positive (0.339 to 0.800) indicating bias toward the use of Gs over Cs. Although the AT skewness value was negative for the whole genome, it was variable for individual genes.
Of the tRNAs, tryptophan (W) had UCA anticodon (opal suppressor) instead of the canonical Trp tRNA CCA anticodon (Fig 2, Table 2). The cloverleaf structure for 18 tRNAs lacked the TCC-arm; lysine (K) and methionine (F) lacked the DHU-stem while serine S1 (AGN) and S2 (UCN) had TCC-arm but lacked the DHU-arm (Fig 2). The number of base pairs in the DHU-stem ranged from 3 to 4 except the absence of DHU-arm in serine S1 and serine S2 (S2 Fig; S1 Table). The TCC-stem of serine S1 had 4 base pairs and serine S2 had 5 base pairs. The number of bases in the D-loop and TCC-loop when present was variable (Fig 2).

Discussion
Mitochondrial genomes of nematodes have been studied with respect to genetics, epidemiology, systematics, phylogeny and evolution [28][29][30][31][32][33][34]. To date, of the 13 species of Angiostrongylus genus [35] the mitogenomes of three species (A. cantonensis, A. costaricensis Brazil taxon, and A. vasorum) have been documented [14,31,36]. Additionally the mitogenome of Aelurostrongylus abstrusus, another member of the Angiostrongylidae has been published [37]. All these mitogenomes were sequenced by the long-PCR method. The present mitogenome of A. costaricensis (Costa Rica taxon) was obtained by next-generation sequencing. Based on COI nucleotide sequences the Costa Rica taxon of A. costaricensis was quite different from the Brazil taxon, with an uncorrected p-distance of 11.4% [12]. Similar differentiation has been reported for the nuclear ITS-2 gene [13]. The COI and ITS-2 results indicated the possible occurrence of sibling species for the A. costaricensis taxa.
The mitogenome size of A. costaricensis (Costa Rica taxon) (13,652 bp) is larger than those of the Brazil taxon (13,585 bp) and the congeners A. cantonensis (13,491-13,502 bp) as well as A. vasorum (13,422 bp) but is smaller than that of Ae. abstrusus (13,913 bp). The larger mitogenome size of the Costa Rica taxon (A. costaricensis) than the Brazil taxon is due to the size of the control region as the Brazil taxon has a shorter length (265 bp) than the Costa Rica taxon (318 bp). The long intergenic sequence between nad4 and cox1 genes has 64 bp with a 11-bp   Table).
With the exception of COX1-3, NAD1, NAD4L and NAD5 genes, the sizes of the other 6 PCGs are different between the Costa Rica and Brazil taxa of A. costaricensis ( Table 5). The start codon for ATP6, CYTB and NAD5 genes in the Costa Rica taxon differs from those of the Brazil taxon. Additionally, the two taxa differ in the stop codon of 6 PCGs (cox2, cytb, nad1, nad2, nad3 and nad4) ( Table 5). The Costa Rica taxon has 5 incomplete T stop codons (cox3, nad2, nad3, nad4l, nad5) compared to 3 in the Brazil taxon (cox3, nad4l, nad5). The incomplete T stop codons can be converted to TAA by post-translational polyadenylation [38].
Of the 22 tRNAs, 17 have different length between the Costa Rica and Brazil taxa ( Table 6). The total length of the tRNAs for the Costa Rica taxon (1247 bp) is longer than that for the Brazil taxon (1230 bp). The cloverleaf structure for the tRNAs is similar for both taxa.  Table 4. Percentage of uncorrected "p" distance matrix between Costa Rica and Brazil taxa of Angiostrongylus costaricensis and related taxa based on (a) 36 mt-genes, (b) 12 protein-coding genes, (c) two rRNA genes, and (d) 22 tRNA genes. The A+T content of the whole mitogenome in the Costa Rica taxon of A. costaricensis (74.9%) is higher than that of the Brazil taxon (73.2%) ( Table 3). The control region, rRNA genes and all the PCGs (excepting nad6) in the Costa Rica taxon have higher A + T content (Table 3). In the present study based on 36 mt-genes, 12 PCGs, 2 rRNA and 22 tRNA genes the Costa Rica taxon of A. costaricensis is distinctly different from the Brazil taxon, with uncorrected genetic distance of p = 15.3%, 16.2%, 13.1% and 10.7%, respectively ( Table 4). The genetic distances are comparable to those of other nematode cryptic species. Based on COI sequences, the genetic distance bewteen the sibling species A. cantonensis and A. malaysiensis is p = 11.1-11.7% [8]. The interspecific divergence bewteen A. cantonensis and A. malaysiensis is p = 1.7-4.1% based on 66 kDa protein gene [39] and p = 8.3-9.2% based on CYTB sequences [36]. In contrast, based on CYTB gene the intraspecific divergence for A. cantonensis is p = 0-2.9% and for A. malaysiensis p = 0-0.1% [40]. In the Rhabditis (Pellioditis) marina species complex, the three cryptic species have average interspecific divergence (based on COI gene) of 4.6-11.7% [41]. The present findings of genetic distance of p = 10.7-16.2% (Table 4) between the Costa Rica and Brazil taxa of A. costaricensis provide strong molecular evidence for them to be treated as members of a species complex, as earlier indicated by COI sequences [12]. In that case, in accordance with the type locality the Costa Rica taxon belongs to the nominal species. The Brazil taxon therefore warrants a new specific name.
In summary, we have successfully sequenced the whole mitochondrial genome of the Costa Rica taxon of A. costaricensis by next-generation sequencing. The genome features are similar to other Angiostrongylidae taxa. The phylogenetic tree based on 36 mt-genes is concordant with those based on 12 PCGs, 2 rRNA and 22 tRNA genes. Based on distinct genetic difference, the Costa Rica and Brazil taxa of A. costaricensis are proposed to be accorded specific status as members of a species complex.