The mitochondrial genome of the chimpanzee louse, Pediculus schaeffi: insights into the process of mitochondrial genome fragmentation in the blood-sucking lice of great apes

Blood-sucking lice in the genera Pediculus and Pthirus are obligate ectoparasites of great apes. Unlike most bilateral animals, which have 37 mitochondrial (mt) genes on a single circular chromosome, the sucking lice of humans have extensively fragmented mt genomes. The head louse, Pediculus capitis, and the body louse, Pe. humanus, have their 37 mt genes on 20 minichromosomes. The pubic louse, Pthirus pubis, has its 34 mt genes known on 14 minichromosomes. To understand the process of mt genome fragmentation in the sucking lice of great apes, we sequenced the mt genome of the chimpanzee louse, Pe. schaeffi, and compared it with the three human lice. We identified all of the 37 mt genes typical of bilateral animals in the chimpanzee louse; these genes are on 18 types of minichromosomes. Seventeen of the 18 minichromosomes of the chimpanzee louse have the same gene content and gene arrangement as their counterparts in the human head louse and the human body louse. However, five genes, cob, trnS1, trnN, trnE and trnM, which are on three minichromosomes in the human head louse and the human body louse, are together on one minichromosome in the chimpanzee louse. Using the human pubic louse, Pt. pubis, as an outgroup for comparison, we infer that a single minichromosome has fragmented into three in the lineage leading to the human head louse and the human body louse since this lineage diverged from the chimpanzee louse ~6 million years ago. Our results provide insights into the process of mt genome fragmentation in the sucking lice in a relatively fine evolutionary scale.


Background
The closest extant relatives of humans are chimpanzees (bonobos and common chimpanzees) and gorillas. Humans share a common ancestor with gorillas, Gorilla gorilla and G. beringei,~7 million years ago (MYA) and share a common ancestor with chimpanzees, Pan troglodytes and Pan paniscus,~6 MYA ( [1,2], but see [3][4][5] for variation of the divergence time estimates). Due to the close evolutionary relationship between humans and these great apes, their parasites are closely related too [6]. All great apes, with the exception of orangutans, Pongo pygmaeus, are parasitised by sucking lice (suborder Anoplura) of two genera, Pediculus and Pthirus [7,8].
Humans are parasitised by three species of lice. Pediculus capitis, the head louse, lives on human head hair and is still prevalent in many countries, particularly in school-aged children. Pe. humanus, the human body louse, lives on human clothes and is thought to have evolved to exploit this new niche when humans began to wear clothing 100,000 to 200,000 years ago [9][10][11]. Although not as prevalent as the head lice, body lice are healthily more important as they are vectors for three pathogens that cause trench fever, recurrent fever and epidemic typhus in humans [12]. Humans also have the pubic louse, Pthirus pubis, which lives on the pubic hair and sometimes eyelashes [13]. The human head louse and the body louse share a genus exclusively with the chimpanzee louse, Pe. schaeffi, whereas the human pubic louse shares a genus exclusively with the gorilla louse, Pt. gorilla [7,8,14,15].
The mitochondrial (mt) genomes of the three human lice have fragmented extensively. The typical mt genomes of bilateral animals have 37 genes on a single circular chromosome [16,17]. The human head louse and the body louse, however, have their 37 mt genes on 20 types of minichromosomes [18,19]. The 34 mt genes known of the human pubic louse are on 14 minichromosomes [19]. To understand the process of mt genome fragmentation in the sucking lice of great apes, we sequenced the mt genome of the chimpanzee louse, Pe. schaeffi, and compared it with those of the human lice. We found that the 37 mt genes of the chimpanzee louse are on 18 types of minichromosomes. We show that a single minichromosome has fragmented into three in the lineage leading to the human body louse and head louse since this lineage diverged from the chimpanzee louse~6 million years ago. Our results provide insights into the process of mt genome fragmentation in the sucking lice in a relatively fine evolutionary scale.

Methods
Collection of lice, DNA extraction, mitochondrial genome amplification and sequencing Chimpanzee lice, Pe. schaeffi, were collected from rescued wild chimpanzees by staff of the Tacugama Chimpanzee Sanctuary in Sierra Leone (sample No. B2148) and were preserved in 100 % ethanol. No animal ethical approval was required for research on parasitic lice including chimpanzee lice and human lice in Australia. Total cellular DNA was extracted from individual lice using DNeasy Blood and Tissue kit (QIAGEN). A 423-bp fragment of mt cox1 gene and a 503-bp fragment of mt rrnL gene of the chimpanzee lice were amplified initially by PCR with primer pairs mtd7-mtd9a and mtd32m-mtd34Ph (Additional file 1). These primers target conserved sequence motifs in the mt genome of the chimpanzee lice. The cox1 and rrnL fragments were sequenced directly with AB3730xl 96-capillary sequencers at the Australian Genome Research Facilities (AGRF).
Two pairs of chimpanzee louse specific primers, Pscox1Frev-Pscox1Rrev and P:sh16SF-16SrevP:schaeffi, were designed from cox1 and rrnL (Additional file 1). PCRs with these chimpanzee louse specific primers amplified the complete cox1 minichromosome (~3.3 kb, Additional file 2A) and L 2(/1) -rrnL minichromosome (~3.0 kb, Additional file 2B) except a 203-bp and a 231bp gap between the primers, respectively. PCR amplicons from these two minichromosomes were sequenced using a primer-walking strategy with AB3730xl 96-capillary platform at the AGRF (see Additional file 1 for primers used). Sequences of the non-coding regions of the cox1 minichromosome and L 2(/1) -rrnL minichromosome were obtained and aligned with ClustalX [20]. A forward primer (PsF) and a reverse primer (PsR) were designed from highly conserved non-coding sequences adjacent to the 5′-end and the 3′-end, respectively, of the coding regions (Additional files 1 and 3). The PCR with PsF and PsR produced a mixture of amplicons ranging from 200 to 1800 bp from the coding regions of the mt minichromosomes of the chimpanzee louse (Additional file 2C). Amplicons generated with PsF − PsR were sequenced with a Roche GS FLX platform at the AGRF. To obtain non-coding region sequences, fulllength trnK-nad4 minichromosome and trnL-rrnS-trnC minichromosome were also amplified by PCR with chimpanzee louse specific primers (Additional file 1), and were sequenced together with cox1-minichromosome amplicons using an Illumina Hiseq platform at the Beijing Genomics Institute (BGI).
Takara LA Taq was used in PCR following the manufacturer's protocol. Each PCR (25 μL)  Thermal cycling conditions were 94°C for 1 min; then 38 cycles of 94°C for 30 s; 50-58°C (depending on primers) for 30 s; 68°C for 1-4 min (depending on amplicon size); and finally 68°C for 2-8 min (depending on amplicon size). Negative controls were run with each PCR experiment to detect false positive amplicons and DNA contamination. PCR products were checked by agarose gel eletrophoresis (1 %); the sizes of PCR products were estimated by comparing with molecular markers. PCR products were purified using the Wizard® SV Gel and PCR Clean-up System (Promega).

Assembly of Roche and Illumina sequence reads and mitochondrial genome annotation
Raw sequence-reads were assembled de novo using Geneious [21] with the parameters: 1) minimum overlap 150 bp (for Roche reads) and 70 bp (for Illumina reads); 2) minimum overlap identity 95 %; 3) maximum 10 % gaps per read; and 4) maximum gap size 10 bp. Proteincoding and rRNA genes were identified with BLAST searches (Basic Local Alignment Search Tool) of NCBI database [22] and verified by sequence alignment with their homologous mt genes of the human lice [18,19]. tRNA genes were identified with tRNAscan-SE [23], ARWEN [24] and MITOS [25]. Shared identical sequences between genes were identified with Wordmatch [26]. Sequence alignment was made with Clustal X [20].

Results and discussion
The mitochondrial genome of the chimpanzee louse comprises 18 types of minichromosomes We sequenced the amplicons generated with the primer pair, PsF-PsR, from the chimpanzee louse, Pe. schaeffi, and obtained 24,230 sequence-reads, which range from 150 to 602 bp long. We assembled these sequence-reads into contigs and identified all of the 37 mt genes typical of bilateral animals in the chimpanzee louse; these genes are on 18 types of minichromosomes ( Fig. 1; Table 1). Each minichromosome has 1-5 genes and is 3-4 kb in size, with a coding region, 145 (for trnW-trnS 2 minichromosome) to 1617 bp (for nad5 minichromosome) in size, and a non-coding region (NCR). Seventeen of the 18 mt minichromosomes of the chimpanzee louse have their counterparts with the same gene content and gene arrangement in the human head louse, Pe. capitis, and the human body louse, Pe. humanus [18,19]. Only cob-trnS 1 -trnN-trnE-trnM minichromosome of the chimpanzee louse is not seen in the human head louse and the body louse; the five genes on this minichromosome are on three minichromosomes in the two human lice.
We identified 21 of the 22 mt tRNA genes of the chimpanzee louse with tRNAscan-SE [23], ARWEN [24] and MITOS [25] on the basis of their inferred secondary structures and anticodon sequences (Additional file 4). We could not identify trnD with these programs. Instead, we used the trnD sequences of the human head louse and the body louse [19] to locate this gene in the chimpanzee louse. trnD of the chimpanzee louse is on a minichromosome with trnT and trnH and has 72 % sequence similarity to trnD of the human head louse and body louse (Additional file 4). Furthermore, the location of trnD of the chimpanzee louse relative to their neighbour genes is the same as that in the human head louse and the body louse ( Fig. 1) [18,19].
As in the human lice [19], trnL 1 (tag) and trnL 2 (taa) of the chimpanzee louse differ by only one nucleotide at the third anticodon position (Additional file 4). Our Roche deep sequencing revealed that both trnL 1 (tag) and trnL 2 (taa) were present before rrnS and rrnL in the chimpanzee louse: for rrnS the vast majority were trnL 1 (94.5 %) whereas for rrnL the vast majority were trnL 2 (97.3 %) (Fig. 1). The relative abundance between the Fig. 1 The mitochondrial minichromosomes of the chimpanzee louse, Pediculus schaeffi. Each minichromosome has a coding region with gene names, transcription orientation, and length indicated, and a NCR in black. The minichromosomes are in alphabetical order according to the names of their protein-coding and rRNA genes, followed by those with tRNA genes only. Protein-coding genes are abbreviated as atp6 and atp8 (for ATP synthase subunits 6 and 8), cox1-3 (for cytochrome c oxidase subunits 1 to 3), cob (for cytochrome b), and nad1-6 and 4 L (for NADH dehydrogenase subunits 1 to 6 and 4 L). rrnL and rrnS are for large and small rRNA subunits. tRNA genes are shown with the single-letter abbreviations of their corresponding amino acids. Chimpanzee image: courtesy of the Tacugama Chimpanzee Sanctuary two trnL genes in the four types of minichromosomes that contain rrnS and rrnL genes of the chimpanzee louse was consistent with that of the human head louse and the body louse we investigated (Additional file 5).
We sequenced the full-length NCR of cox1 minichromosome with an Illumina Hiseq platform; we were unable, however, to sequence a gap downstream the GC-rich motif in the NCR of trnK-nad4 minichromosome, nor trnL-rrnS-trnC minichromosome, although we used the same sequencing strategy. The NCR of cox1 minichromosome is 1770 bp long with a GC-rich motif and an ATrich motif at two ends that are highly conserved in other minichromosomes. There are two repeat units in the NCR of cox1 minichromosome, 112 and 114 bp long respectively; these two units are 406 bp apart from one another and have 94 % similarity (Additional file 3).
Recombination hot-spots revealed by shared identical sequences between mitochondrial genes in the chimpanzee louse Seven stretches of identical nucleotide sequences, 31 to 133 bp long, were found between five pairs of mt genes in the chimpanzee louse ( Table 2). As in the three human lice [19], trnL 1 and trnL 2 of the chimpanzee louse differ by only one nucleotide at the third anticodon position, and thus share two stretches of identical sequences, 32 and 34 bp respectively ( Fig. 2a; Additional file 4). nad5 and rrnL share 101-bp identical sequence in the chimpanzee louse; these two genes share 99-bp identical sequence in the human head louse and the human body louse, but not in the human pubic louse [19], nor in other blood-sucking lice [27]. cox1 and nad3 share 49 bp of identical sequence in the chimpanzee louse; these two genes, however, do not share longer-thanexpected identical sequences in the human lice, nor in other sucking lice. atp8 and cob share 26 bp of identical sequence in the human head and body lice but share 54 bp of identical sequence in the chimpanzee louse. nad4 and nad5 share two stretches of identical sequences, 31 and 133 bp long, in the chimpanzee louse; in the human head louse and body louse, these two genes share 30 and 127 bp identical sequences.  Both trnL 1 (tag) (94.5 %) and trnL 2 (taa) (5.5 %) were found before rrnS; c both trnL 1 (2.7 %) and trnL 2 (97.3 %) were found before rrnL (Additional file 5) The 133-bp identical sequences shared between nad4 and nad5 in the chimp lice are at the 3′ ends of these two genes. The question arisen was whether or not these sequences were indeed part of nad4 and nad5. We annotate these sequences as part of nad4 and nad5 on the basis that: 1) each of them is part of an open reading frame (ORF); 2) the two ORFs with these sequences included are in the usual, expected length of nad4 and nad5 genes of insects; 3) these sequences are only present in nad4 and nad5, not in any other genes nor noncoding regions in the chimpanzee louse; and 4) these sequences have high similarity with their corresponding regions of nad4 and nad5 in the human lice (Fig. 3).
The longer-than-expected identical sequences shared between mt genes in the chimpanzee louse provide further evidence for recombination between mt minichromosomes in the blood-sucking lice [18,19,[27][28][29][30]. Furthermore, with the only exception of the 49-bp identical sequence shared between cox1 and nad3, which was found only in the chimpanzee louse ( Table 2), all of the six other identical sequences shared by different genes in the chimpanzee louse were in similar length, had high similarity and were at the same gene locations as their counterparts in the human head louse and the body louse (Fig. 2a, Fig. 3). Given the chimpanzee louse had a common ancestor with the human head louse and the body louse~6 MYA (see below), the conserved gene locations for the six shared identical sequences are clearly hot spots for homologous recombination between mt genes in the chimpanzee louse, the human head louse and the body louse (Fig. 3).
When two different genes, non-homologous to each other, share a stretch of identical sequence, it begs the question: from which of the two genes did the shared sequence originate? In the human head louse and the body louse, there is insufficient evidence for us to answer this question for any of the nine shared identical sequences [18,19]. This is also the case for all of the shared identical sequences observed in the chimpanzee louse except for the 54-bp identical sequence shared between atp8 and cob ( Table 2). BLAST searches showed that this 54-bp sequence was from a domain of cob conserved among insects and thus had a cob origin; intriguingly, this 54-bp sequence used the second open reading frame in cob but used the first frame in atp8 (Fig. 2b).

Mitochondrial genome fragmentation in the bloodsucking lice of great apes
All great apes, with the exception of orangutans are parasitised by sucking lice of the genera Pediculus and Pthrius [7,8]. The human head louse, Pe. capitis, and the human body louse, Pe. humanus, share a genus exclusively with the chimpanzee louse, Pe. schaeffi, whereas the human pubic louse, Pt. pubis, shares a genus exclusively with the gorilla louse, Pt. gorilla. Two hypotheses have been raised for the distribution of the sucking lice in the genera Pediculus and Pthirus among humans, chimpanzees and gorillas. The first hypothesis (a) (b) Fig. 2 Recombination hot-spots in the mitochondrial genomes of the chimpanzee louse, Pediculus schaeffi, indicated by shared identical sequences between non-homologous genes. a: locations of shared identical sequences in genes. Shared identical sequences are highlighted in color with their length in bp. Genes are indicated with boxes from 5′ end to 3′ end. b: the 56-bp identical sequence shared between atp8 and cob used different open reading frames postulates a host-switch event of Pthirus lice from gorillas to humans, whereas the second hypothesis postulates a loss of Pediculus lice in gorillas and a loss of Pthirus lice in chimpanzees [14]. The first hypothesis is more parsimonious and assumes the coevolution of Pediculus lice with their hosts. Thus, the divergence between the human head louse and the human body louse on one hand, and the chimpanzee louse on the other hand, occurred at the time when their hosts, humans and chimpanzees, diverged~6 MYA ( [1,2], but see [3][4][5] for variation of the divergence time estimates).
Most bilateral animals have 37 mt genes on a single circular chromosome, 15-20 kb in size [16,17]. The human head louse and the human body louse, however, have their 37 mt genes on 20 types of minichromosomes [18,19]. For the human pubic louse, the 34 mt genes identified are on 14 minichromosomes [19]. We show in the present study that the chimpanzee louse has two less mt minichromosomes than the head louse and the body louse, although it is closely related to the human lice and is in the same genus Pediculus (Fig. 1).
The data available to date indicated that mt genomes already became fragmented in the most recent common ancestor (MRCA) of all blood-sucking lice (suborder Apoplura), as all of the species from both major clades of the sucking lice that have been properly examined have fragmented mt genomes [18,19,[27][28][29][30]. Minicircles with mt genes were reported in a Damalinia species (Trichodectidae, Ischnocera), indicating that fragmented mt genomes may also be present in chewing lice [31]. The extensive fragmentation of mt genomes in the human head louse and the body louse [18,19] and in the chimpanzee louse, however, does not appear to be an ancestral condition for sucking lice but a derived condition for the human lice and the chimp louse in the genus Pediculus since all other sucking lice studied to date have less or much less fragmented mt genomes than the human lice and the chimpanzee louse [18,19,[27][28][29][30].

Phh_trnL1-trnL2_32bp GCTAAATTGGCAGATAAAGTGCGTTGGATTTA
in the lineage leading to the human head louse and the body louse after this lineage split from that leading to the chimpanzee louse. Furthermore, it is likely that the cob-trnS 1 -trnN-trnE-trnM minichromosome fragmented into two minichromosomes first: one with cob and the other with trnS 1 -trnN-trnE-trnM. The latter then fragmented further into another two, generating the three minichromosomes we see in the human head louse and the human body louse today ( Figs. 1 and 4). If this was the case, it would indicate that the fragmentation of one minichromosome into two took approximately 3 MY.

Conclusions
In conclusion, we sequenced the mt genome of the chimpanzee louse: the 37 mt genes typical of bilateral animals are on 18 minichromosomes in this louse. Seventeen of the 18 minichromosomes of the chimpanzee louse have the same gene content and gene arrangement as their counterparts in the human head louse and the body louse. Five genes, cob, trnS 1 , trnN, trnE and trnM, which are on three minichromosomes in the human head louse and the body louse, are on one minichromosome in the chimpanzee louse. Using the human pubic louse as an outgoup, we infer that a single minichromosome has fragmented into three in the lineage leading to the human head louse and the body louse since this lineage diverged from that leading to the chimpanzee louse~6 MYA when their hosts diverged. Our results provided insights into the process of mt genome fragmentation in the sucking lice in a relatively fine evolutionary scale.

Availability of supporting data
The nucleotide sequences of the mt genome of the chimpanzee louse supporting the results of this article have been deposited in GenBank (accession numbers KC241882-KC241897 and KR706168-KR706169). The mt genomes of the human body louse, head louse and pubic louse have been published previously (GenBank accession numbers: EU219983-EU219995, FJ499473-FJ499490, FJ514591-FJ514599, HM241895-HM241898, and JX080388-JX080407) [18,19].  Fig. 4 Fragmentation of cob-trnS 1 -trnN-trnE-trnM minichromosome in the lineage leading to the human head louse, Pediculus capitis, and the human body louse, Pe. humanus. The split time,~6 MYA, for the chimpanzee louse from the human head louse and the body louse was after Goodman (1999) [1] and Glazko and Nei (2003) [2] for the chimpanzee-human split time. The split time,~7 MYA, for Pthirus pubis from the Pediculus species was also after Goodman (1999) [1] and Glazko and Nei (2003) [2] for the gorilla-(chimpanzee + human) split time