Partial mtDNA sequencing data of vulnerable Cephalopachus bancanus from the Malaysian Borneo

Tarsier is an endangered nocturnal primate in the family Tarsiidae and is an endemic to Sundaic islands of Philippine (Carlito syrichta), Sulawesi (Tarsius tarsier-complex) and Borneo (Cephalopachus bancanus). Recent records indicated that most molecular studies were done on the Eastern Tarsier and little information for the other group of tarsiers. Here, we present a partial cytochrome b data set of C. bancanus in Sarawak, Malaysian Borneo. Standard mist nets were deployed at strategic locations in various habitat types. A total of 18 individuals were caught, measured and weighed. Approximately, 2 × 2 mm of tissue samples were taken and preserved in molecular grade alcohol. Out of 18, only 11 samples were screened with partial mtDNA (cytochrome b) and the DNA sequences were registered in the GenBank (accession numbers: KY794797-KY794807). Phylogenetic trees were constructed with 20 additional mtDNA sequences downloaded from GenBank. The data are valuable for the management authorities to regulate the type of management units for the metapopulation to sustain population genetics integrity of tarsiers in the range countries across the Sunda Shelf.


a b s t r a c t
Tarsier is an endangered nocturnal primate in the family Tarsiidae and is an endemic to Sundaic islands of Philippine (Carlito syrichta), Sulawesi (Tarsius tarsier-complex) and Borneo (Cephalopachus bancanus). Recent records indicated that most molecular studies were done on the Eastern Tarsier and little information for the other group of tarsiers. Here, we present a partial cytochrome b data set of C. bancanus in Sarawak, Malaysian Borneo. Standard mist nets were deployed at strategic locations in various habitat types. A total of 18 individuals were caught, measured and weighed. Approximately, 2 Â 2 mm of tissue samples were taken and preserved in molecular grade alcohol. Out of 18, only 11 samples were screened with partial mtDNA (cytochrome b) and the DNA sequences were registered in the GenBank (accession numbers: KY794797-KY794807). Phylogenetic trees were constructed with 20 additional mtDNA sequences downloaded from GenBank. The data are valuable for the management authorities to regulate the type of management units for the metapopulation to sustain population genetics integrity of tarsiers in the range countries across the Sunda Shelf. © 2019 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons. org/licenses/by/4.0/).

Data
Tarsiers are a vulnerable primate group [1] in family Tarsiidae that can be found on Southeast Asia Islands; Sundaic islands of Philippine (Carlito syrichta), Sulawesi and surrounding islands (Tarsius tarsier-complex) and Borneo (Cephalopachus bancanus) [2]. Western Tarsier Cephalopachus bancanus bancanus can be found in Malaysian Borneo and is listed as protected and totally protected species in the Malaysia's Wildlife Conservation Act (WCA) 2010 and Sarawak's Wildlife Protection Ordinance (WLPO) 1998 respectively. The molecular research interest on this endemic species is due to the availability of recent information related to taxonomy and evolutionary relationship of tarsiers since the expansion of fauna and prehistoric human into Southeast Asia [2,3].
This dataset contains genetic phylogenetic information of C. bancanus from Malaysian Borneo. Table  1 shows a list of field sampling conducted in Sarawak, Borneo. Field number, standard morphological measurements, weight and sex of each individual were recorded as in Table 2. A set of partial primers of Cytochrome b, DNA master mixture profile and PCR profile were tabulated as in Table 3 and  Supplementary Tables 1 and 2 respectively [4]. Additional 20 mtDNA sequences derived from the GenBank [5e15] were used and tabulated in Table 4. The sequence variations, frequency distribution haplotypes and pairwise distance of tarsier were identified as in Tables 5 and 6

Value of the Data
The data are valuable for the management authorities to determine the type of management units for the metapopulations to maintain the integrity of population genetics in their ranges across the Sunda Shelf.
The data can be used as baseline information for future studies on genetic and molecular ecology that can be used as a flagship model to test the "Out of Sunda" theory and elucidating the history of prehistoric humans and primates migration waves in Southeast Asia. The data allow other researchers focusing on this population to start the genome-wide analysis.   Table 3 Primer used for PCR amplification [4].

Sample Collection
Field sampling was conducted at the southern part of Sarawak; Kubah National Park, Matang Wildlife Centre, Universiti Malaysia Sarawak (UNIMAS), Maludam National Park, Cermat Ceria Forest, Kampung Barieng and Durafarm Plantation ( Table 1). The samplings were assisted by the field assistants from the Institute of Biodiversity and Environmental Conservation (IBEC), UNIMAS. A total of ten mist nets were deployed at strategic locations with high vegetation, trees with small trunk diameter and near to the stream or water bodies [16e20]. A total of 18 individuals were captured, identified, sexed, measured and weighed (Table 2) [18e21]. Each was tranquilised using Zoletil 100 mg solution. Approximately, 2 Â 2 mm-thick tissues samples were taken and preserved in molecular grade alcohol. Fig. 1. The evolutionary history was inferred using the Neighbor-Joining tree method. The optimal tree with the sum of branch length ¼ 1.16079630 is shown. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1,000 replicates; more than 50%) is shown above the branch. The institutional codes are listed in Tables 2 and 4.

DNA extraction, amplification, purification and sequencing
The DNA samples were extracted using cetyl-tri-methyl ammonium bromide (CTAB) protocol [22] and polymerase chain reaction (PCR) amplified using a set of cytochrome b partial primers [4]. The amplified products were purified using Promega Wizard SV Gel and PCR Clean-Up System (Promega Co.) and subjected to cycle sequencing at the First Base Laboratories Malaysia. The C. bancanus sequences were registered in the GenBank (accession numbers: KY794797-KY794807) ( Table 2).

Sequence analysis
The nucleotide sequences were visualized and read using Sequencher 5.4 (https://genecodes.com). The sequences were matched and aligned with 20 additional mtDNA sequences (  The evolutionary history was inferred using the Maximum Parsimony method. The tree was obtained using the Subtree-Pruning-Regrafting (SPR) algorithm which the initial trees were obtained by the random addition of sequences. The consistency index is 0.819864 and the composite index is 0.478254 for all sites and parsimony-informative sites. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1,000 replicates; more than 50%) is shown above the branch. The institutional codes are listed in Tables 2 and 4. nucleotide composition and haplotype frequency were performed in Molecular Evolutionary Genetics Analysis (MEGA) 7 [23] and DnaSP [24]. The evolutionary divergence between sequences (Supplementary Table 3) was estimated in MEGA 7 by using the p-distance model where all positions containing gaps and missing data were eliminated. Kimura 2-parameter method was used to compute the Neighbor-Joining tree (Fig. 1). The evolutionary history of Maximum Parsimony was shown in Fig. 2. The tree was obtained using the Subtree-Pruning-Regrafting (SPR) algorithm which the initial trees were obtained by the random addition of sequences. Meanwhile, the evolutionary history of Maximum Likelihood was performed using the Hasegawa-Kishino-Yano (HKY þ G þ I) method (Fig. 3). The best model was chosen based on the Akaike Information Criterion (AIC; 4776.487) value and the lowest Bayesian Information Criterion (BIC; 5254.204) score. Fig. 3. The evolutionary history was inferred by using the Maximum Likelihood method based on the Hasegawa-Kihino-Yano (HKY þ G þ I) model and the tree with the highest log likelihood (À2336.6352) is shown. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1,000 replicates; more than 50%) is shown above the branch. The institutional codes are listed in Tables 2 and 4. geneticist Dr Faisal Ali Anwarali Khan, science officer Mr Wahap Marni, postgraduate and undergraduate students of UNIMAS batch 2013/2014 of Department of Zoology, Faculty of Resource Science and Technology (FRST) for their assistance during field samplings. This study was supported by the Sarawak Forest Department (SFD) researches permit (NCCD/907/4/4/Jld9-39) and park permit (81/ 2013). This study was made possible under grants RAG/S(4)/916/2012 (17) awarded to Mr Mohd Ridwan Abd Rahman and colleagues. The Universiti Malaysia Terengganu funded this publication from the Research and Innovation Management Centre (RIMC) publication grant is gratefully acknowledged.