Transcriptomic analysis identifies genes and pathways related to myrmecophagy in the Malayan pangolin (Manis javanica)

The Malayan pangolin (Manis javanica) is an unusual, scale-covered, toothless mammal that specializes in myrmecophagy. Due to their threatened status and continuing decline in the wild, concerted efforts have been made to conserve and rescue this species in captivity in China. Maintaining this species in captivity is a significant challenge, partly because little is known of the molecular mechanisms of its digestive system. Here, the first large-scale sequencing analyses of the salivary gland, liver and small intestine transcriptomes of an adult M. javanica genome were performed, and the results were compared with published liver transcriptome profiles for a pregnant M. javanica female. A total of 24,452 transcripts were obtained, among which 22,538 were annotated on the basis of seven databases. In addition, 3,373 new genes were predicted, of which 1,459 were annotated. Several pathways were found to be involved in myrmecophagy, including olfactory transduction, amino sugar and nucleotide sugar metabolism, lipid metabolism, and terpenoid and polyketide metabolism pathways. Many of the annotated transcripts were involved in digestive functions: 997 transcripts were related to sensory perception, 129 were related to digestive enzyme gene families, and 199 were related to molecular transporters. One transcript for an acidic mammalian chitinase was found in the annotated data, and this might be closely related to the unique digestive function of pangolins. These pathways and transcripts are involved in specialization processes related to myrmecophagy (a form of insectivory) and carbohydrate, protein and lipid digestive pathways, probably reflecting adaptations to myrmecophagy. Our study is the first to investigate the molecular mechanisms underlying myrmecophagy in M. javanica, and we hope that our results may play a role in the conservation of this species.

114 terms of dietary adaptation, we wanted to investigate whether specific genes exist for digestive 115 function. We selected the liver, small intestine, and salivary glands for transcriptome sequencing 116 to analyze the genetic selection of potential candidate genes involved in myrmecophagy. Salivary 117 glands are important amylase secretory organs although liver secretions also contain amylase and 118 lipase, and the small intestine is involved in the absorption of nutrients. We aimed to analyze the 119 unique feeding behavior of pangolins in the hope that our results may provide a new approach to 120 the protection of this species. 127 Biological sample 128 Briefly, an adult female M. javanica was provided by the Dongguan Institute of 129 Qingfengyuan Animal Medicine (Dongguan, Guangdong, China). The specimen was a wild 130 individual that died in the process of rescue; the animal was 4.08 kg in weight before its natural 131 death. The animal was dissected immediately after its natural death. During the dissection, ants 132 were found in its stomach. The salivary glands, liver, and small intestine were collected as soon as 133 possible, frozen in liquid nitrogen, and stored at -80°C until RNA extraction.
134 RNA isolation, cDNA library construction and Illumina sequencing 135 Total RNA was extracted from the three tissues using RNAiso reagent (Takara, Otsu, Japan) 136 and was treated with DNase I (Takara). RNA purity was checked using a NanoPhotometer 137 spectrophotometer (IMPLEN, CA, USA). RNA concentration was measured using the Qubit RNA 138 Assay Kit and a Qubit 2.0 Flurometer (Life Technologies, CA, USA), and RNA integrity was 139 assessed using the RNA Nano 6000 Assay Kit and the Agilent Bioanalyzer 2100 system (Agilent 140 Technologies, CA, USA). RNA was frozen at -80°C until cDNA library construction.

141
RNA samples were then mixed with fragmentation buffer (Ambion) and fragmented into 142 short fragments; the average insert size was 200 bp. mRNA was purified from total RNA using 143 poly-T oligo-attached magnetic beads. DNA contaminants were further removed through DNase 144 enzyme digestion followed by rRNA removal. cDNA synthesis was then followed by PCR 145 amplification to generate a complete cDNA library, which was sent for sequencing in a flowcell 148 Data assembly and annotation 149 Four groups of sequencing data were used for assembly and annotation and the data were 150 annotated using a published paper housed at the National Center for Biotechnology Information 151 (NCBI) Sequence Read Archive (SRA) accession No.SRR2561213 (Yusoff et al. 2016). The 152 primary sequencing data were cleaned by a) removing reads with adaptors; b) removing reads 153 where the proportion of unknown bases (N) exceeded 5% (-n = 0.05); and c) removing low-quality 154 reads (reads in which the proportion of bases with quality < 10 exceeded 20%, i.e., -l = 10 and -q 155 > 0.2). Clean reads were aligned to reference sequences using a spliced read mapper for RNA-Seq-156 TopHat2 (http://ccb.jhu.edu/software/tophat/index.shtml), which is based on Bowtie2 157 (http://bowtie.cbcb.umd.edu/). The parameter values of RNA-Seq-TopHat2 were as follows; --158 read-mismatches: 2, --read-edit-dist: 2, --max-intron-length: 5000000, --library-type: fr-159 unstranded, --GTF: genome.gtf, --mate-inner-dist: 40, --solexa1.3-quals, and other parameters 160 were set to the default values. There were three steps: a) align the reads to the reference 161 transcriptomes; b) align the reads to the reference genome without the alignment described in a); 162 and c) align the reads segmentally to the reference genome without the alignment described in b). 163 Following this, the alignment data were used to calculate the distribution and coverage of the reads 164 on the reference genes.  Distributions of potential transcripts related to feeding among the three tissue libraries are 332 shown in Table 3, Table S3 and File S7, including the 997 transcripts related to sensory perception.  Figure S8).

370
Several transcripts were specifically expressed in a single sample, including 21 transcripts in 371 the small intestine that were involved in 19 pathways, five transcripts in the liver that were 372 involved in 11 pathways, 36 transcripts in the referred liver that were involved in 27 pathways, 373 and six transcripts in the salivary glands that were involved in 10 pathways (Table S4). The highest 374 number of specific transcripts was eight, and these transcripts were involved in the arachidonic 375 acid metabolism in the liver reported in the previous study (Yusoff et al. 2016); six transcripts 376 were involved in glycerophospholipid metabolism, in either lipid metabolism or arachidonic acid 377 metabolism of the small intestine (Fig. 5).  The numbers of transcripts related to metabolic pathways that are specifically expressed in the salivary glands, liver, and small intestine of adult female Malayan pangolin (Manis javanica).
The x-axis shows the KEGG function classes, and the y-axis shows the number of transcripts with the corresponding KEGG function class.