The Piranha Genome Provides Molecular Insight Associated to Its Unique Feeding Behavior

Abstract The piranha enjoys notoriety due to its infamous predatory behavior but much is still not understood about its evolutionary origins and the underlying molecular mechanisms for its unusual feeding biology. We sequenced and assembled the red-bellied piranha (Pygocentrus nattereri) genome to aid future phenotypic and genetic investigations. The assembled draft genome is similar to other related fishes in repeat composition and gene count. Our evaluation of genes under positive selection suggests candidates for adaptations of piranhas’ feeding behavior in neural functions, behavior, and regulation of energy metabolism. In the fasted brain, we find genes differentially expressed that are involved in lipid metabolism and appetite regulation as well as genes that may control the aggression/boldness behavior of hungry piranhas. Our first analysis of the piranha genome offers new insight and resources for the study of piranha biology and for feeding motivation and starvation in other organisms.


Introduction
Piranhas are well-known South-American fishes and paradigmatic representatives of Characids. Their widespread reputation comes from their predatory habits and remarkable feeding adaptations, which include large, sharp teeth, and the strongest bite force determined in any fish to date (Grubich et al. 2012).
Red-bellied piranha (Pygocentrus nattereri) ( fig. 1) inhabits neotropical freshwater rivers of northeastern Brazil, and the Paraguay and Parana basins. Previous research on this species has primarily focused on attributes related to diet/ feeding habits (Pauly 1994) and social and feeding behavior (Foxx 1972;Sazima and Machado 1990). Although it is mostly a carnivorous piscivore, it can be opportunistic and feed on plants, insects, worms, and crustaceans. Furthermore, it preys upon sick and injured fishes and scavenges on cadavers of fishes and other vertebrates (Pauly 1994). The popular man-eating reputation of piranhas is likely due to the necrophagous habits of these fish (Sazima and de Andrade Guimarães 1987), as only rarely have "feeding frenzy" attacks on large live prey have been observed. In the wild, red piranha exhibit social behavior and swim in schools, usually of 20-30 fish that feed together (Bellamy 1968;Sazima and Machado 1990). They engage in acoustic communication, which is sometimes exhibited along with aggressive behaviors (Millot et al. 2011). Red piranha mostly ambush fish from within the aquatic vegetation and dash after passing preys, and scan and pick the sediment while foraging on vegetation or invertebrates (Foxx 1972;Sazima and Machado 1990). They can endure long periods of prey shortage and starvation, in particular, when water levels drop and prey becomes scarce at the end of the summer.
Most fish in the wild are faced with short-to long-term fasting periods during their life, which can have profound behavioral and biochemical effects (Navarro and Guti errez 1995), and this phenomenon is more pronounced in carnivorous fish than in herbivorous or omnivorous fish (Navarro and Guti errez 1995). Carnivorous fish might be faced with the lack of adequate prey species in their environment more frequently, and need to invest more time and energy for successful capture, which presents a particular challenge after long periods of fasting.
Due to its unique biology and behaviors, the piranha represents a potentially valuable experimental model in feeding research that can provide important information on the effects of food restriction and feast/famine on feeding behavior. However, very few studies have examined how these animals can cope with starvation. Early metabolic studies show that, in red piranha, liver glycogen and lipids are mobilized rapidly during short-term starvation and low blood glucose levels induces food intake (Bellamy 1968). Studies have shown that fasting affects gene expression of a number of appetite-regulating hormones in the brain (e.g., orexin, CART) and in the intestine (e.g., PYY, leptin, ghrelin) of piranha (Volkoff 2014(Volkoff , 2015, suggesting that endocrine mechanisms might be in part responsible for the resistance of these fish to low food availability. Although these studies provide initial clues to the response of piranha to fasting, they only assess changes in expression of specific targeted genes. The availability of the P. nattereri genome allows us to investigate the genome-wide expression changes under conditions of fasting and to compare them to a satiated feeding regime, thus providing more broad context for understanding the physiological response of the fish to food restriction. We have assembled the piranha genome and used this resource to explore gene regulation in the brain following dietary restriction using high-throughput sequencing.

Genome Assembly and Annotation
The red piranha DNA used for sequencing was derived from a single female (Collection ID: Pna-1) among a maintained laboratory population. Using the predicted 1.6 Gb genome size estimate of the white piranha (Carvalho et al. 2002), total raw sequence coverage of Illumina reads was 76Â (short overlapping reads, 3 kb, 8 kb, and 40 kb matepaired libraries). The overlapping sequence reads (250 bp) were assembled using DiscoVar de novo (Weisenfeld et al. 2014) with default parameters. These reads are derived from a 400 bp library, thus making both pairs overlap in a region of $50 bp, a feature that is exploited by the assembler. This initial assembly of contigs was iteratively scaffolded with long paired reads (3 and 8 kb) using the program SSPACE (Boetzer et al. 2011) and gap filled with a version of Image (Tsai et al. 2010) modified for large genomes. All contigs and scaffolds were cleaned of contaminating sequences by performing a MegaBLAST (Zhang et al. 2000) of the contigs against adapter, bacterial and other vertebrate databases. The final assembly P. nattereri 1.0.2 was masked and then annotated for gene content using the NCBI pipeline described at http://www.ncbi.nlm.nih.gov/books/NBK169439/ .
Repeats and transposable elements (TEs) were identified using RepeatModeler (version 1.0.11, http://www.repeatmasker.org/RepeatModeler/, default parameters). To screen the piranha genome for TEs, the resulting TE library was used as input for RepeatMasker (version 4.0, http://repeatmasker.org/). Characteristics of the repeat landscape (including element age and diversity) were plotted using the RepeatMasker perl script "createRepeatLandscape."

Positive Selection
To estimate genes under positive selection in piranha the protein and cDNA fasta files for several well-annotated species of fish representing the whole fish tree of life were downloaded from NCBI (supplementary table 1, Supplementary Material online). Orthologous proteins of all fish were identified using inparanoid (O'Brien et al. 2005)with default settings. For each gene with a protein ortholog across all species (comparison 1: n ¼ 11,501, comparison 2: n ¼ 3,949), the corresponding protein and cDNA sequences were aligned and converted into a codon alignment using pal2nal (version v14) (Suyama et al. 2006). Resulting sequences were aligned by MUSCLE (Edgar 2004) (option: -fastaout) and nonconserved blocks were removed using Gblocks (version 0.91 b) (Castresana 2000) (options: -b4 10 -b5 n -b3 5 -t ¼ c). The Gblocks output was converted to paml format using an in-house script. Trees were built using Phylip (version 3.696, http://evolution. genetics.washington.edu/phylip.html) with Danio rerio (comparison 1) or Latimeria chalumnae (comparison 2) as outgroup (supplementary table 1, Supplementary Material online). For the generation of the phylogenetic tree, sequences from all orthologous genes of all fish (comparison 1: n ¼ 11,501, comparison 2: n ¼ 3,949) resulting from Gblocks were concatenated and aligned using MUSCLE. Distances were calculated using Phylip/dnadist (version 3.696, http://evolution. genetics.washington.edu/phylip.html) with D. rerio (comparison 1) or L. chalumnae (comparison 2) as outgroup (supplementary table 1, Supplementary Material online) using default settings. The tree was built by Phylip/neighbor using default settings. The phylogenetic tree was drawn from comparison 2 by the FigTree tool (http://tree.bio.ed.ac.uk/software/figtree/) with L. chalumnae as outgroup. For the phylogenetic analyses by maximum likelihood the "Environment for Tree Exploration" (ETE3) toolkit (Huerta-Cepas et al. 2016), which automates CodeML and Slr analyses by using preconfigured evolutionary models, was used. For the detection of genes under positive selection in piranha, we compared the branch-specific model bsA1 (neutral) with the model bsA (positive selection) using a likelihood ratio test (FDR 0.05). FDR was calculated using the "p.adjust" from the R package "stats." To detect sites under positive selection, Naive Empirical Bayes (NEB) probabilities for all 4 classes were calculated for each site. Sites with a probability > 0.95 for either site class 2a (positive selection in marked branch and conserved in rest) or site class 2 b (positive selection in marked branch and relaxed in rest) were considered.

Experimental Animals
Red-bellied piranhas (average weight 8.1 6 0.1 g, average total length 6.8 6 0.3 cm) were purchased from ABCee's Aquatic Imports (Lasalle, QC, Canada) and kept in 65 L (61.0Â33.0Â33.0 cm 3 ) glass aquaria under a simulated photoperiod of 16 h light:8 h dark, with constantly aerated and filtered water at 28 C (two to five fish per tank). For starvation experiments, fish were placed in groups of 2-4 of sizematched fish in 5 (2 fed, 3 fasted) tanks to avoid cannibalism/ predation on smaller fish. Fish were fed to satiety once a day (10:00 am), with flakes (41% protein, 12% fat, 2% fiber, 8.5% moisture, 8% ash, Omega Sea, Sitka, AK). Fish were acclimated under these standard conditions for 2 weeks before the start of the experiment. All fish were immature males or females ($8 cm TL, piranhas mature at $16 cm TL, see Pauly 1994). For the fasting experiment, following the acclimation period, controls continued to be fed once a day and experimental fish were not fed for 10 days. On the sampling day, fed fish were fed at their regular feeding time and fed and fasted fish were sampled 30 minutes after feeding time.
Following experiments, fish were killed by immersion in 0.05% tricaine methanesulfonate (MS 222) (Syndel Laboratories, Vancouver, BC, Canada) followed by spinal section, and whole brain and intestines sampled. All fed fish had food in their gastrointestinal tracks (GIT), whereas fasted fish had empty GITs. Tissues were dissected and placed on ice in RNAlater (Qiagen, Mississauga, ON, Canada) and stored at À20 C until RNA extractions were performed. All experiments were carried out in accordance with the principles published in the Canadian Council on Animal Care's guide to the care and use of experimental animals (Memorial University protocol number 16-08-HV).

Transcriptome Analysis
Total RNA was isolated with the RNAeasy kit (Qiagen) according to the company's protocol. Total RNA was extracted from eyes, gills, spleen, kidney, ovary, liver, skin, heart, muscle, and brain to aid in gene annotation. For feeding trials, total RNA was isolated as before from the whole brains of three starved and four fed fish. We checked total RNA for quality on the Agilent Fragment Analyzer, then enriched for poly(A)þ RNA using the MicroPolyA Purist kit (Ambion, Carlsbad, CA). We used ScripSeq (Epicentre, Madison, WI) to generate strandspecific cDNA that was sequenced on the Illumina Hiseq2000 platform as 100 base paired-end reads (insert size of 400 bp). For transcriptome statistics, see supplementary table 2, Supplementary Material online. Sequences have been deposited at NCBI under BioProject ID PRJNA533530.

Differential Gene Expression
All cDNA sequences from fed and fasted piranha were aligned to the P. nattereri 1.0.2 reference genome and read counts were calculated using the STAR aligner (Dobin et al. 2013) with default settings. Differentially expressed genes were detected using the Bioconductor package DESeq2 (Love et al. 2014). A gene was considered to be differentially expressed, if FDR 0.05 and baseMean !10. For the comparison of the intestine samples, a fold change > 4 in intestine was used since this is only preliminary data with one sample per treatment and no P value can be calculated. To discover enriched functional categories, human orthologs were determined by Ensembl biomart (http://www.ensembl.org/biomart/martview/0e0d4ed19703bfe45f84a3b9b452157a) and functionally clustered using the Database for Annotation, Visualization and Integrated Discovery (DAVID, https://david. ncifcrf.gov/home.jsp).

Assembly and Gene Annotation
We generated a draft assembly for a single female piranha ( fig. 1) to use in studies of trait evolution unique to piranha. With a total sequenced genome input coverage of 76Â (39Â fragments, 26Â 3 kb, and 11Â 8 kb), we assembled a total of 1.25 Gb. The assembly is comprised of 283,518 scaffolds (including single contig scaffolds) with an N50 contig and scaffold length of 57 kb and 1.4 Mb, respectively. Despite a substantial scaffold number, remaining gap sequence size is estimated to be 33 Mb and overall assembly contiguity metrics are similar to fish genomes assembled with short-read approaches (supplementary table 3, Supplementary Material online). This suggests overall many assembly gaps are of small size and with future long-read approaches could be readily closed.
When masked with WindowMasker (Morgulis et al. 2006), we estimate 33.8% of the piranha genome contains interspersed highly repetitive elements, suggesting a shared overall genome structure when compared with the observed range of total repeats among other closely related fishes (supplementary table 4, Supplementary Material online). Using the NCBI annotation pipeline and RNA-seq transcript evidence from diverse tissue types, a total of 25,861 proteincoding genes, 7,838 noncoding, and 482 pseudogenes were predicted (supplementary table 4, Supplementary Material online). The Actinopterygii known proteins data set shows 85% average coverage (supplementary table 5, Supplementary Material online). The Actinopterygii set of proteins was used to make comparisons across species. Evaluation of the genome for completeness based on BUSCO (Assessing genome assembly and annotation completeness with Benchmarking Universal Single-Copy Orthologs) identified 95.3% complete genes from the 4,584 Actinopterygii data set (supplementary table 6, Supplementary Material online). In summary, our various measures of gene representation suggest we have assembled most of the protein-coding genes in the red piranha genome. A full report of the gene annotation can be found at https://www.ncbi.nlm.nih.gov/genome/annotation_euk/ Pygocentrus_nattereri/100/ The RepeatModeller output for the piranha genome masked $44% (supplementary table 7, Supplementary Material online), which is in the range of most sequenced bony fish genomes (Chalopin et al. 2015). TEs constitute the majority of the known repeat content (40%), which is in the same range as the blind cave fish, another representative of the Characidae with a similar landscape of TEs (supplementary table 8, Supplementary Material online and fig. 2). The majority of the classified TE's are LINEs (4.3%) and DNA elements (6.7%), 28.3% of the genome being unclassified TEs.
The transposon history analysis of classified TE's revealed mainly a recent burst ( fig. 2), mostly of LINE elements, while another more ancient expansion of the mobilome seen in many other teleost genomes (e.g., platyfish, tilapia; Brawand 2014; Chalopin et al. 2015) is not evident in the piranha genome. In comparison to the cavefish, which also lacks the ancient expansion signature, the more recent wave of TE propagation is even younger.

Gene and Genome Evolution
The availability of a whole-genome sequence offers the opportunity to search for genetic factors that are involved in the evolution of specific traits and adaptation of piranhas. We analyzed the predicted gene set for evidence of positive selection in piranha. Only sequences that aligned unambiguously in all of the selected species were considered for selection analysis. We produced two comparisons, one that includes only the closest related species from which high quality genomes are available, the cavefish Astyanax mexicanus and the channel catfish, Ictalurus punctatus, with the zebrafish as outgroup (comparison 1). Comparison 2 was made between nine actinopterygian fish species with the coelacanth genome as outgroup ( fig. 3).
This analysis identified 44 genes in comparison 1 and 164 genes in comparison 2 (supplementary table 9, Supplementary Material online) with evidence of piranhaspecific positive selection. A GO-term analysis (supplementary fig. 1, Supplementary Material online) revealed that positively selected genes were enriched for GO terms from metabolism and mitochondrial processes. In addition, epidermal growth factor signaling related GO terms show enrichment. The list of genes under positive selection in the piranha contains many proteins involved in structure and function of the nervous system, for example, glutamate and purinergic receptors, neurexophilin-related protein, slit guidance ligand 1, NDGR4, trace amine-associated receptor 1, cerebellar degeneration-related protein 2, and glial fibrillary acidic protein. Transient receptor potential cation channel M5 plays a role in the perception of taste (Yoshida et al. 2007) and glutamate exchanger xCT is known to mediate resilience to stress (Nasca et al. 2017). This all may be related to the well-developed sensorimotor system of piranha as a fast swimmer and prey detector.
Given the ability of piranha to cope with long periods of starvation several other genes under positive selection are noteworthy. As important regulator of fasting metabolism, adenylate kinase 4 (ak4) plays a key role in cellular energy homeostasis and neuropeptide FF receptor is related to growth and body weight regulation (Peng et al. 2016).

Comparative Expression Profile Analysis of Starved and Fed Piranhas
Piranhas are adapted to long periods of prey shortage, in particular, when water levels get low and prey becomes scarce at the end of the summer. Changes in metabolism occur during food deprivation. During fasting, the mobilization of energy reserves occurs in several tissues, and it tends to be sequential, with carbohydrates utilized first, followed by fat and then protein (Woo and Fung 1981;Barcellos et al. 2010;Bar and Volkoff 2012). In some species, lipid storage primarily occurs in the liver, whereas in other species, they may predominantly be stored in muscles and in mesentery, as seen in most piranhas (Hiane et al. 2002;Ferreira 2010). A mutation in the insulin receptor, which leads to dysregulation of blood glucose levels, has allowed the blind cavefish, A. mexicanus, also like the piranhas a member of the Characiformes, to adapt to prolonged periods of starvation (Riddle et al. 2018).
With the availability of a high-quality reference genome and the prediction of the P. nattereri protein coding genes, it is now possible to perform genome wide transcriptomic studies that were lacking so far. We could for the first time estimate genome-wide expression changes in the whole brain under extreme conditions of fasting and compare to conditions of sufficient feeding regime. To control for the efficiency of our experimental conditions, RNA-seq on one intestine from each group was conducted. Changes in digestive enzymes and "classical" gut hormones, and gene regulations that are indicative of structural reorganization of the gut under fasting conditions (see supplementary table 10, Supplementary Material online) confirmed that the fish under starvation conditions were indeed under nourishment stress.
In brain of food-deprived fish, the metabolic consequences of starvation have been studied only in few species so far, mostly by measuring metabolites and glucose sensors in salmonids (Soengas and Aldegunde 2002). Two studies using targeted qPCRs (Rimoldi et al. 2016) or zebrafish microarrays (Drew et al. 2008) have also been conducted. However, comparable information from piranha is missing so far (Polakof et al. 2012). The availability of a high quality piranha reference genome allowed us to perform the first RNA-seq analysis of brains from starved fish. In the piranha whole brain, starvation had significant effects on the regulated transcriptome. We found 557 transcripts to be downregulated and 394 transcripts to be upregulated in starved fish (logFC ranging from À4.8 to 6.3; supplementary table 10 and figs. 2 and 3, Supplementary Material online). It is likely that even more striking effects may be noticeable after longer periods of fasting. However, we avoided increasing the fasting periods as this would be expected to increase in-tank cannibalism. The differentially expressed genes were enriched for several functional categories, for example, lipid metabolism, metabolic signaling, reorganization of extracellular matrix and proliferation and growth (supplementary fig. 4, Supplementary Material online). By parsing the differentially expressed genes with known role in metabolism, we find genic starvation responses in lipid transport (e.g., nrf6), including many genes of the fatty acid (FA) metabolism, genes (e.g., sterol regulatory element binding transcription factor 1, fatty acid synthase, desaturase 2, elongase 5). Up to 20% of the total brain's energy is provided by mitochondrial oxidation of FAs in mammals (Panov et al. 2014) supporting our observation that in the fasted piranha brain a shift toward transcriptome programs involved in brain energy regulation is observed. FA availability in the brain appears to be very important to the regulation of energy balance, as infusions of FAs into the brain inhibit food intake, although paradoxically, plasma FAs are highest during fasting (Wang and Eckel 2012). Fastinginduced adipose factor/angiopoietin-related protein 4 (angptl4) was strongly upregulated in the brain of starved piranhas. Overexpression of this gene has been shown in mice to be a mediator or starvation response acting mainly on lipid metabolism (Mandard et al. 2006).
Glucose and its derivative lactate, as well as ketone bodies (following a long fast) are the main energy substrate for the brain of mammals (Panov et al. 2014) and fish (Soengas and Aldegunde 2002). Genes involved in the metabolism of carbohydrates and lactate (e.g., pyruvate dehyogenase kinase, facilitated glucose transporter 1) and genes involved in ketone body formation (e.g., acetyl CoA-acyltransferase 2) were significantly affected by fasting. Brain oxidation of ketone bodies is known to increase during food deprivation in some fish (Soengas and Aldegunde 2002).
Changes in metabolic parameters are often accompanied by changes in feeding-regulating hormones produced by both brain and peripheral tissues. These hormones affect feeding centers in the brain to either stimulate (orexigenic) or inhibit (anorexigenic) feeding (Volkoff 2011(Volkoff , 2016Ronnestad et al. 2017) Short-term fasting usually increases the expression of orexigenic factors and decreases that of anorexigenic factors. This has been shown in piranha (Volkoff 2014(Volkoff , 2015 and other characiform fish such as the large fruit-eating pacu, a close relative to the piranhas (Volkoff et al. 2017), dourado (Volkoff 2016), and cavefish (Wall and Volkoff 2013). Our data uncovered only small and nonsignificant changes in the expression of endocrine factors regulating appetite, such as CART and leptin B.
As a consequence of starvation several genes of neurotransmitter metabolisms (e.g., tryptophane 5-hydroxylase, monoxygenase DBH-like, GABA transporter 2, serotonin carrier slc7a2 aka cat2) and neurogenesis (e.g., neural cell adhesion 1, neurabin 2, synaptotagmin 3, glial fibrillary acidic protein) were regulated. An impact was also noted on socalled "immediate-early" genes that are known to play a role in neural plasticity (Mukherjee et al. 2018) including early growth response protein2 and members of the fos transcription factor family, which all were downregulated. In addition, delta opioid receptor Oprd1a, a component of the reward system (Pradhan et al. 2011) was downregulated, probably because satiety as initiator of a reward response is not reached any more under starvation conditions.
In search for prey, hungry piranhas need to exhibit aggression-boldness behavior. A susceptibility gene for anxiety, egr2 (Mcgregor 2013), is downregulated in the brains of starved piranhas, which may contribute to increased boldness to aid the acquisition of food. We also noted significant upregulation under food deprivation of components of FGF signaling (fgf 9, fgf 8, fgf 1) a major pathway known to regulate aggression-boldness in zebrafish (Norton et al. 2011).

Conclusion
Piranhas are known for a unique combination of morphological, physiological and behavioral traits, which makes them top freshwater predators in their habitats. The analyses of our whole-genome assembly and annotation revealed signatures of positive selection in several genes related to neurotransmission, behavior, and metabolism regulation that are candidates for being involved in adaptations to the specific feeding lifestyle of these fishes. Most data on fasting-induced metabolic changes in fish pertain to liver and muscle tissue (e.g., carp; Wang et al. 2006;Gong et al. 2017;rainbow  trout; Salem et al. 2007;and zebrafish;Drew et al. 2008). The brain is also a major organ involved in the regulation of feeding (Ronnestad et al. 2017). This transcriptomic study is the first of its kind in piranha and provides new information on changes in the genome under caloric restriction. In particular, it provides evidence for the upregulation of genes involved in metabolism, suggesting an increased utilization of storage fuels and a brain energy sparring outcome in the piranha is consistent with that seen in other fish species. In summary, we find gene candidates for the complex process of feeding and starvation regulation, and to the more aggressive behavior of starved piranha. Future studies will also reveal how the teleost specific genome duplication and the presence of two genes (as opposed to a single gene in tetrapods) involved in processes of metabolism regulation and behavior might have impacts on the regulation of starvation and metabolism in general. The availability of a reference genome for the whole species group of piranhas will provide the basis for a better understanding of the biology and evolution of these signature fish of the large South American river basins.

Supplementary Material
Supplementary data are available at Genome Biology and Evolution online.