Global Transcriptome Profiling Reveals Molecular Mechanisms of Metal Tolerance in a Chronically Exposed Wild Population of Brown Trout

Worldwide, a number of viable populations of fish are found in environments heavily contaminated with metals, including brown trout (Salmo trutta) inhabiting the River Hayle in South-West of England. This population is chronically exposed to a water-borne mixture of metals, including copper and zinc, at concentrations lethal to naïve fish. We aimed to investigate the molecular mechanisms employed by the River Hayle brown trout to tolerate high metal concentrations. To achieve this, we combined tissue metal analysis with whole-transcriptome profiling using RNA-seq on an Illumina platform. Metal concentrations in the Hayle trout, compared to fish from a relatively unimpacted river, were significantly increased in the gills, liver and kidney (63-, 34- and 19-fold respectively), but not the gut. This confirms that these fish can tolerate considerable metal accumulation, highlighting the importance of these tissues in metal uptake (gill), storage and detoxification (liver, kidney). We sequenced, assembled and annotated the brown trout transcriptome using a de novo approach. Subsequent gene expression analysis identified 998 differentially expressed transcripts and functional analysis revealed that metal- and ion-homeostasis pathways are likely to be the most important mechanisms contributing to the metal tolerance exhibited by this population.


Supplemental Experimental Section
Sample collection Eggs and sperm were stripped from five female and two male brown trout obtained from a trout farm and mixed to facilitate fertilisation. Fertilised eggs were incubated at 8±1 °C on gravel beds in flow-through de-chlorinated tap water. Embryos were collected at 10 developmental stages identified according to [16], as follows: unfertilised eggs (0 days post fertilisation (dpf)), blastula (2 dpf), gastrula (6 dpf), early somitogenesis (10 dpf), late somitogenesis (14 dpf), early organogenesis (21 dpf), mid organogenesis (31 dpf), late organogenesis (41dpf), hatched alevins (51 dpf) and swim-up fry just prior to commencement of feeding (70 dpf). All embryos were snap frozen in liquid nitrogen then stored at -80 °C prior to RNA extraction. For collection of adult tissues, five brown trout from the River Hayle at Relubbus in Cornwall (N 50° 8.476774' , W 5° 24.661446') and 10 brown trout from the control site, the relatively unimpacted River Teign at Gidleigh Park in Devon (N 50° 40.568816' , W 3° 52.407188') were caught by electric fishing on the 19 th September 2010 and 11 th October 2010 respectively. The fish were humanely killed with a lethal dose of benzocaine (0.5 g L -1 ; Sigma-Aldrich) and individual tissues (gill, liver, heart, spleen, stomach, intestine, gonad, head kidney, trunk kidney, eye, brain, pituitary, muscle, skin and caudal fin) were dissected and transported on dry ice to the University of Exeter where they were stored at -80 °C prior to RNA extraction or analysis of metal content.

RNA extraction, cDNA Library preparation and sequencing
Total RNA was extracted from all individual wild fish tissues and from individual embryos using TRI reagent (Sigma-Aldrich) according to the manufacturer's instructions. The isopropanol precipitation step was modified by addition of a high salt solution (0.8 M sodium citrate, 1.2 M NaCl) to remove proteoglycon and polysaccharide contamination [1] during the embryo extractions. The concentration and purity of the resulting RNA was assessed using absorbance measurements at 260 nm and by monitoring the 230/260 and 260/280 nm absorbance ratios, using a NanoDrop ND-1000 Spectrophotometer (NanoDrop Technologies, Wilmington, USA). The integrity of the RNA was further assessed by gel electrophoresis (1% agarose). Equal amounts of total RNA from five embryos were pooled for each developmental stage, before these were combined into a single embryonic sample for sequencing. For the adult fish, equal amounts of total RNA from individual fish tissues were pooled into 12 samples for sequencing to form the following pools: gill, trunk kidney, liver and gut (consisting of stomach and intestine) from both Hayle and Teign fish; ovary and testis from Teign fish (from mature and maturing fish only); and mixed remaining tissues from the Hayle and from the Teign trout (Table S2). This strategy was adopted to allow for comparisons of transcript abundance between the Hayle and Teign fish for tissues hypothesised to be involved in metal tolerance (gill, gut, kidney and liver), and to maximise the likelihood of sequencing genes specific for each tissue. All RNA samples were treated with DNase and cleaned up on Qiagen RNeasy MinElute columns, then quality and concentration were determined using an Agilent 2100 Bioanalyzer (Agilent Technologies, Inc., USA). All RNA input to library construction was of high quality with a RIN > 8. cDNA libraries were prepared from each RNA sample using the Illumina TruSeq RNA Sample Preparation kit, and according to the manufacturer's instructions. The single embryonic cDNA library was sequenced in one lane of the Illumina GAIIx Genome Analyzer generating 100 bp paired-end reads. All cDNA libraries constructed from the wild fish were multiplexed 12x and sequenced in another single lane, generating 76 bp paired-end reads. The average insert size of the multiplexed libraries was 153 bp, and of the embryonic library was 142 bp.

Bioinformatics
The FASTX-Toolkit (http://hannonlab.cshl.edu/fastx_toolkit) was used to clip remaining Illumina adapter sequences from the sequence reads and to trim the first 12 bp at the 5' end to remove bias caused by random hexamer priming [2]. Quality trimming of the 3' end of the reads using a sliding window at the first base with a quality Phred score of < 20 was performed (http://wiki.bioinformatics.ucdavis.edu/index.php/Trim.slidingWindow.pl) and reads shorter than 30 bp were discarded from the dataset. Paired reads were separated from orphan reads for each of the adult tissue and embryonic libraries, using the script from https://github.com/lexnederbragt/denovo-assemblytutorial/blob/master/scripts/pair_up_reads.py. All 'forward' reads (read 1) and 'reverse' reads (read 2) of the adult tissue libraries were pooled into 2 separate fastq files and interleaved using the shuffleSequences_fastq.pl script provided by the Velvet package in preparation for assembly. Similarly, interleaved fastq files were created for the embryonic tissue library.
The interleaved paired and orphan sequences for adult tissues and embryos were assembled de novo using Velvet (version 1.2.08; [3]) and Oases (version 0.2.08; [4]). An initial assembly was created using a k-mer of 73 and using the following parameters for Oases: ins_length 50 -ins_length_sd 200. Subsequently, assemblies were created using k-mers ranging from 65 to 41 (with steps of 8), such that the transcripts generated by the previous assembly were used as along input for the next assembly. The resulting transcripts of the final assembly (the brown trout transcriptome) were then annotated using Blast and all available Ensembl cDNA sequences for zebrafish (Danio rerio), medaka (Oryzias latipes), nile tilapia (Oreochromis niloticus), stickleback (Gasterosteus aculeatus), human and mouse (Release 69; October 2012), (non-human) vertebrate RefSeq RNA and protein sequences and EST sequences (Database of 2012-11-09). In addition, transcripts were also annotated using the Blast service at the Bioportal, University of Oslo, using the non-redundant nucleotides (nt) and proteins (nr) databases [5]. The resulting blast outputs were parsed using the blast2table.pl script from ftp://ftp.genome.ou.edu/pub/programs/Blast2table keeping only the top hits with an e-value cut off < 1e -15 . Annotations were assigned in the following preferential order: zebrafish, medaka, nile tilapia, stickleback, human, mouse (Ensembl cDNA), RefSeq vertebrates RNA, nt, RefSeq vertebrates proteins, and nr. When no annotation could be found, the transcript ID was given.

GILL GUT
KIDNEY LIVER S10 Table S4 -Summary statistics of raw sequencing reads, numbers of reads retained after adaptor removal and quality filtering and retained for input into transcriptome assembly as either paired reads or orphans. 1 and 2 refer to the forward and reverse reads in each paired-end sequence read.