A transcriptomic analysis of diploid and triploid Atlantic salmon lenses with and without cataracts

To avoid negative environmental impacts of escapees and potential inter-breeding with wild populations, the Atlantic salmon farming industry has and continues to extensively test triploid fish that are sterile. However, they often show differences in performance, physiology, behavior and morphology compared to diploid fish, with increased prevalence of vertebral deformities and ocular cataracts as two of the most severe disorders. Here, we investigated the mechanisms behind the higher prevalence of cataracts in triploid salmon, by comparing the transcriptional patterns in lenses of diploid and triploid Atlantic salmon, with and without cataracts. We assembled and characterized the Atlantic salmon lens transcriptome and used RNA-seq to search for the molecular basis for cataract development in triploid fish. Transcriptional screening showed only modest differences in lens mRNA levels in diploid and triploid fish, with few uniquely expressed genes. In total, there were 165 differentially expressed genes (DEGs) between the cataractous diploid and triploid lens. Of these, most were expressed at lower levels in triploid fish. Differential expression was observed for genes encoding proteins with known function in the retina (phototransduction) and proteins associated with repair and compensation mechanisms. The results suggest a higher susceptibility to oxidative stress in triploid lenses, and that mechanisms connected to the ability to handle damaged proteins are differentially affected in cataractous lenses from diploid and triploid salmon.


Introduction
Recent years have seen renewed efforts to establish commercial farming of triploid Atlantic salmon (Salmo salar) -hereafter referred to as salmon (Benfey, 2016;Stien et al., 2019). Farming triploid salmon has two major advantages. The first is that triploid females do not mature sexually and can diverge energy into somatic growth (Piferrer et al., 2009). The second is related to the fact that domesticated salmon escapees and genetic interactions with wild conspecifics represents one of the most significant environmental challenges to salmon aquaculture . Rearing sterile triploid fish reduced this threat, and is an effective way to mitigate further genetic interactions.
Although production of triploid salmon has potential benefits, the global Atlantic salmon aquaculture industry is still primarily based upon rearing diploid fish. While there are several reasons for this, in part, this is due to the fact that triploid salmon often show poor performance. For example, in comparison with diploid salmon, they display differences in physiology, behavior and morphology, with increased prevalence of vertebral deformity and ocular cataracts as two of the most severe disorders (Wall and Richards, 1992;Piferrer et al., 2009;Taranger et al., 2010;Taylor et al., 2015;Sambraus et al., 2017). Cataracts are defined as the loss of transparency of the lens and can appear both as reversible osmotic cataracts and permanent cataracts, which can have multiple causes (Hejtmancik, 2008). In farmed salmon, cataract formation has been linked to genetic predispositions and several nutritional and environmental factors (reviewed by Bjerkås et al., 2006). Cataract has been observed in both freshwater and seawater, however, farmed salmon are particularly prone to cataract development during the smolt transition from fresh to saltwater (Waagbø et al., 1998;Breck and Sveier, 2001;Breck et al., 2005a;Remø et al., 2014) and during periods of rapid growth Breck and Sveier, 2001;Waagbø et al., 1996Waagbø et al., , 2010Remø et al., 2014Remø et al., , 2017. Increased prevalence of cataracts in triploid fish is not well understood but may partly rely on altered metabolism due to differences in cellular morphology (Benfey, 1999).
A sub-optimal level of dietary histidine is currently considered the most important causative factor for cataract development in farmed Atlantic salmon (Breck et al., 2003(Breck et al., , 2005bRemø et al., 2014Remø et al., , 2017Waagbø et al., 2010). Taylor et al. (2015) investigated the preventive effects of dietary histidine supplementation in triploid Atlantic salmon during seawater grow-out. Although the severity was higher in triploids compared to diploids irrespective of diet, applying a high histidine diet mitigated further cataract development in triploids. Similarly, dietary histidine supplementation reduced the severity of cataracts in diploid and triploid yearling smolt, but also with a higher severity in triploids compared to diploids at the highest dietary level (Sambraus et al., 2017). The cataract preventative effect of dietary histidine has been attributed to the functional roles of histidine and the derivative N-acetyl-histidine (NAH) as buffer component (Breck et al., 2005a), osmolyte (Rhodes et al., 2010) and possibly antioxidant (Remø et al., 2014), therefore being important to maintain cell integrity and water balance. The lens concentration of NAH was lower in triploids compared to diploids given the same high histidine diet (Sambraus et al., 2017), suggesting that the triploid lens may be more vulnerable to cataract development, possibly due to lower protection of the triploid lens through lower ability to synthesize NAH, or a higher requirement to maintain water balance in the lens. The latter might be linked to larger cell size in triploids (Wu et al., 2010). Thus, differences in susceptibility to cataracts, as well as the apparent higher requirement of histidine to mitigate (but not eliminate) cataract development in triploids, may be hypothesized to be due to alterations or weakness in the lens of triploids.
Thus far, no attempts have been conducted to evaluate the mechanisms behind increased prevalence of cataracts in triploid fish at the molecular level. Relatively few genome-wide examinations of the molecular mechanisms behind cataract formation have been performed on healthy and cataractous lenses in vertebrates (Sousounis and Tsonis, 2012), possibly due to the biased lens transcriptome, where the expression of structural genes, such as crystallins, predominates over genes that regulate cell function and phenotype (Wistow, 2006;Manthey et al., 2014). Global transcriptional examinations of the mammalian cataractous lens have revealed differential regulation on numerous types of genes, including crystallins and heat shock proteins, cytochrome oxidases, growth factors, metalloproteinases and collagen, as well as various transcription factors (Wistow et al., 2002;Wride et al., 2003;Hawse et al., 2003;Mansergh et al., 2004;Medvedovic et al., 2006;Hejtmancik, 2008;Shiels et al., 2010;Hejtmancik, 2017, 2019). In Atlantic salmon, Tröße et al. (2009) used a 16K salmonid microarray to screen for transcriptional responses to histidine related cataracts in lenses of Atlantic salmon and reported differences in genes encoding proteins linked to lipid metabolism, carbohydrate metabolism, and protein degradation. Among the significantly differentially regulated genes were gamma crystallin M2 (homolog to mammalian crygb), lens fiber membrane lim2, secreted protein, acidic, cysteine-rich (sparc), metallothionein B (mt-b), heat-shock cognate 70 (hsc70a), calpain (capns1), Na/K ATPase alpha subunit isoform 1c (atpa1c) and fatty acid binding protein 2 (fabp2), of which several have been linked to cataracts before.
In the present study, we used transcriptomics (RNA-seq) to examine why triploid fish are more prone to cataract development than diploid fish. To do so, we compared the transcriptional patterns in the lens of diploid and triploid Atlantic salmon originating from both a domesticated strain and a wild population, with and without mature cataract, as assessed by a slit-lamp biomicroscope.

Experimental animals and set-up
The salmon used in this experiment originated from females (f) and males (m) from the domesticated Mowi strain (M) crossed with females and males from a wild population in the river Figgjo (F) in November 2011. Eight groups were made as diploid and triploid of the systematic breeding of the farmed and wild strains: mM × fM, mM × fF, mF × fF and mF × fM. The offspring groups were start-fed with a commercial feed (Skretting, Stavanger, Norway) at March 26th , 2012 and were held at 12 • C water temperature from start feeding to mid-summer. Thereafter, the groups were reared at ambient temperature. Fish were reared under continuous light from start feeding to October 1st, followed by a simulated natural photoperiod to initiate parr-smolt transformation. Experimental groups were held separately in eight tanks until November 27th , 2012, when they were individually passive integrated transponder-tagged (PIT-tags, Electronic I, Inc., Dallas, TX, USA) and the groups distributed equally into three replicate tanks. Fish were transferred to seawater at May 10th, 2013. In sea, the fish were fed Skretting Spirit 75-50A. The experiment was terminated October 16th , 2013.
Fish were sampled as post smolts in seawater at a mean body weight of 143 ± 8 g (n = 46). Upon sampling, the fish were inspected for cataracts, and weight and length measured. From each fish, two lenses and heart tissues were dissected and immediately frozen on liquid nitrogen. The left lens was used for transcriptome de novo assembly and transcriptomics.

Cataract determination
Cataract assessment was performed on anaesthetized fish by use of a Kowa SL-15 slit-lamp biomicroscope (Kowa, Tokyo, Japan). The type, position and severity of the observed cataractous changes were determined according to Wall and Richards (1992), but with a maximum severity extended from 3 to 4 per eye to match the amplitude of the macroscopic scale (microscopic cataract score 0: absent, 1: slight, 2: moderate, 3: severe, 4: total cataract).

Histidine determination
Heart tissue from the sampled fish was used as status organ for histidine and NAH (Remø et al., 2014). NAH and free histidine in the heart tissue were analyzed by reverse phase HPLC and UV detection at 210 nm, with modifications according to Breck et al. (2005b).

RNA isolation
Lens tissue was thoroughly homogenized before RNA extraction using a Precellys 24 homogenizer and ceramic beads CK28 (Bertin Technologies, Montigny-le-Bretonneux, France). Total RNA was extracted using the BioRobot EZ1 and RNA Tissue Mini Kit (Qiagen, Hilden, Germany), treated with DNase according to the manufacturer's instructions and eluted in 50 μL RNase-free MilliQ H 2 O. RNA quality and integrity were assessed with the NanoDrop ND-1000 UV-Vis Spectrophotometer (NanoDrop Technologies, Wilmington, DE, USA) and the Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, CA, USA). The RNA 6000 Nano LabChip kit (Agilent Technologies, Palo Alto, CA, USA) was used to evaluate the RNA integrity of the lens samples. The RNA integrity number (RIN) of RNA extracted for transcriptome assembly (2N: n = 10, 3N: n = 10) and RNA-seq (2N: n = 12, 3N: n = 14) were 8.5 ± 0.0 (n = 20) and 8.2 ± 0.1 (n = 26), respectively (mean ± SEM). of analysis, transcriptome de novo assembly had to be conducted using Illumina paired-end reads before RNA-seq analyses. RNA extracted from 10 diploid and 10 triploid lenses (n = 20) was mixed and used to generate the assembly. Transcriptome de novo assembly was conducted using the short reads assembling software Trinity as described by Grabherr et al. (2011). Trinity combines three independent software modules, Inchworm, Chrysalis, and Butterfly, to process the RNA-seq reads into unigenes. The output sequences were aligned to the databases of NR, NT, SwissProt, KEGG, COG and GO using Blastx, and the best aligning result was used to decide sequence direction. Sequence direction of unigenes not aligned to any of the above-mentioned databases was determined with the ESTScan software (Iseli et al., 1999).

RNA-seq analysis
Direct RNA sequencing (RNA-seq) was used to screen for differentially expressed genes (DEGs) in lenses of both diploid and triploid individuals. As the two strains used here are known to display divergent transcription patterns (Bicskei et al., 2014(Bicskei et al., , 2016, fish from both groups were randomly mixed and pooled prior to any analysis. Individual left lenses from 26 salmon were used for RNA-seq examination (2N-: n = 6, 2N+: n = 6, 3N-: n = 6, 3N+: n = 8). Poly (A) mRNA was isolated using magnetic beads with oligo (dT) from total RNA obtained from the lens samples. Fragmentation buffer was added to shred mRNA to short reads. Using these short fragments (about 200 bp) as templates, random hexamer primers were applied to synthesize first-strand cDNA. Second-strand cDNA was synthesized using buffer, dNTPs, RNaseH, and DNA polymerase I. QiaQuick PCR extraction kit (Qiagen) was used to purify short double-stranded cDNA fragments according to manufacturer's instructions. These fragments were then resolved with EB buffer for end reparation, added poly (A), and ligated to the sequencing adapters. After agarose gel electrophoresis, the suitable fragments were selected for PCR amplification as templates. Finally, the libraries were sequenced using Illumina HiSeq™ 2000 (San Diego, CA, USA).
The de novo lens transcriptome described above was thereafter used as a reference for alignment of the RNA-seq data. Unigenes were annotated with Blastx alignment using an e-value cut-off of 10 − 5 between unigenes and the databases of NR, NT, SwissProt, KEGG, COG and GO. The NOISeq software package (Tarazona et al., 2011) was used to screen for differentially expressed genes (DEGs). NOISeq is a novel non-parametric method for the identification of DEGs, which shows a good performance when compared to other differential expression methods, like Fisher's Exact Test, edgeR, DESeq and baySeq. All RNA-seq work was performed by staff at the Beijing Genome Institute (BGI, Hong Kong).

Ploidy verification
Fish from each ploidy were sampled and measured for erythrocyte diameter to verify their ploidy status (Benfey et al., 1984). Blood smears were used to measure the relative diameters of 10 erythrocytes per fish (Image-Pro Plus, version 4.0, Media Cybernetics Silver Spring). The triploid fish had significantly (22%) larger blood cells than diploid fish.

Statistics
To calculate differential expression, NOISeq default settings were used (Tarazona et al., 2011). NOISeq empirically models the noise distribution of count changes by contrasting fold-change differences (M) and absolute expression differences (D) for all the features in samples within the same condition. This reference distribution is used to assess whether the M-D values computed between two conditions for a given gene is likely to be part of the noise or represent a true differential expression. Instead of using a false discovery rate (FDR) or a q-value cut-off, the NOISeq method calculates a differential expression probability value. A gene is declared as differentially expressed if this probability is higher than q. The threshold q is set to 0.8 by default, since this value is equivalent to an odd of 4:1 (the gene in 4 times more likely to be differentially expressed than not). In this work we used a log2 M-value cut-off of ≥2 (fold-change ≥2). For genes not expressed in some samples, the gene expression value (D) of 0.001 was used. Functional pathway analyses, including prediction of activation and inhibition of upstream transcription factors and downstream effects, were generated through the use of QIAGEN's Ingenuity Pathway Analysis (IPA, QIAGEN Redwood City, www.qiagen.com/ingenuity). Since IPA only can map mammalian homolog entries, identifiers were obtained with Blast alignment against the RefSeq databases (cut-off E10 − 5 ) and assuming orthologous genes have the same function. A limited number of fish-specific genes with no mammalian homologs were for this reason not included in the IPA pathway analysis. This may have skewed the interpretation of the transcriptomic data.

Data availability
The RNA-seq dataset discussed in this publication has been deposited in NCBI's Gene Expression Omnibus and is accessible through GEO Series accession number GSE153933 (https://www.ncbi.nlm.nih.gov/ geo/query/acc.cgi?acc=GSE153933).

Growth and heart histidine levels
Growth of the farmed and wild stocks is reported in detail by Harvey et al. (2017), and therefore not reported here. At the present sampling, there was no significant difference in weight between the diploid and triploid fish. Fig. 1 shows the concentrations of L-histidine and N-acetyl-L-histidine (NAH) analyzed in salmon heart as a measure on the ambient histidine status. Two-way ANOVA analysis showed that there was a significant effect of cataract on L-histidine (Fig. 1A, p = 0.016), while there were no significant effects of either cataract or ploidy on NAH (Fig. 1B). Posthoc tests showed reduced levels of L-histidine (p = 0.019) and increased levels of N-acetyl-L-histidine (p = 0.046) in diploid fish with cataract.

Cataract status
The mean cataract scores of the left lenses used to generate the lens transcriptome (2N + had score 2.5 and 3N + had score 1.5, while both 2N-and 3N-had score 0) and the right lenses used for the RNA-seq analysis (both 2N+ and 3N + had score 3, while both 2N-and 3Nhad score 0) are shown in Fig. 2A and B.

Lens de novo transcriptome assembly
Total RNA extracted from lenses of 20 domesticated and wild salmon, 10 of which were diploid and 10 were triploid, was mixed ( Fig. 2A), sequenced and used to assemble the lens transcriptome. Using the Illumina HiSeq 2000 platform, a total of 68,403,252 raw reads and 63,098,790 clean reads were sequenced. The clean reads were assembled into 78,306 contigs with mean length of 284 nucleotides (nts). Of these, 29,177 contigs with mean length of 659 nts mapped to UniGene entries. Distinct clusters, which contained highly similar UniGene entries (more than 70% that may come from the same gene or homologous genes), were 7391. Distinct singletons representing a single UniGene were 21,786. Using blastx and blastn, a cut-off of 10 − 5 and the following priority order, 15,711,21,084,14,445,11,680,5231,11,327 UniGene entries were functionally annotated to the NR, NT, Swiss-Prot, KEGG, COG and GO databases, respectively. UniGene entries aligned to a higher-priority database were not aligned to a lower-priority database. In total, 22,160 out of the 29,177 UniGene entries were given a functional annotation in these databases.
For protein coding region prediction analysis, the number of coding DNA sequences (CDS) that mapped to the protein database was 15,428. The number of predicted CDS, Unigene entries that could not be aligned to any database and were scanned by ESTScan, was 701. The total number of CDS was 16,129. According to the microsatellite analysis conducted with the MicroSAtellite (MISA) software and using Unigenes as reference, there were 7274 simple sequence repeat (SSR) in the transcriptome. Heterozygous analysis using SOAPsnp, a member of the Short Oligonucleotide Analysis Package (SOAP), revealed 23,759 single- nucleotide polymorphisms (SNPs) in the transcriptome.
A summary of the NR annotation is shown in Fig. S1 (Supplementary File 1). Fig. S1A shows E-value distribution, while similarity distribution is shown in Fig. S1B. Fig. S1C shows the species distribution of UniGene entries annotated to the NR database. Most UniGene entries mapped to Atlantic salmon, followed by hits against Nile tilapia (Oreochromis niloticus), zebrafish (Danio rerio), Japanese medaka (Oryzias latipes) and other fish species. Gene ontology (GO) annotation of the NR unigenes was obtained with the Blast2GO and WEGO software's (Conesa et al., 2005;Ye et al., 2006). Fig. S2 shows the major GOs from the salmon lens transcriptome, divided into the three ontologies biological process, molecular function and cellular component. Of the more specialized molecular process worth mentioning is the antioxidant activity, while the biological function GO annotation indicate that the lens cells are relatively metabolic active.

Differentially expressed genes (DEGs)
To search for differentially expressed genes (DEGs) in diploid and triploid salmon with and without cataracts (Fig. 2B), the left lens from 26 individual fish were selected for RNA-seq analysis. The selection was based on cataract score (score 0 vs 3) and ploidy (2N vs 3N). A total of 634,610,512 single-end reads were sequenced with the Illumina HiSeq™ 2000 system. In average, 24,023,481 ± 199,810 single-end reads were sequenced per sample (n = 26, mean ± SEM). Average total reads mapped to the in-house made lens transcriptome were 17,882,297 ± 161,561 (n = 26, mean ± SEM), representing 74.4% of the total reads. As expected for fish, some contigs had redundant annotations.
Using the default NOISeq setting for calculation of differential expression (q ≥ 0.8 and log 2 ≥ 1), the comparison between diploid fish without and with cataracts (2N-vs 2N+) showed that 182 DEGs were more highly expressed in 2N-lenses and 25 DEGs were more highly expressed in 2N + lenses (Fig. 2C). For the comparison between triploid fish without and with cataracts (3N-vs 3N+), 74 DEGs were more highly expressed in 3N-and 78 DEGs were more highly expressed in 3N+. Comparison of healthy diploid lenses vs healthy triploid lenses (2N-vs 3N-) yielded 107 DEGs, with 93 genes more highly expressed in 2N-, and 14 more highly expressed in 3N-. Comparison of cataractous diploid lenses vs cataractous triploid lenses (2N + vs 3N+) yielded 165 significant DEGs, with 9 genes more highly expressed in 2N+, and 156 genes more highly expressed in 3N+. All significant DEGs in the four comparisons, including fold changes, significance levels and best annotation, which were used in downstream functional analyses, are shown in Supplementary File 2. Annotations were given to about 52% of the DEGs.
Very few DEGs with unique expression were found in the lenses from the four treatment groups. Fig. 2D shows a Venn diagram of the number of unique and shared DEGs determined with a four-way comparison. There were 4 unique DEGs in the 2N-group, 17 in the 2N + group,15 in the 3N-group and 245 in the 3N + group. 98.7% of the DEGs were shared between all treatment groups. Most of these unique DEGs were expressed only in one or a few of the lenses from their respective group. Annotations of unique DEGs are shown in Supplementary File 3.

Functional analysis
Two pathway analysis methods, KEGG and IPA pathway analysis, were employed for functional analysis of DEGs in cataractous lenses from diploid and triploid fish. KEGG pathway enrichment analysis identifies significantly enriched metabolic pathways or signal transduction pathways in DEGs by comparison to the whole genome. Table 1 shows the most significant KEGG pathways from the four comparisons based on a q-value cut-off of 0.05. The top three pathways in both diploid and triploid fish with cataracts were "Phototransduction", "Carbohydrate digestion and absorption" and "Proximal tubule bicarbonate reclamation". Interestingly, for the phototransduction pathway (KEGG pathway ko04744), the significant DEGs linked to this system, 13 DEGs in the diploid fish and 17 DEGs in the triploid fish, (DEGs only found in triploid fish were gnb1, arrb1 and arr3), were all upregulated in the diploid cataractous lens (Fig. 3A) and down-regulated in the triploid cataractous lens (Fig. 3B). As expected, direct comparisons between diploid and triploid lens from fish without and with cataract gives similar patterns. The "ECM-receptor interaction" and "PPAR signaling" pathways were two the most significantly affected KEGG entries based on a direct comparison of DEGs in diploid and triploid salmon with cataracts not listed in the other comparisons.
IPA Core analysis and the IPA Compare function were used for evaluation of biological processes, pathways and networks. In order to use IPA, all identifiers must be recognized as mammalian homologs. Some fish-specific genes obviously cannot be given human ortholog names recognized by IPA, and thus were omitted from the IPA-Core analysis. About 52% of the DEGs from the four gene lists were given automatic annotation as described above (Supplementary File 2). In addition, all unknown DEGs were manually aligned against the core nucleotide and EST databases, and given annotation based on hits against NCBI Unigene entries (Blastn cut-off E10-5). This way, 64.4% of the DEGs used for the functional analysis had IPA identifiers. Table 2 shows annotated salmon genes with human identifiers used in these functional analyses which were significantly differently expressed according to the four comparisons (2N + vs 2N-, 3N + vs. 3N-, 3N + vs 2N+ and 3N-vs 2N-). Highlighted in the table are cataract-linked genes that are differentially regulated in various mice knockout models (data obtained from the iSyTE (integrated Systems Tool for Eye gene discovery) database (URL: http://research.bioinformatics.udel.edu/iSyTE).

Impact of cataracts
To get an idea of the mechanistic basis for cataract development in the salmon lens and the impact of ploidy, we used IPA Core Analysis with the predicted upstream regulators function and the categorical annotations of disease or function to search for differences in the four comparisons described above. By sorting with an activation z-score >2 and p-value of overlap <10.5, IPA Core Analysis predicted six upstream regulators that may explain the observed DEGs in lenses of diploid salmon with cataracts. These were CRX, GTF2IRD1, HIF1A, EDN1, hexachlorobenzene and EPO (Supplementary File 4). The dataset for the most significant transcriptional regulator, CRX with a z-score of 2.43 and a p-value of overlap of 8,35E-19, was made up of the DEGs arr3, gnat1, gnat2, opn1lw, opn1sw, pdc, pde6g, prph2, rcvrn and rho. For GTF2IRD1, which had a z-score of 2.10 and p-value of overlap of 2,89E-10, the dataset consisted of arr3, gnat1, gnat2, opn1lw, opn1sw, pdc and rho. For the disease of functional annotation, the analysis predicted eight categories with a z-score above 2 and p-value >10-5. These were "Cellular Movement, Immune Cell Trafficking-leukocyte migration", "Cellular Movement-cell movement", "Cellular Movement, Hematological System Development and Function, Immune Cell Trafficking-cell movement of leukocytes", "Cellular Movement-migration of cells", "Cell Death and Survival-cell viability", "Cell Death and Survival-cell survival", "Tissue Morphology-quantity of cells" and "Cellular Movement-migration of brain cancer cell lines".
In the lenses of triploid salmon with cataracts, five upstream regulators had a predicted activation state based on the same cut-off as described above. These were CRX, GTF2IRD1, beta-estradiol, trichostatin A and decitabine (Supplementary File 5). CRX, the most significantly regulator with a z-score of − 2.73 and a p-value of overlap of 8,90E-20, was predicted affected based on the same DEGs as in diploid fish, i.e. arr3, gnat1, gnat2, opn1lw, opn1sw, pdc, pde6g, prph2, rcvrn and rho. The predicted activation state for GTF2IRD1 (z-score: − 2.10, p-value of overlap: 6,33E-11) in lenses of triploid salmon was based on the same DEGs as in diploid fish. Using the same cut-off, no disease or function categories had a predicted activation state in the  lenses of triploid salmon. By comparing the transcriptional patterns in cataractous lenses of diploid and triploid salmon indirectly (IPA Compare Analysis of 2Nversus 2N+ and 3N-versus 3N+), the most pronounced differences were seen for the "IL8" and "Production of nitric oxide and reactive oxygen species in macrophage" canonical pathways (data not shown). These pathways had higher activation z-scores in lenses of the diploid fish compared to the triploid fish. A predicted regulator network generated from the comparison of DEGs in cataractous lenses of diploid and triploid salmon is shown in Fig. 4. This network, which had a consistency score of 13.87 and was based on target DEGs apoe, clu, gal, igfbp2, junb, krt18, lep, mmp2, plaur, rbp4 and snca, and on upstream regulators AGT, CREB1, ERK, HIF1A and P38 MAPK, predicted that synthesis of nitric oxide and chemotaxis of cells might be different in cataractous lenses of diploid and triploid salmon. Based on analysis of predicted upstream regulators and hierarchical clustering, the most pronounced difference in triploid fish was seen for CRX, GTF2IRD1, SRC and RHO (Table 3). Interestingly, these upstream regulators were predicted activated in diploid fish (positive z-score) and predicted inhibited in triploid fish (negative z-score) with cataracts. Fig. 5 shows the molecules in these four networks. Except krt18 and ckm in the SRC network, all genes in these networks were up-regulated by cataracts in diploid fish and downregulated in triploid fish.

Impact of ploidy
Comparison of the transcriptional patterns in lenses from diploid and triploid salmon without cataracts revealed two upstream regulators with predicted activation scores above 2 and p-values of overlap >10-4 (Supplementary File 6). According to the IPA Core Analysis, both the transcription regulator CRX and the chemical drug trichostatin A were predicted activated with z-scores of 2.55 and 2.40, respectively. Targeted DEGs for CRX were gnat1, gnat2, opn1lw, opn1sw, pdc, pde6g, prph2, rcvrn and rho, while cdh1, hba1/hba2, hbb, hbz, ndrg1 and slc1a2 made up the dataset for the trichostatin A prediction.
A comparison of transcriptional patterns in lenses from diploid and triploid fish without cataracts showed that two disease or function annotations had prediction scores above 2 and a p-value of overlap >10-5 (Supplementary File 6). "Organ degeneration" (z-score 2.19) and "Degeneration of cells" (z-score 2.19) showed significant differential prediction scores in lenses of triploid and diploid salmon without cataracts. The "Organ degeneration" z-score was based on differential transcription of prph2, rho, crb1, slc1a2, pde6g, gngt1, rpe65, rcvrn, ca1, gnat1, opn1mw (includes others) and ca2, whereas the "Degeneration of cells" z-score was based on differential transcription of rho, slc1a2, pde6g, gngt1, rpe65, gnat1 and prph2. All these genes were more highly expressed in lenses of diploid fish compared to triploid fish. This could reflect differential transcription in non-cataractous lenses from diploid and triploid salmon, or indicate that mechanisms leading to cataracts The direct comparison of transcriptional patterns in lenses of diploid and triploid salmon with cataracts yielded five predicted upstream regulators with activation z-score >2 and p-value of overlap >10-5 (Supplementary File 7). These were CRX, GTF2IRD1, beta-estradiol, trichostatin A and decitabine. Targeted DEGs in the most significant regulator (transcription regulator CRX, p-value of overlap 8, 90E-20) were arr3, gnat1, gnat2, opn1lw, opn1sw, pdc, pde6g, prph2, rcvrn and rho. A significant result for the transcription regulator GTF2IRD1 was based on the DEGs arr3, gnat1, gnat2, opn1lw, opn1sw, pdc and rho. Five categories with disease or functional annotation had predicted activation state based on z-score >2 and p-value of overlap >10-5 (Supplementary File 7). These were "DNA Replication, Recombination, and Repair, Nucleic Acid Metabolism, Small Molecule Biochemistryhydrolysis of nucleotide", "Behavior-", "Cellular-Movement-migration of blood cells", "Cellular Movement-cell movement" and "Organismal Development-size of body".

Discussion
This is the first study to investigate the transcriptomics of salmon lenses in diploid and triploid salmon with and without cataracts. Functional analysis showed that retina-associated genes were differentially affected in diploid and triploid fish. Predicted differential effects of NO-induced oxidative stress, modified cytoskeleton stability and lipid metabolism, possibly affecting cellular metabolism, indicate that the triploid lens might be more vulnerable to cataract due to altered protein degradation and turnover.
Overall, this study indicates that the transcriptional patterns in the lenses of diploid and triploid Atlantic salmon are very similar. This is consistent with the results from a recent study, which showed that the vast majority of genes in liver tissue had similar expression levels between diploid and triploid coho salmon (Oncorhynchus kisutch) (Christensen et al., 2019). Similar results have been shown for other fish species (Chatchaiphan et al., 2017). At the protein level there also appears to be small differences in expression between diploid and triploid salmon (Nuez-Ortin et al., 2017). Relatively few significant DEGs were found in the current dataset. Most of the significant DEGs in the cataractous triploid lenses were higher expressed compared to the cataractous diploid lens (156 vs 9). In healthy lenses the pattern was opposite, with more of the significant DEGs being lower expressed in the diploid lenses (93 vs 14). According to the functional analysis, the most distinct difference between diploid and triploid cataractous lenses in transcript levels were seen for genes encoding proteins involved in the phototransduction pathway. Whether this reflects a direct effect of ploidy on the transcription of these genes is unknown.
The N3+ vs N2+ comparison list contained a gene associated with heat shock protein (HSP) activity, e.g. hsp47/serpinh1. Furthermore, two heat shock protein genes, annotated to hspa8 and hspa8b, were upregulated in diploid cataractous lenses but not in diploid noncataractous lenses. These findings potentially suggest a different ability to handle damaged proteins and protein turnover. Crystallins, watersoluble structural protein found in the lens and the cornea of the eye accounting for the transparency, are relatively similar to HSPs, and have similar chaperone activity (Wang and Spector, 1995;Slingsby et al., 2013). "Protein digestion and absorption" (KEGG pathway ko04974) was the second most significantly affected pathway, after phototransduction, in triploid cataractous lenses compared to diploid cataractous lenses, according to the functional analysis. In humans, ROS-generated protein oxidation may lead to cataract formation in the aged lens (Taylor and Davies, 1987). In addition to oxidative stress and the inflammatory response, an unfolded protein response is known to be activated in age-related ocular disorders such as cataracts (Lenox et al., 2015). Histidine has been shown to stimulate the proteasome and thereby protein degradation and turnover (Hamel et al., 2003;Breck et al., 2005a). With diminished antioxidant capacity and decreased proteolytic capabilities, the triploid lens may be less efficient in clearance of damaged proteins. Taken together, the results from the current study indicate that the higher susceptibility to cataract development in triploid vs diploid salmon may in part rely on how well the cells handle damaged proteins.
The heart histidine and NAH levels observed in the present study represent normal values obtained from a commercial salmon smolt feed (Remø et al., 2014). According to the factorial analysis, the histidine concentration was related to cataract status and not to ploidy, while NAH status did not indicate differences with neither cataract status nor ploidy. The result confirms the higher sensitivity of histidine relative to NAH status in heart tissue, but more importantly, a corresponding lens histidine status to the diploid cataract group and both the triploid groups indicates suboptimal conditions for salmon smolts (Remø et al., 2014). The present groups of salmon would therefore be prone to cataract development.
At the molecular level, functional analysis predicted that the upstream regulators cone-rod homeobox (CRX), GTF2I repeat domain- Table 2 Annotated salmon genes with human orthologs used in the IPA functional analyses. Genes common for all four comparisons were all up-regulated in diploid cataractous lenses and down-regulated in triploid cataractous lenses. Genes in bold are differential regulated in mice cataract mutants (>2.0 fold) according to the iSyTE database (Kakrana et al., 2018). Underlined are genes that are differentially regulated by more than one mutant type.
triploid lenses, which may be hypothesized to be an underlying factor for the higher prevalence of cataracts compared to diploids, as well as the lower lens NAH status observed when reared under similar conditions in the studies by Taylor et al. (2015) and Sambraus et al. (2017). Likewise, cataract and chemotactic activity have been extensively studied over the years (Rosenbaum et al., 1987;Schneider et al., 2012). Several of the predicted upstream regulators for this network, angiotensinogen (AGT), CAMP responsive element binding protein 1 (CREB1), extracellular signal-regulated kinase (ERK), HIF1A and P38 mitogen-activated protein kinase (P38 MAPK), have been linked to cataract formation (AGT: Taube et al., 2012;Ji et al., 2015;CREB1: Weng et al., 2008;ERK: Iyengar et al., 2007;HIF1A: Chen et al., 2014;P38 MAPK: Bai et al., 2015). Follow-up studies should look at how NO induce ROS and oxidative stress in the triploid fish lens, as well as the involvement of chemotaxis in the development of cataract in triploid salmon.
Two pathways, "ECM-receptor interaction" and "PPAR signaling pathway", were among the most significantly affected KEGG entries based on a direct comparison of DEGs in diploid and triploid salmon lenses with cataracts. These were not listed in the other comparisons. The ECM-receptor interaction pathway, including collagen, type II, alpha 1 (col2a1) and secreted protein, acidic, cysteine-rich (osteonectin) sparc, has been linked to disorders of the eye characterized by early onset cataract (Bradshaw, 2009). SPARC is a key lens-and cataract-associated protein (Shiels et al., 2010;Sousounis and Tsonis, 2012;Terrell et al., 2015). SPARC is important for normal cellular proliferation and differentiation and is involved in maintaining lens transparency as shown for mice (Gilmour et al., 1998) and humans (Yan et al., 2000). SPARC is also one of at least 13 proteins harboring mutations that have been associated with a lens or cataract phenotype in mice but not yet in humans (Shiels et al., 2010). In Atlantic salmon, SPARC was suggested to be an "early" up-regulated marker for cataract development (Tröße et al., 2009). Lower expression of sparc suggests that the cataractous triploid lenses might have impaired circulation of fluids, ions, and small molecules, possibly resulting in depolarized membrane resting voltage as shown in mice (Greiling et al., 2009). Cartilage extracellular matrix (ECM) is composed of type II collagen, fibrous proteins and proteoglycans, hyaluronic acid and chondroitin sulfate (Gao et al., 2014). The finding indicates a differential regulated mechanism linked to cytoskeleton disruption and NO-induced oxidative stress (Gao et al., 2014). Differential regulation of PPARs, which are transcription factors in control of many cellular processes, indicate an effect on lipid metabolism in the lens. An effect on lipid/cholesterol transport, previously reported in age-related cataract in humans (Utheim et al., 2008), is suggested by differential expression of apolipoprotein E (apoe). APOE is a major apoprotein that is essential for the normal catabolism of triglyceride-rich lipoprotein constituents (Genecards database), indicating a differential effect on lipoprotein metabolism. Apoe, together with sparc, was among the differentially regulated genes in cataractous lenses of Atlantic salmon fed a low-histidine diet compared to a high-histidine diet (Tröße et al., 2009) and had a lower expression level in the lens of Atlantic salmon fed plant oils compared to fish oils (Remø et al., 2011).
Only five genes from the Cat-Map gene list, an online chromosome map and reference database for cataract in humans and mice (Shiels et al., 2010), showed overlap with the current gene list of cataractous diploid and triploid lenses from salmon (direct comparison). These were, in addition to sparc and apoe, col2a1, gap junction protein alpha 1, 43 kDa (gja1) and retinal pigment epithelium-specific protein 65 kDa (rpe65). Gap junction proteins, also called connexins, are constituents of gap junctions, channels specialized in cell-cell contacts that provide direct intracellular communication. They allow passive diffusion of molecules up to 1 kDa, including nutrients, metabolites (glucose), ions and second messengers (Genecards database). They are especially important for nutrition and intercellular communication in the avascular lens (Hejtmancik, 2008). Mutations in gap junction proteins such as GJA1, present in the lens epithelium, have been linked to human cataracts (Hejtmancik, 2008). RPE65 is a protein located in the retinal pigment epithelium and involved in the production of 11-cis retinal and in visual pigment regeneration (Genecards database). Finally, RPE65 has been associated with leber congenital amaurosis (LCA), a severe dystrophy of the retina (Weleber et al., 1993). No genes associated with Mendelian (inherited) cataracts or cataracts caused by mutations in transcription factors or metabolic enzymes in humans (Shiels and Hejtmancik, 2019) were on the significant lists in this study. Several of the genes that were differentially expressed in cataractous lenses of triploid salmon have previously been documented to be affected by mutations in the mouse lens (Table 2). By comparing our significant genes with the responses of mammalian orthologs with lens defects or cataract as listed in the iSyTE database (Kakrana et al., 2018), it appears that several may be potential candidate markers for follow-up studies in salmon. Apart from the CBP:p300 E9.5 mutation, which seems to down-regulate many of these genes in mice (iSyTE database), several gene knockout mutation types impact the expression of genes from our lists.
Hsp47, also called serpinh1, was one of the genes that were lower expressed in cataractous triploid lenses than in cataractous diploid lenses. HSP47, localized in the endoplasmic reticulum, plays a role in collagen biosynthesis as a collagen-specific molecular chaperone (Genecards database). Heat shock proteins, found throughout the various tissues of the eye, protect and maintain cell viability under stressful conditions such as those occurring during thermal and oxidative challenges chiefly by refolding and stabilizing proteins (Urbak and Vorum, 2010). In the human eye, HSP47 has been suggested to aid the control of pro-collagen under stressful conditions and is induced by corneal structure damage (Urbak and Vorum, 2010). In the salmon lens, increased expression of hsp70 has been shown after short-term handling stress (30 min), indicating that HSPs are transcriptionally controlled and act to protect the cells after stress-induced protein misfolding (Tröße et al., 2010). Lower expression of hsp47 in triploid lenses suggests a poorer ability to facilitate proper folding of proteins. It may be speculated that this is linked to the synthesis, accumulation, repair or breakdown of crystallins or other structural proteins, responsible for lens transparency (Hejtmancik, 2008). Crystallins make up about 80-90% of the soluble proteins in the lens (Hejtmancik, 2008). Mutations in crystalline genes is one of the major reasons for human cataract, and improper ability of chaperones to correct for misfolding or protein damage may render the triploid lens more vulnerable to imbalances responsible for cataract formation in the salmon lens.
When studying the lens transcriptome, it is important to note that the eye lens mostly consists of fiber cells without nuclei and organelles (Bassnett, 2002). With transcription restricted to metabolically active lens epithelial cells and young fiber cells (Hejtmancik et al., 2015), transcriptional differences between diploid and triploid cataractous fish lenses may generally be small. Furthermore, triploid salmon in general differ from diploids by containing fewer and larger cells in most organs (Swarup, 1959;Small and Benfey, 1987), possibly impacting transcriptional differences.
With 74.4% of the total reads mapped to the novel in-house made lens transcriptome, the mapping degree was similar to using a fully sequenced genome as reference. In total however, only 52% of the significant DEGs were annotated using the described pipeline. With manual annotation of all unknowns, about 64% of the DEGs were assigned annotation for IPA functional evaluation. The reason for this relatively poor annotation level is unknown. A good mapping score combined with a poor annotation level might suggest that the lens transcriptome contains a relatively high number of novel transcripts. Among the most strongly differentially regulated genes in both diploid and triploid salmon with cataracts was the CXC chemokine cxcf1a. This is a fishspecific chemokine with no mammalian ortholog (Chen et al., 2013), so its function was not included in the IPA functional analysis. This illustrates one of the limitations studying cataract mode of action in non-model fish species.
In conclusion, this study shows only moderate differences in lens mRNA levels between diploid and triploid Atlantic salmon with score-3 cataract, and very few DEGs with unique expression. Several retina related genes were differentially expressed in the diploid and triploid lenses. The study indicates that the triploid lens may be more vulnerable to cataract than the diploid lens due to predicted effects of protein degradation and turnover, NO-induced oxidative stress, modified cytoskeleton stability and lipid metabolism, possibly linked to repair and compensation mechanisms. Overall, this study suggests that cataract formation is associated with modest changes in gene expression levels, and that transcriptional controls to a large degree regulate gene expression levels independent of chromosomal number in salmon.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.