Data-driven learning of narcosis mode of action identifies a CNS transcriptional signature shared between whole organism Caenorhabditis elegans and a fish gill cell line

an integrative approach for

• Developed an integrative approach for identifying membrane proteins potentially linked to narcosis mode of action (MoA).• Exposure to Narcotics affects a set of genes with a function in CNS in C elegans and in a Gill cell line.• Exposure to Narcotics protects from the effects of drugs/chemicals that induce hyperactivation of Acetylcholine signalling • Gene/metabolite statistical models discriminate narcosis-based from targetspecific inhibition of acetylcholine signaling.

Introduction
It has been estimated that over 350,000 chemicals have been registered for production and use of which many (up to 120,000) remain publicly unknown because they are claimed as confidential or ambiguously described (Wang et al., 2020).Providing views on both human and environmental safety on this number of chemicals poses significant challenges from ethical, financial and capacity perspectives given that the current regulatory frameworks are centred predominantly around in-vivo testing.In this context, it is becoming increasingly evident that the application of in silico and in vitro experimental approaches are key to enabling a reduction of in vivo testing and increasing the throughput to assess this large number of chemicals.This necessary shift in testing provides opportunities to incorporate more mechanistic understanding into risk assessment (Krewski et al., 2010).Key to facilitating this are improvements in classification of both mode of action (MoA) and mechanism of action (MechoA) (Edwards et al., 2016;Meek et al., 2014) where MoA defines a functional change at the cellular level in contrast to the MechoA that defines a specific MIE (Kienzler et al., 2017).Nearly 60 % of all industrial chemicals are predicted to act only as baseline toxicants (Perkins et al., 2015;Verhaar et al., 1992).Narcosis is currently defined as occurring when no other specific modes of action are present and reversible and non-specific toxicity is induced when a threshold concentration in cell membranes is reached during acute exposures.Since narcosis is a function of membrane and tissue concentrations, the toxicity of these chemicals exhibits a strong correlation with lipophilicity.For compounds which induce narcosis, existing approaches such as quantitative structure activity relationships (QSARs) often can be applied with some confidence to subsequently help predict adverse biological impacts on environmentally relevant species as part of risk assessment approaches for assessing chemical safety (European Commission, 2003).However, currently available tools to predict MoA such as the Verhaar scheme (Verhaar et al., 1992), EPA ASTER QSAR application (Russom et al., 1991), and the EPA MOAtox database (Barron et al., 2015) are limited in their ability to provide consensus on predicted MoA (Kienzler et al., 2017).Here we look to investigate the application of the adverse outcome pathway (AOP) framework to help support improved MoA and MechoA discovery (Ankley et al., 2010;Brockmeier et al., 2017;Garcia-Reyero and Murphy, 2018;Willett et al., 2018).The AOP concept enables knowledge to be assembled and combined from all biological levels, both from existing and newly generated datasets, and to be translated in a way which provides both confidence in the mechanistic information and enables it to be more easily applied to risk assessment.More specially, an AOP represents a series of connections which link an initial interaction between a chemical and its biological target, otherwise known as the molecular initiating event (MIE), to an observed apical event (or adverse outcome) of regulatory interest such as mortality and reproduction.These links define the propagation of the MIE through perturbations at different biological levels, i.e. key events (KEs).Whilst both MoA and AOP approaches are important in rationalising risk assessment, they have a key limitation in that the scientific data that is available to robustly support the use AOPs and MoA are limited and cover only a relatively small amount of chemical space.Previously, we have pioneered a computational strategy that integrates the physical-chemical features (PCFs) with transcriptional response and toxicity following a sublethal exposure (Antczak et al., 2015).This approach revealed for the first time that the toxicity of narcotics is linked to the inhibition of calcium ATPase pumps in the crustacean Daphnia magna and that an abnormal increase in intracellular calcium may be responsible for a wide range of effects, including alteration in energy metabolism and decreased heart rate.This work provided an understanding of how the effect of narcotics could be traced back to the function of specific proteins located in the membrane, at least in D. magna.
In this paper, we build on this pioneering work by applying a computational strategy for the systematic identification of membrane proteins potentially linked to narcosis MoA, from the observation of a biological system transcriptional response to exposure.We designed this strategy to identify molecular MIEs of narcosis and develop biomarkers that can discriminate between target-based toxicity and narcosis.Because of its relevance in biomedical research, we choose to focus on the model organism Caenorhabditis elegans (Brenner, 1974;Leung et al., 2008;Xiong et al., 2017) and translate our findings on a fish gill cell line, currently used as a screening tool in ecotoxicology.
Remarkably our results show that the primary biological effect of baseline toxicants on C. elegans occurs in the nervous system with acetylcholine signalling being one of the main affected functions.The molecular biomarkers discovered through this analysis, which included both transcriptomics and metabolomics-based datasets, were able to discriminate narcosisbased from target-specific inhibition of acetylcholine signalling.Interestingly, we demonstrate that narcotics may impact on a similar set of proteins in an in vitro model of fish gill.

Chemical selection
A panel of 30 narcotic chemicals was chosen to represent classic examples of 15 polar and 15 non-polar narcotics.Chemicals were selected on the basis of having well characterised MoA, classified as Class 1 (nonpolar) and Class 2 (polar) narcotics as defined using the Verhaar (modified) decision tree (Verhaar et al., 1992) implemented in ToxTree ver 2.6.13 (Toxicology, 2018), as well as a broad range of physical-chemical features and n-octanol-water partition coefficients (log K ow ).The log K ow values range from −1.47 to 5.03 (Table 1) and were calculated using KOWWIN as part of the USEPA EPISUITE program (Hansch et al., 1977).The exception to this was for the surfactant, dodecylbenzenesulfonic acid sodium salt, for which it has been shown that a modified approach of Hansch et al. (Hansch et al., 1977), which forms the basis of the ClogP method (Biobyte, 2011) incorporating a method modification, provides the optimised calculated log K ow value (Roberts, 1991).An additional panel of 12 acetylcholine esterase (AChE) inhibitors, classified as class 4 according to the Verhaar scheme, were selected to compare gene expression patterns and EC 50 responses between basal toxicants and AChE inhibitors which also represent a broad range of log K ow values (Table 1).All chemicals were purchased from Sigma-Aldrich (Dorset, UK) and CAS numbers along with additional information are provided in Table 1 and Supplementary Data 1.

EC 50 testing
C. elegans wild-type Bristol N2 strain (referred to below as "N2s") were obtained from the Caenorhabditis Genetics Center (Minneapolis, MN, USA).For all in vivo experiments, stage-matched N2s were obtained by bleaching three-35 cm plates of mixed population N2s to obtain eggs and subsequent L4 stage worm (Kashyap et al., 2014;Stiernagle, 2006).EC 50 tests were conducted by calculating the percent of C. elegans dead (i.e., immobilized) after exposure to the test chemical in K-media (31.68 mM KCl and 51.37 mM NaCl) using a modification of the WormScan procedure (Mathew et al., 2012).In brief, L4 worms hatched from bleached eggs were transferred via glass pipette to a 24-well Costar plate with 700 mL of K-media and 1 % v/v of the test chemical in DMSO.Exposures were conducted at 25 °C for 24 h with 7 doses per chemical and 4 replicate wells per dose.EC 50 values were calculated as the percentage of worms alive measured by aligned scans to determine worm movement using ImageJ.The validation of the light-induced EC 50 method versus the traditional touch test was confirmed using aldicarb (Fig. S1).EC 50 values were calculated using the drc package in R. When compared to previous EC 50 test results for C. elegans (Li et al., 2013;Ristau et al., 2015), we find a unusually high amount of deviation from the regression line (Fig. S2).We reasoned that this could be due to an incorrect estimate of the true dose due to reduced solubility or increased volatility of the narcotic chemicals.To address this issue, we modelled each chemical's free concentrations using the Top Line model (Armitage et al., 2014) and found that the modelled aqueous concentrations greatly improved the accuracy of the basal toxicity model (Fig. 1A).Free and internal concentrations are reported in Supplementary Data 1 and were determined using the TopLine model (Armitage et al., 2014) using the parameters indicated in Fig. S3.

C. elegans exposure, chemical analysis and RNA sample preparation
Exposures for microarray analysis and metabolomics data generation were conducted on L4 worms at 1/10 of the calculated EC 50 for a 24 h exposure.This represents a low dose at which overt toxicity via narcosis was not expected to occur.Chemical analysis was conducted on a parallel exposure and samples collected at 0 h and 24 h to determine chemical loss in the exposure system.Chemical analysis was conducted using an Agilent 1260 Infinity High Performance Liquid Chromatography (HPLC) or a Perkin Elmer Clarus® 500 Gas Chromatography Mass Spectrometer (Additional Information).Details of the analytical methods used for each chemical and the corresponding measured concentrations at the start of the experiment and 24 h post exposure are summarized in Supplementary Data 1.To account for the impacts of chemical solubility and volatility on the expected delivered concentrations, we calculated the free and internal concentrations from measured total concentrations for each chemical using the TopLine model (Armitage et al., 2014).This model determines the dissolved (i.e.bioavailable) concentration of chemical in an aquatic system and whilst not developed for C. elegans assays specifically it has enabled the visualisation of C. elegans EC 50 results with reduced variability.Fig. S3 includes a summary of the TopLine model parameters.All calculated nominal, free, and internal EC 50 values are provided in Supplementary Data 1.
Exposures were conducted in 24-well plates at 25 °C at 1 % v/v of the chemical in DMSO.Three biological replicates were generated for each chemical.Replicates with vehicle controls (1 % DMSO) were included in each experimental batch for both microarray and metabolomics sample generation to control for batch effects between different experimental dates.C. elegans samples were flash-frozen in liquid nitrogen and stored at −80 °C until further analysis.
For RNA extraction, a modified TRIzol protocol was used.In brief, frozen C. elegans samples were thawed and flash frozen in 400 μL of TRIzol, with the freeze-thaw step repeated five times.200 μL of additional TRIzol was added to each sample and 140 μL of chloroform used to separate RNA from proteins by high-speed centrifugation (15 min at 12,000 RPM, 4 °C).RNA was precipitated using an equal volume 70 % ethanol and samples were purified using the RNeasy kit (Qiagen, Hilden, Germany) following the manufacturer's protocol.RNA quality was evaluated using the NanoDrop and the triplicate RNA samples from each chemical treatment were pooled into a single sample per treatment; control samples were not pooled.

Immortalised trout gill cell line exposure and RNA sample preparation
RTgill-W1 cells (ATCC CRL-2523) were cultured without antibiotics in Leibovitz's L-15 medium (Gibco, Thermo Scientific, UK), supplemented with 10 % FBS (Gibco) at 20 °C in air.For dosing, RTgill-W1 were plated at 6 × 10 4 cells per well of a 12-well plate (Nunc, Thermo Scientific).Cells were maintained for 24 h before treatment with chemicals (Supplementary Data 2).Cells were washed once with PBS before chemicals or DMSO controls were added in triplicate in separate wells in 1 mL of L-15 + 5 % FBS and incubated for 24 h before RNA was extracted.RNA extractions were performed using the RNeasy kit from QIAGEN (Manchester, UK).Cells were washed with PBS before cells were lysed and triplicates pooled in 350 μL of Buffer RLT.Lysates were then passed ≥5 times through a 20-guage needle to thoroughly homogenise.An equal volume of 70 % ethanol was added to each sample before proceeding with the RNeasy protocol as per manufacturer's instructions.RNA purity and yield were quantified using a QIAxpert (QIAGEN) spectrophotometer and analyzed using the RNeasy settings.RNA quality was assessed using a Tapestation 2200 (Agilent, Manchester, UK).All samples had RIN values of >8.0.
(CK14 ceramic beads), dried in a speedvac concentrator, and reconstituted for LC-MS analysis by an UltiMate 3000 ultrahigh performance LC (UHPLC) and Q Exactive Focus mass spectrometer (MS; Thermo Fisher Scientific).
Samples were analyzed on a reverse phase aqueous C18 method (Hypersil GOLD aQ, 100 × 2.1 mm, 1.9 μm LC column; Thermo Fisher Scientific) with 0.1 % formic acid in water and 0.1 % formic acid in acetonitrile as solvents (column temperature of 45 °C, flow rate of 300 μL/min).UHPLC-MS data were collected in both positive and negative ion modes at a resolution of 70,000.
For metabolite data pre-processing, UHPLC-MS raw data profiles were converted into a NetCDF format.Each NetCDF based three-dimensional data matrix (intensity x m/z × timeone per sample) was deconvolved into a vector of peak responses, where a peak response is defined as the sum of intensities over a window of specified mass and time range.The quality of the metabolomics data was assessed by calculating the relative standard deviation (RSD) of each m/z feature across a series of intrastudy quality control (QC) samples; all metabolite features with an RSD >20 % (in QC samples) were removed from the entire dataset due to insufficient technical reproducibility.The data for each sample was normalized as a percentage to the total peak area for all metabolites in the sample.Metabolite annotations were performed applying the PUTMEDID_LCMS workflow as previously described (Dunn and Winder, 2011).

Microarray analysis
mRNA was used as input for cRNA synthesis using the LowInput QuickAmp Labelling Kit (Agilent, Santa Clara, USA).Cy3-labeled cRNA was generated using the manufacturer's protocol and purified using RNeasy columns (Qiagen, Hilden, Germany).The labeled cRNA was then hybridized onto custom 8 × 15 k C. elegans slides (Design ID 071829) or to 8 × 15 k SurePrint HD trout custom gene expression microarrays (Design ID 076584).Slides were washed and immediately scanned using the Agilent G2565CA Microarray Scanner.

Differential gene expression analysis
Batch effects linked to scanning date were first removed by processing raw data with the ComBat R package (Chen et al., 2011) (Fig. S4).The data were then normalized using quantile normalization.
Genes differentially expressed in each treatment respect to its control were identified using a two-tailed error model as described in Stekel et al. (2005) (Fig. S5).
Comparison between different sample classes was performed using Significance Analysis of Microarrays (SAM) (Tusher et al., 2001).Genes were considered significant when FDR10%.
Functional enrichment analysis of gene lists was performed using gene set enrichment analysis (GSEA) (Subramanian et al., 2005), with a cut-off of 10 % FDR for selection of significant Gene Ontology terms (Consortium, 2015) and Kyoto Encyclopedia of Genes and genomes (KEGG) pathways (Kanehisa, 2002).

Identification of membrane proteins potentially linked to narcosis MoA
Fig. 2 provides a graphic representation of the membrane-related gene expression analysis procedure.In brief, gene expression data from all chemicals was used in a membrane gene seeded Spearman rank correlation analysis, constructing a network of all genes correlating with a membrane gene FDR = 0 and FDR <1 % for the C. elegans and the Trout gill cell line, respectively, with an absolute correlation ≥0.7.From these significant relationships, highly connected membrane hub genes were defined into gene sets, using positively correlating immediate neighbors.The resulting gene sets, representing potential narcotics molecular MIEs, were used in a leading edge GSEA, to test whether they were significantly linked to narcosis type and/or lipophilicity.Significant gene sets were defined by having <1/10 4 false positives.Groups of significant membrane genes were further summarized for functional enrichment using DAVID 6.7 (Huang et al., 2009a(Huang et al., , 2009b)).

AChE and baseline toxicant co-exposures in C. elegans
To determine the effects of pre-exposure of acetylcholine signalling inhibitors on the effect of narcotic compounds, a time-to-paralysis assay was conducted (Mahoney et al., 2006).A random subset of narcotics from the panel of 30 chemicals was chosen and doses were selected from a range finding study at concentrations where ~50 % of the exposed C. elegans were paralyzed after 120 min.C. elegans were exposed to the narcotic for 20 min in K-media followed by transfer to agar plates with either 0.25 mM or 0.5 mM levamisole (an AChE antagonist that affects ach release and generally indicative of effects on pre-synaptic signalling) or aldicarb (an AChR agonist generally reflective of post-synaptic effects) respectively.A total of 20C.elegans were included in each assay and each assay was repeated five times, with a control experiment (1 % v/v DMSO) was included during each experiment. .elegans exposed to a panel of 30 narcotic chemicals.Exposures were conducted for 24 h and % lethality per concentration was measured using a modification of the WormScan/touch test.EC 50 values were calculated over a range of 7 concentrations and were plotted against the measured log K ow value as A) calculated free concentrations and B) calculated internal concentrations.Chemicals are indicated by their ID as well as whether they were classified as nonpolar (red) or polar (blue).

Chemical class prediction
In order to explore whether molecular response to exposure can be used to discriminate different classes of chemicals, we used a genetic algorithm (GA) based methodology coupled with a KNN classification method as implemented in the R package GALGO (Trevino and Falciani, 2006).The modelling procedure was run using with a model size of 3 and setting the fitness value for model selection to overall accuracy larger than 90 %.Model accuracy was estimated by 0.632+ bootstrap method which has been shown to outperform cross-validation (Efron and Tibshirani, 1997).
A total of 5000 statistical models were selected by the GA procedure and a forward selection strategy was employed to develop a representative model with the highest accuracy and the smallest number of variables.

Overview of the study and analysis strategy
The overarching goal of this study was to identify membrane proteins potentially linked to narcosis MoA and develop biomarkers that can discriminate between target-based toxicity and narcosis forming the basis of a biological rationale for chemical grouping.An integrated approach including data acquisition, computational analysis, hypothesis generation and experimental validation was developed.This strategy is summarized in five interconnected steps (Fig. 3).
Step 1-Experimental characterization of the effect of narcotics on C. elegans: We used C. elegans as a model to characterise the transcriptional and metabolic response of an organism to polar and non-polar narcotics.Compound toxicity, expressed as a EC 50 value, was computed for each compound using an exposure model that allowed us to estimate the effective exposure dose.
Step 2-Identification of membrane proteins potentially linked to narcosis MoA in C. elegans: Using our computational target identification strategy (detailed in Fig. 2), we identified membrane protein encoding genes whose activity is potentially perturbed in a lipophilicity dependent manner.This analysis showed that neuronal functions are affected following exposure to narcotics chemicals.
Step 3-Experimental validation: We identified acetylcholine signalling as one of the most important functions that is potentially affected by narcosis.We validated this hypothesis by showing that narcotics antagonise drugs that over-activate acetylcholine signalling.
Step 4-MoA classification: Having identified acetylcholine signalling as being potentially modulated by narcotics, we showed that transcriptomics and metabolomics profiling is able to discriminate AChE inhibitors from polar and non-polar narcotics.
Step 5-Assessing the potential of an in vitro testing system: We applied the whole procedure to a trout gill cell line with the purpose of testing whether an in vitro system representing a taxonomically divergent organism could be used to infer membrane proteins potentially linked to narcosis MoA in C. elegans.Our analysis showed that indeed this is the case.

Toxicity response profiles of selected chemicals are consistent with baseline toxicants
The first step in this study was to select 30 chemicals, representing an equal proportion of polar and non-polar narcotics chemicals and characterise the molecular response of C. elegans following exposure (Table 1).Since the Verhaar (modified) decision tree (Verhaar et al., 1992) that we used for chemicals selection is primarily based on a fish data, we set to validate experimentally that the relationship between K ow and EC 50 was what we expect from bona fide narcotics.Once the data were modelled to represent the chemicals' free concentration (topLine model) we found that compound aqueous concentration fit comparatively well a basal toxicity model (Fig. 1A).Narcosis occurs when a critical concentration is achieved in membranes and baseline toxicants are defined as being equipotent when normalized against internal concentrations (Escher and Hermens, 2002); in our dataset we also saw consistent levels of internal modelled concentrations, confirming that these chemicals were indeed acting as baseline toxicants in our experimental system of choice (Fig. 1B).

Molecular response to narcotics exposure correlates with compound lipophilicity and narcosis type
Comparison of gene expression between exposed and unexposed organisms revealed a wide range of responses across individual chemicals (Supplementary Data 3).At the functional level, using GSEA, we could identify 74 KEGG pathways which were differentially perturbed by exposure to individual narcotics (Fig. 4A).Pathways which were most frequently represented included general toxicity response (metabolism of xenobiotics by cytochrome P450), several metabolic pathways (carbon metabolism, fatty metabolism, pyruvate metabolism, citrate cycle, propanoate metabolism and arginine and proline metabolism) and, interestingly neuroactive ligand-receptor interaction.A relatively simple unsupervised cluster analysis performed using the functional enrichment profile of each chemical response partially separated polar and non-polar compounds (Fig. 4A).More specifically, the cluster enriched with non-polar compounds has 13 chemicals (9 are non-polar and 4 are polar), whereas the cluster enriched with polar compounds has 12 chemicals (8 are polar and 4 are non-polar).This separation (p = 0.02) was consistent with our initial hypothesis that the molecular response to a sub-lethal concentration of narcotics could be diagnostic of narcosis type.
Having shown that individual chemicals elicited a detectable transcriptional response we then looked to identify genes, differentially expressed between polar and non-polar narcotics and/or correlated to a measure of compound lipophilicity (K ow ).
The direct comparison between polar and non-polar compounds revealed 85 differentially expressed genes, all of which were up-regulated in response to polar chemicals.
The correlation analysis linking gene expression profiles to K ow only revealed 12 significantly correlated genes (Table S1, FDR ≤10 %).Among these, four were functionally annotated and included a member of the srx chemoreceptor gene family (srx-6), a gene involved in trachea development (hlh-12) and two of the CO 2 /O 2 neuroglobin receptors (glb-9 and gbl-7).Interestingly, all of the negatively correlated genes were of neuronal function.
However, visual inspection of the heatmaps of the top 100 most correlated genes (Fig. 4B) showed that, similarly to what we previously reported in Daphnia magna (Antczak et al., 2015), exposure to narcotics triggered a sharp transcriptional switch, which in this organism appeared between K ow = 2.18 and K ow = 2.37 (Fig. 4B).The sharp change in gene expression makes a direct comparison between low-and high-K ow compounds a more suitable methodology.In fact, direct comparison between those two groups of chemicals revealed 219 genes differentially expressed at the same FDR threshold, all down-regulated in response to highly lipophilic compounds (K ow ≥2.37).
The functional analysis of the differentially expressed genes showed a similar functional profile between genes associated with K ow and narcosis type (Fig. 4C and D respectively).In both cases the most represented functional term was the gene ontology cellular component term Membrane.Additional terms for the K ow related gene signature suggested neuronal functions (neuropeptide signalling pathway, neuronal cell body and neuroactive ligand-receptor interactions) (Fig. 4C).For the narcosis type gene signature, we identified Heme binding, and oxidoreductase activity (Fig. 4D).Interestingly, six genes out of the 85 differentially regulated genes were associated with neuronal functions.These Fig. 3. Schema of the study and analysis strategy.
were the neuronal localised genes NAPE-1 and SRD-23, the neuronal development gene EGL-20, the acetylcholine receptor subunit gene DEG-3, the gustatory receptor GUR-5, and the potassium channel TWK-7.We then asked whether the metabolomics dataset could reveal differences linked to K ow and narcosis type.While we could detect significant metabolite changes in response to every single chemical exposure  (Table S2), we could not identify metabolites that were significantly correlated with K ow or at different concentration when comparing polar and non-polar chemicals.

An in-silico approach shows that narcosis predominantly impacts neuronal functions in C. elegans
Having demonstrated that the transcriptional responses to narcotic exposure were linked to both log K ow and narcosis type, we then set to identify membrane proteins potentially linked to narcosis MoA in C. elegans.We reasoned that these are likely to be membrane proteins whose activity is affected by the indirect interaction with highly lipophilic chemicals.Therefore, we developed a relatively simple computational approach to identify such membrane proteins (Fig. 2).We first identified the co-expression neighbourhood of each one of the genes that encode for a membrane protein in C. elegans (Fig. 2, Step 1).We then used GSEA to test whether genes co-expressed with a given membrane gene were also correlated with K ow or narcosis type (Fig. 2, Step 2).Finally, membrane proteins that were linked to either narcosis type or lipophilicity were mapped on the KEGG pathway database to facilitate biological interpretation (Fig. 2, Step 3).This analysis revealed 215 membrane genes linked to K ow (Fig. 5A) and 74 membrane genes encoding for membrane proteins linked to narcosis type (Fig. 5B).Putative proteins associated with K ow were partially overlapping with those associated with narcosis type (17 overlapping -genes) but this overlap was not statistically significant (Fig. 5E-F).
Interestingly, pathway-level analysis of K ow and narcosis type-linked membrane protein genes revealed highly overlapping neuronal functions, specifically ion transport and Acetylcholine signalling (Fig. 5C-D).These more specifically included a set of ion channels (potassium, sodium and chloride channels) and neuropeptide receptors (Fig. 5G).We also detected a smaller number of gene specifically linked to K ow and involved in reproduction, lipid transport and mitochondria energy metabolism, suggesting that exposure to narcotics may affect a broader range of functions than the CNS.

Narcotic chemicals protect from AChE inhibitor exposure
Our analysis identified acetylcholine binding and acetylcholine signalling as enriched functional pathways in both K ow and narcosis type analysis.Genes encoding for different classes of acetylcholine receptors are among the top scoring putative MIEs.More specifically, GAD-3, which encodes for the muscarinic acetylcholine receptor is the highest scoring gene in the K ow analysis, followed by DEG-3, which encodes for a subunit of the nicotinic acetylcholine receptor.Similarly, the nicotinic acetylcholine receptor encoding gene ACR-17 was the highest scoring protein for the narcosis type analysis, followed by DEG3.These observations support the hypothesize that in C. elegans, acetylcholine signalling is one of the key functions impacted by narcotics but importantly this is irrespective of narcosis type.
In order to assess whether this hypothesis was correct, we set up a coexposure experiment using either a post-synaptic acetylcholine receptor activator (levamisole) or the pre-synaptic AChE antagonist (aldicarb) in combination with narcotics.Levamisole and aldicarb induce paralysis by serving as AChR agonists and AChE antagonists, respectively.We hypothesized that if narcotics inhibit acetylcholine signalling, they may have a protective effect during exposure to levamisole and aldicarb.Indeed, our hypothesis proved to be correct.Pre-exposure with all narcotic chemicals tested could induce a significant protective effect in the presence of levamisole whereas only one of the polar chemicals (aniline) had a significant protective effect during co-exposures with aldicarb (Fig. 6A-D).
3.6.Molecular response to exposure can be used to predict MoA in C. elegans The results described above are consistent with the hypothesis that narcotics affect the function of acetylcholine signalling.We therefore wondered whether it may be possible to identify biomarkers that can distinguish between narcotics, which we assume act via an indirect MoA, and chemicals that act by specific AChE inhibition.We tested this hypothesis using a machine learning algorithm to optimise a classifier that could distinguish between polar and non-polar narcotics.The application of this search algorithm coupled with a KNN classifier could accurately distinguish between polar and non-polar narcotics regardless of the input data (Fig. 7).We searched for optimised models from the whole transcriptional dataset as well as from the subset of the data representing the hypothetical membrane proteins linked to narcosis MoA and also using metabolomics data obtained in the same conditions using a mass spectrometry approach.More specifically, the model developed using the whole transcriptome as an input identified a model based on 14 genes (90 % accuracy).Four of these were functionally characterised genes (Fig. 7A and Table S3).The model developed using putative membrane proteins linked to narcosis MoA was based on 23 genes (90 % overall accuracy), 19 of which were functionally annotated (Fig. 7B and Table S4).Consistent with the functional composition of the membrane proteins linked to narcosis MoA, several genes in the model were linked to neuronal functions and included the acetylcholine receptor component DEG3, which appeared to be expressed at higher levels in samples exposed to polar compounds.Interestingly, despite the fact that we did not find any difference in the abundance of individual metabolites, we could develop a predictive model based on metabolomics data.This was based on 49 m/z features (93 % overall accuracy) (Fig. 7C and Supplementary Data 4).This is consistent with the expectation that multi-variate models may be more effective discriminators than univariate approaches.
Since we could show that the MoA of narcotic chemicals on a C. elegans whole organism essay involves an interaction with Acetylcholine signalling, we set to ask whether the transcriptional profile of exposed organisms could be used to discriminate between chemicals that act on a non-specific (narcotics) and specific MoA.We therefore selected 12 acetylcholinesterase inhibitors and performed an expression profiling analysis (Table 1).We then applied the same modelling approach to predict whether a chemical would be a polar or non-polar narcotic or an acetylcholinesterase inhibitor.The model we developed could predict narcosis type with great accuracy (88 % overall accuracy) (Fig. 7D and Table S5).

An in vitro system based on a trout gill cell line identify narcosis biomarkers that are functionally equivalent to organism-level biomarkers
Our work in C. elegans is consistent with the hypothesis that sub-lethal concentrations of narcotic chemicals impact proteins associated with CNS function including acetylcholine signalling and several ion channels (Fig. 5).We have also shown that both gene expression profiling and metabolomics can be used effectively to identify biomarkers that discriminate between polar and non-polar compounds as well as chemicals that affect acetylcholine signalling via a target-specific mechanism.While this is certainly encouraging, scaling up an organism-based test for chemical MoA assessment and risk assessment is a non-trivial task.We reasoned that if we could develop a similar expression profiling-based essay in a cell culture system, it would be more widely applicable, and we wondered whether such in vitro essay could reveal molecular MIEs that were indicative of a whole organism biological effect.
To address these important questions, we set to reproduce the analysis of narcotic related genes in a trout gill cell line, using the same 30 narcotics chemicals used for the C. elegans analysis.
While we could not detect differentially regulated genes between polar and non-polar chemicals, we could find evidence of a correlation between gene expression and K ow by using GSEA.More precisely, we could identify four functional terms linked to compound lipophilicity.These were the three GO terms, glutamate receptor binding, ionotropic glutamate receptor binding and skeletal muscle contraction and the KEGG pathway metabolism of xenobiotics by cytochrome P450.
Moreover, following the procedure outlined in Fig. 4, we could identify several membrane proteins encoding gene sets linked to K ow (Fig. 8A, 218 positively and 383 negatively correlated) and to compound class (polar versus non-polar) (Fig. 8B, 184 positively and 132 negatively correlated).
Despite the phylogenetic distance between C. elegans and rainbow trout and the difference between whole organism and a gill cell line, we could show that the putative molecular MIEs were including a similar set of CNS receptors and channels.When the list of putative molecular MIEs was examined in detail, we could almost mirror the set of receptor/signalling complexes identified in the C. elegans analysis, including Acetylcholine signalling (Fig. 5G and Fig. 8G).Similarly, to the C. elegans analysis we could develop predictive models discriminating between polar and non-polar compounds with relatively high (75 %) accuracy (Fig. 8H and Table S6 and Fig. S6 and Table S7).

Discussion
We have shown that by using a combination of omics technologies and computational analysis it is possible to identify the MoA of a class of chemicals that has been elusive for several decades.Importantly, we also show that is possible to develop a computational model that can rapidly predict the MoA, differentiating between specific and nonspecific acting compounds.Finally, we have shown that membrane proteins linked to narcosis MoA may be conserved between C. elegans whole organism and an in vitro cultured gill cell line.All together this work ultimately contributes to the overarching goal of building a functional genomics based chemical classification/ grouping and risk assessment toolset that can be informative of a compound MoA and its potential biological and toxicity effects.

A mechanism for baseline toxicity
Current understanding of narcosis MoA focusses around the destabilising effects of lipophilic compounds on biological membranes.This has led to the dogma that narcotics act on a general toxicity mechanism that affect a wide range of biological functions.The work of our group and others have recently brought to the attention that exposure to basal toxicants may favour the expression of particular genes more than others which results in adverse effects that account for the toxicity observed in a wide range of organisms.Our work in the crustacean D. magna for example suggests that membrane perturbation may cause alterations in intracellular calcium concentrations with downstream effects on hart rate as well as alterations in the efficiency of mitochondria energy metabolism.
Here we show that in C. elegans both polar and non-polar chemicals may act mainly on neuronal functions, including acetylcholine signalling.The gene Ric-3 is positively enriched in the log K ow correlation dataset.Ric-3 is a gene which is known to confer resistance to inhibitors of cholinesterase.This gene is required for the maturation of several AChE receptors, including deg-3/des-2, and is expressed in pharyngeal, body wall muscles, and the majority of C. elegans neurons (Millar, 2008).Deg-3 was also negatively enriched in the log K ow correlation dataset, which works in operation with des-2 to function as an AChE receptor.Deg-3 is nematode-specific and is expressed in sensory neurons and its enrichment in sensory ending helps the gene play a role in sensory transduction (Yassin et al., 2001).Other genes found to be negatively enriched include lev-8; mutants for this gene exhibit mild levamisole resistance and nearly normal locomotion in the presence of acetylcholine esterase inhibitors (Lewis et al., 1980).Gar-3 is also negatively enriched and is the most similar protein to vertebrate muscarinic AChE receptors; in C. elegans it is involved in the control of the pharynx and its muscle membrane potential (Steger and Avery, 2004).Based on the results of the co-exposure experiments and the presence of these genes in the dataset, we hypothesize that nonselective modification to acetylcholine signalling membrane proteins are responsible for the biological effects of baseline toxicity in C. elegans.
Previous work in C. elegans also demonstrated the importance of key sensory and acetylcholine signalling proteins on the behavioural effects of ethanol, a non-polar narcotic.Ethanol exposure causes significant changes to nematode movement, behaviours such as locomotion are governed by motor neurons that can be modified by sensory information received from the pharynx.This interaction is non-specific but does strongly impact neurons and proteins that are located in the pharynx of C. elegans, which is the first structure to come in contact with a chemical during exposure.Mutations in certain proteins alter the physiological or phenotypic effects of ethanol, including proteins involved in exocytosis (e.g.munc-18 and rab-3) as well as neurotransmitter receptors including GABA, glutamate, and serotonin (Johnson et al., 2013).Other proteins connected to the physiological effects of ethanol exposure include vesicle trafficking proteins, soluble NSF attachment protein receptors (SNAREs) (Barclay et al., 2010), connexins, vesicle priming factors, members of the Rab family, and synaptic chaperones (e.g.HSC70 and CSP) (Johnson et al., 2013).Results seen in this study also demonstrate effects in the CNS after exposure to ethanol, with specific pathways affected by ethanol including neuroactive ligand-receptor interaction (CEL04080), endocytosis (CEL04144), and calcium signalling (CEL04020).
While the bioinformatics identification of polar and non-polar chemical molecular MIEs suggests that both classes of chemicals act on the same receptors and ion channels, the co-exposure experiments with the pre-Fig.7. Optimised machine learning models predictive of narcosis type and MoA.The figure shows heatmaps of gene expression and metabolite levels for genes and metabolites that have been selected to predict narcosis type and MoA.Cluster analysis shows that the selected genes are capable of partially separating the compound classes, even outside of the predicting algorithm.Only the functionally annotated genes are shown at the side of the heatmap.Panels A-D show models developed to predict polar versus non-polar compounds.Panel A shows a model selected from the entire transcriptome while panel B shows a similar model selected from the computationally inferred genes.Panel C shows the model developed with metabolomics data.Panel D shows the model developed to separate polar-to non-polar and acetylcholine esterase inhibitors.Gene names are reported on the right side of the heatmap in the event that these were annotated with a potential known protein.Full list of genes and m/z features are available in the Supplementary material (Tables S3-S5 and Supplementary data 4). .Panel E and F shows the overlap of positively and negatively enriched gene sets for the K ow and the narcosis type analysis, respectively.Panel G represents in a cartoon format the main genes associated to acetylcholine signalling and ion transport.Panel H shows the heatmap of genes expression based on the model selected from the entire transcriptome.Cluster analysis shows that the selected genes are capable of partially separating the compound classes, even outside of the predicting algorithm.Full list of genes are available in the Supplementary material (Tables S6-S7).and post-synaptic drugs aldicarb and levamisole suggest that the largest interaction effects is for polar chemicals at the post-synaptic level.The observation that the receptor components for acetylcholine LEV-8 and DEG-3) are negatively correlated with compound lipophilicity (K ow ) and downregulated in the comparison between polar-and non-polar chemicals is indeed consistent with the results of the co-exposure experiment and with the hypothesis that narcotics would inhibit the activity of postsynaptic acetylcholine receptors and that polar compounds may be more effective in doing so.4.2.Why a fish gill cell line can be diagnostic of whole organism neuronal MoAs?
The observation that neuronal proteins are also modulated by exposure to narcotics in the in vitro trout fish cell line model and is perhaps the most unexpected and intriguing finding in this project.However, a closer look at the biology of fish gills and recent observations on the role of acetylcholine in hyperventilatory response to acute hypoxia in fish gills may explain our finding.For example, several of the Na, K and Cl ion channels that have neuronal functions also have a function in maintaining osmotic balance in the fish gill and their function can, therefore, be affected by narcotics.Zachar et al. have recently identified cholinergic cells in the gill filament epithelium in zebrafish (Danio rerio) (Zachar et al., 2017).The authors conclude that cholinergic cells in the zebrafish gills were present in the primary epithelium of gill filaments and formed contacts with nerve fibres.

Relevance of our work in risk assessment
Current approaches for screening chemicals for baseline toxicity include the use of the narcosis QSAR, included in the EpiSuite software and described under TSCA (U.S. EPA, 2016), as well as the OECD Toolbox application within REACH (Perkins et al., 2015).This current approach for assigning MoA is based on acute toxicity data for aquatic species and is unable to account for sublethal, chronic, or additive effects (Escher and Hermens, 2002).Using a data-driven learning approach, we were able to classify whether a chemical was a polar, non-polar, or an AChE inhibitor (Fig. 7).Applying this type of statistical modelling approach in future studies can further enable researchers to utilize molecular-level readouts to predict chemical class and conduct appropriate screening approaches before a chemical is required to undergo thorough evaluation and complete risk assessment procedures.
Our study also reveals the importance of species sensitivity differences considerations while utilizing the AOP framework.Results and pathways found to be significant in baseline toxicity in D. magna (Antczak et al., 2015) were not completely aligned with those found in our study, likely due to exposure route and physiological differences.These results highlight the importance of considering species physiology and exposure routes when devising AOPs and screening tools for chemical classification.In C. elegans, exposure to a high K ow chemical, even at a dose below the EC 50 , causes impacts on acetylcholine signalling genes, whereas for D. magna the primary effect was mitochondrial toxicity.Both species were exposed to the same class of chemicals which elicited effects on biological membranes; however, the downstream results to the organism's physiological functions were different.
Our results with the fish gill cell line show that an in vitro system can not only be effective to distinguish between polar and non-polar compounds but also it can be effective in providing useful indication on the type of molecular functions that may be modulated hence opening interesting possibilities in terms of species extrapolation.

Fig. 1 .
Fig.1.Comparison of log EC 50 responses in C. elegans exposed to a panel of 30 narcotic chemicals.Exposures were conducted for 24 h and % lethality per concentration was measured using a modification of the WormScan/touch test.EC 50 values were calculated over a range of 7 concentrations and were plotted against the measured log K ow value as A) calculated free concentrations and B) calculated internal concentrations.Chemicals are indicated by their ID as well as whether they were classified as nonpolar (red) or polar (blue).

Fig. 2 .
Fig. 2. Overview of the methodology to identify putative membrane molecular MIEs of narcosis.

Fig. 4 .
Fig. 4. Effects of polar and non-polar narcotics on C. elegans.Panel A shows a two-dimensional clustering of compounds and the activity of pathways enriched of genes differentially expressed between organisms exposed to each compound and their respective controls.Red and green shows significant up and downregulation, respectively.Panel B shows the heatmap of the 100 most highly correlated genes to compound K ow .Columns are compounds ranked by increasing K ow .The yellow line shows the separation between low-K ow compounds and high K ow compounds defining a transcriptional switch.Panel C represent a bar chart with the KEGG pathways enriched in the gene list resulting from a comparison between low and high K ow compounds.Panel D represent a bar chart with the KEGG pathways enriched in the gene list resulting from a comparison between non-polar and polar compounds.

Fig. 5 .
Fig. 5. Computationally inferred molecular MIEs of basal toxicity in C. elegans.Panels A and B plots the enrichment score (ES) for each membrane protein co-expression gene set (x axis) in relation to K ow (panel A) or narcosis type (panel B), ranked by descending order (y axis).Numbers in yellow specify how many gene sets are enriched of negatively correlated genes or positively correlated genes to K ow or of up-or downregulated genes in polar respect to non-polar compounds.Panels C and D represent bar charts with the KEGG pathways enriched in the gene list resulting from K ow linked genes (panel C) or narcosis type linked genes (panel D).Panel E and F shows the overlap of positively and negatively enriched gene sets for the K ow and the narcosis type analysis, respectively.Panel G represents in a cartoon format the main genes associated to acetylcholine signalling and ion transport.

Fig. 8 .
Fig.8.Computationally inferred genes linked to basal toxicity in the in vitro trout gill cell line.Panels A and B plots the enrichment score (ES) for each membrane protein co-expression gene set (x axis) in relation to K ow (panel A) or narcosis type (panel B), ranked by descending order (y axis).Numbers in yellow specify how many gene sets are enriched of negatively correlated genes or positively correlated genes to K ow or of up-or downregulated genes in polar respect to non-polar compounds.Panels C and D represent bar charts with the GO terms and KEGG pathways enriched in the gene list resulting from K ow linked genes (panel C) or narcosis type linked genes (panel D).Panel E and F shows the overlap of positively and negatively enriched gene sets for the K ow and the narcosis type analysis, respectively.Panel G represents in a cartoon format the main genes associated to acetylcholine signalling and ion transport.Panel H shows the heatmap of genes expression based on the model selected from the entire transcriptome.Cluster analysis shows that the selected genes are capable of partially separating the compound classes, even outside of the predicting algorithm.Full list of genes are available in the Supplementary material (TablesS6-S7).

Table 1
List of the full panel of Class 1 (non-polar narcotic), Class 2 (polar narcotic) and class 4 (AChE inhibitor) test compounds along with MoA classification and log K ow .