Natural genetic variation in the pheromone production of C. elegans

Significance Pheromones are signaling molecules used for chemical communication in social interactions. Although diverse pheromone classes from diverse species have been identified, little is known about how chemical languages evolve. To study the genetic basis underlying natural diversity in animal chemical language, we analyzed the secreted pheromones of 95 whole-genome sequenced wild Caenorhabditis elegans strains. We characterized the genetic architectures underlying natural differences in the production of 44 ascarosides, which represent the nematode’s primary pheromone class. Our study uncovered “hot spot loci” that broadly impact ascaroside production, as well as inverse correlations between two major classes of ascarosides. Our findings provide insights into how metabolism and chemical communication are coupled in evolution.

pnas.org Simple ascarosides consist of the dideoxy sugar ascarylose (brown) and an FA-like side chain of varying lengths (blue). Modular ascarosides incorporate additional building blocks from other primary metabolic pathways, e.g., indole 3-carboxylic acid (icas#, green) derived from tryptophan, the neurotransmitter octopamine (osas#, green), and a variety of C-terminal modified ascarosides, including folate derivatives (ascr#8, ascr#81) and several glucosides (black) (12). (B) Chain shortening of ascarosides by enzymes in the peroxisomal β-oxidation pathway. Ascarosides of the two classes ("ascr" and "oscr") are derived from very long-chain precursors. (C) Schematic of the ascaroside profiling experiments, in which 95 C. elegans strains were grown in liquid cultures, and the conditioned media extracted and analyzed by HPLC-HRMS, then correlated with genomic data to identify QTL underlying natural differences in pheromone production. (D) A 29,011 bp natural deletion in the JU1400 strain encompassing daf-22 and seven additional genes. (E) Extracted ion chromatograms (EICs) corresponding to several short-and medium-chain ascarosides, as indicated, in the N2 strain (wild-type, WT), daf- 22(ok693), and the JU1400 strain exo-metabolome extracts from synchronized adults. Y axis are scaled as indicated to clearly show lower-intensity metabolites. (F) EICs for m/z 429.3222 and 443.3378, corresponding to precursor ascarosides with C 18 and C 19 sidechains, respectively, in the N2 strain (WT), daf-22(ok693), and the JU1400 strain exo-metabolome extracts from synchronized adults.
natural variation in C. elegans pheromone production. We analyzed the secreted metabolites from 95 wild C. elegans strains and profiled their pheromone bouquets by measuring relative abundances of 44 different ascarosides. Our quantitative genetic analysis of heritable variation in ascaroside production revealed diverse links between natural differences in metabolism and chemical communication of the species.

Results
A Peroxisomal β-Oxidation Gene Is Deleted in a Pheromone-Less Wild C. elegans Strain. To investigate the intraspecific variation in C. elegans pheromone production, we analyzed the exo-metabolomes of 95 wild strains using high-performance liquid chromatography coupled to high-resolution mass spectrometry (HPLC-HRMS) ( Fig. 1C and Methods). Because ascaroside biosynthesis is affected by diverse factors including sex, developmental timing, and nutrition (29), we chose to analyze the exo-metabolomes of synchronized hermaphrodites at the young adult stage. Among thousands of detected metabolites, we identified and quantified 44 ascarosides (Methods and SI Appendix, Fig. S1). Unexpectedly, we found that short-and medium-chain ascarosides were almost completely absent in a single wild strain (JU1400), suggesting that key steps in ascaroside biosynthesis could be impaired in this strain. Previously, mutations in peroxisomal βoxidation genes (e.g., daf-22, dhs-28, maoc-1) (Fig. 1B) were shown to abolish the production of shortand medium-chain ascarosides (16). To investigate whether the JU1400 strain has an impaired peroxisomal βoxidation pathway, we performed a de novo assembly of this genomic region and identified a large deletion (29 kb) in the daf-22 locus, which completely removes daf-22 and seven neighboring genes (Fig. 1D). Consistent with this deletion, the metabolic phenotype of the JU1400 strain closely resembled that of daf-22(ok693) loss-offunction mutants, which lack short-and medium-chain ascarosides and instead accumulate large amounts of long-chain precursor ascarosides ( Fig. 1 E and F). Intriguingly, the ratio of very longchain (ω-1)-to ωascarosides in daf-22(ok693) and JU1400 was similar (Fig. 1F), suggesting that the ratio of these precursors is controlled independently of peroxisomal βoxidation. We extended our analysis of the daf-22 locus to 538 wild genomes (30) but did not find the same deletion nor any nonsense mutation in any other wild strains, suggesting that the loss of a βoxidation gene that leads to the severe impairment of ascaroside production is rare in the natural C. elegans population.
The Pheromone Bouquet Varies among Wild C. elegans Strains.
C. elegans produces and releases a diverse collection of ascarosides with different lengths of FA side chains as well as other types of modifications ( Fig. 1A and SI Appendix, Table S1). Based on the HPLC-HRMS data, we compared the composition and abundances of pheromones among 94 wild C. elegans strains ( Fig. 2 A and B and Dataset S1), excluding the JU1400 strain that lacks the majority of short-and medium-chain ascarosides.
For each strain, we calculated the intensity of each ascaroside relative to the sum of the 44 measured ascarosides (henceforth, referred to as relative abundance). On average, we found that two major ascarosides, ascr#5 and ascr#3, which are derived from the ω and ω-1 pathways, comprised 51.2% and 21.4% of measured ascarosides, respectively (Fig. 2C). The rest of the identified ascarosides comprised 0.004 to 7.4%, and the relative abundances of each ascaroside varied from strain to strain, though to different extents ( Fig. 2 B and C and SI Appendix, Fig. S2). For example, a pheromone that promotes aggregation at picomolar concentrations, referred to as icas#9 (indole-3-carboxylic acid ascarosides) (8,31), was not detected in the ECA36 strain but it comprised 3.3% of total ascarosides in the CB4856 strain (Fig. 2D). Notably, ECA36 possesses a nonsense mutation in cest-3, which encodes the enzyme required for 4′ attachment of indole 3-carboxylic acid to the ascarylose core (32). By contrast, ascr#11 was much less variable than icas#9 across the 94 wild strains, as its relative abundance ranged from 2.1 to 5% (Fig. 2E).
We investigated the QTL hot spot spanning 1.49 Mb on the right arm of chromosome II (ChrIIR-QTL), which explained the largest fraction of variance for the ascr#3:ascr#5 ratio trait and also was mapped for the largest number of relative abundance traits. To identify a quantitative trait gene underlying this hot spot, we performed a fine-mapping for the ascr#3:ascr#5 ratio trait (Fig. 5A). Among 9,907 genetic variants in 290 genes across the QTL (Dataset S2), we prioritized genetic variants that were predicted to disrupt protein function (i.e., missense, frameshift). Four single-nucleotide variants (SNVs) were found to be equally and significantly associated with the phenotypic variation (−log 10 p = 12.469). Among four genes (moe-3, W09H1.1, W09H1.3, and mecr-1) impacted by these SNVs, we focused on mecr-1 because of its predicted involvement in fatty acid metabolism, which is upstream of ascaroside biosynthesis and therefore potentially related to the ascr#3:ascr#5 ratio trait. The gene mecr-1 encodes a mitochondrial trans-2-enoyl-CoA reductase, a key enzyme in mitochondrial FA synthesis (mtFAS) (33), whose potential interactions with ascaroside biosynthesis have not been described previously. We found that four of the five wild strains with high ascr#3:ascr#5 ratios (ED3052, JU258, LKC34, and NIC256) harbored the G159V missense variant in mecr-1 (Fig. 5B), suggesting that this allele could cause increased production of ascr#3, a reduction of ascr#5, or both.
To test this hypothesis, we performed a fine-mapping for the ChrIIL-QTL, which has a much smaller CI (2.6 Mb) than that of the ChrIV-QTL (6.5 Mb) that spans almost half of chromosome IV (Fig. 6A). In total, we analyzed the association between 28,230 genetic variants in ChrIIL-QTL with the ascr#3:ascr#5 ratio trait. The five most significantly associated variants were identified in the genes bath-4, nstp-4, pod-2, T04B8.5, and math-7 (Dataset S3). Intriguingly, the most significantly associated variant predicted to impact gene function was in pod-2, an ortholog of human ACACA (acetyl-CoA carboxylase alpha). POD-2 is predicted to act upstream of MECR-1 in the mtFAS pathway (− log 10 p = 15.300, Fig. 6B). Given the low probability of randomly mapping two highly significant coding variants in two distinct genes within the same biochemical pathway from two separate QTLs, we prioritized the pod-2 variant for further analysis. The alternative POD-2(1516Y) allele is associated with a higher ascr#3:ascr#5 ratio than that of the reference POD-2(1516H) allele (Fig. 6C). Furthermore, the association patterns of pairwise ratio traits are similar to those of MECR-1(G159V) variant (SI Appendix, Fig. S5). Most importantly, three wild strains that exhibit extremely high ascr#3:ascr#5 ratios carry alternative alleles for both genes (Fig. 6C).
To analyze the effects of the POD-2(H1516Y) variant and its genetic interaction with the MECR-1(G159V) variant, we generated allele-substituted strains in both the N2 and the N2 MECR-1(159V) genetic backgrounds. We found that neither the single edit of POD-2(1516H>Y) nor the double edits of both MECR-1(159G>V) and POD-2(1516H>Y) changed the ascr#3:ascr#5 ratio (Fig. 6D). Although these two variants are highly associated with phenotypic variation and components of the same mtFAS pathway, this result shows that neither MECR-1(G159V), POD-2(H1516Y), nor both variants are sufficient to change the ratio of ωascarosides to (ω-1)-ascarosides in the N2 background.

Discussion
We explored natural variation in ascaroside pheromone production of C. elegans and its genetic basis. By profiling excreted metabolites across 95 wild C. elegans strains, we found that ascaroside pheromone bouquets differ between strains in several different ways. In the most extreme case, we found complete absence of ascaroside pheromones in the JU1400 strain likely caused by a deletion encompassing the peroxisomal βoxidation gene daf-22. Similar to daf-22(ok693) laboratory mutants, this wild strain has lost the ability to produce short-and medium-chain ascarosides and instead accumulates long-chain precursors (Fig. 1 E and F). The natural loss of the daf-22 gene was surprising because ascaroside pheromones are known to play key roles in the survival and reproduction of the species (4). Notably, we scanned over 500 wild C. elegans genomes and identified this loss only in the genome of the JU1400 strain. This strain was sampled from an urban garden in the city center of Seville, Spain, suggesting that this rare daf-22 deletion has been maintained in a human-associated environment. Similarly, a nonsense mutation in the carboxylesterase cest-3 in the ECA36 strain was correlated with the lack of indole ascarosides, which regulate dwelling and aggregation behaviors (8). We also found this cest-3 variant from the genome of 28 other wild C. elegans strains that were sampled across Pacific regions (30), suggesting that this variant can be maintained in the natural populations.
In addition, our analysis revealed a negative correlation between the relative abundances of ωand (ω-1)-ascarosides across many different natural strains, highlighted by their most abundant representatives, ascr#5 (ω) and ascr#3 (ω-1), which together account for more than 70% of measured ascarosides. The structural difference between ωand (ω-1)-ascarosides, which also regulate different traits, likely arises from differences in the metabolism of their long-chain fatty acid precursors, which presumably get hydroxylated in specific positions of the chain followed by attachment of the ascarylose, producing long-chain precursors of either ωor (ω-1)-ascarosides. Because the origin of long-chain ascaroside precursors remains unknown, we speculate that the ratio of (ω-1) to ωascaroside is determined by the hydroxylation of long-chain alkyl precursors (i.e., hydroxylation at either the ω or ω-1 carbon), resulting in fatty acid attachment to ascarylose at the corresponding carbon. This difference might be caused by different metabolic inputs to the ascaroside pathway or from the expression of different tailoring enzymes (e.g., cytochrome P450 oxidases), which could preferentially hydroxylate the ω or ω-1 position (34). Interestingly, we found that the five strains with the highest ascr#3:ascr#5 ratios were isolated from Europe and Africa but not from the Pacific region where C. elegans likely originated (35,36). By contrast, the five strains with the lowest ascr#3:ascr#5 ratios include two strains from the Pacific region (SI Appendix, Fig. S7). This result suggests that the relative abundances of ωand (ω-1)-ascarosides were reversed in some populations during the out-of-Pacific expansion of C. elegans, which is hypothesized to have been facilitated by human activity (37).
Our GWA mapping analysis uncovered hot spot genomic loci that underlie relative abundances of various ascarosides as well as the ascr#3:ascr#5 ratio. Two hot spot QTL on chromosome II were mapped, respectively, to loci that harbor coding variants in mtFAS pathway genes (mecr-1 and pod-2) that are highly associated with the (ω-1)-to-ωascaroside ratio. We hypothesize that the mtFAS pathway underlies the balance between the two parallel ascaroside biosynthetic pathways, and genetic variants in the mtFAS pathway contribute to the natural differences in the usage of the two pathways. However, our allele replacement experiments in the N2 strain failed to demonstrate the causal effects of these two variants, which could be interpreted in several ways. First, the mtFAS pathway might not be involved in ascaroside biosynthesis, or at least may not be responsible for the natural variation in (ω-1)-to-ωascaroside ratio. Second, two variants [MECR-1(G159V) and POD-2(H1516Y)] that we edited may be neutral but linked to uncharacterized causal variants in other genes. Finally, complex genetic interactions could mask the effect of these alleles. Recently, incompatible versions of a galactose metabolic pathway were characterized in Saccharomyces cerevisiae (38), in which the incompatible combination of alleles of metabolic genes is not found in nature. We failed to introduce MECR-1 (159G>V) edit in two wild strains but successfully generated MECR-1(159V>G) edit in the N2 background, implying that incompatible alleles of metabolic genes might be present across wild genomes of C. elegans. Notably, we found strong LD among three ascr#3:ascr#5 QTL (the mecr-1 locus, the pod-2 locus, and the ChrIV-QTL). Therefore, although even the double-edited (MECR-1(159G>V) and POD-2(1516H>Y)) strains did not display effects on the mapped trait (ascr#3:ascr#5), this result could be explained by allele(s) in unidentified genes that segregate together and exert nonadditive phenotypic effects.
Although we focused on the (ω-1)-ascaroside-to-ωascaroside ratio among many observed traits in this study, we also discovered natural variation in the production of individual ascaroside pheromones and their QTL. Our dataset will provide a valuable resource for future studies to characterize genes involved in pheromone production and to explore the molecular mechanisms of how genetic changes lead to the evolution of a chemical language.  for the ascr#3:ascr#5 trait is shown. Each dot represents an SNV that is present in at least 5% of the 94 wild strains. The association between the SNV and ascr#3:ascr#5 trait value is shown on the y axis and the genomic position of the SNV is shown on the x axis. SNVs with high or moderate impact inferred from SnpEff are colored purple. (B) Schematic of the mitochondrial FA synthesis (mtFAS) pathway. Two enzymes (POD-2 and MECR-1) that harbor missense variants associated with ascaroside production variation are shown. (C) A bar plot for the ascr#3:ascr#5 trait value of 94 wild C. elegans strains. The reference N2 strain is colored orange and other wild strains are colored by the genotype of two sites (MECR-1(G159V) and POD-2(H1516Y)). (D) Phenotypes of POD-2 allele-replaced strains are compared with the N2 reference parental strain (159G, 1516H) and ECA2818 MECR-1-edited strain (159V, 1516H). Two independent POD-2 allelereplacement strains for each background (ECA3130 and ECA3131 for N2; ECA3128 and ECA3129 for ECA2818) were tested. On the y axis, values of ascr#3:ascr#5 ratio traits are shown.
pnas.org Methods C. elegans Strains and Growth. N2 (Bristol) and wild nematode strains were maintained at 20 °C, reared on Escherichia coli OP50, and grown on modified nematode growth medium containing 1% agar and 0.7% agarose (NGMA) to prevent animals from burrowing (39). Wild strains were obtained from the CeNDR (30)  HPLC-HRMS. Liquid chromatography was performed on a Vanquish HPLC system controlled by Chromeleon software (ThermoFisher Scientific) and coupled to an Orbitrap Q-Exactive High-Field mass spectrometer controlled by Xcalibur software (ThermoFisher Scientific). Methanolic extracts prepared as described above were separated on a Thermo Hypersil Gold C18 column (150 mm × 2.1 mm, particle size 1.9 μM, part no. 25002-152130) maintained at 40 °C with a flow rate of 0.5 mL/ min. Solvent A is 0.1% formic acid (Fisher Chemical Optima LC/MS grade; A11750) in water (Fisher Chemical Optima LC/MS grade; W6-4); solvent B is 0.1% formic acid in acetonitrile (Fisher Chemical Optima LC/MS grade; A955-4). A/B gradient started at 1% B for 3 min after injection and increased linearly to 98% B at 20 min, followed by 5 min at 98% B, then back to 1% B over 0.1 min, and finally held at 1% B for the remaining 2.9 min to reequilibrate the column (28 min total method time). Mass spectrometer parameters were spray voltage, −3.0 kV/+3.5 kV; capillary temperature, 380 °C; probe heater temperature, 400 °C; sheath, auxiliary, and sweep gas, 60, 20, and 2 AU, respectively; S-Lens RF level, 50; resolution, 120,000 at m/z 200; and AGC target, 3E6. Each sample was analyzed in negative (ESI−) and positive (ESI+) electrospray ionization modes with m/z range 100 to 1,000.
Metabolite Nomenclature. Ascarosides were named using Small Molecule Identifiers (SMIDs), a search-compatible nomenclature for metabolites identified from C. elegans and other nematodes. The SMID database (www.smid-db.org) is an electronic resource maintained in collaboration with WormBase (www.wormbase. org); a complete list of SMIDs can be found at www.smid-db.org/browse (12).