Sex-linked gene traffic underlies the acquisition of sexually dimorphic UV color vision in Heliconius butterflies

Significance How differences between the sexes arise and evolve is a central question in evolutionary biology. However, identifying the genetic origins of new behavioral traits has proven elusive due to the complexity of the neural circuitry underlying most behaviors and the difficulty in reconstructing complete chromosomes from whole-genome sequence data. Here, we identify one such genetic mechanism responsible for sexual dimorphism in UV (ultraviolet) color vision in the butterfly genus Heliconius—an autosomal-to-sex chromosome translocation of an opsin gene. We find that the origins of this sexually dimorphic behavior are not well explained by existing models. This represents the first known example of sex-limited UV color vision in animals due to the movement of a single gene to a sex chromosome.

Blast results, sorted by ascending E-value and using the default E-value cutoff, were visually inspected. Top hits were imported, trimmed, and aligned in MEGA X (1). H. charithonia genes were numbered according to their closest H. melpomene homologs following phylogenetic analysis (see below). Alternate blast hits subjectively determined to be of significant length and similarity were also added to the alignments to explore potential expansions of these gene families. All annotated transcripts were deposited in GenBank under accession nos: OQ236489-OQ236524, OQ064249-OQ064288, and OQ629369-OQ629425.

Phylogenetic Analysis
Curated OBP, CSP, and OR protein sequences from H. melpomene and H. charithonia were aligned in MEGA X using the MUSCLE algorithm. These alignments were visually inspected and manually adjusted as needed. Maximum likelihood trees were estimated in PhyML (2) from the nucleotides using 500 bootstrap replicates and the best-fit substitution models as identified by SMS (3). The Akaike Information Criterion was used as the selection criterion. Tree images were generated using the iTOL web server (4).

Genome Analysis
We also searched the H. charithonia genome for the same CSPs, OBPs, and ORs (some of which were extracted directly from the genome). For those genes where we found an H. charithonia ortholog transcript, we used the ortholog's nucleotide sequence as a query for a BLASTn search in the genome. Where there was no ortholog found, we searched using tBLASTn with the H. melpomene protein sequence. Blast results were visually inspected, and relevant hits were determined in the same way as described for the transcriptome searches.

Chemosensory proteins (CSPs)
Out of the 33 H. melpomene CSPs reported in Heliconius Genome Consortium (5), we identified 28 orthologs in H. charithonia (Hc). 23 of these orthologs are full-length, containing a start and stop codon. We also identified 8 lineage-specific CSPs, 7 of which are full-length. This makes 36 Hc CSPs identified in total. In the Hc genome, we found genes corresponding to individual mRNAs for all of the 28 ortholog transcripts, with 20 of these genes being full-length. We also located in the genome all of the 8 new Hc CSPs, with 7 of these being full-length.

Fig. S1. Maximum likelihood tree of manually annotated chemosensory proteins (CSPs).
Genes from Heliconius melpomene and H. charithonia are shown in red and blue, respectively. Amino acid sequences were aligned, then backtranslated to nucleotides to build the tree. Grey circles on branches indicate bootstrap values >= 80% from 500 bootstrap replicates. Branches highlighted by red or blue arcs indicate lineage-specific CSP expansions. The model selected for phylogenetic analysis was GTR+G. Hm, Heliconius melpomene; Hc, Heliconius charithonia.

Odorant-binding proteins
While 43 putative H. melpomene OBPs were reported in Heliconius Genome Consortium (5), mRNA nucleotide transcripts were only available for 41 of them. Using these 41 known OBPs as reference (5), we identified 35 H. charithonia OBP orthologs. 26 of these orthologs were full-length, containing a start and stop codon.
We also identified 4 lineage specific H. charithonia OBPs, all of which were full-length. In total, this makes 39 H. charithonia OBPs. In the Hch genome, we found genes corresponding to individual mRNAs for all 35 ortholog transcripts, and for each of the 4 new OBP transcripts. Genes from Heliconius melpomene and H. charithonia are shown in red and blue, respectively. Amino acid sequences were aligned, then backtranslated to nucleotides to build the tree. Grey circles on branches indicate bootstrap values >= 80% from 500 bootstrap replicates. The model selected for phylogenetic analysis was GTR+G+I. Hm, Heliconius melpomene; Hc, Heliconius charithonia.

Olfactory receptors
While 70 putative H. melpomene OR genes were reported in Heliconius Genome Consortium (5), mRNA transcripts were only available for 52 of them. We identified orthologs in H. charithonia for 50 of these 52 ORs. Of the 50 orthologs, 11 were full-length, containing a start and stop codon. We also found 5 lineage-specific ORs, 1 of which was full-length. Altogether this makes 55 H. charithonia ORs based on available mRNA transcripts. In the Hc genome, we found genes corresponding to the individual mRNAs for all 55 ORs expressed in our transcriptome. One previously identified OR in H. melpomene, HmOR65, is actually identical to HmOR51. A BLASTp against GenBank for this sequence yielded only HmOR51. For that reason, we omitted HmOR65 from our analysis.

Fig. S3. Maximum likelihood tree of manually annotated olfactory receptor proteins (ORs).
Olfactory receptors from Heliconius melpomene and H. charithonia are shown in red and blue, respectively. Amino acid sequences were aligned, then backtranslated to nucleotides to build the tree. Grey circles on branches indicate bootstrap values >= 80% from 500 bootstrap replicates. The model selected for phylogenetic analysis was GTR+G. Hm, Heliconius melpomene; Hc, Heliconius charithonia.

Fig. S4. Identification of Z contig.
A) The contig showing >2 fold male-to-female coverage ratio was assigned as the candidate Z contig. B) Alignment dot plot between the H. charithonia Z chromosome candidate contig and H. erato demophoon Z chromosome scaffold (6). Mapping of the Z chromosome candidate to the H. erato Z chromosome suggests that the coverage-based sex-chromosome assignment identified sex-linked chromosomes correctly.    Table S1 and/ or IHC of adult eye tissue using anti-UVRh1 and anti-UVRh2 antibodies shown in Fig. S9 (see also McCulloch et al. 2017). Species where the average UVRh1 reads per kilobase per million (RPKM) for males was <1, were scored as absent. For species in which UVRh1 RPKM >1, then UVRh1 mRNA was scored as present in males.