Expanded male sex-determining region conserved during the evolution of homothallism in the green alga Volvox

Summary Male and female genotypes in heterothallic (self-incompatible) species of haploid organisms, such as algae and bryophytes, are generally determined by male and female sex-determining regions (SDRs) in the sex chromosomes. To resolve the molecular genetic basis for the evolution of homothallic (bisexual and self-compatible) species from a heterothallic ancestor, we compared whole-genome data from Thai and Japanese genotypes within the homothallic green alga Volvox africanus. The Thai and Japanese algae harbored expanded ancestral male and female SDRs of ∼1 Mbp each, representing a direct heterothallic ancestor. Therefore, the expanded male and female ancestral SDRs may originate from the ancient (∼75 mya) heterothallic ancestor, and either might have been conserved during the evolution of each homothallic genotype. An expanded SDR-like region seems essential for homothallic sexual reproduction in V. africanus, irrespective of male or female origin. Our study stimulates future studies to elucidate the biological significance of such expanded genomic regions.


INTRODUCTION
Evolutionary transitions between self-incompatible and self-fertile mating systems have been an exciting topic of evolutionary biology since Charles Darwin. 1 Extensive studies of the transitions between species with separate sexes and hermaphrodite species in which individuals have both sex functions have been conducted in diploid organisms, such as invertebrates and seed plants. 1,2 Recently, how such transitions occurred in the sex-determining regions (SDRs) found in haploid sex chromosomes 3 of algae and bryophytes was demonstrated by studying whole genomes of Japanese culture strains of two closely related species of the green algal genus Volvox 4 : heterothallic (self-incompatible with genetically different male and female) Volvox reticuliferus and homothallic (bisexual, with the ability to fertilize within a clone) V. africanus. The genome data demonstrated that heterothallic V. reticuliferus has an expanded ($1 Mbp) SDR as found in another heterothallic species Volvox carteri, which suggests an ancient ($75 mya) origin of the expanded SDR within the genus Volvox. 4 Furthermore, homothallic V. africanus originating from Japan (V. africanus JP) harbors an expanded ($1 Mbp) sex-determining-like region (SDLR) originating from a female SDR of the ancestral heterothallic species. 4 However, the culture strain of V. africanus JP represents only one of the three mating systems of homothallic V. africanus 4,5 (homothallic, male-bisexual type; Table S1). Thus, further information on the SDLR (if present) in the other two mating systems of V. africanus (Table S1) would be very valuable for understanding the origin of homothallism and the biological importance of the female-derived expanded SDLR in homothallic V. africanus. 4 However, extended genome studies using these two homothallic mating systems seemed impossible because of the unavailability of V. africanus culture strains with these two systems. 6 heterothallic species during the evolution of the homothallic mating systems in V. africanus originating from Thailand, based on de novo whole-genome sequencing.

RESULTS AND DISCUSSION
Whole-genome assembly A de novo nuclear genome of V. africanus strain 1101-NZ-11 originating from Thailand (V. africanus TH) 7 was constructed by assembling a combination of long and short sequencing reads (see STAR Methods) and constituted 129 contigs. The 129 contigs were gap free (N50 = 3.95 Mbp) and yielded a nuclear genome assembly of 141.0 Mbp, similar to that of V. carteri and other volvocine species (Tables S2 and  S3). Protein-coding gene predictions were performed with the assistance of transcriptome data to estimate 13,455 expressed genes in the genome (Table S2). Assembly quality of the genome was high based on the presence of the vast majority of benchmarking universal single-copy orthologs (BUSCO) reference genes 8 (98.1% complete genes) (Table S2). In addition, the genome size estimated based on k-mer frequencies  (Table S1)  iScience Article ( Figure S1) using GenomeScope 9 was consistent with the total genome assembly size (Table S2), suggesting high coverage rates of the assembled contigs for the whole genome.

An expanded SDLR resolved
Based on a genome search using the two contigs harboring male and female SDRs in V. reticuliferus and the contig harboring SDLR (JP-SDLR) in V. africanus JP, 4 we found a homologous V. africanus TH contig (con-tig0022), which had two separate sequences corresponding to the two separate sequences flanking JP-SDLR, as well as to the two pseudo autosomal regions flanking SDR of V. reticuliferus 4 ( Figure S2). A long ca. 1 Mbp sequence (TH-SDLR) was found between the two separate sequences in contig0022 of V. africanus TH (Figure 2A). TH-SDLR had no dot-plot similarity to the SDRs of V. reticuliferus or JP-SDLR ( Figures S2-S4), suggesting rapid movement in positions of the genes (gametologs in SDR 10,11 and their iScience Article homologs in SDLR 4 ) during evolution. However, it was repeat rich and had a lower GC content than those of the whole-genome sequences found in SDRs of heterothallic volvocine species and JP-SDLR (Table S3).
TH-SDLR harbored homologs of all three male-specific genes in V. reticuliferus (MID, MTD1, and VRM-001) and 31 other protein-coding genes ( Figure 2A); 21 of the 31 were homologs for V. reticuliferus gametologs 4 (Figures 2B and S5-S7 and Table S4), whereas seven of the other 10 were homologs of V. carteri gametologs 10 ( Figure S8). Interestingly, 30 of the 31 genes were homologous between TH-SDLR and JP-SDLR, and their divergences were similar to those of gametolog pairs in heterothallic volvocine species (Figure 3); the one remaining gene was a homolog of MME6 that is recognized in the SDR of V. reticuliferus as a gametolog but is positioned in an autosome-like region (outside JP-SDLR) in V. africanus JP 4 (Figures 2A and S7, and Table S4). Of the 30 genes shared between TH-SDLR and JP-SDLR, 20 were homologs of V. reticuliferus gametologs (Figures 2A, S5 and S6, and Table S4). However, TH-SDLR differed significantly from JP-SDLR in the origins of the homologs of the V. reticuliferus gametologs. Although 18 of the 20 homologs for V. reticuliferus gametologs in iScience Article JP-SDLR originated from the female SDR of the ancestral heterothallic species and no male-related homologs were found in JP-SDLR, 4 18 of the 20 shared homologs and MME6 in TH-SDLR were shown to have originated from the ancestral male SDR based on a phylogenetic analysis, while the remaining two (WDR57 and VAMT001) had no resolution or female origin ( Figures 2B and S5-S7). Therefore, TH-SDLR might have evolved from the male SDR of the ancestral heterothallic species.
Putative ancestral heterothallic species of homothallic V. africanus As discussed above, this study clearly demonstrated that TH-SDLR could have originated from the male SDR of the ancestral heterothallic species. By contrast, JP-SDLR might be female SDR derived. 4 Since TH-SDLR and JP-SDLR shared 30 homologous genes that exhibit some divergence and different sex origins as the gametolog pairs in the heterothallic volvocine species (Figures 3, S5, S6, S8, and S9), these two SDLRs may have maintained the gene compositions of the male and female SDRs of the heterothallic ancestral species of V. africanus. The ancestral male and female SDRs might have harbored three male-specific genes (MID, MTD1, and VRM-001) and one female-specific gene (FUS1), respectively, and they might have had 30 gametologs (20 directly related to V. reticuliferus gametologs) ( Figure S9). Therefore, the genomic features of the ancestral male and female SDRs (gene compositions, low GC content, and high repeat rate) might have been retained in TH-SDLR and JP-SDLR, respectively, in two different homothallic mating systems within homothallic V. africanus (Figures 1 and S9).
Essential genes for homothallic mating system in V. africanus A homothallic mating system is based on the presence of both male and female attributes of sexual reproduction in a single haploid genotype. Our current and previous studies clearly demonstrated that both of two different homothallic organisms (V. africanus TH and JP) have 30 possible ancestral gametologs in SDLR, while three male-specific genes (MID, MTD1, and VRM001) are located in TH-SDLR (Figure 2), or MID and MTD1 are outside JP-SDLR ( Figure S9). 4 Therefore, maleness in the homothallic species is based on the presence of homologs of the conserved male-specific genes MID and MTD1. 4,10,11 In comparison, the majority of the 30 possible ancestral gametologs in TH-SDLR and JP-SDLR are male or female related, respectively. Thus, irrespective of male or female SDR origin, the presence of the 30 ancestral gametologs in both SDLRs may be essential for sexual reproduction, especially the female-related attributes in homothallic V. africanus because the most important malerelated attribute is the production of sperm packets that is determined by the male-specific gene MID. 12 Retaining the ancestral genomic features of the SDR (low GC contents and repeat-rich; Tables S2 and S3) in both TH-SDLR and JP-SDLR may represent a very early event in the transition from heterothallism to homothallism in V. africanus. However, the continuous expanded (ca. 1 Mbp long) SDLRs and low GC and repeat-rich genomic features in SDLRs may be important for sexual reproduction of V. africanus because such expanded genome regions might have been retained in heterothallic species for at least 75 mya after the divergence between V. carteri and V. reticuliferus. 4 Possible evolutionary history of ancestral male and female SDRs during the evolution of the homothallism in V. africanus The direct ancestral heterothallic species of V. africanus is thought to have male and female SDRs that are very similar to TH-SDLR and JP-SDLR, respectively (Figures 3 and S9). Given that the transition to homothallism occurred only once in V. africanus as suggested by the single origin of all the three homothallic mating systems within V. africanus, 6 evolution from the heterothallic ancestor to the homothallic ancestor might have been initiated by the acquisition of both male and female SDRs of the ancestral heterothallic species ( Figure S9). Subsequently, in the direct ancestor of V. africanus JP, the male SDR became degenerate, but the important male genes (MID and MTD1) and the female SDR were retained to express male and female functions in sexual reproduction. By contrast, the female SDR might have disappeared completely in the direct ancestor of V. africanus TH. Experimental transition from heterothallism to homothallism was demonstrated using partial suppression of the MID gene in the male strain of V. carteri. 12 However, the important male attribute ''sperm packet formation'' can be induced in the female strain of V. carteri by transforming the male-specific gene MID. 12 Thus, evolutionary transition from heterothallism to homothallism may be possible by using only the genes of the male genotype in the genus Volvox. Based on the present genome comparison, each of the expanded male and female ancestral SDRs may have been conserved during the evolution of the homothallic species V. africanus, and such an ancestral SDR or SDLR appears to be essential for homothallic sexual reproduction in V. africanus, regardless of male or female origin. The expanded SDLRs have low GC and repeat-rich genomic features (Table S2), which appear to be widely conserved in SDRs of heterothallic species of Volvox related to V. carteri and V. reticuliferus. 4 SDRs are recognized in the marine unicellular species ''prasinophytes'' based on their low GC and repeat-rich contiguous genomic features even when sexuality is unknown. 13 However, the molecular genetic relationships between these genomic features in SDR (or SDLR) and sexuality are unknown. Thus, further studies are needed to elucidate the biological significance of such expanded genomic regions. There will also be many interesting questions about the expanded SDR or SDLR in Volvox. Apart from the identified sex-linked genes, what are the possible important genes in determining the mating system and/or phenotype of a sexual spheroid (Table S1)? Is the genome structure important in projecting further genome arrangements in SDR/SDLR? Is the epigenome involved in SDR/SDLR?

STAR+METHODS
Detailed methods are provided in the online version of this paper and include the following:

Data and code availability
New genome and RNA-seq data have been deposited at DNA DataBank of Japan (DDBJ)/European Nucleotide Archive (ENA)/GenBank and are publicly available as of the date of publication. Accession numbers are listed in the key resources table. All other study data are included in the article and/or Supplemental information. This paper does not report original code. Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.

METHOD DETAILS
Whole-genome sequencing and de novo assembly Cultures of V. africanus TH were grown in Petri dishes (100 mm/non-treated Dish, IWAKI AGC Techno Glass, Shizuoka, Japan) containing 30 mL VTAC medium at 25 C on a 14:10 h light: dark (L:D) schedule, under cool-white fluorescent lamps at an intensity of 80-130 mmol m À2 s À1 . Approximately 200 mL cultured sample was subjected to genomic DNA extraction. Genomic DNA was prepared by using NucleoBond HMW DNA (Macherey-Nagel, Dü ren, Germany) according to the manufacture's protocol. A whole-genome sequencings of V. africanus TH was performed using PacBio and Illumina technologies. Genomic DNA was sheared using a DNA shearing tube, g-TUBE (Covaris). Two sequencing libraries (20 kbp and 30 kbp) were constructed and each library was sequenced on one single molecule, real-time (SMRT) cell using the PacBio Sequel system (Pacific Biosciences, Menlo Park, CA, United States). These reactions generated 1.48 M subreads (total bases: 27.7 Gbp). Sequencing coverage was about 197x based on the estimated genome size. The PacBio reads were assembled de novo with Canu v2.2. 14 Furthermore, genomic DNA was fragmented with a DNA Shearing System, S2 Focused-ultrasonicator (Covaris ILC, Woburn, MA, United States). Illumina paired-end library (average insert size 520 bp) was constructed with a TruSeq DNA PCR-Free Sample Prep Kit (Illumina, San Diego, CA, United States) according to the manufacturer's instructions and were sizeselected on an agarose gel using a Zymoclean Large Fragment DNA Recovery Kit (Zymo Research, Irvine, CA, United States). The final library was sequenced on the Illumina NovaSeq 6000 sequencer (156 M reads with 150 bp read length). Total bases and sequence coverage were 23.5 Gbp and 166-fold, respectively. The Illumina data were then mapped against the PacBio assembly sequences, and assemblies were corrected using Pilon v1.23. 15 Contigs with &2% or S98% GC, with S10% matching with bacterial or organelle DNA sequences, or with &10x mean coverage of Illumina reads were excluded from a set of the nuclear genome sequences. The completeness of the genome assembly was verified using the BUSCO v5.1.2 8 with 1,519 single-copy orthologs from the chlorophyta_odb10 dataset, which indicated that 98.1% complete genes in the reference dataset were present in the current genome assembly (Table S2).

SDLR identification
A candidate contig (contig022) for the entire SDLR was screened as major significant matching subjects with more than three nonoverlapping protein hits (cutoff maximum E-value: 1 3 10À10) by TBLASTN (National Center for Biotechnology Information) on de novo genome assemblies of V. africanus TH with all proteins on V. reticuliferus male SDR and female SDR as queries and then dotplot analyses were performed between contig022 of V. africanus TH and V. reticuliferus male SDR using YASS (https://bioinfo.lifl.fr/ yass/index.php) 16 to detect the SDLR. By using sequences of the V. reticuliferus SDR and SDLR of the Japanese homothallic strain of V. africanus (JP-SDLR), 4 a long SDLR (TH-SDLR) was determined in contig022 of V. africanus TH.