Maturity2, a novel regulator of flowering time in Sorghum bicolor, increases expression of SbPRR37 and SbCO in long days delaying flowering

Sorghum bicolor is a drought-resilient facultative short-day C4 grass that is grown for grain, forage, and biomass. Adaptation of sorghum for grain production in temperate regions resulted in the selection of mutations in Maturity loci (Ma1 –Ma6) that reduced photoperiod sensitivity and resulted in earlier flowering in long days. Prior studies identified the genes associated with Ma1 (PRR37), Ma3 (PHYB), Ma5 (PHYC) and Ma6 (GHD7) and characterized their role in the flowering time regulatory pathway. The current study focused on understanding the function and identity of Ma2. Ma2 delayed flowering in long days by selectively enhancing the expression of SbPRR37 (Ma1) and SbCO, genes that co-repress the expression of SbCN12, a source of florigen. Genetic analysis identified epistatic interactions between Ma2 and Ma4 and located QTL corresponding to Ma2 on SBI02 and Ma4 on SBI10. Positional cloning and whole genome sequencing identified a candidate gene for Ma2, Sobic.002G302700, which encodes a SET and MYND (SYMD) domain lysine methyltransferase. Eight sorghum genotypes previously identified as recessive for Ma2 contained the mutated version of Sobic.002G302700 present in 80M (ma2) and one additional putative recessive ma2 allele was identified in diverse sorghum accessions.


Introduction
Sorghum bicolor is a drought resilient, short-day C4 grass that is grown globally for grain, forage and biomass [1][2][3][4]. Precise control of flowering time is critical to achieve optimal yields of sorghum crops in specific target production locations/environments. Sorghum genotypes that have delayed flowering in long days due to high photoperiod sensitivity are high-yielding sources of biomass for production of biofuels and specialty bio-products [3,5]. In contrast, grain sorghum was adapted for production in temperate regions by selecting genotypes that have reduced photoperiod sensitivity resulting in earlier flowering and reduced risk of PLOS ONE | https://doi.org/10.1371/journal.pone.0212154 April 10, 2019 1 / 17 a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 The genes corresponding to four of the six Maturity loci have been identified. Ma 1 , the locus with the greatest influence on flowering time photoperiod sensitivity, encodes SbPRR37, a pseudo-response regulator that inhibits flowering in LD [21]. Ma 3 encodes phytochome B (phyB) [36], Ma 5 encodes phytochrome C (phyC) [23], and Ma 6 encodes Ghd7 a repressor of flowering in long days [26]. The genes corresponding to Ma 2 and Ma 4 have not been identified but recessive alleles at either locus results in early flowering in long days in sorghum lines that are photoperiod sensitive and have Ma 1 genotypes [28]. Prior studies also noted that genotypes recessive for Ma 2 flower later in genotypes that are photoperiod insensitive and recessive for Ma 1 and Ma 6 [28].
In this study, the impact of Ma 2 alleles on the expression of genes in the sorghum flowering time pathway was characterized. A QTL corresponding to Ma 2 was mapped and a candidate gene for Ma 2 identified by fine mapping and genome sequencing. The results show that Ma 2 enhances SbPRR37 (Ma 1 ) and SbCO expression consistent with the impact of Ma 2 alleles on flowering time in genotypes that vary in Ma 1 alleles.

Plant growing conditions and populations
Seeds for all genotypes used in this study were obtained from the Sorghum Breeding Lab at Texas A&M University in College Station, TX. 100M (Ma 1 Ma 2 Ma 3 Ma 4 Ma 5 ma 6 ) and 80M (Ma 1 ma 2 Ma 3 Ma 4 Ma 5 ma 6 ) are sorghum maturity standards with defined maturity/flowering genotypes [1]. The maturity genotypes were selected from a cross between Early White Milo (ma 1 Ma 2 Ma 3 Ma 4 Ma 5 ma 6 ) and Dwarf Yellow Milo (Ma 1 ma 2 ma 3 Ma 4 Ma 5 ma 6 ). 100M and 80M are nearly isogenic and differ at Ma2.
The cross of 100M and 80M was carried out by the Sorghum Breeding Lab at Texas A&M University in College Station, TX. F 1 plants were grown in the field in Puerto Rico and self-pollinated to generate the F 2 population used in this study. The 100M/80M F 2 population was planted in the spring of 2008 at the Texas A&M Agrilife Research Farm in Burleson County, Texas (near College Station, TX).
The cross of Hegari and 80M was made in the greenhouse at Texas A&M University in College Station, TX. F 1 plants were confirmed and self-pollinated to generate the F 2 population used in this study. The Hegari/80M F 2 population (n = 432) was planted in the spring of 2011 in the greenhouse in 18 L nursery pots in a 2:1 mixture of Coarse Vermiculite (SunGro Horticulture, Bellevue, WA) to brown pasture soil (American Stone and Turf, College Station, TX). All subsequent generations of Hegari/80M for fine mapping were grown in similar conditions. Greenhouse-grown plants were watered as needed and fertilized every two weeks using Peters general purpose 20-20-20 (Scotts Professional).
For circadian gene expression experiments, 100M and 80M genotypes were planted in Metro-Mix 900 (Sungro Agriculture) in 6 L pots, and thinned to 3 plants/pot after 2 weeks. Plants were grown in the greenhouse under 14 h days until 30 days after planting (DAP). After 30 days, the plants were moved into growth chambers and allowed to acclimate for 3 days. The growth chamber was set to 30˚C and 14/10h Light/Dark (L/D) for the 3 days of entrainment and the first 24 h of tissue collection. The lights were changed to constant light for the second 24 h of tissue collection.

QTL mapping and multiple-QTL analysis
DNA was extracted from leaf tissue for all individuals described above as described in the Fas-tDNA Spin Kit manual (MP Biomedicals). All individuals in each mapping or heterogeneous inbred family (HIF) population were genotyped by Digital Genotyping using FseI digestion enzyme as described in Morishige et al [37]. DNA fragments were sequenced using the Illumina GAII platform and the reads were mapped back to the sorghum reference genome (v1.0, Phytozome v6). Genetic maps were created using MapMaker 3.0B with the Kosambi function [38]. QTL were mapped using WinQTLCartographer (v2.5.010) using composite interval mapping with a 1.0 cM walk speed and forward and backward model selection [39]. The threshold was set using 1000 permutations and α = 0.05. Upon release of v3.1 of the sorghum reference genome, the QTL coordinates were updated [40].
To look for possible gene interactions multiple-QTL analysis was used in the Hegari/80M F 2 population. A single QTL analysis using the EM algorithm initially identified two primary additive QTL which were used to seed model selection. The method of Manichaikul et al. [41] was employed for model selection as implemented in R/qtl for multiple-QTL analysis [42]. Computational resources on the WSGI cluster at Texas A&M were used to calculate the penalties for main effects, heavy interactions, and light interactions. These penalties were calculated from 24,000 permutations for flowering time to find a significance level of 5% in the context of a two-dimensional, two-genome scan.

Fine mapping of the Ma 2 QTL
All fine mapping populations for the Ma 2 QTL were derived from F 2 individuals from the Hegari/80M population. The genetic distance spanning the Ma 2 locus is 2 cM corresponding to a physical distance of~1.8 Mbp, so 1000 progeny would be required to obtain 20 recombinants within the Ma 2 QTL region. Six individuals that were heterozygous across the Ma 2 QTL were self-pollinated to generate six HIFs totaling 1000 F 3 individuals. These individuals were grown out in the greenhouse, and flowering time was recorded. They were genotyped by Digital Genotyping as described above [37]. Two F 3 individuals that had useful breakpoints with a heterozygous genotype on one side of the breakpoint were grown and self-pollinated to generate an additional round of HIFs (F 4 , n = 150) that were planted in the spring of 2013 and analyzed as described above. No new breakpoints were identified in the F 4 generation, so this process was repeated again to generate F 5 plants in the spring of 2014.

Circadian gene expression analysis
For the circadian gene expression analysis, 30-day-old plants were placed in a growth chamber set to 14h/10h L/D for the first 24 h and constant light for the second 24 h at 30˚C. Plants were entrained for 3 d under these growth chamber conditions before beginning tissue collection. Leaf tissue was collected and pooled from 3 plants every 3 h for 48 h. The first sample was taken at lights-on on the first day of sample collection. The experiment was repeated three times for a total of three biological replicates. RNA was extracted from each sample using the Direct-Zol RNA Miniprep Kit (Zymo Research) according to the kit instructions. cDNA was synthesized using SuperScript III kit for qRT-PCR (Invitrogen) according to the kit instructions. Primers for sorghum flowering pathway genes were developed previously, and primer sequences are available in Murphy et al [21]. Primer sequences for Ma 2 are available in S1 Table. Relative expression was determined using the comparative cycle threshold (C t ) method. Raw C t values for each sample were normalized to C t values for the reference gene SbUBC (Sobic.001G526600). Reference gene stability was determined previously [43]. ΔΔC t values were calculated relative to the sample with the highest expression (lowest C t value). Relative expression values were calculated with the 2 -ΔΔCt method [44]. Primer specificity was tested by dissociation curve analysis and gel electrophoresis of qRT-PCR products.

Ma 2 phylogenetic analysis
Protein sequences of the closest homologs of Ma 2 were identified using BLAST analysis. Protein sequences were aligned using MUSCLE [45] and visualized using Jalview [46]. Evolutionary trees were inferred using the Neighbor-Joining method [47] in MEGA7 [48]. All positions containing gaps and missing data were eliminated.

Ma 2 DNA sequencing and whole genome sequence analysis
Whole genome sequence reads of 52 sorghum genotypes including 100M and 80M were obtained from Phytozome v12. Base quality score recalibration, INDEL realignment, duplicate removal, joint variant calling, and variant quality score recalibration were performed using GATK v3.3 with the RIG workflow [49]. Sobic.002G302700 was sequenced via Sanger sequencing in the genotypes in Table 1 according to the BigDye Terminator Kit (Applied Biosystems). Primers for template amplification and sequencing are provided in S1 Table.

Effects of Ma 2 alleles on flowering pathway gene expression
The recessive ma 2 -allele in 80M (Ma 1 ma 2 Ma 3 Ma 4 Ma 5 ma 6 ) was previously reported to cause 80M to flower earlier than100M (Ma 1 Ma 2 Ma 3 Ma 4 Ma 5 ma 6 ) in long days [28]. To help elucidate how Ma 2 modifies flowering time, we investigated the impact of Ma 2 alleles on the expression of genes in sorghum's flowering time pathway. Gene expression was analyzed by qRT-PCR using RNA isolated from 100M (Ma 2 ) and 80M (ma 2 ) leaves collected every 3 hours for one 14h light/10h dark cycle and a second 24-hour period of constant light.
SbPRR37 is a central regulator of photoperiod sensitive flowering in sorghum that acts by repressing the expression of SbCN (FT-like) genes in LD [21]. SbPRR37 expression in 100M and 80M grown in long days peaked in the morning and again in the evening as previously observed [21] (Fig 1). The amplitude of both peaks of SbPRR37 expression was reduced in 80M (ma 2 ) compared to 100M (Ma 2 ) ( Fig 1A). SbCO also shows peaks of expression in the morning (dawn) and in the evening (~14h) in plants grown in LD [21] (Fig 1B). Analysis of SbCO expression in 100M and 80M showed that both peaks of SbCO expression were reduced in 80M compared to 100M ( Fig 1B).
SbCN8, SbCN12, and SbCN15 are homologs of AtFT that encode florigens in sorghum [22]. Expression of SbCN8 and SbCN12 increases when sorghum plants are shifted from LD to SD, whereas SbCN15 is expressed at lower levels and shows minimal response to day length [21,26]. SbPRR37 and SbCO are co-repressors of the expression of SbCN8 and SbCN12 in long days, therefore, the influence of Ma 2 alleles on SbCN8/12/15 expression was investigated [21,27]. When plants were grown in long days, expression of SbCN12 was~5 fold higher in 80M compared to 100M consistent with earlier flowering in 80M (Fig 2). Previous studies showed that SbGhd7 represses SbEHD1 expression and that alleles of SbGHD7 differentially affect SbCN8 expression (>SbCN12) [26]. Analysis of SbEHD1 and SbGHD7 expression in 100M and 80M showed that Ma 2 alleles have a limited influence on the expression of these genes (S1 Fig). The timing of the two daily peaks of SbPRR37 and SbCO expression in sorghum is regulated by the circadian clock [21,26]. Therefore, it was possible that Ma2 modifies SbPRR37/SbCO expression by altering clock gene expression. However, expression of the clock genes TOC1 and LHY was similar in 100M and 80M (S1 Fig). Taken together, these results show that Ma 2 is an activator of SbPRR37 and SbCO expression in long days. Prior studies showed that coexpression of SbPRR37 and SbCO in long days inhibits expression of SbCN12 and floral initiation [27]. Later flowering in sorghum genotypes that are Ma 1 Ma 2 vs. Ma 1 ma 2 in long days is consistent with lower SbCN12 expression in Ma 1 Ma 2 genotypes.

Genetic analysis of Ma2 and Ma4
An F 2 population derived from a cross of 100M (Ma 2 ) and 80M (ma 2 ) was generated to map the Ma 2 locus. Because 100M and 80M are nearly isogenic lines that differ at Ma 2 , only Ma 2 alleles were expected to affect flowering time in this population [28]. The F 2 population (n = 1100) segregated for flowering time in a 3:1 ratio as expected. The parental lines and F 2 individuals were genotyped by Digital Genotyping (DG) which identifies single nucleotide polymorphism (SNP) markers in thousands of sequenced sites that distinguish the parents of a population [37]. The near isogenic nature of the parental lines resulted in a very sparse genetic map that lacked coverage of large regions of the sorghum genome including all of the long arm of SBI02. In retrospect, no Ma 2 QTL for flowering time was identified using this genetic map because the gene is located on the long arm of SBI02 (see below).
To overcome the lack of DNA markers associated with the 80M/100M population, a second mapping population was created to identify the genetic locus associated with Ma 2 . An F 2 population (n = 215) that would segregate for Ma2 and Ma4 was constructed by crossing Hegari (Ma 1 Ma 2 Ma 3 ma 4 Ma 5 ma 6 ) and 80M (Ma 1 ma 2 Ma 3 Ma 4 Ma 5 ma 6 ) [30,50]. The population was grown in a greenhouse under long day conditions and phenotyped for days to flowering. QTL for flowering time were identified on SBI02 and SBI10 (Fig 3). Recessive alleles of Ma2 and Ma4 result in earlier flowering when plants are grown in long days. The Hegari haplotype across the QTL on SBI10 was associated with early flowering therefore this QTL corresponds to Ma 4 (S2 Fig). The 80M haplotype across the QTL on SBI02 was associated with early flowering therefore the QTL on SBI02 was assigned to Ma 2 .

Epistatic interactions between Ma 2 and Ma 4
Previous studies indicated an epistatic interaction exists between Ma 2 and Ma 4 [28]. Therefore, Multiple QTL Mapping (MQM) analysis [51] was employed, using data from the Hegari/80M F 2 population, to identify additional flowering time QTL and interactions amongst the QTL as previously described [52]. MQM analysis identified the QTL for flowering time on SBI02 and  (Fig 4). The interaction between Ma 2 and Ma 4 identified by MQM analysis is consistent previous observations that in a recessive ma 4 background flowering is early regardless of allelic variation in Ma 2 [28].

Ma2 candidate gene identification
The Hegari/80M F 2 population located Ma 2 on SBI02 between 67.3 Mbp to 69.1 Mbp (Fig 5). To further delimit the Ma 2 locus, six lines from the Hegari/80M population that were heterozygous across the Ma 2 QTL but fixed across the Ma 4 locus (Ma 4 Ma 4 ) were selfed to create heterogeneous inbred families (HIFs) (n = 1000 F 3 plants) [53]. Analysis of these HIFs narrowed the region encoding Ma 2 to~600 kb (67.72 Mb-68.33 Mb) (Fig 5). Genotypes that were still heterozygous across the delimited locus were selfed and 100 F 4 plants were evaluated for  differences in flowering time. This process narrowed the Ma 2 locus to a region spanning~500 kb containing 76 genes (67.72Mb-68.22Mb) (Fig 5, S2 Table).
The low rate of recombination across the Ma 2 locus led us to utilize whole genome sequencing in conjunction with fine mapping to identify a candidate gene for Ma 2 . Since 100M and 80M are near isogenic lines that have very few sequence differences along the long arm of SBI02 where the Ma 2 QTL is located, whole genome sequences (WGS) of 100M and 80M were generated in collaboration with JGI (sequences available at https://phytozome.jgi.doe.gov). The genome sequences were scanned for polymorphisms within the 500 kb locus spanning Ma 2 . Only one T ! A single nucleotide polymorphism (SNP) located in Sobic.002G302700 was identified that distinguished 100M and 80M within the region spanning the Ma 2 locus. The T ! A mutation causes a Lys141 � change in the third exon, resulting a truncated protein.
A 500 bp DNA sequence spanning the T to A polymorphism in Sobic.002G302700 was sequenced from 80M and 100M to confirm the SNP identified by comparison of the whole genome sequences ( Table 1). The T ! A point mutation was present in 80M (ma 2 ) whereas 100M (Ma 2 ) encoded a functional version of Sobic.002G302700 that encodes a full length protein. Since this mutation was the only sequence variant between 100M and 80M in the finemapped locus, Sobic.002G302700 was identified as the best candidate gene for Ma 2 .
Sobic.002G302700 is annotated as a SET (Suppressor of variegation, Enhancer of Zeste, Trithorax) and MYND (Myeloid-Nervy-DEAF1) (SMYD) domain-containing protein. SMYD domain family proteins in humans have been found to methylate histone lysines and non-histone targets and have roles in regulating chromatin state, transcription, signal transduction, and cell cycling [54,55]. The SET domain in SMYD-containing proteins is composed of two sub-domains that are divided by the MYND zinc-finger domain. The SET domain includes conserved sequences involved in methyltransferase activity including nine cysteine residues that are present in the protein encoded by Sobic.002G303700 (Fig 6) [56]. The MYND domain is involved in binding DNA and is enriched in cysteine and histidine residues [57]. Protein sequence alignment of Sobic.002G302700 homologs revealed that the SYMD protein candidate for Ma2 is highly conserved across flowering plants (Fig 6).
To learn more about Ma2 regulation, the expression of Sobic.002G302700 in 100M and 80M was characterized during a 48h L:D/L:L cycle. Ma 2 showed a small increase in expression from morning to evening and somewhat higher expression in 100M compared to 80M during the evening (S3 Fig).

Distribution of Ma 2 alleles in the sorghum germplasm
Recessive ma 2 was originally found in the Milo background and used to construct Double Dwarf Yellow Milo (Ma 1 ma 2 ma 3 Ma 4 Ma 5 ma 6 ) [28]. Double Dwarf Yellow Milo was crossed to Early White Milo (ma 1 Ma 2 Ma 3 Ma 4 Ma 5 ma 6 ) and the progeny selected to create 100M, 80M and the other Milo maturity standards [1,28,58]. Several of the Milo maturity standards were Maturity2-A novel regulator of flowering time in Sorghum bicolor recorded as recessive Ma 2 (80M, 60M, SM80, SM60, 44M, 38M) and others as Ma 2 dominant (100M, 90M, SM100, SM90, 52M). In order to confirm the Ma 2 genotype of the maturity standards, the 500 bp sequence spanning the Lys141 � mutation in Sobic.002G302700 was obtained from most of these genotypes (Table 1). Kalo was also identified as carrying a recessive allele of Ma 2 . Kalo was derived from a cross of Dwarf Yellow Milo (ma 2 ), Pink Kafir (Ma 2 ), and CI432 (Ma 2 ), therefore it was concluded that DYM is the likely source of recessive ma 2 [28]. Sequence analysis showed that the genotypes previously identified as ma 2 including Kalo, 80M, SM80, 60M, 44M, and 38M carry the recessive mutation in Sobic.002G302700 identified in 80M. 100M, SM100, and Hegari that were identified as Ma 2 , did not contain the mutated version of Sobic.002G302700 (Table 1). Additionally, sequences of Ma 2 from 52 sorghum genotypes with publicly available genome sequences were compared [40]. Sobic.002G302700 was predicted to encode functional proteins in all except one of these sorghum genotypes. A possible second recessive Ma 2 allele was found in IS3614-2 corresponding to an M83T missense mutation that was predicted to be deleterious by PROVEAN [59].

Discussion
In photoperiod sensitive sorghum genotypes, following the vegetative juvenile phase, day length has the greatest impact on flowering time under normal growing conditions. Molecular identification of the genes corresponding to Ma 1 , Ma 3 , Ma 5 and Ma 6 and other genes in the sorghum flowering time pathway (i.e., SbCO, SbEHD1, SbCN8/12) and an understanding of their regulation by photoperiod and the circadian clock led to the model of the flowering time pathway shown in Fig 7 [60]. The current study showed that Ma2 represses flowering in long days by increasing the expression of SbPRR37 (Ma 1 ) and SbCO. The study also located QTL for Ma 2 and Ma 4 , confirmed an epistatic interaction between Ma 2 and Ma 4 , and identified a candidate gene for Ma 2 .
In the current study, two near isogenic Milo maturity genotypes, 100M (Ma 2 ) and 80M (ma 2 ), were used to characterize how allelic variation in Ma 2 affects the expression of genes in the sorghum photoperiod regulated flowering time pathway. This analysis showed that mutation of Ma 2 significantly reduced the amplitude of the morning and evening peaks of SbPRR37 and SbCO expression without altering the timing of their expression. In parallel, the expression of SbCN12 (FT-like) increased 8-fold in leaves of 80M compared to 100M, consistent with prior studies showing that 80M (ma 2 ) flowers earlier than 100M (Ma 2 ) in long days [28]. In contrast, expression of clock genes (TOC1, LHY) and other genes (i.e., GHD7, EHD1) in the photoperiod regulated flowering time pathway were modified to only a small extent by allelic variation in Ma 2 . Based on these results, we tentatively place Ma 2 in the flowering time pathway downstream of the light sensing phytochromes and circadian clock and identify Ma 2 as a factor that enhances SbPRR37 and SbCO expression (Fig 7).
The differential increase in SbCN12 expression in 80M (vs. 100M) is consistent with inhibition of SbCN12 expression in long days by the concerted action of SbPRR37 and SbCO [27]. Genetic studies showed that floral repression mediated by SbPRR37 requires SbCO as a corepressor [27]. Therefore, enhanced expression of both SbPRR37 (Ma 1 ) and SbCO by Ma 2 in Ma 1 Ma 2 genotypes in long days is consistent with delayed flowering under these conditions relative to genotypes such as 80M that are Ma 1 ma 2 . Molecular genetic studies also showed that SbCO is an activator of SbCN12 expression and flowering in ma 1 genetic backgrounds [27]. This is consistent with the observation that ma 1 Ma 2 genotypes flower earlier than ma 1 ma 2 genotypes when grown in long days [28].

Interactions between Ma 2 and Ma 4
Multiple QTL (MQM) analysis of results from the population derived from Hegari/80M identified an interaction between Ma 2 and Ma 4 as well as one additional flowering QTL on SBI09. Flowering time QTL on SBI09 have been identified in other mapping populations, but the gene(s) involved have not been identified [33,34]. The interaction between Ma 2 and Ma 4 confirmed previous observations that recessive ma 4 causes accelerated flowering in long days in Ma 1 Ma 2 genotypes [28]. Interestingly, the influence of Ma 2 and Ma 4 alleles on flowering time is affected by temperature [28,61]. The influence of temperature on flowering time pathway gene expression in 80M and 100M in the current study was minimized by growing plants at constant 30C. Further analysis of the temperature dependence of Ma2 and Ma4 on flowering time may help elucidate interactions between photoperiod and flowering time that have been previously documented [28,62]. Positional cloning of Ma 4 is underway to better understand the molecular basis of Ma 2 and Ma 4 interaction and their impact on flowering time.

Identification of a candidate gene for Ma 2
A mapping population derived from Hegari/80M that segregated for Ma 2 and Ma 4 enabled localization of the corresponding flowering time QTL in the sorghum genome (SBI02, Ma 2 ; SBI10, Ma 4 ). The Ma 2 QTL on SBI02 was fine-mapped using heterozygous inbred families (HIFs) from Hegari/80M. Identification of a candidate gene for Ma 2 was subsequently aided by comparison of genome sequences from the closely related 80M and 100M genotypes [28] A scan of the whole genome sequences of 100M and 80M identified only a single T to A mutation in the 500 kb region spanning the fine-mapped Ma 2 locus that caused a Lys141 � change in the third exon of Sobic.002G302700 resulting in protein truncation. Based on this information Sobic.002G302700 was tentatively identified as the best candidate gene for Ma 2 .
Sobic.002G302700 encodes a SET (Suppressor of variegation, Enhancer of Zeste, Trithorax) and MYND (Myeloid-Nervy-DEAF1) (SMYD) domain containing protein. In humans, SMYD proteins act as lysine methyltransferases, and the SET domain is critical to this activity. Therefore, Ma 2 could be altering the expression of SbPRR37 and SbCO by modifying histones associated with these genes. The identification of this SMYD family protein's involvement in flowering in sorghum as well as the identification of highly conserved homologs in other plant species suggests that Ma2 may correspond to a novel regulator of sorghum flowering. While a role for SYMD-proteins (lysine methyltransferases) as regulators of flowering time has not been previously reported, genes encoding histone lysine demethylases (i.e., JMJ30/32) have been found to regulate temperature modulated flowering time in Arabidopsis [63].
J.R. Quinby [50] identified only one recessive allele of Ma 2 among the sorghum genotypes used in the Texas sorghum breeding program. The maturity standard lines including 80M that are recessive for ma 2 and the genotype Kalo were reported to be derived from the same recessive ma 2 Milo genotype [28]. To confirm this, Ma 2 alleles in the relevant maturity standards and Kalo were sequenced confirming that all of these ma 2 genotypes carried the same mutation identified in 80M (Table 1). Among the 52 sorghum genotypes with available whole genome sequences, only 80M carried the mutation in Ma2 [40]. One possible additional allele of ma 2 was identified in IS36214-2, which contained a M83T missense mutation that was predicted to be deleterious to protein function by PROVEAN [59].
In conclusion, we have shown that Ma 2 represses flowering in long days by promoting the expression of the long day floral co-repressors SbPRR37 and SbCO (Fig 7). Sobic.002G302700 was identified as the best candidate for the sorghum Maturity locus Ma 2 . Further validation such as targeted mutation of Sobic.002G302700 in a Ma 1 Ma 2 sorghum genotype or complementation of Ma 1 ma 2 genotypes will be required to confirm this gene assignment. The identification of this gene and its interaction with Ma 4 help elucidate an additional module of the photoperiod flowering regulation pathway in sorghum. . There was no difference in expression between 100M and 80M in the first day. Expression was slightly elevated in 100M compared to 80M during the night and through the following morning. (TIF) S1