Natural haplotypes of FLM non-coding sequences fine-tune flowering time in ambient spring temperatures in Arabidopsis

Cool ambient temperatures are major cues determining flowering time in spring. The mechanisms promoting or delaying flowering in response to ambient temperature changes are only beginning to be understood. In Arabidopsis thaliana, FLOWERING LOCUS M (FLM) regulates flowering in the ambient temperature range and FLM is transcribed and alternatively spliced in a temperature-dependent manner. We identify polymorphic promoter and intronic sequences required for FLM expression and splicing. In transgenic experiments covering 69% of the available sequence variation in two distinct sites, we show that variation in the abundance of the FLM-ß splice form strictly correlate (R2 = 0.94) with flowering time over an extended vegetative period. The FLM polymorphisms lead to changes in FLM expression (PRO2+) but may also affect FLM intron 1 splicing (INT6+). This information could serve to buffer the anticipated negative effects on agricultural systems and flowering that may occur during climate change. DOI: http://dx.doi.org/10.7554/eLife.22114.001


Introduction
Plants are sessile organisms that have adapted to their habitats to optimize flowering time and thereby guarantee reproductive success and survival. Temperature is one major cue controlling flowering time, particularly before the winter but also during spring when ambient cool temperatures generally delay and warm temperatures promote flowering. Different molecular pathways controlling flowering in different temperature ranges and environments have been genetically dissected (Capovilla et al., 2015;Verhage et al., 2014).
In many plant species and in many accessions of the plant model species Arabidopsis thaliana (Arabidopsis), the well-studied vernalization pathway prevents premature flowering before the long cold periods of the winter . In Arabidopsis, flowering of winter-annual accessions without vernalization is strongly delayed by the MADS-box transcription factor FLOWERING LOCUS C (FLC) Amasino, 1999, 2001;Johanson et al., 2000). FLC forms a repressor complex through interactions with the MADS-box transcription factor SVP (SHORT VEG-ETATIVE PHASE) to repress the transcription of the flowering promoting genes FLOWERING LOCUS T (FT) and SUPPRESSOR OF OVEREXPRESSION OF CO1 (SOC1) (Li et al., 2008;Lee et al., 2007). As a result of the prolonged exposure to cold temperatures during winter, FLC abundance is gradually reduced, predominantly through epigenetic mechanisms, and flowering repression is gradually relieved . Substantial natural variation in the vernalization-dependent expression of FLC has already been described and characterized in many Arabidopsis accessions (Coustham et al., 2012;Li et al., 2014).
During spring, ambient temperature is an important climatic factor. Temperature changes by only a few degree Celsius (˚C) during cold or warm spring periods shift flowering time in many plant species. Since temperature changes associated with global warming could lead to similar flowering changes and threaten agricultural production systems, elucidating the ambient temperature pathway has recently received increased attention (Moore and Lobell, 2015;Wheeler and von Braun, 2013;Jagadish et al., 2016).
The complexities of the ambient temperature flowering pathway are just beginning to be understood (Capovilla et al., 2015;Verhage et al., 2014). After vernalization or in vernalization-insensitive accessions, flowering in ambient temperatures is largely under control of the FLC-related FLOWERING LOCUS M (FLM) and residual FLC retains only a minor role (Gu et al., 2013;Blázquez et al., 2003;Lee et al., 2013). FLM controls flowering in the range between 5˚C and 23˚C and FLM can form, just like FLC, a flowering repressive complex with SVP (Lee et al., 2013;Posé et al., 2013). Besides FLM, as the dominant regulator, ambient temperature flowering is also regulated by the FLM-homologs MAF2 -MAF4 (Li et al., 2008;Gu et al., 2013;Lee et al., 2013;Ratcliffe et al., 2003;Airoldi et al., 2015). Conversely, loss-offunction mutations of SVP lead to early and temperature-insensitive flowering in the range between 5˚C and 27˚C (Lee et al., 2007(Lee et al., , 2013. Within the ambient temperature range, FLM is differentially spliced in a temperature-dependent manner through the alternative use of the exons 2 (FLM-ß) and 3 (FLM-d). Based on transgenic experiments, a model was proposed according to which FLM-ß and FLM-d function antagonistically with SVP. Accordingly, FLM-ß engages in flowering repressive interactions with SVP in cooler temperatures. Conversely, FLM-d would outcompete FLM-ß in warmer temperatures and form heterodimers with SVP unable to bind DNA and repress flowering (Lee et al., 2013;Posé et al., 2013). Recent data suggest that this attractive model, based solely on transgenic experiments, may not be valid in natural contexts (Sureshkumar et al., 2016;Lutz et al., 2015).
As yet, it has remained largely unknown whether genetic variation of the FLM gene locus plays a role in flowering time regulation across Arabidopsis accessions and if so, which variations determine basal and temperature-dependent FLM expression and splicing. Although two FLM deletion alleles conferring temperature-insensitive early flowering were identified in the Arabidopsis accessions Niederzenz-1 (Nd-1) and Eifel-6 (Ei-6), their limited demographic and genetic spread indicated that FLM deletions may be disadvantageous (Werner et al., 2005;. A first FLM expression variant was determined from the early flowering accession Killian-0 (Kil-0). In Kil-0, a LINE retrotransposon insertion in FLM intron 1 caused premature transcription termination, aberrant FLM splicing, consequently reduced FLM expression and earlier flowering. This phenotype was especially prominent at a temperature of 15˚C , which is closer to the average temperature in the native range of the species than the commonly used 21˚C (Lutz et al., 2015;Hoffmann, 2002;Weigel, 2012). The subsequent identification of nine further accessions with an identical LINE insertion was suggestive for a recent adaptive selective sweep (Lutz et al., 2015). It was thus concluded that FLM expression-modulating alleles are advantageous for flowering time adaptation in the ambient temperature range, particularly since there are no other pleiotropic growth changes observed in FLM mutant alleles.
Previous work had shown that an intronless FLM cDNA expressed from a FLM promoter fragment was unable to rescue the flm mutant phenotype. Since this suggested that important information for FLM expression may reside in FLM non-coding sequences, we examined non-coding sequence polymorphisms with a potential role in controlling FLM expression and splicing. Using phylogenetic footprinting, we found conserved FLM promoter and intron 1 regions essential for FLM expression. By association analysis using polymorphism data from » 800 Arabidopsis accessions, we identified a small polymorphic region in the FLM promoter and a highly polymorphic nucleotide triplet in FLM intron 6 controlling basal and temperature-dependent FLM expression. Small changes in the relative abundance of the FLM-ß splice variant dynamically modulated flowering at 15˚C over a range of 15 leaves. When tested in a homogenous genetic background, FLM abundance correlated almost perfectly with flowering time (R 2 = 0.94) and contributed (R 2 = 0.21) to flowering time variation in heterogeneous natural Arabidopsis populations. Our data suggest that FLM-ß is an important determinant of flowering during cool or warm spring periods.

Results
Intronic sequences are required for basal and temperature-sensitive FLM expression Temperature-dependent changes in FLM abundance and alternative splicing are critical for flowering time control in ambient temperatures in Arabidopsis. How FLM expression is regulated and whether Arabidopsis accessions have employed differential FLM expression and splicing to adapt to different temperature climates remained to be shown. We previously found that Arabidopsis transgenic lines expressing an intronless FLM from a functional FLM promoter fragment cannot express FLM to detectable levels and fail to rescue the flowering time phenotype of a flm-3 loss-of-function mutant (Lutz et al., 2015). We subsequently compared FLM expression in transgenic lines expressing a genomic FLM fragment, obtained from the Columbia-0 (Col-0) wild type, containing all six introns (pFLM::gFLM; FLM Col-0 ) with lines expressing the FLM-ß (pFLM::FLM-ß) or FLM-d (pFLM::FLM-d) splice variants retaining intron 1 but lacking all other introns ( Figure 1A). The presence of intron 1 was sufficient to restore the basal expression of FLM at the ambient temperatures 15˚C, 21˚C, and 27˚C, but introns 2-6 were required for the temperature-sensitive regulation of the splice variants FLM-ß and FLM-d ( Figure 1B). We concluded that intron 1 may contain critical information for FLM expression and introns 2-6 may contribute to temperature-dependent FLM expression.

Phylogenetic footprinting pinpoints essential regions for basal expression in the promoter and the first intron
To identify non-coding regions important for FLM expression, we performed multiple sequence alignments of Arabidopsis FLM with its closest sequence homologue MAF3 and FLM homologues from five other Brassicaceae species (  one promoter region of 250 bp and two intron 1 regions of 373 and 101 bp with increased sequence conservation (>60%) (Figure 2A and Figure 2-figure supplement 1). We then generated PRO D225bp , INT1 D373bp , and INT1 D101bp transgenic lines expressing pFLM::gFLM variants (FLM Col-0 ) with deletions of the three regions in the FLM deletion accession Nd-1 to measure the effects on FLM-ß and FLM-d expression at 15˚C and 23˚C ( Figure 2B and Figure 2-figure supplement 2) (Werner et al., 2005). To normalize for variability between the transgenic lines, we examined pools of independent T2 segregating lines (n = 21-34). We validated this pooling strategy by demonstrating that the established behaviour of FLM in the Col-0 and Kil-0 accessions could be faithfully recapitulated when performing equivalent analyses with T2 lines expressing gFLM Col-0 and gFLM Col-0 bearing the Kil-0 LINE insertion (Lutz et al., 2015) ( Non-coding sequence variation of major FLM haplotypes influences FLM expression To find further non-coding determinants for FLM expression, we analysed FLM nucleotide variation in 776 sequenced Arabidopsis accessions (The 1001 Genomes Consortium, 2016). We identified 45 promoter and intronic SNPs with a minor allele frequency (MAF)!5% and used these to define ten major haplotypes (H1--H10) representing 379 (49%) accessions ( Figure 3-figure supplement 1A,B; Supplementary file 1). We defined an initial set with 41 accessions by selecting five to twelve accessions from six of the ten haplotype groups ( Figure 3A). Since introns 2-6 seemed important for temperature-sensitive FLM expression, we added 11 accessions with varying intron 2-6 haplotypes (HI2-6) but identical intron 1 haplotype (HI1; Figure 3A and Figure 3-figure supplement 1D-G). Finally, we added Col-0 (H3) and Kil-0 (H1) due to their well characterized FLM regulation to obtain a representative FLM haplotype set with ultimately 54 accessions (Supplementary file 2). We assured, by analytical PCR, that none of these accessions, except Kil-0, carried the previously described intron 1 LINE insertion (Lutz et al., 2015).
To find polymorphisms regulating FLM expression and splicing, we performed a genotype-phenotype association analysis. Since the vernalization pathway strongly delayed flowering in 24 of the 54 accessions and consequently suppressed FLM effects, we used FLM expression as a phenotype for the association analysis (Figure 3-figure supplement 2A). To ascertain that FLM expression was not affected by FLC, we examined FLM in non-vernalized wild type Col-0 and flc-3 mutants as well as in Col-0 carrying a functional vernalization module (Michaels and Amasino, 1999). Concurrent with previous reports, we did not detect an influence of FLC on FLM transcript abundance in our conditions (Figure 3-figure supplement 2B) (Scortecci et al., 2001;Ratcliffe et al., 2001). We then obtained FLM expression data from the FLM haplotype set and measured total FLM transcript levels as well as FLM-ß and FLM-d levels at 15˚C or 23˚C (Supplementary file 3). In line with the reported behaviour of FLM in Col-0, we observed that FLM-ß expression decreased (on average 0.6 fold) and FLM-d increased (on average 1.4 fold) with increasing temperature in most accessions ( Figure 3B and Supplementary file 3) (Lee et al., 2013;Posé et al., 2013;Lutz et al., 2015). At the same time, we also observed substantial variation in FLM-ß and FLM-d expression between the accessions of the FLM haplotype set, inviting the conclusion that non-coding sequence variation modulates FLM expression ( Figure 3B and Supplementary file 3). Since total FLM transcript levels strongly correlated with FLM-ß but not with FLM-d abundance at 15˚C and at 23˚C, it could be suggested that FLM-ß represents the major FLM form among the FLM transcripts ( Association analysis identifies polymorphic sites with potential FLM regulatory functions Using the FLM haplotype set, we next performed association tests between FLM expression at 15˚C and 23˚C or expression ratios derived from these values and an extended set of 119 polymorphisms     . We decided to investigate the potential role of a single base pair deletion (PRO1 T/-; bp À215) and a genetically slightly linked SNP (PRO2 A/C ; bp À93), because they were both located in the proximal part of the FLM promoter. Further, we investigated three genetically unlinked nucleotides because they were positioned as a highly diverse nucleotide triplet in intron 6 (INT6 A/C-A/C-A/T/C ; bp +3975-+ 3977) in an otherwise conserved sequence context ( Figure  Strikingly, the PRO2 and INT6 sites were directly flanked or in close proximity to a total of four PolyA motifs of variable length ([A] 7-11 ), one of which represented a so-called CArG-box, a potential binding site for MADS-box transcription factors (Zhang et al., 2016). Two more PolyA motifs resided in introns 3 and 5 ( Figure 4D and Figure 4-figure supplement 1A). Since PolyA motifs had been reported to be important for gene expression, we reasoned that these motifs could be relevant for FLM regulation, in isolation or in combination with the highly significant non-coding variations (O'Malley et al., 2016;Horton et al., 2012).
The INT6+ CAA polymorphism confers temperature-insensitive FLM expression and flowering Introns 2-6 were required for temperature-sensitive FLM expression (Figure 1). When we statistically tested the interaction between genotype and temperature with a multiple linear model, we found that temperature-sensitive FLM-ß regulation was significantly reduced in INT6+ CAA between 15˚C and 23˚C and when compared to the FLM Col-0 control variant (p=0.012) ( Figures 4E and 6A,B). In line with the prediction, flowering of this variant was indeed less sensitive to temperature changes when tested at 15˚C and 23˚C in homozygous T3 progeny plants and compared to the FLM Col-0 reference ( Figure 6C,D). Thus, temperature-independent FLM-ß expression changes correlate with temperature-insensitive flowering in the selected temperature range.
Differential abundance of FLM splice forms can correlate with changes in FLM transcription or alternative splicing To understand whether the same or different molecular mechanisms are the basis of altered FLM expression in FLM variants, we estimated FLM transcription of variants with strongly altered abundance of processed FLM abundance by measuring levels of unprocessed FLM pre-mRNA from plants grown at 15˚C. When compared to the levels of processed FLM mRNA, the PRO D225bp and INT1 D373bp variants showed similarly strong reductions of unprocessed pre-mRNA, suggesting that the respective deletion polymorphisms directly affect FLM transcription ( Figures 4E,F and and  7A). In turn, the INT6+ CAA and PolyA_6xD lines had reduced FLM mRNA levels but, when compared to FLM Col-0 , did not show substantial changes in unprocessed pre-mRNA levels ( Figures 4E,F and  and 7B). Since this indicated that post-transcriptional events may be affected in these variants, we tested for the abundance of differential polyadenylated splice variants after semi-quantitative 3'- RACE-PCR and sequencing of the cloned PCR products (Figure 7-figure supplement 1A). There, we detected a relative reduction of FLM-ß transcripts in INT6+ CAA and PolyA_6xD that was accompanied by increases in the abundance of two polyadenylated transcripts containing exon 1 and intron 1 (E1I1p) that had already been noted in an earlier publication ( Figure 7C,D) (Lutz et al., 2015). Importantly, we did not identify a single FLM-d clone among the 163 sequenced cDNAs. The relative increase in E1I1p transcripts could also be independently confirmed by E1I1p-specific qRT-PCRs and suggested in summary that splicing site choice at the exon 1 -intron 1 junction is changed in the INT6+ CAA and PolyA_6xD alleles (Figure 7-figure supplement 1B,C). In relation to all exon 1-containing transcripts, the overall abundance of these intron 1-containing transcripts was comparatively low (Figure 7-figure supplement 1C).

PRO2+ and INT6+ polymorphisms contribute to global variation of FLM levels
The nine PRO2+ and INT6+ haplotypes tested in transgenic experiments were present in 579 (69%) of all 840 accession with available genome sequence information (Figure 8-figure supplement  1A). To examine whether these haplotypes explain natural variation of FLM-ß levels in natural accessions, we randomly selected an experimental population of 94 accessions (2 to 14 accessions per We found that the PRO2+/INT6+ haplotype significantly affected FLM-ß transcript levels (Figure 8-figure supplement 1D). When we integrated the average values from these natural accessions with the respective values from the transgenic analysis, we detected a positive, however not significant correlation, a likely consequence of the small number of datapoints (FLM-ß, R 2 = 0.13) (Figure 8-figure supplement 1E).
To examine the correlation between FLM-ß expression and flowering time, we determined flowering time of accessions at 15˚C. To avoid strong interference from the vernalization pathway, we selected 27 genetically diverse summer-annual accessions, which initiate flowering without the need of vernalization (Figure 8-figure supplement 2A, Supplementary file 6B). We measured FLM-ß transcript levels and flowering time at 15˚C. As residual FLC transcript in these summer-annual accessions may still affect flowering time, we also determined FLC transcript levels (Coustham et al., 2012;Li et al., 2014;Duncan et al., 2015). Using a multiple linear regression approach, we found that FLM-ß and FLC explained 32.9% (R 2 = 0.329, p=0.0083) of flowering time variation, with FLM-ß significantly explaining a subfraction of 21.0% (R 2 = 0.210, p=0.011) and FLC only 11.9%, however not significantly (R 2 = 0.119, p>0.05) ( Figure 8B and Figure 8-figure supplement 2B). Further, by integration of expression data and flowering time data into a multiple linear model and comparison of the slopes, we found that flowering time responded stronger to FLM-ß levels in the accessions than in the transgenic variants (p=0.0321) ( Figure 5B and Figure 8). Taken together, we concluded that variations in FLM-ß levels account for flowering time in cool ambient spring temperatures in a diverse population of summer-annual Arabidopsis accessions (Figure 9).

Discussion
Ambient temperature during spring is a major cue determining flowering time. Cool temperatures generally delay and warm temperatures promote flowering time of Arabidopsis. The FLM locus explains flowering time variation in different ambient temperatures but the underlying genetic bases of FLM-dependent flowering remained largely unclear (Salomé et al., 2011;el-Lithy et al., 2006;O'Neill et al., 2008).
We found that, besides the FLM promoter, also intron 1 sequences were essential for FLM basal expression and subsequently identified by phylogenomic footprinting a conserved 373 bp intron 1 region essential for FLM basal expression (Figure 2). Further, through association analyses using genomic sequence information, we uncovered FLM regulatory regions (PRO2+ and INT6+) that control temperature-dependent FLM expression in a haplotype-specific manner (Figure 4).  (Figures 4 and 6). In all our experiments, FLM-ß highly correlated (R 2 = 0.94) with flowering time over a broad vegetative range (15-30 rosette leaves) when determined at 15˚C and tested in a homozygous background, regardless of the type of variant ( Figure 5). Our finding that FLM-ß had a stronger effect in the control of flowering time than FLM-d indicates that previously phrased functional models proposing that FLM-d had an antagonistic activity to FLM-ß need to be corrected Sureshkumar et al., 2016). This is further supported by the fact that our results estimate that FLM-d levels are overall very low and that we did not identify a single FLM-d clone in a directed sequencing approach that identified 53 FLM-ß clones. Our findings thus support more recent studies suggesting that the FLM-d splice variant may be biologically irrelevant (Sureshkumar et al., 2016;Lutz et al., 2015). One of these studies also identified a large number of biologically irrelevant transcripts that could be identified with the primer combination used here for the detection of FLM-d (Sureshkumar et al., 2016). Our data thus support the conclusion that none of these amplification products has a biologically important function (Sureshkumar et al., 2016). Whereas the PRO2+ and INT6+ polymorphisms tested affected FLM transcript abundance, the molecular causes of these transcription changes varied among the different polymorphisms (Figures 4 and 7). Deletion of a 225 bp promoter region including the PRO2+ polymorphic site was associated with tenfold reduced levels of FLM pre-mRNA indicating that this promoter region was essential for FLM expression (Figure 7). A twofold relative difference of FLM-ß levels was found when comparing the PRO2+ GATAC and PRO2+ AAACC variants with the lowest and highest FLM-ß expression, respectively, and our linear model would predict a flowering time difference of 14.4 leaves at 15˚C (Figures 4 and 5). The PRO2+ region harbours a predicted MADS-box transcription factor binding site and several instances of auto-or cross-regulation of MADS-box factors have been described (de Folter et al., 2005;Kaufmann et al., 2009;Smaczniak et al., 2012). It can therefore be envisioned that FLM expression is regulated by MADS-box transcription factors and that PRO2+ polymorphisms modulate the efficiency or specificity of these binding events and thereby modulate basal FLM expression.
Similarly, the deletion of a 373 bp fragment in FLM intron 1 resulted in an elevenfold reduction in FLM expression and earlier flowering by 11.5 leaves as experimentally determined. This effect size is much smaller than predicted by the linear model. However, as proposed earlier, this may likely be due to a critical lower threshold for FLM-ß to become effective (Figures 4 and  5). Intronic cis-regulatory transcription factor binding sites have been identified in other MADS-box transcription factors and interactions of enhancer and silencer elements that reside in the promoter sequence or the 3'-end of the first intron were reported (Hong et al., 2003;Schauer et al., 2009). Thus, similar mechanisms may govern the expression of FLM at intron 1 sites that may act in isolation or together with binding events at the FLM promoter. Interestingly, natural polymorphisms in intron 6 (INT6+) led to about fourfold differences in the abundance of FLM-ß, which could, as predicted by the linear model, relate to a flowering time delay by 28.8 leaves, suggesting that INT6+ harbours extensive potential to fine-tune flowering (Figures 4  and 5). Importantly, this molecular effect appears to be mediated, in the case of INT6+ CAA , by effects on the splicing efficiency and specificity at the distal intron 1. There, INT6+ CAA promotes the formation of short intron 1-containing transcripts, at the expense of FLM-ß, that are likely subjected for degradation by nonsense-mediated decay (Figure 7) (Lutz et al., 2015).
The INT6+ site is directly flanked by a short PolyA motif and such sites represent potential recognition sites of hnRNP (heterogeneous nuclear ribonucleoprotein) splicing factors. The predicted hnRNP binding pattern at INT6+ depended indeed on the INT6+ haplotype, when predicted by web-based algorithms, and their binding preference and activity, in concert with other splicing or  transcriptional regulators, may ultimately be the basis of the splicing changes observed here (Piva et al., 2012;Carrillo Oesterreich et al., 2011;Reddy et al., 2013).
The cooccurrence of PolyA motifs ([A] 7-11 ) with the PRO2+ and the INT6+ sites had attracted our attention (Figure 4-figure supplement 1). Combinatorial deletions of all six FLM PolyA motifs led to gradual decreases in FLM-ß abundance, which could be explained be altered intron 1 splicing (Figure 7). This ultimately resulted in FLM-ß levels below a lower effective threshold in the Poly-A_6xD lines and consequently very early flowering (Figure 4-figure supplement 1). This suggested that the PolyA motifs may modulate FLM expression by altering FLM splicing. In support of this conclusion, we found that an extended PolyA motif, as it is present in the INT6+ AAA variant when compared to the Col-0 reference variant INT6+ AAT , correlated with strong increases in FLM-ß but not in FLM-d abundance (Figure 4). PolyA motifs are known to prevent nucleosome binding and changes of chromatin architecture may influence splicing (Reddy et al., 2013;Suter et al., 2000;Wijnker et al., 2013). The knowledge about the underlying molecular mechanisms and the identity and specificity of the splicing regulators in plants is still very limited. We noted with interest, however, that several hnRNPs have a reported role in flowering time regulation in Arabidopsis and wheat (Kippes et al., 2015;Fusaro et al., 2007;Streitner et al., 2012;Xiao et al., 2015).
The nine PRO2+/INT6+ haplotype combinations included in our transgenic experiments represent 69% of the world-wide PRO2+/INT6+ variation (Figure 8-figure supplement 1A,B). We found, that FLM-ß explained around 21% of flowering time in a genetically heterogeneous Figure 9. Model of the proposed role of PRO2+ and INT6+ haplotypes and temperature on FLM-ß abundance and flowering. The abundance of the flowering repressor FLM-ß decreases in response to higher temperature and flowering is consequently accelerated (Figures 1B and 4E) (Lee et al., 2013;Posé et al., 2013;Lutz et al., 2015). Note, that previous studies showed an especially prominent effect of FLM in a range from 9˚C to 21˚C (long-day photoperiod) (Lee et al., 2013;Posé et al., 2013;Lutz et al., 2015). At the genetic level, FLM-ß abundance is triggered by the PRO2+ (purple) and/or the INT6+ (pink) haplotype and flowering time correlated to FLM-ß abundance ( Figure 4E). Among the PRO2+/INT6+ combinations tested, the Col-0 (grey) reference allele (PRO2 + AAAAC /INT6+ AAT ) showed intermediate FLM-ß levels ( Figure 4E). We suggest, that changes in flowering time due to changing ambient temperature can be precisely buffered by modifying the PRO2+ and INT6+ regions, as illustrated by similar plant symbols. DOI: 10.7554/eLife.22114.024 population of summer-annual Arabidopsis accessions (Figure 8). Hence, PRO2+ and INT6+ haplotypes regulate FLM-ß abundance and, in turn, contribute to flowering time regulation in Arabidopsis accessions (Figures 8 and 9). We consider it not surprising that the correlation of flowering time with FLM-ß levels (21% versus 94%) as well as the responsiveness of flowering time to FLM-ß levels differ between a genetically heterogeneous natural population and the homogenous transgenic population. First, we showed that residual FLC transcript may also slightly contribute to variation of flowering time of summer-annual accessions, possibly interfering with the effects of FLM-ß Li et al., 2006). Further, additional sequence variation in FLM in natural accessions may contribute to relevant expression changes, e.g. in the part of intron 1 identified as essential for FLM expression but not analysed in further detail here (Figure 2). Then, it is possible that these accessions harbour variation in the FLM-related MAF2 -4, and this variation may differentially affect flowering time, in isolation or in combination with variation of their common interaction partner SVP (Ratcliffe et al., 2003;Airoldi et al., 2015;Scortecci et al., 2003). Bearing these possible genetic and environmental interferences in mind, we regard the detected effect of FLM-ß on flowering (21%) in cool (15˚C) temperatures as considerable and suggest that it is rather an underestimation.
We found a uniform distribution of nine PRO2+/INT6+ haplotype combinations among the genetic clusters that have recently been established based on the analysis of genomic sequences from 1135 Arabidopsis accessions (The 1001 Genomes Consortium, 2016). Interestingly, the PRO2 + GGAAC /INT6+ AAT haplotype was overrepresented among the relict accessions and the PRO2+-AAAAC /INT6+ AAA haplotype was overrepresented among the Asian group (The 1001 Genomes Consortium, 2016) ( Figure 8-figure supplement 1). In our experiments, both of these haplotypes were associated with increased FLM-ß transcript levels and consequently late flowering (Figure 4). Since the relict and Asian accessions have been proposed to originate from glacial refugia from where central Europe was recolonized after the last ice age, these late-flowering PRO2+/INT6+ haplotypes may represent original haplotypes (Sharbel et al., 2000;Schmid et al., 2006). The late flowering phenotype associated with these haplotypes would be in line with the hypothesis that late flowering alleles are more ancient and that early flowering alleles were derived only during the more recent Arabidopsis evolution (Toomajian et al., 2006). Further, genetic linkage among the PRO2+ and INT6+ polymorphic sites was overall low (R 2 = 0.11-0.55) and the sequences surrounding PRO2+ and INT6+ were comparatively conserved. Taken together, this suggests that mutations in the PRO2+ and INT6+ sites independently arose multiple times and that these sites were preferentially selected during Arabidopsis flowering adaptation. All PRO2+/INT6+ alleles displayed a broad geographic distribution, except PRO2+ GGAAC /INT6+ AAT , which was overrepresented in Spain (Figure 8-figure supplement 1). Likely, microclimates at specific geographic locations rather than general climate conditions at broad geographic regions may be important to drive adaptation of flowering to changing ambient temperature, as it was previously suggested for broadly distributed non-coding haplotypes of FLC that explain variation in vernalization (Li et al., 2014;Weigel, 2012;Shindo et al., 2006).
Mutations in cis-regulatory regions are important for adaptation and phenotypic evolution and have a low probability to generate negative pleiotropic effects (Swinnen et al., 2016;Meyer and Purugganan, 2013;Wray, 2007). FLM may be an ideal candidate for flowering time adaptation through non-coding sequence variation at PRO2+ and INT6+ sites since changes in FLM-ß abundance precisely modulate flowering while maintaining phenotypic plasticity and without generating negative pleiotropic effects (Figure 9). Changes at the PRO2+ and INT6+ sites should allow adapting flowering time in response to altered geographic distribution and consequently climate conditions as well as during changing global environments (Figure 9). The role of FLM orthologues in other Brassicaceae, including a number of agronomically important species, is as yet not understood. It is thus at present not possible to predict the role of Arabidopsis FLM polymorphisms for flowering time adaptation and plant breeding in this plant family. However, in view of the availability of new genome editing methodologies, the knowledge about important non-coding regions may be useful to fine-tune flowering time (Shan et al., 2013). This may allow buffering the anticipated negative effects on agricultural systems that occur as a consequence of small temperature changes during climate change (Moore and Lobell, 2015;Wheeler and von Braun, 2013).

Physiological experiments
For flowering time analyses, plants were grown under long day-conditions with 16 hr white light (110-130 mmol m À2 s À1 )/8 hr dark in MLR-351 SANYO growth chambers (Ewald, Bad Nenndorf, Germany). Plants were randomly arranged in trays and trays were rearranged every day. Water was supplied by subirrigation. Flowering time was quantified by counting rosette leaf numbers (RLN). Consistent with previous reports, we observed a strong correlation between days to bolting and rosette leaf number of Arabidopsis accessions (Figure 8-figure supplement 2C) (Atwell et al., 2010). Student's t-tests (normally distributed values), Wilcoxon rank tests (not normally distributed values), and multiple regression models were calculated with R (http://www.r-project.org/).

Cloning procedure
A previously described construct with the Col-0 genomic FLM fragment pFLM::gFLM template (pDP34)  was recombined into a pDONR201 destination vector using the Gateway system (Life Technologies, Carlsbad, CA). This vector was used as a template (FLM Col-0 ) to generate mutations using either a single phosphorylated primer or a combination of forward and reverse primers (Sawano and Miyawaki, 2000;Hansson et al., 2008). In case of variants with multiple modifications, individual mutations were introduced one at a time. The mutated inserts were recombined to the pFAST-R07 expression vector using the Gateway system (Life Technologies, Carlsbad, CA) (Shimada et al., 2010). All expression constructs were verified by sequencing and transformed into Agrobacterium tumefaciens strain GV3101. Nd-1 plants were transformed using the floral-dip method (Clough and Bent, 1998) and transgenic plants were identified based on seed fluorescence (Shimada et al., 2010). Segregation of T2 lines was examined and lines with single insertion events were selected for further analysis based on segregation ratios (Shimada et al., 2010). A list of primers and expression constructs is listed in Supplementary file 7.

Quantitative real-time PCR
For qRT-PCR analyses of homozygous lines, total RNA was isolated from three biological replicates using the NucleoSpin RNA kit (Machery-Nagel, Dü ren, Germany). DNA was removed by an on-column treatment with rDNase (Machery-Nagel, Dü ren, Germany). 2-3 mg total RNA were reverse transcribed with M-MuLV Reverse Transcriptase (Thermo Fisher Scientific, Waltham, USA) using an oligo (dT) primer. The cDNA equivalent of 30-50 ng total RNA was used in a 12 ml PCR reaction with SsoAdvancedUniversal SYBR Green Supermix (BioRad, Mü nchen, Germany) in a CFX96 Real-Time System Cycler (BioRad, Mü nchen, Germany). The relative quantification was calculated with the DDCt method using ACT8 (AT1G49240) as a standard (Pfaffl, 2001). For the analysis of the pooled independent T2 transgenic lines, a quarter of all available independent lines per construct (18 to 45) was sampled resulting in four replicate pools comprising between four and eleven lines. Around 1000 seeds were used for pooling per line. One RNA sample was extracted per replicate pool and processed as described above.
The large scale expression experiments were performed using a previously described 96-well format (Figure 8 and Supplementary file 3) (Box et al., 2011). DNA was digested with DNaseI (Thermo Fisher Scientific, Waltham, USA) and reverse transcription and qPCR reactions were performed as described above using a CFX384 Real-Time System Cycler (BioRad, Mü nchen, Germany). ACT8 (AT1G49240) and BETA-TUBULIN-4 (AT5G44340) were used as reference genes. Student's t-tests were calculated with Excel (Microsoft). qRT-PCR primers are listed in Supplementary file 7.
Data retrieval from the 1001 Arabidopsis thaliana genome project Sequence information from datasets of 776 and 840 accessions, respectively, have been available that were used in this study. The genomic sequences of a comprehensive set of 1135 Arabidopsis accessions have just recently been released and were used for concluding analyses (The 1001 Genomes Consortium, 2016). Genomic FLM sequences (7 kb; Chr1: 28953637-28960296; including 2 kb upstream and 0.2 kb downstream sequence) were extracted from 776 accessions from the 1001 Genomes portal (http://1001genomes.org/datacenter/) using the Wang dataset (343 accessions), the GMI dataset (180 accessions), the Salk dataset (171 accessions) and the MPI dataset (80 accessions) (The 1001 Genomes Consortium, 2016; Schmitz et al., 2013;Long et al., 2013;Cao et al., 2011). 45 SNPs with a MAF>5% were extracted and used for haplotype analysis of the FLM locus ( Figure 3A and Figure 3-figure supplement 1). 31 SNPs were polymorphic between the 54 selected accessions represented in the FLM haplotype set (Supplementary file 1).
To obtain an extended set of sequences that included not only SNPs but also information about insertions and deletions, we extracted FLM genomic sequences (including 2 kb upstream and 0.38 kb downstream sequence) from the GEBrowser 3.0 resource (http://signal.salk.edu/atg1001/3.0/ gebrowser.php). A set of 850 sequences was manually curated and aligned using MEGA7.0.14 (Tamura et al., 2011). Some sequences were excluded since they possessed a high number of ambiguous bases. The core sequence set consisted of 840 sequences. This sequence set was used for association analysis (

Haplotype analysis
Haplotype analyses were performed with DNaSP 5.10 (Rozas and Rozas, 1995) using the SNP dataset of the above-described set of 776 sequences. Invariable sites were removed and networks were generated using FLUXUS network software (Bandelt et al., 1999).

LD analysis
Linkage (R 2 ) between the polymorphic sites that were used as input for the simple single locus association test was calculated using the LD function in GGT v2.0 (van Berloo, 2008). Only diallelic SNPs (109 of 119) were considered.

Kruskal-Wallis test
To detect variants with significant effects on FLM transcript levels (simple single locus association test), 119 variants of a 7 kb FLM locus were extracted from a set of 840 sequences retrieved from the GEBrowser 3.0, as described above. To examine whether a variant showed a significant effect on FLM expression, the qRT-PCR expression values of the 54 accessions from the FLM haplotype set were used as input data (Supplementary file 4) to run Kruskal-Wallis tests using R (http://www.rproject.org/). p-values from all comparisons were corrected following the Benjamini-Hochberg multiple testing correction and resulting values were -log(10) transformed and plotted along the gene model (

FLM molecular analyses
qRT-PCRs for FLM pre-mRNA abundance and 3' RACE PCR were performed as previously described (Lutz et al., 2015). Primers sequences are listed in Supplementary file 7.