The transcriptomic response to a short day to long day shift in leaves of the reference legume Medicago truncatula

Photoperiodic flowering aligns plant reproduction to favourable seasons of the year to maximise successful production of seeds and grains. However understanding of this process in the temperate legumes of the Fabaceae family, which are important both agriculturally and ecologically, is incomplete. Previous work in the reference legume Medicago truncatula has shown that the FT-like gene MtFTa1 is a potent floral activator. While MtFTa1 is upregulated by long-day photoperiods (LD) and vernalisation, the molecular basis of this is unknown as functional homologues of key regulatory genes present in other species, notably CONSTANS in A. thaliana, have not been identified. In LD MtFTa1 maintains a near constant diurnal pattern of expression unlike its homologue FT in A. thaliana, which has a notable peak in expression at dusk. This suggests a different manner of regulation. Furthermore, M. truncatula possesses other FT-like genes such as two LD induced MtFTb genes which may also act in the regulation of flowering time. MtFTb genes have a diurnal pattern of expression with peaks at both four and sixteen hours after dawn. This study utilises RNA-Seq to analyse the transcriptome of M. truncatula leaves to identify genes which may regulate or be co-expressed with these FT-like genes following a shift from short-day photoperiods to inductive long-days. Specifically this study focuses on the first four hours of the day in the young leaves, which coincides with the first diurnal peak of the FTb genes. Following differential expression analysis at each timepoint, genes which alter their pattern of expression are distinguished from those which just alter their magnitude of expression (and those that do neither). It goes on to categorise these genes into groups with similar patterns of expression using c-means clustering and identifies a number of potential candidate photoperiod flowering time genes for future studies to consider.


INTRODUCTION
The regulation of flowering controls the important developmental shift between the vegetative and reproductive growth phases of the plant, aligning plant sexual reproduction with favourable seasonal environmental variation. This facilitates successful pollination and the maximizing of crop productivity and yield. In many temperate climate species, such as the winter annual varieties of the well-studied Brassicaceae species Arabidopsis thaliana (L.) Heynh., the primary determinants of flowering time are long-day (LD) photoperiod (daylength) conditions and vernalisation (prolonged exposure to cold temperatures) for which the molecular pathways underlying these responses are well understood (Andrés & Coupland, 2012).
In A. thaliana, LD conditions are perceived in the leaves and culminates in the activation of the floral integrator gene FLOWERING LOCUS T (FT ) (Turck, Fornara & Coupland, 2008). Specifically, FT is activated by circadian and light signals aligning which facilitates the formation of the GIGANTEA-FLAVIN-BINDING KELCH REPEAT, F-BOX 1 (GI-FKF1) complex. This complex degrades CYCLING DOF FACTOR (CDF) transcription factors which otherwise form a complex with a TOPLESS (TPL) protein to repress the expression of the transcription factor CONSTANS (CO). This gene encodes a B-box class protein with a CCT domain (Putterill et al., 1995;Goralogia et al., 2017). The stabilisation of CO protein in the late afternoon of LD results in it acting as a subunit of a NUCLEAR FACTOR-Y (NF-Y) pioneer transcription factor complex which directly activates FT (Andrés & Coupland, 2012;Gnesutta et al., 2017). CDF proteins are also able to repress FT directly (Song et al., 2015). FT protein is the principal mobile floral signal (florigen) which is transported to the shoot apical meristem via the phloem to induce flowering via activation of a second floral integrator gene SUPPRESSOR OF OVEREXPRESSION OF CONSTANS 1 (SOC1) and the floral meristem identity gene APETALA1. This results in a state of floral commitment and the development of flowers (Turck, Fornara & Coupland, 2008).
Beyond A. thaliana, the presence of FT orthologues integrating environmental signals and regulating flowering appears to be widely conserved (Ballerini & Kramer, 2011;Wickland & Hanzawa, 2015;Putterill & Varkonyi-Gasic, 2016). However the conservation of other elements of the pathway is not as strong. Recent work by Simon et al. (2015) suggest that the central role CO plays in the photoperiod pathway in A. thaliana evolved within the Brassicaceae following a gene duplication. Thus CO-like (COL) genes regulating FT homologues is not universal and their activity in some other species (e.g., Oryza sativa L.; rice; Hayama et al., 2003) are the result of convergent evolution. Examples of species in which COL genes do not appear to regulate flowering time include Japanese morning glory (Ipomoea nil (L.) Roth) and temperate Fabaceae family (legume) species such as pea (Pisum sativum L.) and Medicago truncatula Gaertn. Notably, a recent survey of COL genes in M. truncatula found no evidence of them acting as a photoperiodic switch (Wong et al., 2014) and the gene expression of the closest COL homologue in pea, was not altered in photoperiodic flowering time mutants. These include the recessive late bloomer1 (late1) mutant which disrupts the orthologue of GI (Hecht et al., 2007;Liew et al., 2009) and a dominant late flowering mutant late2 which has been mapped to a CDF gene PsCDFc (Ridge et al., 2016). This Pscdfc mutant encodes a protein which has lost the ability to interact with the FKF1 and exhibits reduced expression of FT-like genes (Ridge et al., 2016). Whether the regulation of the PsFT-like genes by PsCDFc is direct or not remains unknown.
Nevertheless, in many species the regulation of FT orthologues exhibit significant commonalities such as genes containing CCT and/or B-box domains integrating the photoperiod and vernalisation signals. For instance, in the Pooid grasses such as wheat (Triticum spp.) prior to vernalisation the repression of the FT-like gene is maintained by a pair of ZCCT proteins, which contain both CCT and B-box domains, in complex with NF-Y subunits (Song et al., 2015;Li et al., 2011). Then in inductive LD conditions upregulation of the FT orthologue requires the PHOTOPERIOD 1 (PPD1) gene, which encodes a CCT domain (Shaw et al., 2013;Pearce et al., 2017). In addition genes encoding CCT and B-Box domains are important in the cultivated varieties of the LD responsive sugar beet (Beta vulgaris L.) (Pin et al., 2012;Dally et al., 2014).
The lack of a direct upstream regulator of FT-like genes in temperate legumes means that, despite good progress, the understanding of how flowering time in this family is incomplete (Putterill et al., 2013;Weller & Ortega, 2015). Legumes are an ecologically diverse plant family (The Legume Phylogeny Working Group, 2013) and include a number of dietary staple crops. In an agricultural context these crops reduce the need for fertilizer use via nitrogen fixation (Vance, 2001). Like A. thaliana, many temperate legume species, including M. truncatula, accelerate their flowering in response to vernalisation and LD conditions (Highkin, 1956;Summerfield et al., 1985;Roberts, Hadley & Summerfield, 1985;Laurie et al., 2011;Weller & Ortega, 2015;Ridge et al., 2017). Classically pea has been most intensively studied to analyse photoperiodic flowering in temperate legumes but has recently been complemented by the study of other species, such as M. truncatula for which considerable genetic and genomic resources exist (Tadege et al., 2008;Young et al., 2011;Tang et al., 2014).
Analysis of flowering time mutants in pea has demonstrated that some members of the photoperiod pathway, such as the photoreceptors and components of the circadian clock, are largely conserved with A. thaliana (Weller & Ortega, 2015). In addition, FT-like homologues have been characterised in several legume species with most having multiple copies which fall into three sub-clades (Laurie et al., 2011;Hecht et al., 2011). In pea and M. truncatula the FT-like gene FTa1 is a potent floral inducer whose expression is elevated in LD (Laurie et al., 2011;Hecht et al., 2011). However unlike FT in A. thaliana, MtFTa1 (Medtr7g084970) does not possess a diurnal pattern and instead exhibits a near constant level of expression once induced in LD (Laurie et al., 2011) suggesting that the mechanisms of regulation between FT and FTa1 likely differ significantly. Moreover grafting experiments in pea between flowering time mutants suggest that additional floral stimuli exist (Hecht et al., 2011).
Good candidates for secondary floral stimuli are the FTb genes that, like FTa1, are upregulated in LD and capable of complementing an A. thaliana ft-1 mutant (just MtFTb1 in M. truncatula, Laurie et al., 2011;Hecht et al., 2011). Distinctively, MtFTb genes have a diurnal pattern of expression in LD and peak twice, at ZT4 and ZT16 (ZT is zeitgeber time where ZT0 is subjective dawn at lights on). This pattern of expression is similar to that of FT in A. thaliana under ''natural'' LD conditions (Song et al., 2018). Another legume FT-like gene which may play a role in flowering time is FTa2 which in M. truncatula it is mostly expressed in short-day (SD) photoperiod conditions consistent with a floral repressor (Laurie et al., 2011). However whether any FT-like genes other than FTa1 regulate flowering time in either pea or M. truncatula remains to be demonstrated. Downstream of FT-like genes the regulation of flowering in temperate legumes is similar overall to that of A. thaliana, albeit complicated by several genes being present in multiple copies with potential functional redundancies. For example, three MtSOC1-like genes depend on MtFTa1 for the extent and timing of their expression, although Mtsoc1a mutants do show delayed flowering (Fudge et al., 2018;Jaudal et al., 2018). The genes involved in inflorescence development are similar to that of A. thaliana (Cheng et al., 2018).
Overall, while some components of the A. thaliana photoperiodic flowering model appear to be conserved in temperate legumes, other aspects differ. Specifically, what factors act immediately upstream of FT-like genes in the photoperiodic flowering of temperate legumes remain unknown. In light of the gap in understanding, we take a transcriptomic approach to identify additional candidate regulators of photoperiodic flowering in M. truncatula. In two experiments we target genes expressed in a similar, or opposite, manner to LD induced FT-like genes with the aim of identifying candidate regulators. Plants were shifted from SD (8 h light/16 h dark) to LD (16 h light/8 h dark) conditions and gene expression changes analysed in the first four hours of the diurnal cycle; at dawn (ZT0), two hours after dawn (ZT2) and four hours after dawn (ZT4) during which time the LD induced FT-like genes are expressed. These timepoints capture the constant induction of MtFTa1 in LD and target the first diurnal peak of MtFTb1 and MtFTb2 at ZT4 also in LD (Laurie et al., 2011). M. truncatula cv 'Jester' (Hill, 2000) seeds were scarified by softly scraping them between two pieces of sand paper (grade P160). The seeds were then germinated at 15 • C in gently shaking tubes of water and dark conditions for 24 h. Germinated seeds were then vernalised by being transferred to damp petri dishes and incubated at 4 • C for a further 25 days. The seedlings were subsequently planted in seed raising mix (Daltons Ltd., NZ) in individual cell pots and grown in growth cabinets at 22 • C under 240 µMm −2 sec −1 cool white fluorescent light. This was in accordance with Institutional Biological Safety Committee approval GMO08-UA006. Soil was kept moist with a complete liquid nutrient media (without Na 2 SiO 3 ; Gibeaut et al., 1997). In the two experiments plants were grown in SD conditions (8 h light/16 h dark) until they were 10 days old and were then shifted at ZT8 into LD conditions (16 h light/8 h dark) for 3 days in experiment 1 (harvest at ZT0 and ZT2) or 5 days in experiment 2 (harvests at ZT4). Three biological replicates were taken each consisting of two pooled trifoliate leaves from different non-adjacent plants giving a total of 18 samples. Only the first trifoliate leaf to unfurl was sampled from a given plant. Samples were immediately frozen in liquid nitrogen.

RNA extraction and sequencing
The 18 frozen leaf samples were ground using five 3 mm sterilised ball bearings per sample and a Geno/Grinder R 2010 (SPEX R SamplePrep, USA). Total RNA was then extracted from the ground samples using the RNeasy Plant Mini Kit (Qiagen, Hilden, Germany) following the manufacturer's instructions and the quantities and qualities of the extracted RNA were then measured using a Bioanalyzer 2100 (Agilent Technologies, Santa Clara, CA, USA). Samples were then sent to the Otago Genomics Facility (www.otago.ac.nz/genomics/index.html) and RNA-Seq libraries were prepared.

Complementary DNA synthesis and RT-qPCR analysis
Following RNA extraction, 8 µg of RNA was treated with the TURBO DNA-free Kit (Invitrogen, Foster City, CA, USA) . First-strand complementary DNA (cDNA) was then synthesised using 1 µg of DNase treated RNA using SuperScript IV Reverse Transcriptase (Invitrogen, USA) using an oligo dT primer following the manufacturers instructions. At this point a control reaction where the reverse transcriptase is omitted was run, one reaction per set of replicate samples. When tested alongside synthesised cDNA these reactions control for the presence of genomic DNA contamination. Measurement of relative abundances of cDNA, as a measure of gene expression, was done using Real time quantitative PCR (RT-qPCR). In this assay 10 µl reactions using Power SYBR Green PCR Master Mix and 2 µl of diluted cDNA (cDNA was diluted 20x prior to RT-qPCR). Control reactions using water, as well as the cDNA reactions which lacked reverse transcriptase, were run simultaneously. RT-qPCR experiments were assembled on 384-well plates and run on an Applied Biosystems 7900HT Sequence Detection System.
Analysis of the RT-qPCR data was done using the 2 − Ct algorithm (Livak & Schmittgen, 2001). This utilised either Medtr7g089120 a TUBULIN BETA-1 CHAIN gene or Medtr6g084690 a SERINE/THREONINE PROTEIN PHOSPHATASE 2A REGULATORY SUBUNIT (PP2A; previously known as PROTODERMAL FACTOR 2) as reference genes (Kakar et al., 2008). Primers used are available in Table S2.

RNA-Seq analysis
Read trimming of the the FASTQ files was then performed using BBDuk tool in the BBTools suite (v37.54;Bushnell, 2018). This removed the sequencing adapters and low quality sequence (Phred = 20) and retained only reads which were at least 36 bases in length. Furthermore read pairs lacking one of the pair were discarded. Transcripts were then quantified using Salmon (v0.8.2;Patro et al., 2017) which uses quasi-mappings to map the reads to the annotated genes of the Mt4.0v2 transcriptome (Young et al., 2011;Tang et al., 2014). The resulting count tables were then imported into R (R Core Team, 2018) using the tximport package (v1.4.0; Soneson, Love & Robinson, 2015) and principal component analysis (PCA) and DE analysis at the gene level was performed using DESeq2 (v1.16.1;Love, Huber & Anders, 2014). DESeq2 normalizes the data by fitting a negative binomial GLM with a gene-specific dispersion parameter. Clustering was performed using the Mfuzz package (v2.36;Kumar & Futschik, 2007) using minimum centroid distances as a heuristic measure of appropriate cluster number. Briefly this consisted of compromising between cluster size and number by ascertaining when additional clusters only marginally increased the resolution.

RESULTS
It has previously been demonstrated that shifting vernalised M. truncatula plants from SD to LD induces flowering and is accompanied by the induction of the expression of FT-like genes MtFTa1, MtFTb1 and MtFTb2. It was found that three days in LD was sufficient to promote the transition to flowering in M. truncatula (but one was insufficient; Laurie et al., 2011). We thus utilised at least three days in LD in our experiments. Here two similar experiments are analysed where plants were grown in SD conditions until they were 10 days old and were then shifted at ZT8 into LD conditions to describe the transcriptomic changes which occur in the M. truncatula leaves following such shifts, alongside the FT-like genes.
In the first experiment sampling of leaves occurred at ZT0 (subjective dawn) and ZT2 when the plants had experienced three days of LD conditions. In the second experiment the plants had experienced five days of LD conditions and sampling occurred at ZT4. With the aim of identifying candidate regulators of FT-like genes these samples capture the constant LD induction of MtFTa1 and both precede and include the first diurnal peak of MtFTb1 and MtFTb2 at ZT4 in LD (Laurie et al., 2011). These samples (in triplicate) were used to construct RNA-Seq libraries which were subsequently sequenced.
The sequenced RNA-Seq libraries all generated 40-50 million reads with mean quality scores greater than 35. The quantification of gene abundances reported an average mapping rate of 89.86% to the Mt4.0v2 transcriptome (see Table S1 for full table of abundances in Transcripts per Million; TPM). Thus the data generated are of a high quality and indicates that the Mt4.0v2 transcriptome is reasonably complete.

Differential expression at each time point
Analysis of this data initially considered pairwise comparisons of gene expression between LD and SD at each timepoint. The significance of any differences observed was assessed using Wald significance tests and since there are three sets of tests, the significance levels of these were adjusted for together using the false discovery rate method.
It was observed that 28,151-29,011 genes had read counts >1 (out of the 50,444 in the Mt4.0v2 transcriptome). Of these 6,824 genes at ZT0 (24% of expressed genes) had statistically different expression (α = 0.05). There were 5,523 genes (19%) at ZT2 and 7,743 genes (25%) at ZT4 (Table S3). When these lists were filtered for those which had >2-fold differences and had >10 mean normalised reads then 2,436, 1,309 and 2,661 genes were judged to be DE at ZT0, ZT2 and ZT4 respectively. These numbers are similar to what was observed in an A. thaliana microarray experiment after a SD to LD shift where 2000 genes were DE (Wigge et al., 2005). The transcript abundances of the LD induced FT-like genes and five other candidate photoperiod pathway genes were assessed (Fig. 1). While absent in SD, large increases in MtFTa1 transcript abundance were observed in LD at ZT0, ZT2 and ZT4. In addition, appreciable expression of MtFTb1 and MtFTb2 was only seen in LD at ZT4, with minimal to no expression at ZT0 and ZT2 in either SD or LD.
The differences in expression exhibited by the FT-like genes in Figs. 1A to 1C qualitatively agree with previously reported RT-qPCR time course data where in LD MtFTa1 has an approximately constant level of expression across the day (Laurie et al., 2011). MtFTb1 and MtFTb2 have been observed to diurnally peak at ZT4 and ZT16 (Laurie et al., 2011), consistent with the DE observed here at ZT4 (Figs. 1B and 1C). This indicates that despite originating from different experiments, these datasets could be analysed together as a ZT0-to-ZT4 time series to observe the pattern of gene expression change following a SD to LD shift.
To further demonstrate that these datasets could be analysed together we considered the diurnal expression of four other genes measured by RT-qPCR in an independent ZT0-ZT20 time series experiment and compared them to our RNA-Seq data (Fig. 2). Here it was observed that in MtFKF1 (Medtr8g105590) and MtCDF1 (Medtr2g059540) from a low point at ZT0 in SD gene expression increases and peaks at ZT8 (Figs. 2A and 2B). This is also seen in the gene abundances of the RNA-Seq datasets where a large increase in SD at ZT4 is observed. Similar minimal LD expression at these timepoints between the two datasets is also in evidence (Figs. 2E and 2F). Conversely, in the RT-qPCR time course MtCDF2 (Medtr5g041420) and MtCDF4 (Medtr6g012450) have their greatest expression at ZT0 and which in SD sharply decreases at ZT4 (Figs. 2C and 2D) which is also observed in the RNA-Seq transcript abundances (Figs. 2G and 2H).

Overview of the time series analysis
The similarity between the patterns of expression observed in this data with previously reported RT-qPCR results for three genes (Figs. 1A to 1C), as well as independently collected RT-qPCR results for an additional four genes ( Fig. 2) indicates that there is no significant batch effect which would bias the interpretation of this data as a time series. We then analysed the data in such a manner. Across the three timepoints 31,363 genes (out of the 50,444) had a read count >1 at at least one timepoint (see Fig. S1 for dispersion plot and MA-plot) and were included in the analysis. To take an overarching view of the variation between samples, principal component analysis (PCA) was employed with the first two principal components plotted in Fig. 3. Biological replicates clustered together and 64% of the observed variation is This plot gives an overarching perspective on the variation within the dataset with the first two components explaining 64% of the variation. PC1 strongly aligns with the time of sampling and PC2 the photoperiod condition. The plot was constructed using log 2 normalised counts of genes with non-zero read counts.
Full-size DOI: 10.7717/peerj.6626/ fig-3 explained by the first two principal components which strongly align with the time of sampling (PC1) and the photoperiod condition (PC2). Typically in a pairwise comparison, differences in transcript abundances can be filtered based on fold-change and levels of expression to focus on genes more likely to be biologically consequential. This is difficult in a time series as it is unknown in this instance which timepoint is most relevant to the regulation of FT-like genes. Reflecting on the gene abundances plotted in Fig. 1 it was observed that the genes with differing expression between LD and SD could be broadly grouped into two classes, those which altered their pattern of expression in response to the photoperiodic shift and those which altered the magnitude of their expression. Specifically, in Fig. 1, the majority of genes (six) change their pattern of expression over time between conditions (MtFTb1, MtFTb2, MtPHYA, MtGI, MtELF3a and MtSOC1a), while two genes maintain a similar pattern, but vary in magnitude of expression between conditions (MtFTa1 and the NF-YC-like gene). It was into these two classes that we decided to segment the time series data followed by clustering.
To classify the genes into these two classes we fit two models to the data. The first model included an interaction term between growth condition (SD or LD) and time of sampling (ZT0, ZT2 and ZT4). This was to identify genes which respond in a condition-specific manner over time such that any changes in the pattern of gene expression brought about by the photoperiodic shift are determined (e.g., MtFTb1 in Fig. 1B). Secondly, we repeated the analysis with a model which lacked this interaction term to test for just the effect of condition (LD vs SD) on the magnitude of gene expression (e.g., the NF-YC-like gene in Fig. 1G). This dual approach facilitated the identification of genes which alter their pattern of expression or only the magnitude of their expression respectively.
Alongside this analysis, an a priori list of 146 candidate genes was also assembled consisting of genes shown to, or are suspected of, participating in flowering time regulation via the photoperiod pathway (Tables 1 and 2). In addition to the FT-like genes, the list incorporates photoreceptors, components of the circadian clock and classes of transcription factor which could potentially regulate the FT-like genes chosen based on their role in A. thaliana. These include candidates from the CDF, TPL and NF-Y families as well as additional transcription factors from families with members known to bind the promoter of FT in A. thaliana compiled by Ridge et al. (2016). These include the genes containing CCT-domain and B-box domains as well as the CIB/BEE-like (CBL), CRYPTOCHROME-INTERACTING BASIC-helix-loop-helix (CIB), PHYTOCHROME INTERACTING FACTOR (PIF ), and APETALA2 (AP2) families.

Photoperiod induced changes in the pattern of gene expression
To identify changes in the pattern of gene expression over time, a generalised linear model was fit using the growth condition and time of sampling as predictors along with an interaction term. A likelihood ratio test was then used to test for genes where this interaction term is significant relative to a reduced model lacking the interaction term. A significant result indicates that the expression of the gene responds in a condition-specific manner over time (i.e., a low p-value indicates a change in expression pattern not solely magnitude). This is illustrated in Fig. 1 where MtPHYA, MtELF3a, MtGI, MtFTb1 and MtSOC1a have distinctly different patterns of expression over time in LD compared to SD. On the other hand MtFTa1 and the NF-YC-like gene Medtr1g082660 show a similar pattern in both LD and SD but at differing magnitudes so in this context are not considered to alter their pattern of expression. While MtFTb2 also appears to alter it's pattern over time, it was not significant in the likelihood ratio test.
This approach resulted in 9,516 genes with altered expression (α = 0.05; full results in Table S4) or 30.34% of those tested with >1 read. To aid interpretation of these changes and quickly identify the timepoint(s) at which individual genes differed in LD relative to SD, Wald significance tests were used to contrast the difference in expression between the two conditions at each of the three timepoints. The significance levels of these genes were adjusted for all three contrasts together using the false discovery rate method. This MtCDFh DOF-type zinc finger DNA-binding family protein      Medtr7g113680 NF-YC like nuclear transcription factor Y protein

Notes.
The genes listed in this table are loci known or hypothesised to participate in the photoperiod pathway in legumes along with homologues of the core components of the pathway in A. thaliana. They include potential FT promoter binding genes compiled by Ridge et al. (2016) from which the naming of MtTOE1a to MtSPT derives. Table depicts the the adjusted p-value for the interaction between time and condition. Note that if the adjusted p-value is significant each contrast between conditions at timepoints ZT0, ZT2 and ZT4 is also given to facilitate identifying where the patterns of expression diverge. Included in these results is the mean normalised read counts for the gene at this timepoint. If the interaction term adjusted p-value is not significant contrasts are omitted. In all cases it is the expression in LD relative to SD which is tested. In addition the cluster assignment and membership value are listed. Differentially expressed results are in bold using an α = 0.05. resulted in 9,427 of the 9,516 genes having differing expression between LD and SD at a minimum of one timepoint with 6,437, 4,511 and 6,159 for ZT0, ZT2 and ZT4 respectively (Fig. S2A). We consider only genes with >2-fold differences with >10 mean normalised reads as DE and there were 3,192 genes meeting this criteria at at least one timepoint. This corresponds to 2131, 1062 and 2168 DE genes at ZT0, ZT2 and ZT4 respectively. Full results are listed in Table S5. In these filtered lists of genes 4.9% of these genes differed in the three timepoints and 32% differed at two or more timepoints demonstrating how most genes have a single peak, predominantly at ZT0 or ZT4 (Fig. 4A). Notably, there were fewer genes DE at ZT2 than the other timepoints and those that did mostly differed at one or both of the other timpeoints too. Only 27% of the 1,062 genes DE at ZT2 are unique to the timepoint (compared to 58.4% for ZT0 and 51.6% for ZT4). The interaction term was found to be significant in the majority (91/146; 62%) of the candidate genes associated with photoperiodic regulated flowering including 19/24 (79%) of the circadian clock and photoreceptor candidate genes and a striking 8/8 (100%) of the selected AP2 class of genes (Table 1). Of this set of 91 genes, 90 were statistically different (α = 0.05) at one or more timepoints and for 61 genes the difference at at least one timepoint was greater than >2-fold difference with >10 mean normalised reads. For example, a gene encoding a predicted core component of the core circadian clock, MtLHY, is expressed at ∼4-fold higher levels at ZT0 in SD than LD but this situation is reversed at ZT4 when it is ∼7.5-fold higher in LD than SD.
To get a broader view of the results, the mean abundances of all 9,516 genes which altered their pattern of expression were taken and log 2 transformed, before being standardised to have a mean of zero and standard deviation of one. The standardised mean abundances were then clustered into 18 clusters using c-means clustering (Table S6; see Fig. S3A for cluster number optimisation) and visualised in Fig. 4B. This algorithm assigns a membership score (between 0 and 1) for each gene to each cluster describing the degree to which an individual observation belongs to a given cluster (see Fig. S4 for distribution of membership scores). A gene is then assigned to the cluster for which it has the highest membership. Cluster 3, which has 464 genes in it, was of particular interest as MtFTb1 is present. Thus these genes have patterns of expression similar to MtFTb1 and may be involved in regulating MtFTb1 or involved in similar processes. Clusters 9, 11 and 17 (648, 598 and 538 genes respectively) are also of interest as they contain genes with an opposite expression pattern to cluster 3, some of which may thus be negative regulators of MtFTb1 (e.g., MtFTa2 is in cluster 11). Specifically, these three clusters have peaks of expression at ZT4 in SD which are not present in LDs (Fig. 4C).
With regard to candidate photoperiodic flowering time genes, alongside MtFTb1 in cluster 3 there were only two genes from Table 1. These were MtSOC1a (Medtr7g075870) and a BHLH transcription factor gene called MtCBL3 (Medtr7g053410). MtSOC1a is a downstream target of MtFTa1 (Medtr7g084970) and demonstrated to affect flowering time (Jaudal et al., 2018). Another cluster 3 candidate flowering related gene Medtr3g101520 encodes a B3 domain, as does E1, the most important locus in the photoperiod pathway of the tropical legume soybean (Glycine max (L.) Merr.; Xia et al., 2012). On the other hand, the predicted circadian clock-like genes PRR37a, b and PRR59a-c (Matsushika et

Notes.
The genes listed in this table are loci known or hypothesised to participate in the photoperiod pathway in legumes along with homologues of the core components of the pathway in A. thaliana. They include potential FT promoter binding genes compiled by Ridge et al. (2016) from which the naming of MtCOLi to MtSPT derives. Table depicts the the adjusted p-value for the interaction between time and condition. Note that if the adjusted p-value is significant each contrast between conditions at timepoints ZT0, ZT2 and ZT4 is also given to facilitate identifying where the patterns of expression diverge. Included in these results is the mean normalised read counts for the gene at this timepoint. If the interaction term adjusted p-value is not significant contrasts are omitted. In all cases it is the expression in LD relative to SD which is tested. In addition the cluster assignment and membership value are listed. Differentially expressed results are in bold using an α = 0.05.  , 2000;Nakamichi et al., 2005) all have profiles opposite to that of MtFTb1 with three included in cluster 9 (Table 1). Additionally Medtr1g033620, a SHAQKYF class MYB-like DNA-binding domain gene, is present in cluster 11 which is interesting as proteins of this class have recently been implicated in the regulation of FT in A. thaliana (e.g., EFM and FE; Yan et al., 2014;Abe et al., 2015). Next, genes in each cluster were then ranked by their fold-change at ZT4 because MtFTb1 has its greatest difference in expression between the two conditions at this timepoint (962fold up in LD; Fig. 1B). We focused on transcription factor genes and observed significant fold changes in transcript levels of genes encoding a number of zinc finger proteins. For instance, in cluster 3 at ZT4, the zinc finger (Ran-binding) family gene Medtr4g113840 was 45-fold elevated in LD compared to SD and the DOF-type zinc finger DNA-binding family gene Medtr3g091820 was increased 2-fold. In contrast, in cluster 9 the B-box type zinc finger protein Medtr2g073370 was 29-fold more abundant in SD than in LD at ZT4 and in cluster 11, Medtr2g059540, also a DOF domain zinc finger gene (denoted MtCDF1 in Table 1) was ∼300-fold higher in SD than in LD. Furthermore in cluster 17, the zinc finger (Ran-binding) family gene Medtr6g069400 is 7-fold elevated in SD compared to LD at ZT4.

Changes in the magnitude of gene expression
There is a class of genes which change their level of expression in response to the shift from SD to LD conditions, but this does not alter the relative changes which occur at differing timepoints. Thus it is only the magnitude, not the pattern of expression which changes (e.g., MtFTa1 or the NF-YC-like gene Medtr1g082660 in Figs. 1A and 1G). To identify these genes a simpler model was fit the data which lacked the interaction term between growth condition and time of sampling.
It was observed that in this model 8,695 genes differed in the magnitude of their expression between conditions (α = 0.05; Table S7), but this was reduced to 4,694 when those that also altered the pattern of their expression were omitted. Therefore 14.96% of genes with >1 read (4,694/31,363 genes) alter just the magnitude of their expression in response to the shift in photoperiod conditions. To provide timepoint level resolution of these changes Wald significance tests were again used to contrast the 4,694 genes (Table  S8) with the significance levels of these genes adjusted for all three contrasts together using the false discovery rate method. This resulted in 4,161 of the 4,694 genes differing in magnitude at least one timepoint with 2,715, 2,268 and 2,666 for ZT0, ZT2 and ZT4 respectively (Fig. S2B). When considering genes with >2 fold differences and >10 mean normalised reads as DE, only 819 genes differed in magnitude at at least one timepoint and at ZT0, ZT2 and ZT4 this corresponded to 457, 353 and 454 genes respectively. Here it was observed that only 65 genes (7.9%) of this set of genes are consistently higher in LDs than SDs while 54 genes (9.99%) are consistently lower (Fig. 5A). The numbers in these classes go up to 187 genes (22.8%) and 139 genes (17%) respectively when genes differing at two or more timepoints are included. Like the class of genes which altered their pattern of expression there are fewer DE genes at ZT2.  Table S8. (B) The standardised and clustered abundances (such that the average expression value is zero and the standard deviation is one) of all 4,694 genes which were classed as just altering the magnitude of their gene expression in response to the photoperiod shift clustered into 18 clusters using c-means clustering. The number of genes in each cluster is listed alongside. (C) Mean standardised abundance for clusters examined in more detail are plotted with cluster 1 selected for containing MtFTa1 which is expressed in LD at all timepoints but not SD and clusters 9 and 10 were selected as genes in these clusters are consistently higher in SD (blue and dotted) than in LD (orange and line). Mean standardised abundances for all clusters are plotted in Fig. S6.
Full-size DOI: 10.7717/peerj.6626/ fig-5 Results for the candidate photoperiod loci are summarised in Table 2. Specifically these are the candidate photoperiodic flowering time genes which were not classed as changing their pattern of expression between conditions. Here 15/55 (27%) of the genes are classed as altering just their magnitude in response to the photoperiodic shift. However, only 9/15 (60%) have statistically different levels of expression at two or more timepoints and looking down the list MtFTa1 is the only gene to consistently differ >2-fold with >10 mean normalised reads. Thus, none of these candidate photoperiod genes are expressed similarly to MtFTa1 which shows >30-fold higher levels in LD than in SD at all timepoints.
We then assessed the sets of genes whose expression is either consistently higher or lower in LDs than SDs (65 and 54 genes respectively) for other candidate genes which could potentially play a role in the regulation of flowering time. These include Medtr8g091720 a NF-Y-like gene, which exhibits consistently greater expression (2.5-5.6-fold) in LD than in SD, like that of MtFTa1. Genes which consistently show reduced expression in LD compared to SD include Medtr2g014200 which encodes a squamosa promoter-binding-like protein as well as genes associated with sugar transport. For instance, Medtr3g074180 encoding a trehalose-6-phosphate phosphatase is 2-3 fold lower accross the three timepoints and Medtr0204s0040 a sugar porter family MFS transporter which not expressed in LD at all. Sugar transport is a process linked to flowering time and the regulation of FT in A. thaliana (Wahl et al., 2013).
In addition, while not previously linked to flowering time, these lists also contain a number of genes which likely have regulatory functions such as Medtr5g079220 encoding a R2R3-MYB transcription factor, Medtr3g107940 which produces a FBD protein and Medtr8g012655 which is the gene for an ethylene response factor. These genes all have greater expression in LD compared to SD. Conversely, genes which have reduced expression in LD compared to SD include Medtr7g012790 encoding a circadian clock coupling factor ZGT, Medtr7g105780 encoding an ovate transcriptional repressor, Medtr3g031220 which produces a WRKY transcription factor and Medtr8g026960 which is the gene for a homeobox associated leucine zipper protein.
Relaxing the criteria for differential expression slightly and reconsidering the candidate photoperiod loci also suggests that MtCDFf (Medtr6g027450) could be investigated further as it is higher in LD in all three timepoints (8.57-fold, 3.48-fold and 12.99-fold respectively) however its expression is overall quite low with an average of only 2.7 normalised reads at ZT4. Other potential candidates on the list include the ELF4-like gene Medtr8g020200, consistently 1.8-2.3-fold higher in SD compared to LD, as is MtCMF17 (Medtr1g044785) although, like MtCDFf, the expression in these datasets is low. This may reflect that these genes are cell-type specific and so only expressed in a fraction of those sampled.
All 4,694 genes which differed in the magnitude of their expression between conditions were then clustered into 18 clusters (Fig. 5B, Fig. S3B and Table S9). MtFTa1 was present in cluster 1 which had 331 genes. However, none of the selected photoperiodic candidate genes clustered with MtFTa1. Clusters 9 (259 genes) and 10 (239 genes) have patterns opposite to that of cluster 1 (Fig. 5C). A TPL-like gene from Table 2 is in cluster 9 and a second TPL-like gene is in cluster 10. In addition, in clusters 9 and 10, a number of other flowering time candidates are present including Medtr0020s0120 in cluster 9, which is similar to the FT antagonist TERMINAL FLOWER 1 in A. thaliana (Jaeger et al., 2013;Wickland & Hanzawa, 2015). Also present in cluster 9 are a trio of genes encoding B3 binding domain proteins Medtr1g021410, Medtr1g021435 and Medtr1g021500 and pair of genes which encode SHAQKYF class MYB transcription factors Medtr0036s0260 and Medtr5g027550. These genes are notable for genes containing these domains have been associated with flowering time, as has the jumonji domain protein encoding gene Medtr2g011630 in cluster 10 (Xia et al., 2012;Abe et al., 2015;Yan et al., 2014).
A total of 35/65 of the consistently differentially expressed genes which are higher in LDs than SDs (Fig. 5A) are present in cluster 1. These include Medtr1g099440 which encodes a membrane-associated kinase regulator-like protein, Medtr6g086805 a heat shock transcription factor gene and Medtr4g009110 which encodes a helix loop helix DNAbinding domain protein. Similarly, clusters 9 and 10 contain 19/54 of the consistently differentially expressed genes which are lower in LDs than SDs. These genes include the ethylene response factor gene Medtr5g016750 and Medtr4g119422 encoding a cullin-like protein.
We then ranked the clustered genes by their fold-change between LD and SD at ZT0. Strikingly, in cluster 1, within the top 13 genes with the largest fold increase in LD compared to SD at ZT0, MtFTa1 is ranked 8th, while 10 of the other genes were homologues of IRON MAN (IMA)/FE-UPTAKE-INDUCING PEPTIDE 1 (FEP1) genes. These encode mobile signalling peptides integral for the uptake of of iron from the soil and octuple ima/fep1 mutants in A. thaliana result in severe chlorosis (Grillet et al., 2018;Hirayama et al., 2018). They were also identified in a recent reannotation of the M. truncatula genome to identify small, secreted peptides (De Bang et al., 2017). In addition cluster 3, which is similar to cluster 1, has the remaining annotated IMA/FEP1 genes as 4 of the top 5 genes with the largest fold-change differences at ZT0.

DISCUSSION
This study presents a thorough overview of the changes in the M. truncatula leaf tissue transcriptome following a shift of vernalised plants from SD to LD conditions between ZT0 and ZT4. Our data are of very high quality with an average mapping rate of 89.86%. This suggests that while the existing Mt4.0v2 transcriptome captures the majority of the signal in the data there is nevertheless space to improve it. To date, a majority of transcriptomic datasets in M. truncatula have been generated from root tissue, not leaves (e.g., experiments in the MtGEA database are >50% root tissue (Benedito et al., 2008)). Thus this dataset broadens the understanding of gene expression in the aerial tissue of M. truncatula, in particular in differing photoperiodic conditions. Consequently it could be incorporated into future cross-species comparisons with datasets like that of Wu et al. (2014) who performed a similar shift experiment in the SD-responsive soybean.
While the data presented here consist of two composite datasets, and so interpretation requires caution, the results of this analysis are nevertheless biologically plausible. Initial pairwise comparisons of gene expression between LD and SD at each timepoint individually qualitatively agreed with previously published expression profiles of MtFTa1, MtFTb1 and MtFTb2 (Figs. 1A-1C;Laurie et al., 2011). Furthermore, independent RT-qPCR timecourses of MtFKF1, MtCDF1, MtCDF2 and MtCDF4 are similar to the transcript abundances seen in these RNA-Seq datasets (Fig. 2). From these results it was concluded that there is no significant batch effect between datasets and that it is appropriate to interpret this data as a time series.
The dual approaches taken to analyse the data as a time series first considered the interaction between condition and time and then just the effect of the condition. This successfully identified the genes which alter the pattern or just the magnitude of their expression respectively. This approach could serve as a template for similar datasets in other plants which lack a CO-like regulator. Given the significant role the circadian clock plays in the regulation of the A. thaliana transcriptome (Covington et al., 2008;Michael et al., 2008) and the manner in which the photoperiod regulates the circadian clock (Nohales & Kay, 2016), it is unsurprising to see that a greater number of genes were classed as altering their pattern of expression (9,516/31,363 of detectable genes; 30.34%) than only the magnitude of their expression (4,694/31,363; 14.96%). This is especially true of our selected candidate photoperiodic genes (Table 1) where 62% had a significant change in pattern.
Clustering was employed to subset the two classes of DE genes further based on their normalised abundances across the three timepoints. The low membership scores reveal the small degree of separation between clusters. This may be a feature of gene expression data, but the strength of c-means clustering is that it allows the certainty of cluster assignment to be assessed (Tables S6, S9). An alternative approach to subset the classes of genes would be to group them based on functional gene set descriptors such as Gene Ontology (GO) terms and cluster until individual clusters become enriched for single GO terms. However since in M. truncatula only 37% of genes have annotated GO terms (Tang et al., 2014), their use is currently of limited utility. This study has identified additional candidate photoperiodic flowering time genes for future characterising and reverse genetics screens. In terms of identifying genes co-expressed with the LD-induced M. truncatula FT-like genes, candidate genes such as the zinc finger gene Medtr4g113840 and the B3 domain transcription factor gene Medtr3g101520 (both in the same cluster as MtFTb1) present as future avenues of inquiry as potential regulators of photoperiodic flowering. Conversely, it would be interesting to investigate the CCT containing B-box type zinc finger gene Medtr2g073370 which has the opposite pattern of expression to MtFTb1 consistent with a repressive role. Finally, it is notable that none of the list of candidate photoperiod genes responded in a similar way to the SD to LD photoperiodic shift as the potent floral activator MtFTa1 (Table 2). However a number of other potential regulators were identified, such as the NF-Y-like Medtr8g091720 which has consistently higher expression in LD or the pair of TPL-like genes (Medtr4g114980 and Medtr7g112460) and ethylene response factor gene Medtr5g016750 which all have a pattern of expression opposite to MtFTa1.
The experiments analysed in this study focused on the first four hours of the diurnal cycle where, in LD, MtFTa1 is induced and which precede and include the first peak in expression of LD induced MtFTb1 and MtFTb2 at ZT4 (Laurie et al., 2011). It should be noted that the clustering presented in this study is limited to the timepoints sampled and cannot be considered predictive of the pattern in which genes are expressed later in the day. Genes in the same cluster may have divergent patterns later in the day. Given the diurnal pattern of expression of MtFTb1 and MtFTb2 (Laurie et al., 2011), our ability to identify candidate regulators which share patterns of expression would be enhanced by the inclusion of additional samples from later timepoints. Notably at ZT8 to capture the trough and ZT16 to capture the second peak in expression of MtFTb1 and MtFTb2 in LD. This is because samples from later timepoints would facilitate greater discrimination between expressed genes and thus result in smaller clusters. However, it is also possible that the regulator of M. truncatula FT-like gene expression is post-translationally regulated because light/protein dependent mechanisms are common in photoperiodic and circadian regulatory networks. In this case, it might not be possible to identify the regulators using a co-expression approach.

CONCLUSIONS
This study further elucidates the photoperiodic acceleration of flowering in the reference legume species M. truncatula which interestingly appears to lack a CO-like regulator. We found that the photoperiodic shift from SD to LD conditions had a large effect on the leaf transcriptome with 14,210 genes altering their pattern or magnitude of expression. Candidate regulators that were co-expressed with the LD-induced FT-like genes were identified by clustering. It was notable that none of the list of candidate photoperiod genes responded to the photoperiodic shift in a similar manner as that of the potent floral activator MtFTa1, and few were similar to that of the MtFTb genes. Thus this analysis further supports the idea that FT-like genes in M. truncatula are uncoupled from the photoperiodic transcriptional networks seen in other species and that flowering time in M. truncatula is induced in a novel manner. Future work will focus on molecular-genetic analysis of the function of the candidate regulators identified in this study in M. truncatula photoperiodic flowering.