Evolutionary history of Hemerocallis in Japan inferred from chloroplast and nuclear phylogenies and levels of interspecific gene flow

: The perennial herb genus Hemerocallis (Asphodelaceae) shows four flowering types: diurnal half-day, diurnal one-day, nocturnal half-day, and nocturnal one-day flowering. These flowering types are corresponding to their main pollinators, and probably act as a primary mechanism of reproductive isolation. To examine how the four flowering types diverged, we reconstructed the phylogeny of the Japanese species of Hemerocallis using 1615 loci of nuclear genome-wide SNPs and 2078 bp sequences of four cpDNA regions. We also examined interspecific gene flows among taxa by an Isolation-with-Migration model and a population structure analysis. Our study revealed an inconsistency between chloroplast and nuclear genome phylogenies, which may have resulted from chloroplast capture. Each of the following five clusters is monophyletic and clearly separated on the nuclear genome-wide phylogenetic tree: (I) two nocturnal flowering species with lemon-yellow flowers, H. citrina (half-day flowering) and H. lilioasphodelus (one-day flowering); (II) a diurnal one-day flowering species with yellow-orange flowers, H. middendorﬀii; (III) a variety of a diurnal half-day flowering species with reddish orange flowers, H. fulva var. disticha; (IV) another variety of a diurnal half-day flowering species with reddish orange flowers, H. fulva var. aurantiaca, and a diurnal one-day flowering species with yellow-orange flowers, H. major; (V) a diurnal half-day flowering species with yellow-orange flowers, H. hakuunensis. The five clusters are consistent with traditional phenotype-based taxonomy (cluster I, cluster II, and clusters III-V correspond to Hemerocallis sect. Hemerocallis, Capitatae, and Fulvae, respectively). These findings could indicate that three flowering types (nocturnal flowering, diurnal one-day flowering, and diurnal half-day flowering) diverged in early evolutionary stages of Hemerocallis and subsequently a change from diurnal half-day flowering to diurnal one-day flowering occurred in a lineage of H. major. While genetic differentiation among the five clusters was well maintained, significant gene flow was detected between most pairs of taxa, suggesting that repeated hybridization played a role in the evolution of those taxa. The perennial herb genus Hemerocallis (Asphodelaceae) shows four flowering types: diurnal half-day, diurnal one-day, nocturnal half-day, and nocturnal one-day flowering. These flowering types are corresponding to their main pollinators, and probably act as a primary mechanism of reproductive isolation. To examine how the four flowering types diverged, we reconstructed the phylogeny of the Japanese species of Hemerocallis using 1615 loci of nuclear genome-wide SNPs and 2078 bp sequences of four cpDNA regions. We also examined interspecific gene flows among taxa by an Isolation-with-Migration model and a population structure analysis. Our study revealed an inconsistency between chloroplast and nuclear genome phylogenies, which may have resulted from chloroplast capture. Each of the following five clusters is monophyletic and clearly separated on the nuclear genome-wide phylogenetic tree: (I) two nocturnal flowering species with lemon-yellow flowers, H. citrina (half- day flowering) and H. lilioasphodelus (one-day flowering); (II) a diurnal one-day flowering species with yellow-orange flowers, H. middendorffii ; (III) a variety of a diurnal half-day flowering species with reddish orange flowers, H. fulva var. disticha ; (IV) another variety of a diurnal half-day flowering species with reddish orange flowers, H. fulva var. aurantiaca, and a diurnal one-day flowering species with yellow-orange flowers, H. major ; (V) a diurnal half-day flowering species with yellow-orange flowers, H. hakuunensis . The five clusters are consistent with traditional phenotype-based taxonomy (cluster I, cluster II, and clusters III-V correspond to Hemerocallis sect. Hemerocallis , Capitatae, and Fulvae , respectively). These findings could indicate that three flowering types (nocturnal flowering, diurnal one-day flowering, and diurnal half-day flowering) diverged in early evolutionary stages of Hemerocallis and subsequently a change from diurnal half-day flowering to diurnal one-day flowering occurred in a lineage of H. major . While genetic differentiation among the five clusters was well maintained, significant gene flow was detected between most pairs of taxa, suggesting that repeated hybridization played a role in the evolution of those taxa. Harvester 2012). Graphical obtained using Packager 2015) 2017).


Introduction
Since the study of Darwin (1862), a wealth of evidence has been accumulated that adaptation to different pollinator groups promotes the divergence of floral traits and leads to speciation (reviewed in Schiestl and Johnson, 2013). In the process of adaptive divergence and speciation mediated by pollinators, floral displays attracting animal pollinators co-vary with the preferences and perceptual abilities of pollinators (Dyer et al., 2012). Under this co-variation, a change in a floral display trait can cause pollinator shifts and promote premating reproductive isolation between nascent species (Bradshaw and Schemske, 2003). Floral anthocyanin is a well-studied example of such a trait and repeated anthocyanin gain/loss, which changed flower color drastically, promoted pollinator-mediated divergence in Mimulus (Cooley et al., 2011), Petunia (Gübitz et al., 2009), Aquilegia (Hodges and Derieg, 2009), and Ipomoea (Streisfeld and Rausher, 2009). Flowering time can also facilitate rapid speciation by affecting both fitness and assortative mating pleiotropically as a magic trait (Matsumoto et al., 2015). Contrastively, postmating reproductive isolation is sometimes weakly developed even between sister species adapted to distinct pollinators, and interspecific hybridization has been found in contact zones, called hybrid zones (Hodges and Arnold, 1994;Wolf et al., 2001). Interspecific hybridization often leads to adaptive introgression (Suarez-Gonzalez et al., 2018) and contributes to new combinations of floral traits that can promote hybrid speciation (Rieseberg, 1997;Rieseberg et al., 1995). Such introgression between sister species adapted to distinct pollinators is a complicated process. A better understanding of this process requires studies that combine, highly-resolved phylogenies, detailed observations on pollinator visits, and reliable estimation of gene flow between the species (van der Niet and Johnson, 2012).
The genus Hemerocallis L. (Hemerocallidoideae, Asphodelaceae) is a suitable model to study pollinator-mediated speciation (Rodriguez-Enriquez and Grant-Downton, 2013). Hemerocallis is a group of perennial herbs, consisting of about 14 species diverged in East Asia (Chen and Noguchi, 2000;Hotta, 2016;Hu, 1969;Hwang and Kim, 2012). In Japan, six species are distributed (Hotta, 2016). Hemerocallis shows four flowering types: diurnal half-day, diurnal one-day, nocturnal half-day, and nocturnal one-day flowering (Chen and Noguchi, 2000;Matsuoka and Hotta, 1966). These flowering types correspond to flower color and their main pollinators. Diurnal half-day flowering species, H. fulva (L.) L. and H. hakuunensis Nakai, start to flower early in the morning and close flowers one or two hours after sunset. While Hemerocallis fulva has reddish orange flowers and is mainly pollinated by swallowtail butterflies (Hirota et al., 2012), H. hakuunensis has yellow-orange flowers, and is pollinated by bees (Kang and Chung, 1997a). Diurnal one-day flowering species, H. middendorffii Trautv. & C.A.Mey. and H. major (Baker) M.Hotta, start to flower early in the morning and close flowers early in the next morning. Both Hemerocallis middendorffii and H. major have yellow-orange flowers. Hemerocallis middendorffii is mainly pollinated by bumblebees (Lelej and Kupianskaya, 2000). The nocturnal half-day flowering species H. citrina Baroni and the nocturnal one-day flowering species H. lilioasphodelus L. start to flower at dusk and close flowers early in the next morning (H. citrina) or in the next evening (H. lilioasphodelus). Both H. citrina and H. lilioasphodelus have lemonyellow flowers with sweet scent that is lacking in diurnally flowering species. Hemerocallis citrina is mainly pollinated by nocturnal hawkmoths (Hirota et al., 2012). In contrast to the strong differentiation among species in floral traits, postmating reproductive isolation is weakly developed in Hemerocallis. Different species of Hemerocallis can be easily crossed by hand-pollination and F1 hybrids are highly fertile (Kawano and Noguchi, 1973;Matsuoka and Hotta, 1966;Yahara, 2006, 2008). Also, phenotypic intergradation has been recorded in natural hybrid zones of some pairs of species (H. fulva var. aurantiaca and H. citrina var. vespertina, Hasegawa et al., 2006;H. middendorffii and H. yezoensis (=H. lilioasphodelus var. lilioasphodelus), Kawano, 1961;H. fulva var. disticha and H. citrina var. vespertina, Kawano and Noguchi, 1973). Thus, premating isolation by flowering time and pollinator preferences may function as a primary mechanism of reproductive isolation in Hemerocallis.
While Hemerocallis has been studied as a model of pollinatormediated speciation (Hirota et al., 2012(Hirota et al., , 2013(Hirota et al., , 2019, phylogenetic relationships among species of Hemerocallis still remain unclear. Based on cpDNA phylogenetic analysis, Noguchi and De-yuan (2004) claimed that the nocturnal half-day flowering species, H. citrina, derived three times independently from nocturnal one-day flowering ancestors and  suggested that H. citrina, H. flava (=H. lilioasphodelus), and H. middendorffii were either polyphyletic or paraphyletic. In both studies, however, only a few cpDNA markers were used and the bootstrap values for the cpDNA phylogenetic trees were low. On the other hand, a recently constructed nuclear genome-based phylogeny of Hemerocallis revealed that samples of H. citrina and H. lilioasphodelus and samples of H. middendorffii form two different monophyletic clusters (Murakami et al., 2020). It is well-known that a maternally inherited chloroplast gene-based phylogeny does not always correspond to a nuclear gene-based phylogeny due to chloroplast capture (Rieseberg and Soltis, 1991) or incomplete lineage sorting (Takahashi et al., 2001). The former occurs under a directional gene flow of cpDNA, and the latter is a stochastic process which potentially occurs in a group that evolved through rapid adaptive radiation (Fior et al., 2013). Thus, the inconsistency between the chloroplast phylogeny (Noguchi and De-yuan, 2004; and the nuclear phylogeny (Murakami et al., 2020) may be either of these cases. For phylogenetic reconstruction, Noguchi and De-yuan (2004) and  used chloroplast markers only but Murakami et al. (2020) used genomewide SNPs only so that it still remains uncertain whether their different conclusions are due to inconsistencies of chloroplast and nuclear phylogenies or due to the use of different sample sets. To better understand the evolutionary history of Hemerocallis and to resolve conflicting hypotheses about phylogenetic tree topology (monophyletic vs. polyphyletic/paraphyletic), it is necessary to analyze chloroplast and nuclear genomes using the same dataset.
In addition to phylogenetic information, analyses of population structure and interspecific gene flow are required to understand the influence of hybridization on the evolutionary history of Hemerocallis. Substantial evidence shows phenotypic intergradation in mixed natural populations of Hemerocallis (Hasegawa et al., 2006;Kawano, 1961;Kawano and Noguchi, 1973), but no previous study has been conducted on the genetic population structure and interspecific gene flow in Hemerocallis.
Here, we examine the evolutionary history of all diploid Japanese species of Hemerocallis by phylogenetic and population genetic approaches. First, we reconstruct two phylogenetic trees separately using chloroplast DNA sequences and nuclear genome-wide SNPs. Second, we estimate genetic diversity indices and the population structure based on nuclear genome-wide SNPs. Third, we estimate the degree and direction of interspecific gene flow using nuclear genome-wide SNPs. Based on those data, we answer to the following questions: (1) Does the chloroplast phylogeny correspond to the nuclear genome phylogeny? (2) If there is any inconsistency between chloroplast and nuclear genome phylogenies, which of chloroplast capture and incomplete lineage sorting is more likely to cause the inconsistency? (3) Is there any significant gene flow between taxa of different flowering types? (4) What is the most plausible process for the adaptive divergence among the species of Hemerocallis with different flowering types?

Plant materials
The genus Hemerocallis (Asphodelaceae) is divided into three sections: Hemerocallis, Fulvae Nakai, and Capitatae Nakai, distinguished by flowering time, flower color, scents and other morphological characteristics (Hotta, 2016). Hemerocallis sect. Hemerocallis, including H. citrina and H. lilioasphodelus, is a group of nocturnally flowering species characterized by lemon-yellow flower color and sweet scent. Both species are probably pollinated by nocturnal hawkmoths (H. citrina: Hirota et al., 2012). Hemerocallis sect. Capitatae includes H. middendorffii, a diurnal one-day flowering species, which has yelloworange flowers with faint scent and is mainly pollinated by bumblebees (Lelej and Kupianskaya, 2000). Hemerocallis sect. Fulvae includes the diurnal half-day flowering species H. fulva, which has scentless reddish orange flowers mainly pollinated by swallowtail butterflies (Hirota et al., 2012). Hemerocallis major and H. hakuunensis are placed in H. sect. Fulvae (Hotta, 2016) or sect. Hemerocallis (Murakami et al., 2020). Hemerocallis major shows diurnal one-day flowering and H. hakuunensis shows diurnal half-day flowering; both have yellow-orange flowers.
Hemerocallis hakuunensis is mainly pollinated by bees (Kang and Chung, 1997a). The pollinators of H. major have not been reported. On Danjo Islands, which is the only natural habitat of H. major, diurnal and nocturnal moths were abundant but large butterflies and bees were rare (Kato et al., 1967;Miyata, 1973;. Thus, H. major may be pollinated by diurnal and nocturnal moths.
were dried with silica gel and stored at room temperature. Voucher specimens were deposited in the Herbarium of Kyushu University (FU). Total DNA was extracted from fresh or silica gel-dried leaves using either a DNeasy Plant Mini Kit (Qiagen) or the CTAB method (Doyle and Doyle, 1990). Extracted DNA of Xanthorrhoea preissii Endl. (Xanthorrhoeoideae, Asphodelaceae) was provided from the Kew DNA Bank (reference No. 36236) as an additional outgroup for the MIG-seq analysis.

PCR and sequencing of chloroplast DNA regions
Sequences of chloroplast DNA (cpDNA) regions were determined for 38 individuals of Hemerocallis and one individual of D. ensifolia. Corresponding sequences of X. preissii were obtained from the chloroplast whole genome sequence at GenBank: accession no. KX822774.1. For DNA extraction, we selected one individual from each of the 38 representative populations (Table S1) because the sequences of cpDNA regions of Hemerocallis have low genetic diversity (Noguchi and De-yuan, 2004;. Four non-coding cpDNA regions, trnL intron, trnL-trnF, rbcL-atpB and psbA-trnH, were amplified using primers listed in Table S2. New primers for rbcL-atpB were designed for the outgroup species D. ensifolia because the universal primers often amplified a non-target region instead of the rbcL-atpB region. The primers were designed based on the sequences of the rbcL and atpB genes of Dianella, Hemerocallis, and Xanthorrhoea from GenBank. The PCR reaction was mainly performed with TaKaRa Ex Taq (Takara Bio) with the PCR cycle consisting of 2 min at 95 • C, followed by 30 cycles of 1 min at 95 • C, 1 min at 55 • C, 1 min at 72 • C, and a final extension of 7 min at 72 • C. The PCR products were purified with an ExoSAP (USB). Except for psbA-trnH, the purified PCR products of each cpDNA region were sequenced using the same primers as in the PCR reaction with a BigDye Terminator v 3.1 Cycle Sequencing Kit and an ABI Prism 3100 automated sequencer (Applied Biosystems). For the psbA-trnH region, one informative site was very close to the forward primer psbA_1f, and thus we used the different reverse primer trnH_2r, which is slightly more upstream than trnH_1r. All sequence data were registered to DNA Data Bank of Japan (DDBJ) under accession nos. LC566709-LC566825.  (Table S1). Small symbols (circles, triangles and squares) and their colors indicate species/varieties and their locations in Japan. The background colors of the small symbols indicate the northern (black) or southern (gray) clusters in the cpDNA tree.

MIG-seq analysis for nuclear genome-wide datasets
Multiplexed ISSR genotyping by sequencing (MIG-seq, Suyama and Matsuki, 2015) was used for de novo SNP detection. The MIG-seq library was prepared following the slightly modified protocol of Suyama and Matsuki (2015). The 1st PCR was conducted using MIG-seq primer set 1. The 1st PCR products were diluted 50 times with deionized water and used for the 2nd PCR. The 2nd PCR was conducted using primer pairs including tail sequences, adapter sequences for Illumina sequencing, and the index sequence of a reverse primer to identify each individual sample. The 2nd PCR products were multiplexed in the size range of 300-800 bp. The sequencing was performed by an Illumina MiSeq platform (Illumina, San Diego, CA, USA) using a MiSeq Reagent Kit v3 (150 cycle, Illumina). According to Suyama and Matsuki (2015), we skipped the sequencing of the first 17 bases of read 1 (anchor region for 2nd PCR primer, ISSR primer, and anchor region in the 1st primers) and three bases of read 2 (anchor region for 2nd PCR primer) using the 'DarkCycle' option of the MiSeq system. Both ends of the fragments and index sequences were read by pair-end sequences (reads 1 and 2) and index sequencing; 80, 94, and six bases of sequences were determined as read 1, read 2, and the index read, respectively. All raw MIG-seq data were deposited at the DDBJ Sequence Read Archive (DRA) with accession number DRA010527.

Quality control and de novo assembly
The first 14 bases (12 bases of the SSR primer region and two bases of the anchor sequences in the 1st primers) of read 2 sequences were trimmed using the program 'fastx_trimmer' in the FASTX-Toolkit 0.0.14 (http://hannonlab.cshl.edu/fastx_toolkit). Low-quality reads were removed from raw reads by the 'quality_filter' option of FASTX Toolkit 0.0.14 using the settings of q = 30 and P = 40. Extremely short reads containing adapter sequence were also removed from raw reads using TagDust 1.13 (Lassmann et al., 2009). Quality filtered 80 bp-nucleotide sequences from both ends of the fragment were used for SNP discovery. A de novo assembly was performed by ustacks and cstacks (STACKS 1.47, Catchen et al., 2013). 'Stack' is a set of identical sequences and several of those stacks were merged to form putative loci. First, using the ustacks program, each stack was piled within a sample setting the minimum stack coverage at m = 3, and a set of putative loci within a sample was formed by comparing the stacks allowing for two mismatches between stacks within a locus (M = 2). Second, the cstacks program merged the putative loci of each sample allowing for two mismatches (n = 2) and produced the catalog of putative loci in the entire samples. The default settings for the other options were used. The interspersed repeats and low complexity DNA sequences contained in the catalog of putative loci were treated as missing data using RepeatMasker 4.0.7 (Smit et al., 2015). The sequences of the repeat-masked putative loci (hereafter referred to as loci) were used as a reference for short-read mapping. Among the repeat-masked putative loci, 24 loci showed high homology (blastn e-value < 1e-4) to the complete chloroplast genome of H. citrina (Ou et al., 2020, GeneBank Accession No. MN872235). These loci were removed after the mapping process. For short-read mapping, the quality filtered reads from each individual were separately mapped to the repeat-masked putative loci using BWA-MEM alignment 0.7.17 ) with default settings. The sequence alignment/map format files (SAM/BAM) were processed to identify genotypes using SAMtools 1.7 and Bcftools 1.6 . Then, the genotyped sites with coverage depth less than 10 reads were filtered out.

SNP calling and data preparation
The MIG-seq data was obtained from 238 individuals of Hemerocallis. Unfortunately, the outgroup samples, D. ensifolia and X. preissii, had to be eliminated from the data because they did not contain any shared SNP locus with Hemerocallis. Next, two datasets, one composed of sequences and another composed of SNPs, were prepared as follows. For the sequence dataset, we parsed the merged genotype file to provide the sequence data retaining both biallelic SNP sites and non-variable sites with VCFtools 0.1.15 (Danecek et al., 2011). Four different filtering criteria were applied for quality control of sequence data. First, any SNP site where one of two alleles had less than three counts was filtered out because it is difficult to distinguish polymorphisms from sequencing errors when the minor allele count of SNPs is too low (Roesti et al., 2012). Second, loci containing SNPs with high heterozygosity (H o ≥ 0.6) were removed because excess of heterozygosity potentially resulted from artifactual loci built from several paralogous genomic regions. Third, after performing a Hardy-Weinberg equilibrium test for each population with eight or more individuals, we excluded loci where allele frequencies deviated from the Hardy-Weinberg equilibrium at P < 0.01 for two or more populations (Larson et al., 2014). This procedure effectively filtered out the loci of the haploid organelle genomes (i.e. the loci of the mitochondria genome, because the loci of the chloroplast genome were already removed before). Fourth, nucleotide sites with a genotyping rate < 80% were eliminated. For the SNP-only dataset, we extracted all SNPs from the sequence dataset, removed loci with minor allele frequency < 5%, and randomly selected only one SNP site per locus to avoid the inclusion of linked SNPs.
For calculating genetic diversity indices of the nuclear genome-wide sequence, both of the sequence dataset and the SNP-only dataset were used (see next section). For phylogenetic analysis, two additional sequence datasets (genotyping rate ≥ 60% and ≥ 30%) were used along with the above-described sequence dataset (genotyping rate ≥ 80%). This is because the resolution of phylogenetic trees tends to be higher when the tree is constructed by a larger dataset even when it includes loci with lower genotyping rates (Wagner et al., 2013). Among the three datasets, the largest dataset had a genotyping rate ≥ 30% because it included more loci than the other datasets (≥60%, ≥80%) (Table S4). For outlier detection and population structure analysis, the SNP-only dataset was used. For estimating divergence time and migration rate, a part of the sequence dataset (genotyping rate ≥ 80%) was used.

Genetic diversity
The following genetic diversity indices were estimated as additional indicators of past hybridization events. The nucleotide diversity of cpDNA was calculated for species with two or more samples, using DnaSP 6 (Rozas et al., 2017). The sequence dataset (genotyping rate ≥ 80%) and the SNP-only dataset were used to determine total and SNPbased nucleotide diversities of the nuclear genome-wide sequence, respectively; both were computed for species and for populations, using the "site-pi" function of VCFtools (Danecek et al., 2011). Observed and expected heterozygosities were calculated for species and for populations, using VCFtools for the SNP-only dataset.

Phylogenetic analysis
Multiple alignments of the four cpDNA regions were performed using the program MUSCLE 3.8.1551 (Edgar, 2004), and alignment columns containing gaps were trimmed using a heuristic selection method based on similarity statistics of trimAl 1.4.rev15 (Capella-Gutierrez et al., 2009). We used Kakusan 4.0.2016.11.07 (Tanabe, 2007) to find suitable nucleotide substitution models and partitioning strategies for the nucleotide datasets. The data was subdivided into four components: rbcL-atpB (713 bp), psbA-trnH (525 bp), trnL intron (481 bp) and trnL-trnF intergenic spacer (359 bp), and then run through Kakusan4. The AICc criterion (Sugiura, 1978) was used to compare Nonpartitioned, Partitioned_Equal_Mean_Rate, and Separate models. The Parti-tioned_Equal_Mean_Rate (GTR + Γ), which assumes an equal rate of nucleotide substitution across arbitrarily specified partitions, proved optimal. Maximum likelihood phylogenies were inferred using the software RAxML 8.2.10 (Stamatakis, 2014), whereby 1000 replicates of parallelized tree search bootstrapping was conducted. For the phylogenetic analyses of the nuclear genome-wide sequences dataset, we also used a maximum likelihood approach in RAxML 8.2.10. We used a GTR + Γ model of sequence evolution, which requires sequence datasets, and performed 1000 replicates of parallelized tree search bootstrapping.

Outlier detection (SNPs under selection)
To distinguish outlier SNPs that are presumably under natural selection from neutral SNPs in the SNP-only dataset, an outlier test was done with the R package pcadapt (Luu et al., 2017). This method using a SNP-based principal component analysis has a low false-positive detection rate and does not need a priori grouping. In this method, a user is required to choose the optimal K principal components (PCs), typically based on inspection of a scree plot, in which K is the number of PCs with eigenvalues that depart from a straight line. We chose K values from 1 to 8 to detect outliers. Any SNP with a q-value less than 0.01 was considered an outlier. When the SNP-only dataset (67 SNPs) was examined with pcadapt, the outlier test with the optimal K = 3 revealed 15 SNPs putatively under divergent selection.

Population structure analysis
The population genetic structure of Hemerocallis was examined using individual-based principal component analysis (PCA) using GENODIVE 2.0 (Meirmans and Van Tienderen, 2004) and structure analysis (Pritchard et al., 2000). To estimate genetic variation and structure among taxa, the individual-based PCA was performed for two datasets: 52 neutral SNPs and 15 outlier SNPs. We preformed structure analysis using STRUCTURE 2.3.4 (Pritchard et al., 2000) for only the 52 neutral SNPs because the software assumes neutrality. We performed 100 independent runs with a burn-in of 100,000 steps and additional 100,000 steps with the admixture model and estimated log-likelihoods for each number of clusters (K = 1-20). Optimal K values were determined by using the Delta K method of Evanno et al. (2005) in Structure Harvester (Earl and Vonholdt, 2012). Graphical results were obtained using CLUMPAK (Cluster Markov Packager Across K, Kopelman et al., 2015) and R package Pophelper 2.2.9 (Francis, 2017).

Estimation of divergence time and migration rate
Three demographic parameters, population size (θ for 4N e μ), migration rate (m), and divergence time (t) were estimated assuming as Isolation-with-Migration (IM) model using the IMa3 program (Hey, 2010;Hey and Nielsen, 2004). For the IM model, we used the part of the sequence dataset consisting of the haplotypes of the 52 loci with the neutral SNPs, which were detected by the outlier test (average length of the loci: 84.3 ± 0.6 bases, minimum and maximum length are 64 and 92 bases, respectively). We used WhatsHap 0.16 (Martin et al., 2016) to determine the haplotypes of each locus. The haplotype sequences of each locus were aligned by MUSCLE 3.8.1551 (Edgar, 2004), and all columns containing any gap were trimmed using trimAl 1.4.rev15 (Capella-Gutierrez et al. 2009). We collected 100,000 genealogies every 100 MCMC steps, each following a burn-in period of 100,000 steps. The uniform prior limits were set for the models within each species ([θ, m, t] = [10, 50, 1]). The HKY model was applied to all loci, 100 Markovcoupled chains with 0.99 heating curve shape parameter were used. The divergence time was scaled using the geometric mean of the mutation rate per locus per year and assuming a generation time of five years. Because the mutation rate, which is required to estimate the demographic parameters, is not known for Hemerocallis, we used nuclear ribosomal internal transcribed spacer (ITS) substitution rates of the herbaceous angiosperms (average: 5.48 × 10 −9 substitutions per site per year, the lower bound: 1.72 × 10 −9 , the upper bound: 19 × 10 −9 , Kay et al., 2006). These substitution rates are consistent with the substitution rates of noncoding spacer/introns of nuclear genes of the Arabidopsis species (average: 4.48-9.86 × 10 −9 , Huang et al., 2012). In addition, we tested the priors ([θ, m, t] = [10, 50, 25]) with 30 Markov-coupled chains because  suggested that ancestors of H. middendorffii might have originated at least 25 million years ago.
Because the number of samples per population was not large enough to perform a population-based analysis, we defined "a population" as one of the following four taxa (a-d) corresponding to the four major clusters of the nuclear genome-wide phylogenetic tree ( Fig. 3) H. major). These four taxa were representative of clusters I to IV, each containing more than 20 samples from multiple localities. Because of the small sample size, the endemic taxa (H. major and H. middendorffii var. exaltata, discussed in 4.4) and the taxa only found at one locality in Japan (H. lilioasphodelus and H. hakuunensis (Cluster V)) were not included. Although H. middendorffii var. exaltata has 22 samples, we did not include it. Because this endemic variety seems to be monophyletically derived from H. middendorffii var. esculenta in the nuclear genome-wide phylogenetic tree (Fig. 3), H. middendorffii var. esculenta could be more suitable as a representative of cluster II. Treating each group as a population, we run models for each combination of the four groups. The ancestry of Hemerocallis remained unclear because we could not obtain outgroup information from the MIG-seq analysis, and a topological inconsistency was found between the cpDNA and the nuclear genome-wide phylogenies. Thus, we analyzed each pair of taxa separately. Population structure was previously suggested to have little effect on the estimates (Strasburg and Rieseberg, 2010), although the IM models assume that there is no structure within a population (Hey and Nielsen, 2004).

Genotyping and genetic diversity
The sequences of four cpDNA regions were determined for 40 individuals and, after gaps were trimmed, the total length of the sequences was 2078 bp. We identified 29 and 241 polymorphic sites for Hemerocallis and all samples, respectively. There were 16 haplotypes within Hemerocallis. Nucleotide diversity of the cpDNA regions across the genus Hemerocallis was 0.00263; H. middendorffii var. esculenta (0.00024) and H. fulva var. aurantiaca (0.00072) showed nucleotide diversities an order of magnitude smaller than the other taxa (0.00227-0.00254, Table S3).
A total of 68,321,960 raw reads (284,674 ± 7734 reads per sample) were obtained from MiSeq, and after quality control, 53,974,257 reads (224,893 ± 5295 reads per sample) remained. The de novo assembly produced a total of 82,934 putative loci (6,683,013 bp). Then, 224,376 bp of putative loci were masked by RepeatMasker because of simple repeats (200,608 bp) or low complexity (23,768 bp). Loci obtained from a few individuals were mostly removed during the site filtering process. Consequently, the largest sequence dataset (≥30% of genotyping rate, 120,579 bp) had 1615 loci shared by at least 72 individuals (

Phylogenetic analysis
The cpDNA phylogenetic tree showed two major geographical groups named northern and southern clusters (Fig. 2). Hemerocallis middendorffii var. esculenta and var. exaltata were included only in the northern cluster, and H. fulva var. aurantiaca and H. major were included only in the southern cluster. Hemerocallis hakuunensis was grouped with neither the northern nor southern cluster. Hemerocallis citrina var. vespertina and H. fulva var. disticha were included in both clusters. Most populations of these two taxa belonged to the southern cluster but a few populations (H. citrina, pop. 1, 3, 7; H. fulva var. disticha, pop. 40, 41, 42, 48) belonged to the northern cluster. These seven populations were distributed in the southern area of the northern cluster's distribution range (Fig. 2). Regardless of taxa, the distribution of the northern cluster was biased to the northern and Japan Sea-side areas of Honshu island, and the distribution of the southern cluster was biased to southwestern  (Table S1). Arrows indicate populations with a mixture of genetic components from two clusters in the population structure analysis (Fig. 4) The phylogenetic analysis using the largest dataset of MIG-seq (genotyping rate ≥ 30%) produced a topology with the highest bootstrap support and showed five major clusters (Fig. 3). Unlike the cpDNA tree, this MIG-seq tree showed a topology largely consistent with the taxonomy based on floral traits. Cluster I included two nocturnally flowering species, H. citrina var. vespertina (nocturnal half-day flowering) and H. lilioasphodelus (nocturnal one-day flowering), and H. lilioasphodelus formed a monophyletic group with 100% of bootstrap value although it was embedded in H. citrina (hereafter, for simplification, nocturnal half-day flowering and nocturnal one-day flowering are in some cases collectively referred to as nocturnal flowering Phylogenetic analyses using the smaller datasets (genotyping rate ≥ 60%, ≥80%) revealed major clusters almost corresponding to the major clusters in the phylogenetic tree based on the largest dataset (genotyping rate ≥ 30%), but with lower bootstrap supports (Fig. S1). In the phylogenetic analysis using the smallest dataset (genotyping rate ≥ 80%), H. lilioasphodelus and H. hakuunensis were nested within H. citrina var. vespertina.

Population structure
The genetic structure examined by STRUCTURE (Pritchard et al., 2000) at K = 5 (Fig. 4) confirmed the five major clusters of the MIG-seq tree (Fig. 3). However, a mixture of two genetic components were found in three populations (pop. 18, 19, 58) and hybrid group 61 (Fig. 4), which were also located at unique positions in the nuclear genome-wide phylogenetic tree (arrows in Fig. 3). First, the population of H. lilioasphodelus (pop. 19) had a genetic structure with genetic components of H. citrina var. vespertina (red in Fig. 4)  H. middendorffii (green). At K = 2, the optimal value according to delta K, the five major clusters of the MIG-seq tree were not separated.
The individual-based PCA using 52 putatively neutral SNPs showed groupings consistent with the five major clusters of the MIG-seq phylogenetic tree using the largest dataset (genotyping rate ≥ 30%) although some samples of H. middendorffii and H. fulva var. disticha were overlapped (Fig. S2A). In the PCA diagram using 15 outlier SNPs, however, H. middendorffii was well separated from H. fulva var. disticha (Fig. S2B).

Divergence time and migration between taxa
Considering the result of the MIG-seq phylogenetic tree (Fig. 3) Fig. S3). Subsequently, we estimated migration rates between all pairs of the four taxa. As a result, significant migrations were detected in all pairs of the taxa except the migration from H. middendorffii var. esculenta to H. fulva var. disticha (Table 1, Fig. S4). . 4. STRUCTURE plot for K = 2-6. K = 5 best matches the five major clusters of the phylogenetic analysis using the largest MIG-seq dataset. K = 2 has the largest delta K for our data. Species and populations are separated by broad and narrow vertical black lines. The population number may not be shown when the sample number from the population is three or less. The clusters shown at the bottom correspond to the nuclear genome-wide phylogenetic tree.

Fig
Two of the six pairs showed a significant asymmetric migration. The migration rate from H. fulva var. disticha to H. citrina var. vespertina was significantly larger than the migration rate of the reverse direction in that the lower bound of the 95% HPD interval of the former (m 2->1 ) exceeded the upper bound of the 95% HPD interval of the latter (m 1->2 ). Similarly, the migration rate from H. fulva var. aurantiaca to H. citrina var. vespertina was significantly larger than the reverse direction migration rate. These estimates agreed well with the estimates obtained using the larger divergence time prior ([θ, m, t] = [10, 50, 25]), indicating that the estimates of divergence time were robust (Table S5).

Discussion
The nuclear genome-wide phylogeny was not consistent with the chloroplast phylogeny: the former showed five major clusters which agreed with the phenotype-based taxonomy (particularly based on floral traits) but the latter corresponded to the geographic distribution (i.e. a northern and a southern cluster). This discrepancy is probably caused by chloroplast capture, as discussed below. The nuclear genome phylogeny is more reliable than the chloroplast phylogeny because the population structure based on the neutral loci showed that genetic differentiation was well maintained among the five major clusters although significant interspecific gene flow was detected between most pairs of taxa. Adaptation of floral traits to pollinators could be the main factor to maintain the genetic differentiation among the five major clusters. This reliable nuclear genome-wide phylogeny allows consideration of the most plausible speciation process in Hemerocallis offering an explanation of how the different flowering types and the three sections of Hemerocallis (Capitatae, Fulvae, Hemerocallis) have evolved. Our findings could indicate that the different flowering types have diverged early in the evolutionary history of Hemerocallis and have been maintained conservatively.

Inconsistency between chloroplast and nuclear genome-wide phylogeny
In plants, there are many examples of inconsistencies between the phylogenies from chloroplast and nuclear genes attributed to chloroplast capture (Rieseberg and Soltis, 1991) or incomplete lineage sorting (Takahashi et al., 2001). While both explanations are applicable to the inconsistency in Hemerocallis, we suggest below that chloroplast capture is more likely to be the case.
Introgression occurs through hybridization and subsequent backcross, and the specific case of cpDNA introgression is known as chloroplast capture (Rieseberg and Soltis, 1991). Chloroplast capture can occur if the chloroplast and nuclear genomes have different modes of inheritance. In general, chloroplast and nuclear genomes are inherited maternally and biparentally, respectively. Hemerocallis probably has this mode of inheritance as well. In Hemerocallis, introgression is expected because many of the species can easily hybridize in natural environments (Hasegawa et al., 2006;Kawano, 1961;Kawano and Noguchi, 1973) and also through hand-pollination (Kawano and Noguchi, 1973;Matsuoka and Hotta, 1966;Yasumoto and Yahara, 2006). The significant interspecific gene flow revealed in this study (Table 1, Fig. S4) confirmed that introgressions of nuclear genes did occur in the evolutionary history of Hemerocallis. During the process of recurrent backcrossing, not only nuclear genes but also chloroplast genes would have been exchanged between species. The majority of populations of H. citrina and H. fulva var. disticha had the southern cluster type of cpDNA while a few populations of those two species had the northern cluster type of cpDNA (Fig. 2). All other species had only one type of cpDNA (Fig. 2). As a result, H. citrina and H. fulva var. disticha showed much larger nucleotide diversities than other species (Table S3). Those populations of H. citrina and H. fulva with the northern cluster type of cpDNA may have originated from a hybridization with a species distributed in northern Japan (e.g. H. middendorffii), causing chloroplast capture. Incomplete lineage sorting, on the other hand, is provided by stochastic sorting of ancestral polymorphisms (Takahashi et al., 2001). Our results did not support stochastic sorting of ancestral polymorphisms because the inconsistencies between the chloroplast phylogeny and the nuclear phylogeny were not random but geographically biased.

Phylogeny and flowering types
Among the five major clusters of the nuclear genome-wide phylogenetic tree, cluster I and II correspond to a nocturnal flowering group (nocturnal one-day and half-day flowering) and a diurnal one-day flowering group, respectively, and clusters III-V correspond to a group of diurnal half-day flowering species except for the diurnal one-day flowering species H. major. This finding could indicate that the three flowering types (nocturnal flowering, diurnal one-day flowering, and diurnal half-day flowering) diverged early in the evolutionary history of Hemerocallis and have been maintained conservatively and, subsequently, diurnal one-day flowering evolved in H. major of cluster IV, and nocturnal one-day flowering and nocturnal half-day flowering differentiated in H. lilioasphodelus and H. citrina of cluster I, respectively.
The ancestral flower state remained unclear because our nuclear genome-wide phylogenetic tree was not rooted, and a root of the cpDNA tree was trichotomous. To infer the ancestral flower state, we need to consider flower traits of the clade sister to Hemerocallis. According to a recent molecular phylogenetic study of Xanthorrhoeaceae, now included in Asphodelaceae (McLay and Bayly, 2016), Hemerocallis is sister not to Dianella, which we used as an outgroup in our phylogenetic analysis, but to Simethis Kunth or Chamaescilla Mueller ex Bentham. Simethis is a monotypic genus distributed in western Europe and northern Africa. Chamaescilla is a genus including four Australian species. Hemerocallis, Simethis and Chamaescilla compose the hemerocalloid Table 1 Demographic parameters estimated by Isolation-with-Migration models for each pair of taxa. High point and 95% highest posterior density intervals in parentheses for divergence time in million years ago (t), population size in thousands of taxon 1 (θ 1 ), taxon 2 (θ 2 ), ancestral population (θ A ), and migration rate from taxon 1 to 2 (m 1->2 ) and from taxon 2 to 1 (m 2->1 ). Significance of migration was tested by likelihood ratio tests (Nielsen and Wakeley, 2001); *p < 0.05, ** p < 0.01, *** p < 0.001.  March 5, 2020). The flower longevity of Simethis is unknown but all taxa of the johnsonioid clade are diurnal one-day flowering (Clifford and Conran, 1998). Regarding pollination, Simethis and Chamascilla appear to be pollinated by bees (McLay and Bayly, 2016) and the johnsonioid clade contains mainly buzz-pollinated species (Clifford and Conran, 1998). More generally, it was suggested that buzz-pollination by bees is ancestral in Hemerocallidoideae and butterfly or moth pollination evolved later in Hemerocallis (Furness et al., 2013). According to the knowledge summarized above, the common ancestors of Hemerocallis may have been diurnal, one-day or longer than one-day flowering, and their flowers may have been mainly pollinated by bees. Among all Hemerocallis species, H. middendorffii is most similar to the hypothesized ancestor because it is diurnal one-day flowering and mainly pollinated by bees (Lelej and Kupianskaya, 2000). In contrast, all other species do not satisfy these three conditions simultaneously (see in Materials and Methods 2.1). Below, we thus assume that the flowering type of cluster II, including H. middendorffii, is the ancestral flower state of Hemerocallis. Under this assumption, Fig. 5A shows the most parsimonious hypothesis for the evolutionary history of Hemerocallis suggesting that all clusters (I-V) are monophyletic and that nocturnal flowering and diurnal half-day flowering emerged independently from a diurnal one-day flowering ancestor.
The hypothetical evolutionary history of Hemerocallis in Fig. 5 focuses on four flower traits: flower opening time, flower closing time, flower color, and floral scent. Flower opening and closing time together can determine flower duration. Each of these four traits is taxonomically important to classify Hemerocallis species (Hotta, 2016) and relevant for the pollination syndrome (van der Pijl, 1961). Moreover, flower opening and closing time of Hemerocallis are regulated by different major genes (Nitta et al., 2010). More generally, in plants, flower color and flower scent are often governed by major genes (color: e.g., Bradshaw and Schemske, 2003;Hoballah et al., 2007;scent: e.g., Klahre et al., 2011). Therefore, drastic changes in these four flower traits can be caused by a small number of mutations in major functional or regulatory genes.
Under the most parsimonious hypothesis (Fig. 5A), three of the four traits changed in the common ancestor of cluster I ("1" in Fig. 5A): ancestral diurnal one-day flowering changed to nocturnal half-day flowering (a shift of flower opening time from morning to dusk), yellow-orange flower color changed to lemon-yellow, and scentless flowers changed to flowers with a sweet scent. These changes of flower traits are typically associated with a pollinator shift from diurnal insects to hawkmoths (van der Pijl, 1961). On the sister branch, in the common ancestor of cluster III, IV, and V ("2" in Fig. 5A), diurnal one-day flowering was shortened to diurnal half-day flowering through a shift of flower closing time from the next morning to the evening before. This change may be associated with a specialization to diurnal pollinators. Subsequently, in the common ancestor of cluster III and IV ("3" in Fig. 5A), flower color changed from yellow-orange to reddish orange. This color change may be associated with a pollinator shift from bees to butterflies because butterflies prefer red flowers to yellowish flowers (Hirota et al., 2012(Hirota et al., , 2013. Next, within cluster IV ("4" in Fig. 5A), flower color changed from reddish orange to yellow-orange and diurnal half-day flowering was extended to diurnal one-day flowering in the ancestor of H. major through a shift of flower closing time from the evening to the next morning. These two changes may be associated with adaptation to a local pollinator environment of Danjo Islands, which is the only natural habitat of H. major (discussed in 4.3). Finally, in the ancestor of H. lilioasphodelus within cluster I ("5" in Fig. 5A), nocturnal half-day flowering was extended to nocturnal one-day flowering through a shift of flower closing time from the morning to the evening. This could also imply adaptation to a local pollinator environment but our knowledge of pollinators for H. lilioasphodelus on the continent (Russia, China and Korea) are too limited to derive any conclusion. Fig. 5A suggests that two pollinator shifts occurred in Hemerocallis: from bees to hawkmoths ("1" in Fig. 5A) and from bees to butterflies ("3" in Fig. 5A), each associated with a change in flower color from yelloworange to lemon-yellow and to reddish orange, respectively. The former, the evolution of hawkmoth pollination in Hemerocallis ("1" in Fig. 5A) is different from the previously known cases of Aquilegia and Petunia (Klahre et al., 2011;Whittall and Hodges, 2007), in which flower color changed from pink or red to white, resulting in a pollinator shift from bees or hummingbirds to hawkmoths. In Aquilegia and Petunia, a flower color change directly influenced preference of pollinators. In Hemerocallis, however, hawkmoths do not show a consistent preference to lemon-yellow flower color (Hirota et al., 2013) but prefer the high contrast of a UV bullseye pattern as a nectar guide (Hirota et al., 2019). The flower color change from yellow-orange to lemon-yellow may thus be related to a change in the contrast of the UV bullseye pattern. The latter, the evolution of butterfly pollination in Hemerocallis, directly corresponds to pollinators' color preference. Butterflies consistently prefer reddish flower color in Hemerocallis (Hirota et al., 2012(Hirota et al., , 2013. Additionally, bees take longer to find red flowers than flowers with more conspicuous colors (Spaethe et al., 2001). Thus, flower color change from yellow-orange to reddish orange has probably promoted a pollinator shift from bees to butterflies.
In Hemerocallis, flower colors are mainly attributed to anthocyanins and carotenoids (Griesbach and Batdorf, 1995). In some cases, drastic changes of flower color caused by major gene mutations result in a pollinator shift (Bradshaw and Schemske, 2003;Hoballah et al., 2007;Owen and Bradshaw, 2011). Such major genes are often related to transcription factors involved in anthocyanin (e.g. reddish, pink color) or carotenoid (e.g. yellow, orange color) pigmentation synthesis (Quattrocchio et al., 1999;Sagawa et al., 2016). The pollinator shifts from bees to hawkmoths ("1" in Fig. 5A) and from bees to butterflies ("3" in Fig. 5A) were associated with a flower color change from yelloworange to lemon-yellow and to reddish orange, respectively. The former may be related to a reduction or change of floral carotenoid pigmentation, whereas the latter may be related to an increased We also discuss another hypothetical evolutionary history of Hemerocallis because our dataset is limited to Japanese Hemerocallis species, and does not include the four species endemic to China (H. multiflora Stout, H. nana Forrest & W. W. Smith, H. forrestii Diels and H. plicata Stapf, Chen and Noguchi, 2000) and the two species endemic to Korea (H. hongdoensis M.G. Chung & S.S. Kang and H. taeanensis S.S. Kang & M.G. Chung, Hwang and Kim, 2012). The four Chinese species were suggested to belong to sect. Fulvae (Matsuoka and Hotta, 1966), and the two Korean species were described under sect. Fulvae (Chung and Kang, 1994;Kang and Chung, 1997b). These species are all diurnal half-day flowering and have uniformly colored orange or yellow-orange flowers, which are common to H. hakuunensis of sect. Fulvae. Among the six unanalyzed species, the three Chinese species, H. forrestii, H. plicata, and H. nana, are endemic to the highlands of south-western China, especially the northwest of Yunnan (Chen and Noguchi, 2000), which is known to have high plant diversity containing palaeo-endemisms (López-Pujol et al., 2011). Thus, H. forrestii, H. plicata, and H. nana may be a group of old origin and could be an ancestral lineage. Considering this possibility, we below reconstruct an evolutionary history of Hemerocallis by assuming that H. hakuunensis, which has the above character states common to the three Chinese species, is an ancestor.
Under this assumption (Fig. 5B), flower color changed from yelloworange to reddish orange in the common ancestor of cluster III and IV ("7" in Fig. 5B), which may be associated with a pollinator shift from bees to butterflies. On the sister branch, in the common ancestor of cluster I and II ("8" in Fig. 5B), diurnal half-day flowering was extended to diurnal one-day flowering through a shift of flower closing time from the evening to the next morning. The subsequent evolutionary events in cluster I ("9, 10" in Fig. 5B) and IV ("11" in Fig. 5B) occurred in the same order as in Fig. 5A. Interestingly, under this different evolutionary hypothesis (Fig. 5B), the two pollinator shifts from bees to other insects occurred, just as in Fig. 5A: from bees to hawkmoths ("9" in Fig. 5B) and from bees to butterflies ("7" in Fig. 5B). However, the evolutionary path in Fig. 5B requires an additional evolutionary event compared to the path in Fig. 5A: in Fig. 5B, flowering duration once shortened from diurnal one-day flowering to diurnal half-day flowering when the ancestral state of Hemerocallis was derived from closely related genera (Simethis and Chamaescilla) ("6" in Fig. 5B) but this event did not occur in Fig. 5A.
Although our MIG-seq tree is unrooted, the estimated divergence time (0.39-0.53 MYA) is consistent with the previous estimation based on cpDNA (McLay and Bayly, 2016). Therefore, Hemerocallis is likely to have experienced rapid adaptive divergence during the Quaternary. McLay and Bayly (2016) showed, using cpDNA regions, that the ancestor of Hemerocallis could have diverged from the sister genus of Simethis at least 10 MYA. So that, the ancestral state is likely to have been maintained since 10 MYA to 0.39-0.53 MYA when the adaptive radiation happened. During the long-term continuation of the ancestral state, a lot of mutations may have accumulated, resulting in difficulties of reconstructing a rooted tree. While the genus Hemerocallis is distributed in East Asia, the sister genus Simethis occurs in Western Europe and Northern Africa (Gianguzzi et al., 2012) and the second closest genus Chamaescilla is found in Australia (McLay and Bayly, 2016). These three geographic ranges are totally separated from each other. Since the history of the separation of these ranges remains uncertain, it is difficult to determine the genus closest to Hemerocallis to infer its ancestral state.
To understand and reconstruct the evolutionary history of Hemerocallis in greater detail, more information is required: first, a rooted phylogenetic tree; second, more samples of populations and species of Hemerocallis from outside of Japan; and third, more information about the flowering types and pollinators of the taxa closely related to Hemerocallis.

Interspecific gene flow
Significant interspecific gene flow was detected between most pairs of taxa. Moreover, biased gene flow was observed from the diurnal halfday flowering species (H. fulva var. aurantiaca and var. disticha) to nocturnal flowering species H. citrina (Table 1, Fig. S4). Biased gene flow was detected only between pairs of taxa whose flowering time overlaps for a few hours in the early evening only (Hasegawa et al., 2006;Nitta et al., 2010). This biased gene flow probably resulted from a biased pollen flow from early to later flowering species caused by the behavior of crepuscular hawkmoths. Assuming that pollinators have no preference for floral traits, it was theoretically predicted that biased pollen flow from diurnal half-day flowering to nocturnal half-day flowering species always happens because of pollen carry-over (Matsumoto et al., 2013). Crepuscular hawkmoths, one of the main pollinators of H. citrina, foraged rather opportunistically and showed no significant preference to flower color and scent (Hirota et al., 2013) despite of their preference for a high bullseye contrast pattern, which is stronger in flowers of H. citrina than H. fulva (Hirota et al., 2019). This provides a possible explanation for the biased gene flow from H. fulva to H. citrina. By contrast, unbiased bidirectional gene flow was observed between the other pairs of species whose flowering time overlapped over half a day or longer. The longer overlap period may diminish the effect of biased pollen flow from early to later flowering species.
While significant gene flow was detected between most pairs of taxa, the population structure based on neutral SNPs showed that genetic differentiation was well maintained among the five major clusters (Fig. 4). In general, under significant gene flow, closely related taxa are expected to show low genetic differentiation (Cooper et al., 2010). However, when local adaptations by divergent selection have been going on, genetic differentiation among taxa can be maintained not only for a relatively short time (e.g. Aquilegia, Li et al., 2014Li et al., , 2019 but also for a longer time period (e.g. Helianthus, Strasburg and Rieseberg, 2008). In this study, 15 outlier SNPs were found presumably under natural selections, indicating that divergent selection could have promoted local adaptations which possibly have played a role for maintaining the observed genetic difference even in the face of the significant gene flow detected in this study as well. In Hemerocallis, adaptation of floral traits to pollinators could have been more important than adaptation of vegetative traits to abiotic environments because the five clusters are clearly distinguished by floral traits but not by vegetative traits. The impact of interspecific gene flow and local adaptation on the evolutionary history of Hemerocallis is, however, to be studied in more detail using larger datasets.
In spite of the well-maintained genetic structure of the five major clusters, at the population level, three cases of introgression between clusters were found with the population structure analysis in populations 18, 19, and 58 (Fig. 4) (Fig. 4). Kawano (1961) reported a large natural hybrid swarm of H. lilioasphodelus and H. middendorffii at Otanoshike in the eastern part of Hokkaido and suggested that these species are normally isolated by habitats but they can easily hybridize bidirectionally once their populations get into contact with each other because their flowering times totally overlap (both of them are one-day flowering). Second, population 18, the Kechi population of H. citrina on Tsushima island, had genetic components not only from H. citrina (cluster I) but also from H. hakuunensis (cluster V) (Fig. 4). Hotta et al. (1984) (Fig. 4). While there have been no previous reports of introgressions between these two varieties, it is likely that gene flow from var. disticha to var. aurantiaca (pop. 58) happened at some moment.
In H. citrina population 4 and H. middendorffii population 28 at Mt. Ibuki, we found a few individuals of the putative hybrids (hybrid group 61, 62), but no sign of introgression between pop. 4 and 28 was observed in the structure analysis (Fig. 4) In addition, nucleotide diversities and heterozygosities of the parental populations (pop. 4 and 28) were similar to other populations of Hemerocallis (Table S1). These putative hybrids had genetic components of H. citrina (cluster I) as well as H. middendorffii (cluster II). It is notable that one individual from hybrid group 62 (in pop. 28 of H. middendorffii) was almost identical with H. citrina (Fig. 4) although it was growing in the H. middendorffii population. This individual may be a derivative of backcrossing from H. citrina (pollen donor) to a hybrid in pop. 28 of H. middendorffii (pollen recipient). Further study on the cpDNA of this individual is needed because it is not determined from our sample.
According to Hotta (1986), there is a hypothesis that H. major evolved from a hybridization between H. fulva var. aurantiaca and H. citrina. However, this hypothesis contradicts the result of the population structure analysis (Fig. 4), which showed that the genetic components of H. major are almost identical to H. fulva var. aurantiaca and there was no sign of gene flow from H. citrina. Interestingly, there was no individual with an H. fulva var. aurantiaca-like phenotype (reddish orange flowers and diurnal half-day flowering) in the H. major population (yellow-orange flowers and diurnal one-day flowering, Yasumoto, personal observation) although the genetic components of H. major are largely identical to H. fulva var. aurantiaca (Fig. 4). Hemerocallis major may have evolved and diverged from H. fulva var. aurantiaca (Fig. 3) because of the long-time isolation from H. fulva and the local pollinator environment on Danjo Islands. In the world, H. major is only naturally distributed on Danjo Islands. Danjo Islands have been geologically isolated from Kyushu island and the Continent by the sea and have developed their own unique fauna and flora with endemic species and varieties (Ito et al., 2017;Toriba, 1986;Uematsu et al., 1973;Yamaguchi and Ejima, 1973). According to the records of the fauna on Danjo Islands, diurnal and nocturnal moths are abundant but large butterflies and bees are rare (Kato et al., 1967;Miyata, 1973;.

Taxonomic insights
The nuclear genome-wide phylogenetic tree agreed with the phenotype-based taxonomy (Hotta, 2016;Matsuoka and Hotta, 1966), except for the result that H. fulva var. disticha s. str. and var. littorea were not separated clearly (Fig. 3). Hemerocallis sect. Hemerocallis, Capitatae, and Fulvae corresponded with cluster I, II, and III-V, respectively (Fig. 5). Our results also largely agreed with another MIG-seq study of Hemerocallis by Murakami et al. (2020) which found that the following three clusters were clearly separated: the first cluster composed of H. middendorffii varieties (H. sect. Capitatae, equivalent to our cluster II), the second cluster composed of H. fulva varieties (H. sect. Fulvae, equivalent to our cluster III), and the third cluster composed of H. citrina varieties, H. lilioasphodelus varieties, H. hakuunensis and H. major (H. sect. Hemerocallis, closest to our cluster I). In contrast to Murakami et al. (2020), our nuclear genome-wide phylogenetic tree showed two new additional clusters (IV and V) and H. hakuunensis and H. major did not belong to Hemerocallis sect. Hemerocallis (cluster I) but to H. sect. Fulvae (cluster III-V) as suggested by Hotta (2016). In our result, H. hakuunensis formed the monophyletic cluster V, which was distinctly separated from the other clusters (Fig. 3). Also, H. major formed a monophyletic clade which was clearly separated from H. fulva var. aurantiaca in cluster IV. These differences between the two studies can be attributed to the limited sample size of H. hakuunensis (three), H. major (one) and H. fulva var. aurantiaca (none) in Murakami et al. (2020). In contrast, our study used 10, 16, and 33 samples of H. hakuunensis, H. major, and H. fulva var. aurantiaca, respectively, providing more reliable evidence for the phylogenetic positions of H. hakuunensis and H. major.
Our nuclear genome-wide phylogenetic tree offers three suggestions for the intraspecific and species-level taxonomy of Hemerocallis in Japan. First, Hemerocallis fulva var. disticha s. str. and var. littorea are not separated (in cluster III, Fig. 3). Based on this result, we treat var. littorea as a synonym of var. disticha as follows.
Hemerocallis fulva var. littorea occurs in coastal habitats and was distinguished by its evergreen leaves from var. disticha which is deciduous in winter (Matsuoka and Hotta, 1966). However, the inland strains are often evergreen under cultivation and it is hard to distinguish the two varieties morphologically. Second, H. fulva var. aurantiaca has highly diverged from H. fulva var. disticha, and is closer to H. major. The distribution of H. fulva var. aurantiaca and H. major is limited to western Kyushu (Hotta, 1986) and both are distinguished by floral color and morphology. Thus, H. fulva var. aurantiaca may be treated as a distinct species sister to H. major. However, it is hard to distinguish H. fulva var. aurantiaca from H. fulva var. disticha by floral traits because the two varieties are very similar in having reddish orange tepals with dark tawny mottling and human-unrecognized scent. To describe H. fulva var. aurantiaca as a species, we need to define additional diagnostic morphological traits. Third, H. middendorffii var. exaltata endemic to Tobishima and Sado islands was derived from H. middendorffii var. esculenta widely distributed in the higher elevation of northern Japan. Hemerocallis middendorffii var. exaltata has a later flowering season and more flowers (10-30 in well-developed plants) per inflorescence than H. middendorffii var. esculenta (3-10 flowers) (Hotta, 2016;Matsuoka and Hotta, 1966). It is possible that these flowering traits are an adaptation to the local pollinator environment on the islands, like in the case of H. major (discussed in 4.3). This result supports the suggestion of Matsuoka and Hotta (1966) that H. middendorffii var. exaltata is a lineage of H. middendorffii adapted to the lowland environment of Tobishima and Sado islands.

Conclusions
Using 1615 loci of genome-wide SNPs (MIG-seq) and 2078 bp sequences of four cpDNA regions, our study revealed an inconsistency between chloroplast and nuclear genome phylogenies, probably due to chloroplast capture. Significant interspecific gene flow was detected between most pairs of taxa, but the five major clusters are clearly separated on the nuclear genome-wide phylogenetic tree corresponding to floral traits. This could indicate that gene flow among the five clusters is limited at the present time or that pollinator-mediated selection for floral traits is stronger than the effect of gene flow. Among the five major clusters, two clusters corresponded to a nocturnal flowering group and a diurnal one-day flowering group, respectively, and the other three clusters corresponded to a diurnal half-day flowering group except for the diurnal one-day flowering species H. major. This finding could indicate that these three flowering groups have diverged early in the evolutionary history of Hemerocallis. Hemerocallis provides a promising model system to examine the speciation process with gene flow.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.