BSA-Seq for the Identification of Major Genes for EPN in Rice

Shen, Shen; Xu, Shanbin; Wang, Mengge; Ma, Tianze; Chen, Ning; Wang, Jingguo; Zheng, Hongliang; Yang, Luomiao; Zou, Detang; Xin, Wei; Liu, Hualong

doi:10.3390/ijms241914838

Open AccessArticle

BSA-Seq for the Identification of Major Genes for EPN in Rice

by

Shen Shen

^†,

Shanbin Xu

^†,

Mengge Wang

,

Tianze Ma

,

Ning Chen

,

Jingguo Wang

,

Hongliang Zheng

,

Luomiao Yang

,

Detang Zou

,

Wei Xin

^* and

Hualong Liu

^*

Key Laboratory of Germplasm Enhancement and Physiology & Ecology of Food Crop in Cold Region, Ministry of Education/College of Agriculture, Northeast Agricultural University, Harbin 150030, China

^*

Authors to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Int. J. Mol. Sci. 2023, 24(19), 14838; https://doi.org/10.3390/ijms241914838

Submission received: 16 August 2023 / Revised: 16 September 2023 / Accepted: 28 September 2023 / Published: 2 October 2023

(This article belongs to the Section Molecular Plant Sciences)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Improving rice yield is one of the most important food issues internationally. It is an undeniable goal of rice breeding, and the effective panicle number (EPN) is a key factor determining rice yield. Increasing the EPN in rice is a major way to increase rice yield. Currently, the main quantitative trait locus (QTL) for EPN in rice is limited, and there is also limited research on the gene for EPN in rice. Therefore, the excavation and analysis of major genes related to EPN in rice is of great significance for molecular breeding and yield improvement. This study used japonica rice varieties Dongfu 114 and Longyang 11 to construct an F₅ population consisting of 309 individual plants. Two extreme phenotypic pools were constructed by identifying the EPN of the population, and QTL-seq analysis was performed to obtain three main effective QTL intervals for EPN. This analysis also helped to screen out 34 candidate genes. Then, EPN time expression pattern analysis was performed on these 34 genes to screen out six candidate genes with higher expression levels. Using a 3K database to perform haplotype analysis on these six genes, we selected haplotypes with significant differences in EPN. Finally, five candidate genes related to EPN were obtained.

Keywords:

Oryza sativa L.; EPN; BSA-seq; haplotype; candidate genes

1. Introduction

Rice is a major food crop, and increasing its yield is essential for ensuring food security. With the rapid development of the economy and society, the cultivated area is constantly decreasing, while the population is constantly increasing. Therefore, the improvement of rice yield has become an urgent problem to be solved in the future [1]. The improvement of the total rice yield largely depends on the yield per unit area, which is primarily determined by the factors that make up the yield composition [2]. The EPN, grains per panicle, and thousand grain weight are the three components of rice yield composition [3]. The composition factors of rice yield are interrelated and mutually constrained, all controlled by multiple genes, and are sensitive to the environment. The EPN per plant in rice is influenced by the number of tillers, while the number of tillers in rice is constrained by both fertilizer and genetic background. For example, treating nitrogen deficiency during the rice tillering stage and compensating with nitrogen fertilizer during the young panicle differentiation stage can significantly increase the number of effective panicles per plant [4]. The EPN of rice is not only easily affected by environmental factors but also a complex quantitative trait controlled by multiple genes. EPN can seriously affect rice yield, and having stable EPN is one of the most important characteristics of an ideal plant structure [5]. Therefore, improving the EPN of rice plants is the primary objective of rice breeding. The exploration and analysis of major genes related to EPN in rice, as well as the identification of their dominant haplotypes, are crucial for rice breeding and increasing rice yield.

The mapping and gene cloning of quantitative trait loci (QTLs) for important agronomic traits in rice, such as yield components, are the foundation and prerequisite for molecular breeding of high-yield and high-quality rice. QTL mapping is the process of determining the position of a QTL on a chromosome by analyzing the relationship between all DNA markers in the genome and the measurements of the quantitative trait phenotype. It often includes the steps of constructing a genetic linkage map, obtaining mapping populations, detecting the genotype and phenotype measurements of individuals in the segregating generation population, and analyzing the relationship between genotype and phenotype. Traditional QTL localization typically requires the use of molecular markers distributed throughout the entire genome to genotype many individuals in a population [6]. To ensure sufficient statistical power, this method requires genotype and phenotype analysis of a large number of offspring, as opposed to bulked segregant analysis (BSA), which only requires genotyping of individuals with extreme phenotypes [7]. BSA is used for gene mapping of qualitative traits or quantitative traits with major genes [8]. When used for quantitative trait mapping, this technique is also known as QTL-seq [9] and has wide applications in genetic breeding. In this method, two parents with significant differences in traits are selected for hybridization to obtain an offspring population with separated traits. From this population, a certain number of individuals with extreme phenotypes are selected for DNA mixed-pool sequencing. The differences in genotype frequency of each locus between the two mixed pools at the whole genome level are then calculated to determine the regions where QTLs related to traits are located. Initially, the BSA method was widely used to identify QTLs related to specific traits, such as disease resistance, color, and fertility. Now, it has been applied to QTLs and gene mapping of various levels and traits. Therefore, BSA-seq, used in this study to locate the main QTLs and explore the target gene, is the correct choice.

A significant number of QTLs controlling rice yield traits have been successfully mapped and cloned in genetic mechanism research [10,11]. Moreover, satisfactory results have been achieved in the molecular analysis of complex agronomic traits in rice [12]. Some QTLs related to effective panicle number in rice have gradually been identified. Xu et al. conducted a QTL analysis on yield-related traits using 292 recombinant inbred lines produced by TQ and LT. They detected a total of four QTLs affecting the EPN of rice on chromosomes 3, 4, 11, and 12 [13,14]. Zhang et al. constructed three sets of CSSL populations using PA64s, 9311, and Nihon [15]. After conducting whole genome resequencing, we obtained high-density physical maps. These maps allowed us to locate a QTL on chromosome 1 that controls EPN, with a contribution rate of 20% [15]. Wu et al. utilized the RIL population obtained from a cross between two indica rice varieties, ‘H359’ and ‘Acc85582’. Their findings revealed the presence of five QTLs responsible for controlling rice tillering traits, which were found on chromosomes 1, 3, and 5 [16]. Miyamoto et al. used an RIL population to locate four QTLs related to tillering, which were found on chromosomes 2, 5, 6, and 8, respectively [17]. In recent years, a large number of important agronomic trait-related genes in rice have been cloned, such as MOC1, Ghd7, GS3, GW2, etc. [12,18]. Li et al. discovered the single tiller gene MOC1 using mutant materials. This gene is the first to be identified as controlling rice tillering and is located on chromosome 6 [19]. TAD1 is a tillering and dwarf gene that encodes the coactivator of the anaphase-promoting complex/cyclosome (APC/C), a multi-subunit E3 ligase. It is located on chromosome 3 and is responsible for promoting the expression of the OSH1 gene, thus maintaining the pre-meristem region and facilitating the formation of axillary meristem [20]. Although some genes related to the effective number of panicles in rice have been discovered, there are not many genes that can be applied to rice production and truly improve rice yield. Therefore, we still need to further explore more effective EPN-related genes.

In this study, rice varieties Dongfu 114 and Longyang 11 were used as parents to create an F₅ population. Two extreme mixed pools were then constructed based on the analysis of effective panicles within the population. The main QTL interval of rice was obtained using the QTL-seq method, and the candidate genes associated with effective panicles were identified through haplotype analysis. The aims of this study were to explore new major loci in rice, provide valuable genetic resources for rice breeding, and have significant implications for studying rice tillering mechanisms and improving yield.

2. Results

2.1. Phenotypic Analysis and Evaluation of EPN

In this experiment, three replicates were set for the rice materials used, and of the two parental varieties, the EPN of ‘DF114’ plants was more than that of ‘LY11’ plants, and the EPN of the 309 F₅ progenies (Figure 1) ranged from 11.98 to 18.25, with the 30 least and most progenies assigned to the L-pool and M-pool, respectively, for DNA resequencing. In addition, the skewness and kurtosis associated with EPN in the F₅ population were 0.506 and 0.783, respectively. The average value of the F₅ generation population was 14.32 (Table S1). These values were consistent with the characteristics of quantitative traits overall, indicating that the data were suitable for QTL analysis.

2.2. BSA-Seq Analysis

The mean coverage depth for the parents and the two pools was 50 ×, and comparison of the sequences to the ‘Nipponbare’ reference genome resulted in the identification of 867,195 SNPs and 143,441 indels, which were reduced to 247,881 SNPs and 40,409 indels by trimming and filtering (Table S2). A total of 288,290 high-quality SNPs/indels that were homozygous in each parent and polymorphic between the parents were then selected for BSA-seq analysis. ∆(SNP-Index) values (Figure 2A), Euclidean distance (ED) values (Figure 2B), and Fisher’s exact test p-values (Figure 2C) were used to identify candidate EPN-related QTL regions (Table S3). Three significant (p < 0.01) peaks in the ED distribution spanned 24.62–26.47 Mb, 11.82–17.80 Mb, and 22.81–23.23 Mb on chromosomes 7, 9, and 11, respectively. In contrast, the peaks in the ∆(SNP-Index) and Fisher’s exact test p-value distributions covered the entire interval on chromosomes 7 and 11. Furthermore, after selecting intersections for each significant region, we identified significant regions on three chromosomes. These regions had fragment sizes of 1.85 Mb (24.62–26.47 Mb) on qEPN7, 6.41 Mb (15.49–21.90 Mb) on qEPN9, and 0.42 Mb (22.81–23.23 Mb) on qEPN11. Therefore, qEPN7, qEPN9, and qEPN11 were considered to be more significant targets for mining candidate EPN genes.

2.3. Putative Candidate Genes for Three QTL Intervals

The three QTL intervals obtained from the above analysis were further screened, resulting in a total of 489 SNPs/indels, of which 238 were distributed on qEPN7, 242 on qEPN9, and 9 on qEPN11. After screening non-synonymous genes for upstream, UTR5, and exonic, 9 genes were obtained on qEPN7, namely Os07g0602700, Os07g0602800, Os07g0602900, Os07g0603100, Os07g0603200, Os07g0603300, Os07g0603500, Os07g0614000, and Os07g0617600. Additionally, 25 genes were obtained on qEPN9, namely Os09g0431500, Os09g0433600, Os09g0473300, Os09g0509800, 0s9g0526500, Os09g0526700, Os09g0531600, Os09g0532200 Os09g0533300, Os09g0535200, Os09g0535500, Os09g0538700, Os09g0539400, Os09g0539500, Os09g0540300, Os09g0542100, Os09g0546400, Os09g0547800, Os09g0549400, Os09g0549500, Os09g0549600, Os09g0550400, Os09g0551400, Os09g0551500, and Os09g0551600, for a total of 34 genes.

2.4. Enrichment Analysis of Candidate Genes

This experiment conducted a GO enrichment analysis on 34 candidate genes and obtained significantly different GO terms. Some GO terms with higher enrichment were selected and plotted as pathway statistical maps (Figure 3A). According to different functional annotations, these GO terms can be divided into three categories: biological processes, cell component, and molecular function. The significant aspects of biological processes include DNA replication initiation, cellular response to reactive nitrogen species, and cellular response to inorganic substance. The most significant categories of cell composition include the Pwp2p–containing subcomplex of 90S preribosome, replication fork protection complex, and U12–type spliceosomal complex. Among the categories of molecular function, glycerophosphodiester phosphodiesterase activity, amine-lyase activity, and DNA replication origin binding have higher significance. It can be seen that these candidate genes may affect the tillering formation process of rice through pathways involved in biological processes, cell component, and molecular functions. In addition, this experiment also conducted KEGG enrichment analysis on 34 candidate genes. According to Figure 3B, these genes are primarily involved in basal transcription factors, galactose metabolism, ribosome biogenesis in eukaryotes, metabolic pathways, and other pathways. Except for metabolic pathways, these genes are significantly enriched in other pathways.

2.5. Temporal Expression Pattern of EPN-Associated Genes

Many studies have found that genes related to tillering exhibit a unique temporal expression pattern throughout the entire growth period. From 20 days (DAT) to 48 days (DAT) after transplantation, their expression levels remained stable and high at 00:00 (R0) and 12:00 (R12) in the roots, but decreased after 48 days [21]. This pattern can explain the stable transition from tillering development to ear development from a genetic perspective. Therefore, in this study, we attempted to explore the temporal expression patterns of genes related to EPN. We utilized the RichXPro website to conduct expression clustering analysis on root expression data obtained at weekly intervals of 00:00 (R0) and 12:00 (R12) for 34 candidate genes over the course of the growth period. The results showed that there were 32 genes with expression data in RiceXPro, and eight genes in R12 had relatively high expression levels from 20DAT to 48DAT (Figure 4A). In R0, there were nine genes with high expression levels from 21DAT to 49DAT (Figure 4B). After conducting a thorough comparison and screening of the two graphs, we identified six candidate genes that exhibited high expression levels in both time periods. These genes were Os07g0603300, Os09g0433600, Os09g0549400, Os09g0539500, Os09g0549500, and Os09g0551600. This expression pattern is consistent with the dynamic change pattern of tiller number throughout the entire growth period. There is a rapid increase in tiller number starting at 21DAT and reaching its peak at 49DAT. After 49 DAT, the expression of genes related to EPN decreased. This decrease may be associated with the transition from vegetative growth to reproductive growth and ear development on tillers.

2.6. Analysis of Candidate Gene Haplotype by RFGB Database

Through haplotype analysis of six candidate genes on the RFGB website, we found 12 SNP mutations in the Os07g0603300 gene, including three in the promoter regions and nine in the coding regions. In total, we obtained 24 haplotypes. The EPN difference between haplotypes was significant. The Os09g0433600 gene had nine SNP mutations, six promoter regions, and three coding regions. A total of 11 haplotypes were obtained, with significant differences in EPN among them. The Os09g0549500 gene had 13 SNP mutations, four promoter regions, and nine coding regions. A total of five haplotypes were obtained, with significant differences in EPN among them. The Os09g0549400 gene had 18 SNP mutations, four promoter regions, and 14 coding regions. A total of nine haplotype were obtained, with significant differences in EPN among them. The Os09g0551600 gene had nine SNP mutations, six promoter regions, and three coding regions. A total of 13 haplotype were obtained, with significant differences in EPN among them (Figure 5). At the same time, we did not find any SNP mutations in Os09g0539500. Therefore, we ultimately obtained five candidate genes: Os07g0603300, Os09g0433600, Os09g0549400, Os09g0549500, and Os09g0551600.

3. Discussion

Rice is the staple food for approximately 50% of the world’s population [22], making the demand for rice yield extremely significant. As one of the key elements, the EPN greatly affects rice yield and serves as the primary indicator for rice breeding. Improving rice EPN from a genetic perspective is worth exploring in depth. At present, some genes related to tillering have been identified in rice. For example, Zhu et al. constructed a low spike number chromosome segment substitution line (C3074) by backcrossing Japan’s Qinghe and Guanglu’ai 4 as parents. Using 1429 recessive individuals derived from NIL-F2–3 populations obtained from C3074 and Guanglu’ai 4, we successfully mapped the main QTLqPN1 that controls the EPN per plant in rice. The QTL was finely mapped to a 34.4 kb region on the long arm of rice chromosome 1, which contains six annotated genes [23]. Chen et al. used Zhenshan97 and Miyang46 as parents to construct a backcross population. By screening multiple generations of self-crossing materials from the BC2 population, researchers constructed a fine mapping population with 18 target intervals that were separated and had a highly consistent genetic background. The micro effect QTLqHd1, which controls both heading date and panicle number, was finely mapped to a 950 kb region between RM12102 and RM12108 at the end of the long arm of rice chromosome 1. This locus has a significant effect on panicle number per plant, and it shows an additive effect of increasing efficiency on both heading date and yield traits [24]. However, continuous exploration of new EPN-related genes is crucial for improving rice yield. So far, few QTLs related to EPN have been identified. One reason is that the number of rice panicles is controlled by a polygenic system, with relatively low heritability and susceptibility to various environmental conditions [25]. Another reason is the absence of a genetically diverse population suitable for QTLs related to panicle numbers. In addition, there is often a negative correlation between EPN and the other two yield components [2]. Improving one gene factor often has a negative impact on other trait factors. High yield at the individual level of rice does not necessarily mean high yield at the population level. How to coordinate the balance between the source and sink of rice, and make optimal use of rice yield-related genes in rice breeding, is the crucial issue. This study selected two rice varieties, Dongfu 114 and Longyang 11, as parents, based on their significant differences in effective panicle number (EPN). The aim of the study was to identify excellent QTLs and candidate genes associated with higher EPN in rice. This will be achieved through methods such as BSA-seq analysis. The findings of this study will provide a theoretical foundation for future efforts to improve rice yield.

The BSA-seq method can rapidly localize gene QTLs. With the advancement of sequencing technology, the cost has significantly decreased. Compared to traditional linkage mapping, QTL-seq can improve work efficiency and provide high-density mutation sites. QTL-seq has been successfully applied to many plants, including cucumbers [26], soybeans [27], rice [28], and tomatoes [29]. With the recent advancements in high-throughput sequencing, high-resolution mass spectrometry analysis, and information processing technology, BSA-seq has become more mature. Its accuracy and cost have significantly improved [30]. The combination of traditional maps and BSA-seq can effectively and quickly narrow down the main QTL intervals [9,31,32]. For example, Guo et al. identified candidate genes for controlling drought tolerance in rice grown in cold regions during fertilization using BSA-seq and RNA-seq [33]. Zhao et al. identified a new site, qGL3.5, that regulates rice grain length using BSA-seq [34]. Liang et al. combined a large number of isolation analyses with BSA-seq to identify a new type of pi21 haplotype, thus endowing rice with innate resistance to rice blast [35]. The main advantage of the BSA-seq method is that it utilizes DNA mixed gene pool sequencing. It uses the genotype of the control parent as a reference to calculate the SNP index in the extreme pool of offspring. This reduces the cost of DNA extraction and allows for the direct utilization of polymorphic SNPs between two parents for localization, without the need for new markers [36]. F2, RIL, and DH populations can all serve as target populations for QTL-seq. While screening candidate regions, it is possible to perform functional analysis of candidate genes within those regions, which may lead to precise localization in a single step [9,37]. In this study, three bioinformatics analysis methods were used to map the QTL region at the 99% significance level. Using the BSA-seq strategy, QTLs related to effective panicle number were located on chromosomes 7, 9, and 11. The identified QTLs were found in three regions containing annotated genes. In general, phenotypic variation is caused by non-synonymous mutations in the gene coding region of genes. Therefore, in this study, we first considered which regions of the mapped QTL contained non-synonymous significant SNPs, and the variation in the promoter or CDS region was likely to be the key to gene expression regulation. In order to screen potential candidate genes in the interval, we combined QTL-seq with time expression pattern and haplotype analysis. This approach reduced the number of candidate genes in the interval defined by QTL-seq from 34 to 5.

In the past decade, genotype datasets for many rice varieties have been published and used to identify several loci related to important agronomic traits [38,39]. Some large-scale rice collections with sequence and phenotype data provide valuable materials and knowledge for rice research and breeding projects [38,40,41,42]. We found that Os07g0603300 and Os09g0549500, two out of the five candidate genes, have been discovered and cloned by previous researchers. Os07g0603300 is GL7, a major QTL that controls grain length and width on rice chromosome 7. It encodes a homologous protein to Arabidopsis LONGIFOLIA protein and regulates the longitudinal elongation of cells. The 17.1 kb tandem repeat at the GL7 locus causes the upregulation of GL7 expression level and downregulation of the expression of negative factors adjacent to GL7, thereby increasing the grain length and improving the appearance quality of rice [43]. Os09g0549500, also known as OsU11/U12-31K, is necessary for normal plant development. Artificial microRNAs (amiRNA) knock out AtU1/U1231k mutant plants, resulting in delayed main stem growth. However, rice Os31K has the ability to restore the wild-type phenotype to Arabidopsis plants with amiR1-4 [44]. Nowadays, with the continuous development of bioinformatics technology and the gradual reduction in sequencing costs, many studies on rice tillering traits are not limited only to the initial positioning research, but also pursuing more detailed, accurate, and in-depth explorations. Although some progress has been made in studying rice tillering ability, the available information is still quite limited. Therefore, on one hand, it is necessary to continue exploring more reliable gene information in the rice genome. On the other hand, it is important to quickly locate and clone functional genes using mutant materials. Only by accelerating the exploration of the rice functional genome can we effectively guarantee future gene cloning. Due to time constraints and various factors, this study did not conduct further research on the selected candidate genes. The next step could be to prioritize the remaining three candidate genes for in-depth research. This would involve exploring the cloning, genetic transformation, and regulation of rice tillering mechanisms associated with these genes. It is important to validate these findings in order to provide a more comprehensive understanding of the genetic mechanism of rice tillering.

4. Materials and Methods

4.1. Plant Materials and Construction of Segregating Pools

The japonica varieties ‘Dongfu 114’ (‘DF114’) and ‘Longyang 11’ (‘LY11’) were obtained from Northeast Agriculture University (Harbin, China) and used as the female and male parents, respectively, to create an F₅ population of 309 individuals. In the spring of 2021, ‘DF114’ (n = 48), ‘LY11’ (n = 48), and F₅ (n = 309) individuals were planted in four rows under natural conditions in paddy fields at Acheng Experimental Station (Harbin, Heilongjiang Province, China), and 5 plants from the center of each plot were selected for evaluation of EPN. Based on the analysis of the EPN of 309 F₅ individual plants, 30 plants with the minimum number of panicles and 30 plants with the maximum number of panicles were selected as the extreme few panicle pool (L-pool) and extreme many panicle pool (M-pool), respectively.

4.2. Phenotyping Analysis Genotyping Data and SNP Filtering

DNA samples from Dongfu 114, Longyang 11, and two pools with extreme mixing were selected and submitted to Guangzhou Kideo Biotechnology Co., Ltd. (Guangzhou, China) for QTL-seq analysis. First, DNA samples were collected. The integrity of the DNA was assessed using agarose gel electrophoresis. The DNA concentration was measured using Nanodrop, and the accurate quantification of DNA concentration was performed using Qubit. The IlluminaHiSeq platform was used for sequencing. The average coverage depth of the parents and two mixing tanks was 50×. After sequencing using the IlluminaHiSeq platform, the original sequencing data, RawData, was obtained. Quality checks and filtering were performed to obtain CleanReads. BWA-backtrack software was used to compare CleanReads with the Nipponbare reference genome. Picard 2 software was utilized to remove duplicate readings. GATK 4 software was employed to detect and filter SNPs. Finally, SNP sites between the sample and the reference genome were identified. Association analysis was performed using ∆(SNP-Index) [45], ED [46], and two-tailed Fisher’s exact test values [47]. The final QTL interval was determined by considering the overlapping interval of the three methods.

The SNP index is a method of searching for QTL loci by looking for differences in genotype frequencies between mixed pools. If the genotype changes of a certain locus are not related to phenotype, the proportion of alleles in the two mixed pools should be roughly equal, and the SNP index is close to the theoretical separation ratio. When Δ (SNP index) is close to 0, it indicates that there is little correlation between the marker SNP and the trait. Conversely, when Δ (SNP index) is closer to 1, it suggests a stronger correlation between the marker SNP and the trait, making the chromosome region a potential candidate region for QTL.

ED is one of the methods that uses sequencing data to identify significant difference markers between mixed pools and evaluate the regions associated with traits. The ED value of the non-target site should tend toward 0. The larger the ED value, the greater the difference between the two mixed pools of the marker.

The two-tailed Fisher’s exact test is based on the hypergeometric distribution. It is used to test the allele depth ratio in two mixed pools and determine if there is a significant difference. This significance is represented by a p-value. The smaller the p-value, the higher the significance, indicating a greater possibility of a difference in the allele ratio between the two mixed pools.

4.3. GO and KEGG Enrichment Analysis of Candidate Genes

The candidate genes within the QTL intervals obtained from QTL-seq analysis were annotated using the online software Ensembl (http://www.ensembl.org/index.html/2023/04/25/) with Nipponbare as the reference genome. Firstly, we mapped genes to each term in the GO database (http://www.geneontology.org/2023/5/10/). and calculated the number of genes in each term. Then, we applied hypergeometric tests to identify GO entries that were significantly enriched in genes compared to the background of the entire genome. We then utilized GO analysis to classify genes based on their cell component, molecular function, and biology. KEGG is the primary public database related to Pathways. Pathway significance enrichment analysis utilizes KEGG Pathway as a unit and employs hypergeometric tests to identify pathways that are significantly enriched in genes compared to the background of the entire genome. Significant enrichment through Pathway can determine the main biochemical, metabolic, and signal transduction pathways involved in genes. We analyzed the metabolic pathways of these annotated genes using the KEGG website (https:///www.kegg.jp/2023/5/10/).

4.4. Temporal Expression Pattern and Haplotype Analysis

Through the RichXPro website, expression clustering analysis was conducted on the root expression data of candidate genes collected at weekly intervals of 00:00 (R0) and 12:00 (R12) throughout the growth period. The purpose was to screen for genes with higher expression levels. Through the Haplotype Analysis module (HaplotypeAnalysis) of the RFGB database (https://www.rmbreeding.cn/2023/5/25/), we analyzed the mutation information of the candidate gene promoter and the CDS region, obtained each haplotype of the candidate gene and further screened it. Then, we used the PN phenotype database to analyze the difference in the number of spikes among haplotypes.

4.5. Statistical Analysis

Differences between parent and progeny EPN were analyzed using IBM SPSS Statistics 26. The average, standard deviation, skewness, and kurtosis of the EPN statistics were calculated. Graphs were created using Adobe Photoshop CC 2019 and Origin 2021.

5. Conclusions

This study utilized the japonica rice varieties Dongfu 114 and Longyang 11 to establish an F₅ population comprising 309 individual plants. By identifying the EPN in the population, 30 individual plants with extreme traits were selected. QTL-seq analysis was then conducted on the parents, as well as on pools of plants with multiple panicles and plants with few panicles. The ∆ (SNP Index) algorithm, ED algorithm, and Fisher exact test values collectively identified the primary effective QTL intervals for EPN on chromosomes 7, 9, and 11 as 1.85 Mb (24.62–26.47 Mb), 6.41 Mb (15.49–21.90 Mb), and 0.42 Mb (22.81–23.23 Mb), respectively. Afterwards, temporal expression pattern analysis of EPN was performed on 34 genes to identify six candidate genes with higher expression levels. Then, haplotype analysis was performed on these six genes using a 3K database to screen out genes with insignificant haplotypes. This process resulted in the identification of five high-quality candidate genes. Although further work is needed to elucidate the mechanisms of action of these genes, this study provides resources for breeding programs aimed at improving the EPN.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ijms241914838/s1.

Author Contributions

Writing—original draft preparation, S.S.; conceptualization, S.X. and S.S.; methodology and data analysis, S.S., S.X., M.W., N.C. and T.M.; data curation, H.Z., J.W., L.Y. and D.Z.; supervision, D.Z. and H.L.; resources, H.L. and H.Z.; visualization, L.Y.; review and editing, H.L. and W.X. funding acquisition H.L. and W.X. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by grants from the China Postdoctoral Science Foundation (Grant No. 2021M693792) and the “Breeding of high quality and resistant rice varieties” (Grant No. 2020ZX16B01), a major scientific and technological project of “Hundreds and Thousands” in Heilongjiang Province, China.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data was created.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

EPN	Effective Panicle Number
QTL	Quantitative Trait Locus
BSA	Bulk Segregant Analysis
GO	Gene Ontology
KEGG	Kyoto Encyclopedia of Genes and Genomes
ED	Euclidean Distance
PIL	Recombinant Inbred Lines
DH	Doubled Haploid
NIL	Near Isogenic Lines

References

Khush, G.S. What it will take to feed 5.0 billion rice consumers in 2030. Plant Mol. Biol. 2005, 59, 1–6. [Google Scholar] [CrossRef]
Song, X.; Meng, X.; Guo, H.; Cheng, Q.; Jing, Y.; Chen, M.; Liu, G.; Wang, B.; Wang, Y.; Li, J. Targeting a gene regulatory element enhances rice grain yield by decoupling panicle number and size. Nat. Biotechnol. 2022, 40, 1403–1411. [Google Scholar] [CrossRef] [PubMed]
Sakamoto, T.; Matsuoka, M. Identifying and exploiting grain yield genes in rice. Curr. Opin. Plant Biol. 2008, 11, 209–214. [Google Scholar] [CrossRef]
Xiong, Q.; Tang, G.; Zhong, L.; He, H.; Chen, X. Response to nitrogen deficiency and compensation on physiological characteristics, yield formation, and nitrogen utilization of rice. Front. Plant Sci. 2018, 9, 1075. [Google Scholar] [CrossRef]
Jiao, Y.; Wang, Y.; Xue, D.; Wang, J.; Yan, M.; Liu, G.; Dong, G.; Zeng, D.; Lu, Z.; Zhu, X. Regulation of OsSPL14 by OsmiR156 defines ideal plant architecture in rice. Nat. Genet. 2010, 42, 541–544. [Google Scholar] [CrossRef] [PubMed]
Zou, C.; Wang, P.; Xu, Y. Bulked sample analysis in genetics, genomics and crop improvement. Plant Biotechnol. J. 2016, 14, 1941–1955. [Google Scholar] [CrossRef]
Giovannoni, J.J.; Wing, R.A.; Ganal, M.W.; Tanksley, S.D. Isolation of molecular markers from specific chromosomal intervals using DNA pools from existing mapping populations. Nucleic Acids Res. 1991, 19, 6553–6568. [Google Scholar] [CrossRef]
Michelmore, R.W.; Paran, I.; Kesseli, R. Identification of markers linked to disease-resistance genes by bulked segregant analysis: A rapid method to detect markers in specific genomic regions by using segregating populations. Proc. Natl. Acad. Sci. USA 1991, 88, 9828–9832. [Google Scholar] [CrossRef]
Takagi, H.; Abe, A.; Yoshida, K.; Kosugi, S.; Natsume, S.; Mitsuoka, C.; Uemura, A.; Utsushi, H.; Tamiru, M.; Takuno, S. QTL-seq: Rapid mapping of quantitative trait loci in rice by whole genome resequencing of DNA from two bulked populations. Plant J. 2013, 74, 174–183. [Google Scholar] [CrossRef]
Hao, W.; Lin, H.-X. Toward understanding genetic mechanisms of complex traits in rice. J. Genet. Genom. 2010, 37, 653–666. [Google Scholar] [CrossRef]
Xing, Y.; Zhang, Q. Genetic and molecular bases of rice yield. Annu. Rev. Plant Biol. 2010, 61, 421–442. [Google Scholar] [CrossRef]
Zuo, J.; Li, J. Molecular dissection of complex agronomic traits of rice: A team effort by Chinese scientists in recent years. Natl. Sci. Rev. 2014, 1, 253–276. [Google Scholar] [CrossRef]
Xu, J.; Xue, Q.; Luo, L.; Li, Z. QTL dissection of panicle number per plant and spikelet number per panicle in rice (Oryza sativa L.). Yi Chuan Xue Bao Acta Genet. Sin. 2001, 28, 752–759. [Google Scholar] [PubMed]
Xu, J.; Yu, S.; Luo, L.; Zhong, D.; Mei, H.; Li, Z. Molecular dissection of the primary sink size and its related traits in rice. Plant Breed. 2004, 123, 43–50. [Google Scholar] [CrossRef]
Zhang, B.; Shang, L.; Ruan, B.; Zhang, A.; Yang, S.; Jiang, H.; Liu, C.; Hong, K.; Lin, H.; Gao, Z. Development of three sets of high-throughput genotyped rice chromosome segment substitution lines and QTL mapping for eleven traits. Rice 2019, 12, 33. [Google Scholar] [CrossRef] [PubMed]
Wu, W.-R.; Li, W.-M.; Tang, D.-Z.; Lu, H.-R.; Worland, A. Time-related mapping of quantitative trait loci underlying tiller number in rice. Genetics 1999, 151, 297–303. [Google Scholar] [CrossRef]
Miyamoto, N.; Goto, Y.; Matsui, M.; Ukai, Y.; Morita, M.; Nemoto, K. Quantitative trait loci for phyllochron and tillering in rice. Theor. Appl. Genet. 2004, 109, 700–706. [Google Scholar] [CrossRef]
Song, W.-Y.; Wang, G.-L.; Chen, L.-L.; Kim, H.-S.; Pi, L.-Y.; Holsten, T.; Gardner, J.; Wang, B.; Zhai, W.-X.; Zhu, L.-H. A receptor kinase-like protein encoded by the rice disease resistance gene, Xa21. Science 1995, 270, 1804–1806. [Google Scholar] [CrossRef]
Li, X.; Qian, Q.; Fu, Z.; Wang, Y.; Xiong, G.; Zeng, D.; Wang, X.; Liu, X.; Teng, S.; Hiroshi, F. Control of tillering in rice. Nature 2003, 422, 618–621. [Google Scholar] [CrossRef]
Tanaka, W.; Ohmori, Y.; Ushijima, T.; Matsusaka, H.; Matsushita, T.; Kumamaru, T.; Kawano, S.; Hirano, H.-Y. Axillary meristem formation in rice requires the WUSCHEL ortholog TILLERS ABSENT1. Plant Cell 2015, 27, 1173–1184. [Google Scholar] [CrossRef]
Ma, X.; Li, F.; Zhang, Q.; Wang, X.; Guo, H.; Xie, J.; Zhu, X.; Ullah Khan, N.; Zhang, Z.; Li, J. Genetic architecture to cause dynamic change in tiller and panicle numbers revealed by genome-wide association study and transcriptome profile in rice. Plant J. 2020, 104, 1603–1616. [Google Scholar] [CrossRef]
Sasaki, T.; Burr, B. International Rice Genome Sequencing Project: The effort to completely sequence the rice genome. Curr. Opin. Plant Biol. 2000, 3, 138–142. [Google Scholar] [CrossRef]
Zhu, J.; Zhou, Y.; Liu, Y.; Wang, Z.; Tang, Z.; Yi, C.; Tang, S.; Gu, M.; Liang, G. Fine mapping of a major QTL controlling panicle number in rice. Mol. Breed. 2011, 27, 171–180. [Google Scholar] [CrossRef]
Chen, J.-Y.; Guo, L.; Ma, H.; Chen, Y.-Y.; Zhang, H.-W.; Ying, J.-Z.; Zhuang, J.-Y. Fine mapping of qHd1, a minor heading date QTL with pleiotropism for yield traits in rice (Oryza sativa L.). Theor. Appl. Genet. 2014, 127, 2515–2524. [Google Scholar] [CrossRef] [PubMed]
Liu, G.; Zhang, Z.; Zhu, H.; Zhao, F.; Ding, X.; Zeng, R.; Li, W.; Zhang, G. Detection of QTLs with additive effects and additive-by-environment interaction effects on panicle number in rice (Oryza sativa L.) with single-segment substitution lines. Theor. Appl. Genet. 2008, 116, 923–931. [Google Scholar] [CrossRef] [PubMed]
Lu, H.; Lin, T.; Klein, J.; Wang, S.; Qi, J.; Zhou, Q.; Sun, J.; Zhang, Z.; Weng, Y.; Huang, S. QTL-seq identifies an early flowering QTL located near Flowering Locus T in cucumber. Theor. Appl. Genet. 2014, 127, 1491–1499. [Google Scholar] [CrossRef] [PubMed]
Song, J.; Li, Z.; Liu, Z.; Guo, Y.; Qiu, L.-J. Next-generation sequencing from bulked-segregant analysis accelerates the simultaneous identification of two qualitative genes in soybean. Front. Plant Sci. 2017, 8, 919. [Google Scholar] [CrossRef]
Wambugu, P.; Ndjiondjop, M.N.; Furtado, A.; Henry, R. Sequencing of bulks of segregants allows dissection of genetic control of amylose content in rice. Plant Biotechnol. J. 2018, 16, 100–110. [Google Scholar] [CrossRef]
Wen, J.; Jiang, F.; Weng, Y.; Sun, M.; Shi, X.; Zhou, Y.; Yu, L.; Wu, Z. Identification of heat-tolerance QTLs and high-temperature stress-responsive genes through conventional QTL mapping, QTL-seq and RNA-seq in tomato. BMC Plant Biol. 2019, 19, 398. [Google Scholar] [CrossRef]
Xin, W.; Liu, H.; Yang, L.; Ma, T.; Wang, J.; Zheng, H.; Liu, W.; Zou, D. BSA-Seq and Fine Linkage Mapping for the Identification of a Novel Locus (qPH9) for Mature Plant Height in Rice (Oryza sativa). Rice 2022, 15, 26. [Google Scholar] [CrossRef]
Lei, L.; Zheng, H.; Bi, Y.; Yang, L.; Liu, H.; Wang, J.; Sun, J.; Zhao, H.; Li, X.; Li, J. Identification of a major QTL and candidate gene analysis of salt tolerance at the bud burst stage in rice (Oryza sativa L.) using QTL-Seq and RNA-Seq. Rice 2020, 13, 55. [Google Scholar] [CrossRef]
Yang, L.; Wang, J.; Han, Z.; Lei, L.; Liu, H.L.; Zheng, H.; Xin, W.; Zou, D. Combining QTL-seq and linkage mapping to fine map a candidate gene in qCTS6 for cold tolerance at the seedling stage in rice. BMC Plant Biol. 2021, 21, 278. [Google Scholar] [CrossRef] [PubMed]
Guo, Z.; Cai, L.; Chen, Z.; Wang, R.; Zhang, L.; Guan, S.; Zhang, S.; Ma, W.; Liu, C.; Pan, G. Identification of candidate genes controlling chilling tolerance of rice in the cold region at the booting stage by BSA-Seq and RNA-Seq. R. Soc. Open Sci. 2020, 7, 201081. [Google Scholar] [CrossRef] [PubMed]
Zhao, H.; Zheng, Y.; Bai, F.; Liu, Y.; Deng, S.; Liu, X.; Wang, L. Bulked segregant analysis coupled with whole-genome sequencing (BSA-Seq) and identification of a novel locus, qGL3.5, that regulates grain length. Res. Sq. 2021. [Google Scholar] [CrossRef]
Liang, T.; Chi, W.; Huang, L.; Qu, M.; Zhang, S.; Chen, Z.-Q.; Chen, Z.-J.; Tian, D.; Gui, Y.; Chen, X. Bulked segregant analysis coupled with whole-genome sequencing (BSA-Seq) mapping identifies a novel pi21 haplotype conferring basal resistance to rice blast disease. Int. J. Mol. Sci. 2020, 21, 2162. [Google Scholar] [CrossRef]
Abe, A.; Kosugi, S.; Yoshida, K.; Natsume, S.; Takagi, H.; Kanzaki, H.; Matsumura, H.; Yoshida, K.; Mitsuoka, C.; Tamiru, M. Genome sequencing reveals agronomically important loci in rice using MutMap. Nat. Biotechnol. 2012, 30, 174–178. [Google Scholar] [CrossRef]
Singh, V.K.; Khan, A.W.; Jaganathan, D.; Thudi, M.; Roorkiwal, M.; Takagi, H.; Garg, V.; Kumar, V.; Chitikineni, A.; Gaur, P.M. QTL-seq for rapid identification of candidate genes for 100-seed weight and root/total plant dry weight ratio under rainfed conditions in chickpea. Plant Biotechnol. J. 2016, 14, 2110–2119. [Google Scholar] [CrossRef]
Li, X.; Chen, Z.; Zhang, G.; Lu, H.; Qin, P.; Qi, M.; Yu, Y.; Jiao, B.; Zhao, X.; Gao, Q. Analysis of genetic architecture and favorable allele usage of agronomic traits in a large collection of Chinese rice accessions. Sci. China Life Sci. 2020, 63, 1688–1702. [Google Scholar] [CrossRef] [PubMed]
Peng, H.; Wang, K.; Chen, Z.; Cao, Y.; Gao, Q.; Li, Y.; Li, X.; Lu, H.; Du, H.; Lu, M. MBKbase for rice: An integrated omics knowledgebase for molecular breeding in rice. Nucleic Acids Res. 2020, 48, D1085–D1092. [Google Scholar] [CrossRef]
Crowell, S.; Korniliev, P.; Falcao, A.; Ismail, A.; Gregorio, G.; Mezey, J.; McCouch, S. Genome-wide association and high-resolution phenotyping link Oryza sativa panicle traits to numerous trait-specific QTL clusters. Nat. Commun. 2016, 7, 10527. [Google Scholar] [CrossRef]
Zhao, Q.; Feng, Q.; Lu, H.; Li, Y.; Wang, A.; Tian, Q.; Zhan, Q.; Lu, Y.; Zhang, L.; Huang, T. Pan-genome analysis highlights the extent of genomic variation in cultivated and wild rice. Nat. Genet. 2018, 50, 278–284. [Google Scholar] [CrossRef]
Dong, H.; Zhao, H.; Li, S.; Han, Z.; Hu, G.; Liu, C.; Yang, G.; Wang, G.; Xie, W.; Xing, Y. Genome-wide association studies reveal that members of bHLH subfamily 16 share a conserved function in regulating flag leaf angle in rice (Oryza sativa). PLoS Genet. 2018, 14, e1007323. [Google Scholar] [CrossRef] [PubMed]
Wang, Y.; Xiong, G.; Hu, J.; Jiang, L.; Yu, H.; Xu, J.; Fang, Y.; Zeng, L.; Xu, E.; Xu, J. Copy number variation at the GL7 locus contributes to grain size diversity in rice. Nat. Genet. 2015, 47, 944–948. [Google Scholar] [CrossRef]
Kwak, K.J.; Jung, H.J.; Lee, K.H.; Kim, Y.S.; Kim, W.Y.; Ahn, S.J.; Kang, H. The minor spliceosomal protein U11/U12-31K is an RNA chaperone crucial for U12 intron splicing and the development of dicot and monocot plants. PLoS ONE 2012, 7, e43707. [Google Scholar] [CrossRef] [PubMed]
Fekih, R.; Takagi, H.; Tamiru, M.; Abe, A.; Natsume, S.; Yaegashi, H.; Sharma, S.; Sharma, S.; Kanzaki, H.; Matsumura, H. MutMap+: Genetic mapping and mutant identification without crossing in rice. PLoS ONE 2013, 8, e68529. [Google Scholar] [CrossRef] [PubMed]
Hill, J.T.; Demarest, B.L.; Bisgrove, B.W.; Gorsi, B.; Su, Y.-C.; Yost, H.J. MMAPPR: Mutation mapping analysis pipeline for pooled RNA-seq. Genome Res. 2013, 23, 687–697. [Google Scholar] [CrossRef] [PubMed]
Fisher, R.A. On the interpretation of χ 2 from contingency tables, and the calculation of P. J. R. Stat. Soc. 1922, 85, 87–94. [Google Scholar] [CrossRef]

Figure 1. Probability distribution of EPN of 309 rice plants in F₅ population.

Figure 2. Quantitative trait locus (QTL) analysis of rice EPN at maturity using three QTL-seq methods. (A) Manhattan plot showing the distribution of Δ(SNP-index) on chromosomes. (B) Manhattan plot showing the distribution of Euclidean distance (ED5) on chromosomes. (C) Manhattan plot showing the distribution of log-transformed Fisher’s exact test p-value distribution, –log10(p) on chromosomes. Blue and red lines represent 95 and 99% confidence intervals, respectively, and black lines represent mean values of the three algorithms, which were drawn using sliding window analysis. Numbers on the horizontal coordinates represent chromosome numbers.

Figure 3. GO enrichment analysis and KEGG enrichment analysis for candidate genes. (A) GO analysis. (B) KEGG analysis.

Figure 4. Temporal expression pattern of EPN-associated genes during the whole growth period. Expression data in root at 00:00 (A) and 12:00 (B) were downloaded from RiceXPro website. The heatmaps represented hierarchical clustering of relative expression levels of 27 candidate genes at different days after transplanting (DAT). The scale for relative expression levels (after normalization by z-score) is denoted by color bars, with red representing the high expression levels, white medium expression, and blue low expression.

Figure 5. Haplotype analysis of candidate genes. (A) Haplotype analysis of Os07g0603300. (B) Haplotype analysis of Os09g0551600. (C) Haplotype analysis of Os09g0433600. (D) Haplotype analysis of Os09g0549500. (E) Haplotype analysis of Os09g0549400.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Shen, S.; Xu, S.; Wang, M.; Ma, T.; Chen, N.; Wang, J.; Zheng, H.; Yang, L.; Zou, D.; Xin, W.; et al. BSA-Seq for the Identification of Major Genes for EPN in Rice. Int. J. Mol. Sci. 2023, 24, 14838. https://doi.org/10.3390/ijms241914838

AMA Style

Shen S, Xu S, Wang M, Ma T, Chen N, Wang J, Zheng H, Yang L, Zou D, Xin W, et al. BSA-Seq for the Identification of Major Genes for EPN in Rice. International Journal of Molecular Sciences. 2023; 24(19):14838. https://doi.org/10.3390/ijms241914838

Chicago/Turabian Style

Shen, Shen, Shanbin Xu, Mengge Wang, Tianze Ma, Ning Chen, Jingguo Wang, Hongliang Zheng, Luomiao Yang, Detang Zou, Wei Xin, and et al. 2023. "BSA-Seq for the Identification of Major Genes for EPN in Rice" International Journal of Molecular Sciences 24, no. 19: 14838. https://doi.org/10.3390/ijms241914838

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

BSA-Seq for the Identification of Major Genes for EPN in Rice

Abstract

1. Introduction

2. Results

2.1. Phenotypic Analysis and Evaluation of EPN

2.2. BSA-Seq Analysis

2.3. Putative Candidate Genes for Three QTL Intervals

2.4. Enrichment Analysis of Candidate Genes

2.5. Temporal Expression Pattern of EPN-Associated Genes

2.6. Analysis of Candidate Gene Haplotype by RFGB Database

3. Discussion

4. Materials and Methods

4.1. Plant Materials and Construction of Segregating Pools

4.2. Phenotyping Analysis Genotyping Data and SNP Filtering

4.3. GO and KEGG Enrichment Analysis of Candidate Genes

4.4. Temporal Expression Pattern and Haplotype Analysis

4.5. Statistical Analysis

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI