Composition and Diversity of Endophytic Bacterial Communities in the Seeds of Upland Rice Resources from Different Origin Habitats in China

The cultivation of high-quality upland rice, which is known to be drought tolerant and widely adaptable, can help solve the challenges of food and water shortages for the world’s rising populations, so it is of great significance to sustainable agricultural development. Certain microbial agents can further enhance plant drought resistance, such as the endophytic bacteria that live in plant seeds without causing infection, making them a vital resource that can improve upland rice cultivation and yield. In this study, high-throughput sequencing technology based on the Illumina MiSeq platform was used to investigate the structure and diversity of endophytic bacterial communities in the seeds of 12 upland rice varieties from different areas in Yunnan Province, China. The results showed that 39 endophytic operational taxonomic units (OTUs) were found to coexist in all samples. At the phylum level, the dominant phyla in the 12 seed samples were Proteobacteria (66.92–99.98%). At the genus level, Pantoea (9.75–99.24%), Pseudomonas (0.11–37.24%), Curtobacterium (0.01–19.90%), Microbacterium (0.01–14.95%), Methylobacterium (0.40–5.86%), Agrobacterium (0.01–4.53%), Sphingomonas (0.04–1.56%), Aurantimonas (0.01–1.45%) and Rhodococcus (0.11–1.09%) were found to be the dominant genera coexisting in all the tested upland rice seeds tested, representing the core microbiota in upland rice seeds. Furthermore, correlation analysis showed that upland rice habitat environmental factors, such as climate, precipitation and altitude, exert significant effects on the composition and diversity of the endophytic bacterial communities in the upland rice seeds. The findings of this study contribute significantly to understanding the relationship between upland rice and its endophytic bacteria, which can be developed to enhance drought tolerance and yield in this important crop.


Introduction
Plant endophytes are microbes that live in the tissues and organs of healthy plants, either at a certain stage or all stages of their life cycle, without causing disease or infection (Wilson 1995). The evolutionary process has made this interaction between plants and endophytes mutually beneficial (Liu et al. 2019). Endophytic bacteria not only provide plants with biological functions, such as promoting growth (Ansary et al. 2018), increasing yield (White et al. 2019), enhancing resistance (Rangjaroen et al. 2019) and inducing the synthesis of secondary metabolites (Mohamad et al. 2018), but 1 3 their natural products also offer potential for applications in the pharmaceutical, agricultural and other industries (White et al., 2019;Chouhan et al., 2021). Consequently, research into the interactions between endophytes and plants has become a hot topic in the fields of plant science, agronomy and ecology (Vandenkoornhuyse 2015).
Plant seeds are not only integral to their reproduction, the preserver and transmitter of plant genetic information, but are also an adaptive strategy to ensure species reproduction under stress and, as such, are highly significant to agricultural production (Domergue et al. 2019). Studies have shown that plant seeds are rich in microbial resources, which transfer beneficial microorganisms from generation to generation, thereby directly and indirectly impacting plant growth and development, health, quality and yield, as well as functional components and other biological characteristics (Walitang et al. 2018;Liu et al. 2019). Endophytic bacteria are the microorganisms most widely identified by researchers. A large number of studies have found that the dominant endophytic bacteria in plant seeds are Proteobacteria, Actinobacteria, Firmicutes and Bacteroidetes phyla, while the dominant genera are Pantoea, Bacillus, Pseudomonas, Sphingomonas, Microbacterium and Acinetobacter, amongst others (Zhang et al. 2018b;Wang et al. 2021a,b). However, compared with other plant tissue microorganisms, research into plant seed endophytes is still limited. In particular, little has been reported on the effects of plant growth-related environmental factors on the diversity and community structure of endophytic bacteria in plant seeds.
China produces the most rice in the world and it is the country's dominant food crop. However, water shortages in China have negatively impacted rice cultivation, inducing increased research into the optimization of upland rice, an ecotype crop that is hardier and drought resistant than lowland rice (Khan et al. 2021). In China, upland rice is grown mainly in the Yellow-Huaihe River Basin as well as areas with diverse environments. In recent years, the improvement of plant drought resistance via endophytic bacteria has been investigated (Ullah et al. 2019). The in-depth study of endophytic bacteria in upland rice seed is not only conducive to the improvement of rice yield, but also further contributes to understanding the mechanism of drought tolerance. Moreover, it is not yet clear how environmental factors, such as temperature, altitude and precipitation, affect the core microbiota of upland rice seeds. Here, we analyzed the diversity and community structure of endophytic bacteria in 12 upland rice seeds in China's Yunnan Province and also examined the correlation between core microbiota and the temperature, humidity and altitude of this region. The aims of this paper were to clarify the core microorganisms that live in upland rice seeds and to reveal the relationship between these core microbiotas and the environmental factors of this growing area.

Source of Upland Rice Seeds
Twelve upland rice seed samples were provided by the Hunan Hybrid Rice Research Center. The origin areas and related information of all samples are provided in Fig. 1

Sample Surface Sterilization and Treatment
Three replicates of each sample were selected in this study. First, the husks of each upland rice seed sample were removed using a small sheller. Then, under aseptic conditions the following operations were performed in sequential order: the husked seeds were washed three times with prepared sterile water; 5 g of seeds were placed in a clean and sterile 50-mL tube containing 25 mL phosphate buffer (per liter: 7.15 g of NaH 2 PO 4 ·2H 2 O, 22.04 g of Na 2 HPO 4 ·12H 2 O, 200 µL of Silwet L-77) (Liu et al. 2019;Wang et al. 2021a); seeds were sonicated twice via an Ultrasonic Processor Scientz-IID sonicator (NingBo Scientz Biotechnology Co., Ltd., China) at low power (237.5 W; 950 W × 25%) in an ice bath for 5 min (alternating 30 2-s bursts with 30 2-s rests) (Liu et al. 2019;Wang et al. 2021a). To ensure surface sterility, sterile tweezers were used to press surface-sterilized seeds into LB medium (LUQIAO), and the samples were incubated at 30 °C for 72 h.

DNA Extraction
Five grams of surface-sterilized upland rice seeds from each sample were frozen with liquid nitrogen and then quickly ground into a fine powder using a pre-cooled sterile mortar. Thereafter, DNA was extracted using the FastDNA® SPIN Kit for Soil (MP Biomedicals, Solon, OH, USA), according to the manufacturer's instructions.

Amplicon Library Preparation and Sequencing
All PCR amplifications were performed using TransStart FastPfu DNA Polymerase (TransGen, Beijing, China). For the rice seeds, 799F (5'-AAC AGG ATT AGA TAC CCT G-3') and 1492R (5'-GGT TAC CTT GTT ACG ACT T-3') were used for amplification of the 16S rRNA gene (first-round amplification). Thereafter, the 750 bp fragment amplified from the endophytic bacteria was used as the template for secondround amplification of the V6-V8 region (968F: 5'-AAC GCG AAG AAC CTTAC-3' and 1378R: 5'-CGG TGT GTA CAA GGC CCG GGA ACG -3'), with a 4-7 bp barcode added to the 5' primer of 968F and 1378R. Thermocycling steps were as follows: 5 min at 95℃; then 20 cycles of 45 s at 95 °C, 30 s at 57 °C, and 30 s at 72 °C. The amplified products were purified and mixed in equivalent amounts. The TruePrep® DNA Library Prep Kit V2 for Illumina (Vazyme, China) was used to obtain the library, after which all samples were sequenced by 300-bp-paired-end sequencing with a MiSeq platform using the MiSeq® Reagent Kit v3 (600 cycles) (Illumina).

Sequence Data Processing
Assembly of the paired FASTQ files was performed using Mothur (version 1.39.0) (Schloss et al. 2011). Briefly, paired sequence reads were assembled after the removal of raw reads with ambiguous bases or low quality, such as those with read length < 50 bp, average Q score < 25, or reads not matching the primer (pdiffs = 0) or barcode (bdiffs = 0). The high-quality DNA sequences were then aligned to the SILVA rRNA reference database (v119) (Quast et al. 2013), and the chimera.uchime module was used to remove chimera sequences. Reads were then The rice types are mainly divided into japonica rice and indica rice. Generally, japonica rice has darker leaf color, narrower leaf shape, shorter plant height, and shorter, thicker and thicker grain than indica rice. Japonica rice has the characteristics of cold resistance and drought resistance, while indica rice has the characteristics of heat resistance, strong light resistance and humidity resistance

Data Statistics
Community richness, evenness and diversity analyses (Shannon, Simpson, ACE, Chao and Good's coverage) were performed using Mothur. The PCoA (Principal Coordinates Analysis) was analyzed based on the tayc matrix by Mothur. The t-test (with 95% confidence intervals) was used to determine whether the means of evaluation indices showed a statistical difference, with p-value < 0.05 considered the significant standard. Taxonomy was assigned using the online software RDP classifier (Wang et al. 2007) at the default parameter (80% threshold) based on the Ribosomal Database Project (Cole et al. 2009). Genera and family abundance differences between samples were analyzed using Metastats (White et al. 2009). Spearman correlation coefficients between two variables were calculated using the R command "cor.test". RDA (redundancy analysis) based on genus level was performed using the "vegan" package in R. Average annual precipitation, average annual temperature and altitude were selected as the variable parameters.

Sequence Accession Numbers
The raw high-throughput sequencing data were submitted to the NCBI database with Accession number SRR13319808-SRR13319843 and BioProject number PRJNA688367.

Diversity Analysis of Endophytic Bacteria in Upland Rice Seeds
According to the information obtained on barcode and frontend primers, the quality control sequences were divided into 36 groups of sequence files, from which a total of 2,089,709 high-quality sequences were obtained, with an average of 58,047 sequences per sample (Supplementary Table S1).
Because of the large sample size, the total value of repeated samples was used as the final calculation. The original diversity data are shown in Supplementary Table S2. According to the difference in distance between the sequences, 16S rRNA genes obtained were clustered into OTUs for species classification under the similarity level of 97%. A total of 5704 OTUs were generated from all samples, with the number of OTUs in each sample ranging from 322 to 1527 (  19H011, 19H012, 19H013, 19H014, 19H015, 19H017, 19H018, 19H019, 19H022, 19H024, 19H029 and 19H032, respectively. Diversity (α diversity) indexes of samples include ACE, Chao, Shannon and Simpson values, of which ACE and Chao values are used for sample abundance assessment, while Shannon and Simpson values are used for sample diversity assessment. In general, some differences in diversity and abundance were found in each sample, with samples 19H011, 19H015, 19H017, 19H018 and 19H024 exhibiting Fig. S1 and Fig. S2).

Bacterial Endophyte Community Compositions and Structures of Upland Rice Seeds
The endophytic bacterial community compositions of the 12 upland rice seed samples from different areas in the Yunnan Province are shown at the phylum level in Fig. 2A. The endophytic bacterial community structures of all 12 samples were found to have low diversity at the phylum level, mainly including Proteobacteria and Actinobacteria. The relative abundance of Proteobacteria was relatively high, and the relative abundances of different samples were between 66.92% and 99.98%, making it the main group of bacteria followed by Actinomycetes, with abundances ranging from 0.01% to 32.21%. At the genus level, 148 genera were covered by the endophytic bacteria in all upland rice seed samples. The main bacteria with high relative abundances were Pantoea, Pseudomonas, Curtobacterium, Microbacterium, Methylobacterium, Agrobacterium, Sphingomonas, Aurantimonas and Rhodococcus, which ranged from 9.75-99.24%, 0.11-37.24%, 0.01-19.90%, 0.01-14.95%, 0.40-5.86%, 0.01-4.53%, 0.04-1.56%, 0.01-1.45% and 0.11-1.09%, respectively (Fig. 2B). Classification of the samples at the level of 97% sequence similarity (genus, top10), presented in Fig. 3, revealed that the abundance distribution Relative abundance of shared/ unshared genus in each upland rice seed sample. The abscissa represents the sample name, the ordinate represents the relative abundance of species, each color represents one species, and the corresponding rectangular height represents the relative abundance of species. When judging the relative abundance of a species in a sample, only the length of the color rectangle should be considered and not the accumulated heights of other colors below the rectangle (Color figure online) of endophytic bacteria in each seed sample was different at the genus level. To explore differences in the structures of endophytic bacteria communities in the 12 upland rice seed samples, PCoA was used to draw the two-dimensional distribution diagram of seed samples shown in Fig. 4. The distance between each sample in the diagram reflects the degree of community structure similarity, with closer distances reflecting more similar community structures. The results of the PCoA showed that the distance between samples 19H019 and 19H029 was close, and the distances between samples 19H011, 19H012, 19H013, 19H015, 19H022 and 19H024 were close, which indicated that the structures of the endophytic bacterial communities in samples 19H019 and 19H029 were similar, while those in samples 9H011, 19H012, 19H013, 19H015, 19H022 and 19H024 were similar.

Analysis of Environmental Factors Affecting the Community Structure and Diversity of Endophytic Bacteria in Upland Rice Seeds
In order to further explore the impact of environmental factors in upland rice origin areas on the endophytic bacterial community structure and diversity of the sample seeds, we gathered data about the temperature, precipitation and altitude of the corresponding areas in Yunnan Province of China, presented in Table 3 and Fig. 1B. From these, we revealed the relationship between environmental factors, sample distribution and the main dominant bacteria by RDA, Fig. 3 Classification of samples at the level of 97% sequence similarity (genus, top10). The horizontal represents the name of the sample and the vertical represents the name of the bacterial genus in the sample. The relative abundance of endophytic bacterial genera from 0-1 corresponds to a color gradient of blue, yellow and red, with the brighter the red, the higher the relative abundance of the species. And the samples with similar community structures were also clustered according to the relative abundance of different bacterial genera in each sample (Color figure online) with the average annual precipitation, average annual temperature and altitude used as variables (Fig. 5). The effects of environmental factors on the endophytic bacteria in seeds in the RDA diagram are characterized mainly by the length of the environmental factors, while the degree of influence on each strain is reflected by the cosine value of the angle. Temperature, precipitation and altitude were found to have significant effects on the endophytic bacteria in upland rice seeds, with precipitation and altitude exerting the main influence (Fig. 5). There was a significant positive correlation between the main dominant bacteria Pantoea and precipitation, temperature and altitude, with the correlation between Pantoea and altitude strongest. However, there was a negative correlation between other dominant bacteria and environmental factors. By comparing precipitation at sampling sites with the abundance of dominant endophytes in the upland rice seeds, it was found that the abundances of Pseudomonas, Microbacterium, Methylobacterium and Sphingomonas in most of the seeds collected in low precipitation areas were higher than in those from high precipitation areas (Fig. 6). The proportion of other bacteria was low and, therefore, no further analysis was conducted.

Discussion
Studies on the root microbial diversity of upland rice and its effects on the growth and drought tolerance of this crop have been previously reported (Pang et al. 2020a,b). Our research group has also previously explored the diversity and community structure of the endophytic bacteria found in upland rice seeds in different regions of China and preliminarily revealed the core microbiota of these endophytic bacteria (Wang et al. 2021b). However, until now there have been only limited reports on the endophytic bacteria in upland rice seeds and, in particular, correlation analysis between  Overall, the main endophytic bacterial groups and community structures of the 12 upland rice seed samples from Yunnan Province were found to be similar. The primary shared genera were Pantoea, Pseudomonas, Curtobacterium,Microbacterium,Methylobacterium,Agrobacterium,Sphingomonas,) was the first dominant genus shared by all 12 seed samples. Although some species of Pantoea are wellknown plant pathogens, in our previous studies on the endophytic bacteria of upland rice seeds and saline-alkali tolerant rice seeds, we also found Pantoea to be the first dominant genus in all samples (Wang et al. 2021a,b). As a dominant endophyte, Pantoea exists in healthy upland rice and other plant seeds and exerts no disease symptoms, indicating that it should have other effects on plants. It has been proved that some species of Pantoea do play an important role in promoting plant growth and improving plant drought resistance (Zhang et al. 2018a;Cherif-Silini et al. 2019;Luziatelli et al. 2020). It has been variously reported that Pseudomonas, Microbacterium, Methylobacterium and Sphingomonas isolated from plants can play a great role in improving the drought tolerance and drought resistance of plants (Wang et al. 2014;Egamberdieva et al. 2015;García-Fontana et al. 2020;Luo et al. 2020;Zhang et al. 2020). In this study, we also found that the abundance of these dominant endophytic bacteria in the upland rice seeds collected in areas with low precipitation was higher than that of the upland rice varieties from areas with high precipitation. As dominant genera, these bacteria participate in the community structure of endophytic bacteria in upland rice seeds, a phenomenon which may be of great significance to the drought resistance of upland rice. It has been reported that microflora in plants can assist in the recruitment of other beneficial microorganisms under biotic or abiotic stresses to protect the plants from pathogens, promote growth and improve resistance to such stresses (Pang et al. 2020a,b;Santoyo 2021). Consequently, it is our speculation some drought-resistant bacteria be actively recruited from the environment by upland rice under drought stress, however, this hypothesis needs to be verified by further research.
In order to further explore the factors influencing the community structure and diversity of endophytic bacterial in upland rice seed, we investigated and compared the environmental factors, such as temperature, precipitation and altitude, in the origin areas in Yunnan Province. Not only were differences found among the environmental factors in different regions but, through RDA analysis, it was also found that temperature, precipitation and altitude exert significant effects on endophytic bacteria. The altitude at the origin areas was found to have the greatest influence on the main dominant bacteria Pantoea in the upland rice seeds. These results indicate clearly that the community structure and diversity of endophytic bacteria in upland rice seeds are affected not only by upland rice varieties and genotypes, but also by environmental factors such as temperature, precipitation and altitude. Furthermore, the community structure and composition of endophytic bacteria in upland rice seeds are created by the combination of upland rice varieties, genotypes and environment, rather than by a single factor. The actual living state of plants in nature is the state of both microorganisms and plant. Plant breeding is, therefore, the cultivation of symbiotes between plants and microorganisms (Wang et al. 2015). Since upland rice has the characteristics of drought resistance and drought tolerance, its symbiotic microorganisms should also have corresponding drought tolerance characteristics that enable them to adapt to the local environment. The use of high-throughput sequencing technology in this study to explore the community structure and diversity of endophytic bacteria in upland rice seeds from origin areas of the Yunnan Province of China provided significant findings for the subsequent development of drought-tolerant bacteria resources and the improvement of local upland rice yields. At the same time, the mechanism of drought tolerance at the microbial level of upland rice can be investigated via comparative analysis of the microbial differences between upland rice and rice, for which this study has now laid a foundation.

Conclusion
Exploring the endophytic microbial community structure and diversity of upland rice seeds provides the basis for understanding the synergistic effect of endophytic bacteria in upland rice as well as the new functions and new substances produced by this synergy. Pantoea, Pseudomonas, Curtobacterium, Microbacterium, Methylobacterium, Agrobacterium, Sphingomonas, Aurantimonas and Rhodococcus were found to serve as major core endophytic bacteria in the 12 upland rice seed samples examined in this study. Overall, while some differences were observed in the endophytic bacterial community structure and diversity in the seed samples, these differences were not significant. Furthermore, it was found that the differences in endophytic bacterial community structure and diversity were not only related to the genotypes of the upland rice seed samples themselves, but were also affected by local temperature, precipitation, altitude and other environmental factors.