onDifferentialMetabolites inDistinction ofRice (Oryza sativa L.) Origin Based on GC-MS

/e analytical method for the metabolomics of the 60 rice seeds from two main rice origins in Heilongjiang Province was developed based on gas chromatography coupled with mass spectrum. /e specific differential metabolites between two rice origins were identified, and the distinguish of the two main origins was illustrated by using the R software platform with XCMS software package for gas chromatography coupled with mass spectrum data processing, combined with multivariate statistical analysis software. /e result indicated that the 173 peaks were detected, and 54 of which were structurally identified, covering amino acids, aliphatic acid, sugar, polyols, and so on. By comparing the data of Wuchang and Jiansanjiang origins, it was found that there were 9 special metabolites in Wuchang origin and 8 special metabolites in Jiansanjiang origin. /e 10 differential metabolites with significant changes (P< 0.05, VIP≥ 1) were filtrated. It is indicated that the differential metabolites of rice carry information of their origin and there are the differences in the metabolites of rice in two main origins. /e proposed method is expected to be useful for the metabolomic researches of rice.


Introduction
Metabonomics is the science of taking low-molecular weight metabolites in biological samples (such as organic acids, fatty acids, amino acids, and sugar) as the research object, through high-throughput detection and data processing, information integration, and biomarker identification [1]. Since the concept of metabolomics was put forward by Oliver [2] in 1997, metabolomics has been widely applied in various fields, becoming a powerful means to explore the inner mechanism of matter [3]. As people emphasis on food safety, identification or trace of origin of agricultural products becomes the focus of research in recent years, and mineral elements fingerprints analysis [4] and isotope fingerprint analysis [5] are a commonly used means.
And metabolomics, from the point of view of biology, the overall qualitative and quantitative study on all the endogenous metabolites in plant, and explain the metabolite changes from the angle of systems biology and biological life activity phenomenon [6], will become an effective analytical platform for the identification of agricultural products. Nicholson et al. had studied the metabolomics of Arabidopsis thaliana in different origins and found that the differences of the growth environment could make differences in the amino acids and sugars of Arabidopsis thaliana [7]. Giansante et al. [8] analyzed composition and content of fatty acid of olive oil from four different origins in an Italian by using gas chromatography. e results showed that the content of palmitic acid and linoleic acid in the olive oil existed significant differences in different origins. With high resolution, sensitivity, and reproducibility, gas chromatography-mass spectrometry (GC-MS) has become one of the main analytical platforms in metabolomic research [9].
Rice (Oryza sativa L.) is an important food crop, rich in nutrients and essential trace elements, suitable for human needs [10] and has become a hot topic in plant metabolomics research in recent years [11][12][13][14][15][16]. e environment of origin has an important influence on the growth of rice, so variety and content of the metabolites in rice may have the information of origin. e metabolites of rice in different origins may exist obvious differences.
In this work, rice seeds from two main rice origins, Wuchang farm (WC) and Jiansanjiang farm (JSJ), in Heilongjiang Province were comparatively analyzed based on the metabolomics by GC-MS. e results provide a theoretical basis for the origin identification and distinguish of rice.

Plant Material.
Rice plants, Oryza sativa L., were collected from the two geographical indication rice reserves, which are located in Jiansanjiang farm (JSJ) and Wuchang (WC) in Heilongjiang Province. e main varieties were randomly collected with the checkerboard sampling method according to the representative sampling principle [17] in the scope of protection. In each sampling point, rice panicles (Japonica rice, 1-2 kg) were collected according to different directions. 30 samples were selected from each origin.

Sample Preparation.
e sample processing method and chromatographic method refer to the literatures [18,19] with slight modifications.
Under liquid nitrogen, rice seed was pulverized to obtain the powdered samples. 50 mg of powdered samples, 800 µL of methanol, and 10 µL of internal standard (2-chlorphenylalanine) were mixed by vortex for 30 s. Subsequently, the mixture solution was centrifuged at 12,000 rpm for 15.0 min at 4°C. After centrifugation, 200 µL supernatant was transferred to a GC bottle (1.5 mL automatic sample bottle), and then the bottle was dried with nitrogen blowing. e dried residue was completely dissolved for 90 min at 37°C (in 30 µL of 20 mg·ml −1 methoxyamine hydrochloride in pyridine) followed adding 30 µL BSTFA to derive for 60 min at 70°C. All samples were analyzed within 24 h after derivatization treatment.
2.5. GC-MS Analysis. 1 µL of sample volume was injected with autosampler. Gas chromatography was performed on a 30 m HP-5 ms column with 0.25 mm inner diameter and 0.25 mm film thickness (Agilent J&W Scientific). Injection temperature was 280°C, the interface was set to 250°C, and ion source was adjusted to 230°C and quadrupole to 150°C. Helium (>99.999% purity) was used as the carrier gas set at a constant flow rate of 2 mL·min −1 . e temperature was 2 min isothermal heating at 80°C, followed by a 10°C·min −1 oven temperature ramp to 320°C and a final 6 min heating at 320°C.
e system was then temperature equilibrated for 6 min at 80°C prior to injection of the next sample. Full scanning mode was used, and scanning range was 50-550 (m/z).

Data Analysis. GC-MS data analysis was performed at
Suzhou Bionovogene (Suzhou, China). e original data of GC-MS were pretreated with XCMS software packages in the R software platform. en, the edited data matrix was imported into SIMCA-P software (Umetrics AB, Umea, Sweden) for principal component analysis (PCA) and partial least squares-discriminant analysis (PLS-DA) and other multivariate statistical analysis. e differences of the sample metabolome between the groups were analyzed through the analysis of PCA and PLS-DA score map, and the difference metabolites were screened according to the difference between the value of group contribution (VIP) and the significance (P < 0.05). Compared with the standard spectrum library of National Institute of Standards and Technology (NIST) and Wiley Registry metabolomic database, most metabolites were analyzed. e paraffin retention index of metabolites is further qualitative identified based on the retention index provided by the Golm Metabolome Database (GMD). At the same time, most substances were further confirmed by standard products.

GC-MS Total Ion
Chromatogram. In all samples, 173 peaks were detected, and 54 metabolites (Table 1), including 11 kinds of sugars, 9 kinds of fatty acids, 12 kinds of polyols, 14 kinds of other derivatives, 14 kinds of organic acid, 2 kinds of amino acids, 1 kind of phosphoric acid, and 1 kind of nucleotide, were identified by analysis of GC-MS raw data. e comparison and analysis of the total ion flow chromatogram of two groups of samples ( Figure 1) showed that the total ion chromatogram of rice in different origins is similar but slightly different. As can be seen from Figure 1, the baseline of the peak diagram is stable and the instrument is in good stability.

e Differential Metabolites of Rice in Two Origins.
e metabolites common in 30 rice samples were found with the statistical analysis. A total of 26 metabolites were found in the WC origin, and 25 were found in the JSJ origin. Remove the same metabolites, and ultimately, the nine kinds of characteristic metabolites, including pentanoic acid, pyran glucose, stearic acid, eicosanoic acid, docosanoic acid, 1monooctadecyl glycerol, trehalose, β-tocopherol, and β-sitosterol, respectively, were found in WC origin. e 8 kinds of characteristic metabolites, including benzoic acid, fumaric acid, xylose, xylitol, glucose, inositol, sorbitol, and raffinose, respectively, were found (Table 1).

e Differences Analysis of Rice Metabonomics in
Two Origins 3.2.1. PCA Analysis. PCA [20] can degrade the data, eliminate overlapping information, and explain most information with a small number of factors, so as to distinguish similar variables and find differences. A PCA analysis (SIMCA-P) was performed for processing the data of rice samples from two groups to research the effect of different growth environment on rice metabolites. is analysis has three main components. In the PCA part of this paper, the cumulative R 2 X � 0.664 and Q 2 � 0.572. e values of R 2 X and Q 2 are both greater than 0.5, and the difference between the two is less than 0.2. e metrics of the two indicators given in the comprehensive literature [21] indicate that the PCA model has good fit and predictability, and there is no overfitting. It is indicated that the fitting accuracy of the model is better. As can be seen from Figure 2, except for the 5 abnormal samples, the remaining 55 rice samples are in the confidence interval. e data of the abnormal sample represent that the sample data are quite different from the other sample data of the same group, so the display in the figure shows that it is far away from other samples in the same group. ere are two reasons for the anomaly. One is the error in the sample pretreatment process, and the other is caused by the large individual difference between the sample itself and other samples. According to the PCA score map, the samples in the JSJ origin are distributed on the left side of the confidence interval, and the samples in the WC origin are mostly distributed on the right side of the confidence interval. Explain that there are differences between the two origin samples. ere is overlap in the sample group of the Jiansanjiang origin, indicating that the similarity between the samples is large, and there is also a large distance between the sample points. It shows that the similarity between samples is small and the difference is large. Comparison of samples from both regions and comparisons between samples from the same place indicate differences between samples.
For intergroup samples, the samples from the JSJ origin are distributed on the left side of the confidence interval, while the samples from the WC origin are mainly distributed on the right of the confidence interval. It is showed that there were significant differences between the two groups of samples, but there was still overlap between the samples of similar groups.
rough unsupervised principal component analysis, rice samples from different origins could not be distinguished.

PLS-DA Analysis.
At the same time as the reducing dimension, PLS-DA ( Figure 3) combines with regression model and makes a discriminant analysis of regression results with a certain discriminant threshold, which is conducive to more efficient discovery of intergroup differences and differential compounds. is analysis has two main components, R 2 X � 0.715, R 2 Y � 0.87, and Q 2 � 0.629, and the values of the two principal components are similar. When PLS-DA is used for data correlation or discriminant model analysis, the replacement test mode can be used ( Figure 4). As can be seen from Figure 4, the abscissa indicates the displacement retention of the permutation test (the ratio that coincides with the order of the original model Y variable, and the point where the permutation retention is equal to 1 is the R 2 Y and Q 2 values of the original model). e ordinate indicates the value of R 2 Y or Q 2 . e green dot indicates the R 2 Y value obtained by the displacement test. e blue square point indicates the Q 2 value obtained by the displacement test. e two broken lines represent the regression lines of R 2 Y and Q 2 , respectively. e original model R 2 Y is 0.87. Between 0.5 and 1, closer to 1, indicating that the established model is more in line with the real situation of the sample data. e original model Q 2 is 0.629, which is greater than 0.5, indicating that if a new sample is added to the model, an approximate distribution will be obtained. In general, the original model can better explain the difference between the two sets of samples. e Q 2 value of the random test for the displacement test is smaller than the Q 2 value of the original model. e regression line of Q 2 and vertical axis intercept is less than zero. At the same time, as the retention decreases, the proportion of the substituted Y variable increases, and the Q 2 of the stochastic model gradually decreases. It shows that the original model has good robustness and there is no overfitting phenomenon. In view of the above, it shows that the PLS-DA model has better predictability and there is no overfitting. Compared with the PCA score chart (Figure 2), Q 2 is increased, which indicates that the concentrated repeatability of test is well, and the accuracy of the model is very high. Figure 3 shows that samples of two groups completely separate, no overlapping samples. For intergroup samples, the samples from the JSJ origin are distributed on the left side of the confidence interval, while the samples from the WC origin are mainly distributed on the upper of the confidence interval.
ere are two abnormal samples that are not within the confidence interval. It is proved that different origins (growth environment) have a great influence on rice metabolites.

Exploration and Identification of Differential
Metabolites between the Two Groups. In this experiment, the variable importance in the projection (VIP, the threshold value > 1) of the PLS-DA model coupled with the P value of Student's t-test (t-test, the threshold value ≤ 0.05) were used for looking for the metabolites of differential expression. e rice samples of two origins were analyzed, and the ten differential metabolites with significant changes were screened, covering 4 kinds of fatty acids, 3 kind of other derivatives, a kind of polyols, a kind of sugars, and a kind of nucleotides, the differences metabolites contain both primary metabolites and also contains the secondary metabolites (Table 2). Among the two rice origins, eight differential metabolites were higher in rice in the WC origin than in the JSJ origin and were upregulated and increased by 1.51-4.24 times. e content of glycerol and α-D-Methylfructofuranoside was lower than in rice in the WC origin than in the JSJ origin and were downregulated and decreased by 0.52 and 0.06 times, respectively.

Hierarchical Cluster Analysis.
e data set is scaled by the pheatmap package in R software (v3.3.2), and the bidirectional clustering analysis of samples and metabolites was conducted. Figure 5 is a hierarchical clustering diagram of relative quantitative values of metabolites in this experiment. As can be seen from Figure 5, the heat map is divided into two parts: red and green, indicating that the content of metabolites is very different, and the difference between them is obvious. At the top of the graph is the clustering of samples in two origins, and it can be found that the clustering effect is very good, the samples on the left are from mainly JSJ origin, and the samples on the right are from the WC origin. It can be seen that, through ten differential metabolites of the selection, the samples in two origins can be distinguished and the results are good.

3.2.5.
e Pathway Analysis of Differential Metabolites. e related metabolic pathway analysis was carried out on the differential metabolites of two rice origins by using the enrichment analysis of MetaboAnalyst and metabolic pathway retrieval of KEGG [22]. Five metabolic pathways were found, and the specific information is shown in Table 3. e 10 kinds of differential metabolites in two origins were analyzed through retrieval of metabolic pathways, and five differential metabolites, including glycerol, indole, thymidine, myristic acid, and 9-(Z)-octadecanoic acid, were found in target metabolic pathways. e metabolic pathways, including the biosynthesis of fatty acids, glycerol metabolism, the biosynthesis of phenylalanine, tyrosine, and tryptophan, galactose metabolism, and pyrimidine metabolism, were found and matched. It is indicated that the difference of origins has an obvious effect on the various terminal metabolic pathways of rice.

Conclusions
In this experiment, the metabonomics of rice seeds in two main rice origins (JSJ and WC) in Heilongjiang Province were researched based on the GC-MS, 173 peaks were detected, and 54 metabolites were identified. Compared with the two origins, 9 unique metabolites were found in WC origin, 8 of which were unique metabolites in JSJ origin, and the 10 differential metabolites of distribution between JSJ and WC origins were selected. It is proved that the metabolites of rice in a different origin carry the information of  Journal of Chemistry their origin, and the difference of metabolites is feasible for the distinguishing of rice origins. e purposed method is feasible for the separation and identification of metabolites in rice seeds. Compared with the literatures [19,23], the origin has an effect on rice metabolites, and the metabolites will be different in different origins. e method can be used for the isolation and identification of metabolites in rice seeds. e analysis of metabolic pathways shows that the difference of origins has an obvious effect on the various terminal metabolic pathways of rice. is research provides a reference for plant metabolomics. So it seems possible to extend this method to the separation and identification of the metabolites in other similar samples by varying the experimental conditions.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Consent
Informed consent was obtained from all individual participants included in the study.