Characterization and Differentiation of Grain Proteomes from Wild-Type Puroindoline and Variants in Wheat

Premium wheat with a high end-use quality is generally lacking in China, especially high-quality hard and soft wheat. Pina-D1 and Pinb-D1 (puroindoline genes) influence wheat grain hardness (i.e., important wheat quality-related parameter) and are among the main targets in wheat breeding programs. However, the mechanism by which puroindoline genes control grain hardness remains unclear. In this study, three hard wheat puroindoline variants (MY26, GX3, and ZM1) were compared with a soft wheat variety (CM605) containing the wild-type puroindoline genotype. Specifically, proteomic methods were used to screen for differentially abundant proteins (DAPs). In total, 6253 proteins were identified and quantified via a high-throughput tandem mass tag quantitative proteomic analysis. Of the 208 DAPs, 115, 116, and 99 proteins were differentially expressed between MY26, GX3, and ZM1 (hard wheat varieties) and CM605, respectively. The cluster analysis of protein relative abundances divided the proteins into six clusters. Of these proteins, 67 and 41 proteins were, respectively, more and less abundant in CM605 than in MY26, GX3, and ZM1. Enrichment analyses detected six GO terms, five KEGG pathways, and five IPR terms that were shared by all three comparisons. Furthermore, 12 proteins associated with these terms or pathways were found to be differentially expressed in each comparison. These proteins, which included cysteine proteinase inhibitors, invertases, low-molecular-weight glutenin subunits, and alpha amylase inhibitors, may be involved in the regulation of grain hardness. The candidate genes identified in this study may be relevant for future analyses of the regulatory mechanism underlying grain hardness.


Introduction
Hexaploid bread wheat (Triticum aestivum L.) is a critical food crop worldwide. In China, which is the largest producer of bread wheat, the wheat-growing region comprises approximately 24 million hectares and includes basins, hills, plains, plateaus, and mountains. New high-quality and genetically diverse wheat varieties are urgently needed to meet the increasing demand for wheat due to societal and economic development and increases in living standards. Currently, wheat varieties in China mainly consist of general and mixed wheat, which are insufficient for satisfying the processing demands in the wheat industry. Specifically, there is a lack of premium quality wheat for special end-uses, especially hard and soft wheat types [1].
Grain hardness is an important wheat quality-related trait and one of the main phenotypic targets among wheat breeders [2]. Most wheat varieties are divided into two main classes on the basis of the kernel texture (i.e., hard and soft), with the remaining varieties classified as either medium-hard or medium-soft wheat. Hard and soft wheat varieties mill

Examination of the Puroindoline Genotypes and Grain Hardness Indices of the Experimental Materials
To analyze the DAPs in wheat grains with varying hardness indices, four varieties with diverse puroindoline genotypes (CM605, ZM1, MY26, and GX3) were selected for this study. The soft wheat variety CM605, which carries the wild-type puroindolineencoding alleles (Pina-Dla/Pinb-D1a), had a grain hardness index of 33.1. In contrast, the hard wheat varieties MY26, GX3, and ZM1 had grain hardness indices of 65.8, 63.0, and 60.7 and Pina-Dla/Pinb-D1b, Pina-Dla/Pinb-Dlc, and Pina-Dla/Pinb-D1p genotypes, respectively (Table 1). In each column, different lowercase and uppercase letters indicate significant differences at the 0.05 and 0.01 levels, respectively.

Protein Identification and Quantification
A total of 66,369 matched spectra, 32,547 peptides, and 6253 proteins were identified by the tandem mass tag (TMT) analysis of the four wheat varieties. Relative quantitative data were obtained for the 6253 identified proteins. Additionally, 2601 of the detected proteins were annotated according to all four of the following databases: Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), Clusters of Orthologous Groups (COG), and InterPro (IPR) (Figure 1). The following criteria were used to identify significant DAPs: fold-change >2 (increased abundance) or <0.5 (decreased abundance) and a false discovery rate <0.05. Of the 6253 identified proteins, 208 were identified as DAPs. Moreover, 115 proteins were differentially expressed between MY26 and CM605, of which 37 and 78 proteins were significantly more and less abundant, respectively, in MY26 than in CM605. Furthermore, 116 proteins were differentially expressed between GX3 and CM605, of which 43 and 73 proteins were significantly more and less abundant, respectively, in GX3 than in CM605. Among the 99 proteins that were differentially expressed between ZM1 and CM605, 42 and 57 were significantly more and less abundant, respectively, in ZM1 than in CM605. The techniques used in this study enabled a high-throughput and high-resolution proteomic analysis. In earlier studies, 1211 quinoa proteins were identified using a labelfree quantification method [17]; 6061 proteins in wheat grains were identified by a TMT analysis [18], and 6958 wheat proteins were identified on the basis of iTRAQ data [19]. More specifically, TMT analyses are performed using a multiplexed protein identification The techniques used in this study enabled a high-throughput and high-resolution proteomic analysis. In earlier studies, 1211 quinoa proteins were identified using a labelfree quantification method [17]; 6061 proteins in wheat grains were identified by a TMT analysis [18], and 6958 wheat proteins were identified on the basis of iTRAQ data [19]. More specifically, TMT analyses are performed using a multiplexed protein identification and quantitation strategy involving isotope-labeling techniques that provide relative and absolute protein quantities in complex mixtures [20]. In the current study, 6253 proteins were identified and quantified in the grains of four wheat varieties.

Cluster Analysis of Protein Relative Abundances
A cluster analysis of protein relative abundances was completed to determine the correlation between protein relative abundances and puroindoline genotypes. The protein relative abundance for each sample was obtained. The expression data for all samples were combined for the C-means cluster analysis. The results of the cluster analysis are presented in Figure 2. Proteins were classified into six clusters according to their expression levels. Sixty-seven proteins were significantly more abundant in CM605 than in MY26, GX3, and ZM1 and were classified in Cluster 4. In contrast, 41 proteins were significantly less abundant in CM605 than in MY26, GX3, and ZM1 and were classified in Cluster 5 ( Table 2).

GO Enrichment Analysis of DAPs
Differentially abundant proteins detected by the comparisons between the wheat varieties with differing puroindoline genotypes and the wheat variety with the wild-type puroindoline genotype were included in the GO enrichment analysis. For the MY26 vs. CM605 comparison, the GO enrichment analysis assigned 69 GO terms to 115 DAPs. Among these GO terms, 21 were significantly enriched (p < 0.05). Notably, some proteins were annotated with multiple GO terms. For the GX3 vs. CM605 comparison, 116 DAPs were annotated with 67 GO terms, of which 17 were significantly enriched. For the ZM1 vs. CM605 comparison, 99 DAPs were annotated with 45 GO terms, among which 16 were significantly enriched. Of the enriched GO terms assigned to the DAPs, the following six were shared by all three comparisons: chitin catabolic process, cell wall macromolecule catabolic process, and response to stress (biological process terms) and enzyme inhibitor activity, chitin binding, and chitinase activity (molecular function terms) (Figure 3).

GO Enrichment Analysis of DAPs
Differentially abundant proteins detected by the comparisons between the wheat varieties with differing puroindoline genotypes and the wheat variety with the wild-type puroindoline genotype were included in the GO enrichment analysis. For the MY26 vs. CM605 comparison, the GO enrichment analysis assigned 69 GO terms to 115 DAPs. Among these GO terms, 21 were significantly enriched (p < 0.05). Notably, some proteins were annotated with multiple GO terms. For the GX3 vs. CM605 comparison, 116 DAPs were annotated with 67 GO terms, of which 17 were significantly enriched. For the ZM1 vs. CM605 comparison, 99 DAPs were annotated with 45 GO terms, among which 16 were significantly enriched. Of the enriched GO terms assigned to the DAPs, the following six were shared by all three comparisons: chitin catabolic process, cell wall macromolecule catabolic process, and response to stress (biological process terms) and enzyme inhibitor activity, chitin binding, and chitinase activity (molecular function terms) (Figure 3).  The comparisons of the wheat varieties detected six proteins that were annotated with the above-mentioned six common GO terms (Table 3). Four chitinase-related GO terms (chitin catabolic process, cell wall macromolecule catabolic process, chitin binding, and chitinase activity) were assigned to TraesCS1D01G249600.1. One peroxidase-related term (response to stress) was assigned to two DAPs (TraesCS3B01G577900.1 and TraesCS3B01G578000.1). Another enriched GO term in all three comparisons (enzyme inhibitor activity, which is associated with cysteine proteinase inhibitors and invertase inhibitors) was assigned to a proteinase inhibitor protein (TraesCS4A01G052100.1) and two invertase inhibitor proteins (TraesCS4A01G459900.1 and TraesCS4A01G460900.1). The comparisons of the wheat varieties detected six proteins that were annotated with the above-mentioned six common GO terms (Table 3). Four chitinase-related GO terms (chitin catabolic process, cell wall macromolecule catabolic process, chitin binding, and chitinase activity) were assigned to TraesCS1D01G249600.1. One peroxidaserelated term (response to stress) was assigned to two DAPs (TraesCS3B01G577900.1 and TraesCS3B01G578000.1). Another enriched GO term in all three comparisons (enzyme inhibitor activity, which is associated with cysteine proteinase inhibitors and invertase inhibitors) was assigned to a proteinase inhibitor protein (TraesCS4A01G052100.1) and two invertase inhibitor proteins (TraesCS4A01G459900.1 and TraesCS4A01G460900.1).
Cysteine proteinases exist in a wide variety of plants and are involved in several physiological processes. Most phytocystatins are inhibitors of cysteine proteases and have multiple important functions in plants. For example, they control various physiological and cellular processes in plants, while also inhibiting the activities of exogenous cysteine proteases that are secreted by herbivorous arthropods and pathogens to digest or colonize plant tissues [21,22]. Earlier research established clear correlations among storage protein deposition, cystatin biosynthesis, and decreased cysteine protease activities in storage organs. The functional relationship between cystatins and cathepsin L-like proteases was previously inferred on the basis of their involvement in the mobilization of storage proteins during the germination of barley seeds [23]. A cysteine proteinase (gliadian) that is secreted into the endosperm to digest storage proteins is reportedly regulated by intrinsic cystatins in wheat [24]. Another study identified two wheat cystatins (WC1 and WC4) with inhibitory effects on hydrolysis [25]. In barley, the downregulated production of a cystatin (HvIcy-2), which is one of the proteinaceous inhibitors of the cathepsin F-like protease, influences the grain-filling process [26]. Accordingly, cysteine proteinase inhibitors might contribute to the regulation of grain hardness by affecting the synthesis or hydrolysis of grain storage proteins.  Invertases are hydrolases that catalyze a reaction that converts sucrose to glucose and fructose. These enzymes are widely found in plants, animals, and microorganisms. On the basis of their solubility, localization, and pH optima, the invertases in higher plants can be divided into the following three groups: cytoplasmic, vacuolar, and cell wall invertases [27]. The unique expression pattern of the rice GIF1 gene, which encodes a cell wall invertase, reflects the close relationship between cell wall invertases and the kernel weight [28]. A previous study on maize showed that the constitutive expression of a cell wall invertase-encoding gene increases the total starch content by up to 20% in transgenic plants (relative to the corresponding content in wild-type control plants) [29]. Plastidic invertases, which are responsible for all of the invertase activities in the chloroplasts of Arabidopsis thaliana leaves, are required for starch accumulation [30]. Some invertases can modulate the starch content in plants, thereby indirectly affecting grain hardness. However, the specific relationship between invertase functions and grain hardness remains undetermined.

KEGG Pathway Enrichment Analysis of DAPs
The DAPs detected by the three comparisons also underwent a KEGG pathway enrichment analysis. The 20 most enriched KEGG pathways among the DAPs revealed by the three comparisons are presented in Figure 4. Of these enriched KEGG pathways, the following five were common to the three comparisons: 'glycosphingolipid biosynthesis-globo and isoglobo series', 'sphingolipid metabolism', 'fluid shear stress and atherosclerosis', 'MAPK signaling pathway-plant', and 'amino sugar and nucleotide sugar metabolism'. The following two DAPs were associated with four enriched KEGG pathways: TraesCS1D01G249600.1 (chitinase) and TraesCS5B01G011700.1 (alpha-galactosidase; α-Gal) ( Table 3).    Alpha-galactosidase (EC 3.2.1.22) is a type of exoglycosidase that can specifically catalyze the hydrolysis of α-galactosidic bonds. It has been detected in animals, plants, and microorganisms (archaea, bacteria, and fungi). However, compared with the research on α-Gal in microorganisms, there have been relatively few investigations on α-Gal in plants. Nevertheless, previous research demonstrated that α-Gal in plants is often involved in important physiological processes, including leaf development and senescence [31], seed development and germination [32], fruit softening and ripening [33], and stress responses [34]. Unfortunately, the effects of α-Gal on grain hardness are unknown.

IPR Enrichment Analysis of DAPs
The enriched IPR terms among the DAPs detected by the three comparisons were also determined. The 10 most enriched IPR terms among the DAPs are provided in Figure 5. The following five IPR terms were common to the three comparisons: 'protein of unknown function DUF538', 'chitin-binding, type 1', 'glycoside hydrolase, family 19, catalytic', 'pectinesterase inhibitor', and 'bifunctional inhibitor/plant lipid transfer protein/seed storage helical domain'. The 'protein of unknown function DUF538 IPR term was assigned to two proteins (TraesCS5B01G267400.1 and TraesCSU01G074400.1), both of which were annotated as a plant/protein (protein of unknown function). Both 'chitin-binding, type 1 and 'glycoside hydrolase, family 19, catalytic' were assigned to TraesCS1D01G249600.1, which was annotated as a chitinase. The 'pectinesterase inhibitor' term was assigned to two proteins (TraesCS4A01G459900.1 and TraesCS4A01G460900.1), which were annotated as invertase inhibitors. The 'bifunctional inhibitor/plant lipid transfer protein/seed storage helical domain' term was assigned to three proteins, of which two (TraesCS1B01G011600.1 and TraesCS1B01G011700.1) were annotated as low-molecular-weight glutenin subunits (LMW-GSs) and one (TraesCS2B01G004800.1) was annotated as an alpha amylase inhibitor. was assigned to two proteins (TraesCS5B01G267400.1 and TraesCSU01G074400.1), both of which were annotated as a plant/protein (protein of unknown function). Both 'chitinbinding, type 1′ and 'glycoside hydrolase, family 19, catalytic' were assigned to TraesCS1D01G249600.1, which was annotated as a chitinase. The 'pectinesterase inhibitor' term was assigned to two proteins (TraesCS4A01G459900.1 and TraesCS4A01G460900.1), which were annotated as invertase inhibitors. The 'bifunctional inhibitor/plant lipid transfer protein/seed storage helical domain' term was assigned to three proteins, of which two (TraesCS1B01G011600.1 and TraesCS1B01G011700.1) were annotated as low-molecularweight glutenin subunits (LMW-GSs) and one (TraesCS2B01G004800.1) was annotated as an alpha amylase inhibitor.  Low-molecular-weight glutenin subunits are polymeric protein components in the wheat endosperm. Their ability to form inter-molecular disulfide bonds with each other and/or with high-molecular-weight glutenin subunits is important for the formation of glutenin polymers and determines the processing properties of wheat dough [35]. A single wheat variety may contain 7-16 different LMW-GSs [36]. Moreover, each LMW-GS differentially influences the processing quality of flour [37]. Generally, most subunits (e.g., Low-molecular-weight glutenin subunits are polymeric protein components in the wheat endosperm. Their ability to form inter-molecular disulfide bonds with each other and/or with high-molecular-weight glutenin subunits is important for the formation of glutenin polymers and determines the processing properties of wheat dough [35]. A single wheat variety may contain 7-16 different LMW-GSs [36]. Moreover, each LMW-GS differentially influences the processing quality of flour [37]. Generally, most subunits (e.g., Glu-A3d, Glu-B3d, and Glu-D3d) positively affect dough strength. However, other subunits (e.g., Glu-B3j) are negatively correlated with the rheological properties of dough [38]. In the present study, the relative expression level of two LMW-GSs (TraesCS1B01G011600.1 and TraesCS1B01G011700.1) had a negative correlation with the wheat grain hardness index, suggesting they may have important functions affecting wheat grain hardness.
The grain starch content is reportedly negatively correlated with grain hardness, with increases in the starch content potentially resulting in the production of grains with a relatively soft endosperm texture [39]. In the current study, TraesCS2B01G004800.1 was annotated as an alpha amylase inhibitor that might restrict the hydrolysis of starch, ultimately leading to an increase in the total starch content of wheat grains. The Pfam database contains a large collection of multiple sequence alignments and hidden Markov models for many common protein families [40]. We determined that the Pfam ID (PF00234: protease inhibitor/seed storage/LTP family) of TraesCS2B01G004800.1 is the same as that of puroindoline a (TraesCS5D01G004100.1) and puroindoline b (TraesCS5D01G004300.1), with the latter protein identified as the main determinant of wheat grain hardness [7]. Accordingly, our findings are suggestive of a potentially critical relationship between TraesCS2B01G004800.1 and grain hardness.

Terms/Pathways/Proteins Common to All Three Comparisons
By analyzing the proteins annotated with the six GO terms and five IPR terms or assigned to the five KEGG pathways that were enriched in all three comparisons, 12 proteins annotated with these terms or assigned to these pathways were revealed to be differentially expressed in each comparison (Table 3). Of these 12 proteins, only two were upregulated in the wheat varieties with variant puroindoline genotypes (compared with the wheat variety with the wild-type puroindoline genotype); both proteins belonged to Cluster 5. The remaining 10 DAPs were downregulated in the wheat varieties with variant puroindoline genotypes and belonged to Cluster 4 ( Figure 2). The functions of these proteins and their effects on wheat grain hardness are described above. According to our results, several DAPs identified as cysteine proteinase inhibitors, invertases, LMW-GSs, and alpha amylase inhibitors may have regulatory effects on wheat grain hardness. However, the potential relationships between these proteins and grain hardness will need to be experimentally verified.

Plant Materials
In a previous study, more than 100 wheat varieties were collected from each wheat ecological region in China, after which the puroindoline gene-encoding locus was genotyped and the grain hardness index was calculated. Four wheat varieties that differed in terms of their puroindoline genotypes and grain hardness indices were selected for this study. More specifically, CM605, MY26, GX3, and ZM1 had puroindoline genotypes of Pina-Dla/Pinb-D1a, Pina-Dla/Pinb-D1b, Pina-Dla/Pinb-Dlc, and Pina-Dla/Pinb-D1p, respectively. Thus, these four varieties were classified into the following two categories: wild-type puroindoline genotype (Pina-Dla/Pinb-D1a) with a soft grain texture and variants (Pina-Dla/Pinb-D1b, Pina-Dla/Pinb-Dlc, and Pina-Dla/Pinb-D1p) with a hard grain texture ( Table 1).
The wheat cultivars were grown in an experimental field (36 • 14 N, 111 • 58 E) at The Wheat Research Institute, Shanxi Agricultural University (Linfen, Shanxi Province, China) from October 2019 to May 2020. For each wheat genotype, individual pods were considered as a biological replicate. Seeds were harvested from the naturally matured spikes, and the moisture content of grain was less than 12%. Three samples were collected per plot, and then each sample was examined three times. The grain hardness index was determined using approximately 100 g seeds per sample and the Single Kernel Characterization System (Model 4100; Perten Instruments, PerkinElmer, Waltham, MA, USA).

Protein Extraction
Individual samples were ground in liquid nitrogen. The ground material was resuspended in SDT lysis buffer (4% SDS, 100 mM DTT, and 10 mM TEAB) prior to a 5 min ultrasonication on ice. The lysate was incubated at 95 • C for 8 min and then centrifuged at 12,000× g for 15 min at 4 • C. The proteins in the supernatant were reduced with 10 mM DTT for 1 h at 56 • C and then alkylated with sufficient iodoacetamide for 1 h at room temperature in darkness. Precooled acetone (4-times volume) was added to the samples, which were then vortexed and incubated at −20 • C for at least 2 h. Samples were centrifuged at 12,000× g for 15 min at 4 • C, and the precipitate was collected. After washing with 1 mL cold acetone, the pellet was dissolved in dissolution buffer (8 M urea and 100 mM TEAB, pH 8.5). The protein concentration was determined on the basis of a Bradford protein assay. Next, 20 µg protein samples were analyzed by 12% SDS-PAGE initially at 80 V for 20 min and then at 120 V for 90 min. The gel was stained using Coomassie brilliant blue R-250 and destained until the bands were clear.

TMT Labeling of Peptides
Each protein sample was mixed with DB dissolution buffer (8 M urea and 100 mM TEAB, pH 8.5) for a total volume of 100 µL. Next, 1.5 µL trypsin and 100 mM TEAB buffer were added, and the samples were mixed and digested at 37 • C for 4 h, after which 1.5 µL trypsin and 2 µL CaCl 2 (1 mol/L) were added to each sample before an overnight digestion. Formic acid was added to the digested sample, and the pH was adjusted (<3). The mixture was centrifuged at 12,000× g for 5 min at room temperature. The supernatant was slowly loaded onto a C18 desalting column, which was washed three times with washing buffer (0.1% formic acid and 3% acetonitrile) before samples were eluted using elution buffer (0.1% formic acid and 70% acetonitrile). The eluants were collected and lyophilized. Next, 100 µL 0.1 M TEAB buffer was added to reconstitute the samples, which were then mixed with 41 µL acetonitrile-dissolved TMT labeling reagent. The samples were shaken for 2 h at room temperature. The reaction was terminated by adding 8% ammonia. All labeled samples were mixed (equal volume), desalted, and lyophilized.

Separation of Fractions
Mobile phases A (2% acetonitrile; pH adjusted to 10.0 using ammonium hydroxide) and B (98% acetonitrile) were used for the gradient elution. The lyophilized powder was dissolved in solution A and centrifuged at 12,000× g for 10 min at room temperature. The sample was fractionated using a C18 column (Waters BEH C18, 4.6 × 250 mm, 5 µm) and a Rigol L3000 HPLC system. The column oven was set at 45 • C. The eluates were monitored at a UV wavelength of 214 nm. Fractions were collected at a rate of one tube per minute for a total of 10 fractions. All fractions were dried under vacuum conditions and reconstituted in 0.1% (v/v) formic acid in water.

LC-MS/MS Analysis
Shotgun proteomic analyses were performed using an EASY-nLC™ 1200 UHPLC system (Thermo Fisher, Waltham, MA, USA) coupled with a Q Exactive™ HF-X mass spectrometer (Thermo Fisher, Waltham, MA, USA) at Novogene Genetics, Beijing, China. Specifically, 1 µg sample was injected into a C18 Nano-Trap column (4.5 cm × 75 µm, 3 µm). Peptides were separated in an analytical column (15 cm × 150 µm, 1.9 µm) using a linear gradient elution. The separated peptides were analyzed using the Q Exactive™ HF-X mass spectrometer (Thermo Fisher, Waltham, MA, USA) combined with Nanospray Flex™ (electrospray ion source) (Thermo Fisher, Waltham, MA, USA), with a spray voltage of 2.3 kV and an ion transport capillary temperature of 320 • C. The full scan range was 350 to 1500 (m/z) with a resolution of 60,000 (at m/z 200). The automatic gain control target value was 3 × 10 6 , and the maximum ion injection time was 20 ms. The 40 most abundant precursors in the full scan were selected and fragmented by higher energy collisional dissociation for the MS/MS analysis with a 10-plex resolution of 45,000 (at m/z 200). The automatic gain control target value was 5 × 10 4 , and the maximum ion injection time was 86 ms. The normalized collision energy was set at 32%; the intensity threshold was 1.2 × 10 5 , and the dynamic exclusion parameter was 20 s.

Identification and Quantification of Proteins
The proteins corresponding to the spectra from each run were identified by screening the IWGSC RefSeq v1.0 annotated wheat genome database (https://wheat-urgi.versailles. inra.fr/ (accessed on 24 March 2017)) using the search engine Proteome Discoverer 2.4 (PD 2.4; Thermo). The search parameters were as follows: mass tolerance for the precursor ion, 10 ppm; mass tolerance for the product ion, 0.02 Da; fixed modification, carbamidomethylation; dynamic modifications, oxidation of methionine and TMT plex; and N-terminal modifications, acetylation, TMT plex, methionine loss, and methionine loss + acetylation. A maximum of two missed cleavage sites were allowed. To improve the quality of the analysis, PD 2.4 filtered the search results. Specifically, the peptide spectrum matches (PSMs) with a credibility score exceeding 99% were designated as credible PSMs. The identified proteins contained at least one unique peptide. The identified PSMs and proteins with a false discovery rate of no more than 1.0% were retained for further analyses. The protein relative abundances were analyzed by performing a T-test. Proteins with a relative expression level that differed significantly between the experimental and control groups (p < 0.05 and fold-change >2.00 or <0.50) were defined as DAPs.

Functional Characterization of DAPs
The GO and IPR functional analyses were conducted using the InterProScan program, and the results were compared with the information in non-redundant protein databases (Pfam, PRINTS, ProDom, SMART, ProSite, and PANTHER). The COG and KEGG databases were used to analyze the protein families and pathways. The DAPs were included in the volcano map analysis, cluster heat map analysis, and GO, IPR, and KEGG enrichment analyses.

Conclusions
To reveal the differences between soft wheat and hard wheat proteomes, three hard wheat varieties (MY26, GX3, and ZM1) with different puroindoline-encoding genes were compared with a soft wheat variety (CM605) with the wild-type puroindoline genotype. A total of 6253 proteins were identified and quantified. Furthermore, a cluster analysis of protein relative abundances detected 208 DAPs that were classified into six clusters. Among these DAPs, 67 and 41 proteins were significantly more and less abundant, respectively, in CM605 than in MY26, GX3, and ZM1. Moreover, six GO terms, five KEGG pathways, and five IPR terms were common among the three comparisons according to the enrichment analysis. Twelve proteins annotated with these terms or assigned to these pathways were differentially expressed in each group. Several proteins had been previously identified (e.g., cysteine proteinase inhibitor, invertase, LMW-GS, and alpha amylase inhibitor) and may be involved in the regulation of grain hardness. To the best of our knowledge, this is the first comparative proteomic analysis of hard and soft wheat varieties with differing puroindoline genotypes. The findings of this study lay the foundation for future investigations on the regulatory mechanism associated with puroindoline-encoding genes, while also providing researchers with potential candidate genes for future studies on wheat grain hardness.
Funding: This study was supported by the Sichuan Science and Technology Program (2022NS-FSC1700; 2023NSFSC1925) and the Scientific and Technological Project of Sichuan Academy of Agricultural Sciences (1 + 9KJGG002).

Data Availability Statement:
The mass spectrometry proteomics data have been deposited to the Pro-teomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD041989.

Conflicts of Interest:
The authors declare no conflict of interest.