Identication of the conserved domains of ADP-Glucose Pyrophosphorylase (AGPase) protein in sweetpotato (Ipomoea batatas (L.) Lam.) and its two wild relatives

The conserved domains are dened as recurring units in molecular evolution, which are commonly used to interpret the molecular function and biochemical structure of proteins. The AGPase amino acid sequences of three species from the Ipomoea genus were identied to investigate their physicochemical and biochemical characteristics. The molecular weights (MW), isoelectric point (pI), instability index (II), and grand average of hydropathy (GRAVY) showed considerable differences in each plant. The aliphatic index (AI) values of sweetpotato AGPase proteins were higher in the small subunit than in the large subunit. The AGPase proteins from sweetpotato contain an LbH_G1P_AT_C domain in the C-terminal region and various domains (NTP_transferase, ADP_Glucose_PP, or Glyco_tranf_GTA) in the N-terminal region. On the other hand, most of its two relatives (I. trida and I. triloba) only contain the NTP_transferase domain in the N-terminal region. These ndings suggested that these conserved domains were species specicity and related to the subunit types of AGPase proteins. The study may enable research on the AGPase-related specic characteristics of sweetpotatoes, which do not exist in the other two species, such as starch metabolism and tuberization mechanism.


Introduction
ADP-glucose pyrophosphorylase (AGPase; EC: 2.7.7.27) is a regulatory enzyme that catalyzes the biosynthesis of alpha 1,4-glucans (glycogen or starch) in photosynthetic bacteria and plants (Smith-White and Preiss 1992). In higher plants, it is a heterotetramer composed of two different but closely related subunits (α2β2): "small" (α subunit, 50-54 kDa) and "large" subunits (β subunit, 51-60 kDa) based on the size difference (Smith-White and Preiss 1992; Ballicora et al. 2004). The small subunit is responsible for the catalytic activity, whereas the large subunit plays regulatory roles (Crevillén et al. 2003; Ballicora et al. 2004). These subunits are necessary for the optimal activity of the native enzyme in plants; a lack of one of the subunits will reduce the activity of the AGPase and in uence the synthesis of starch (Li and Preiss 1992). In sweetpotato, AGPase is a key enzyme controlling starch synthesis and is considered an important determinant of the sink activity of the roots (Yatomi et al. 1996; Tsubone et al. 2000). Many AGPase genes have been cloned and studied in sweetpotatoes ( Sweetpotato (Ipomoea batatas (L.) Lam.) is a hexaploid (2x = 6n = 90) perennial tuberization crop belonging to the family Convolvulaceae (Welbaum 2015). Two non-tuberization diploid Ipomoea species, I. tri da (H.B.K.) G. Don (2n = 2x = 30) and I. triloba L. (2n = 2x = 30), have been reported to be the putative progenitors of sweetpotato, which are commonly considered to be model species for sweetpotato research (Roullier et al. 2013;Wu et al. 2018). In this study, the AGPase genes were screened from sweetpotato and its two related species to investigate the conserved domains of the coding protein. The differences in these domains can be used to con rm the functions of the AGPase protein between the sweetpotato and its two relatives.

Methods
Identi cation of AGPase amino acid sequences Sweetpotato Genomics Resource (http://sweetpotato.plantbiology.msu.edu/index.shtml) and NCBI databases (https://www.ncbi.nlm.nih.gov/) were used to identify the AGPase domain-containing proteins in the three species. The amino acid sequence of the AGPase protein IbAGPa1 (BAF47744.2) was used as the driver sequence for BLAST-search.
The ProtParam (http://www.expasy.org/tools/protparam.html) of ExPASy (Expert protein analysis system, https://www.expasy.org/) tool was used to compute the physicochemical characteristics of AGPase proteins in the three species, including the number of amino acids, molecular weight, theoretical isoelectric point (pI), instability (II) and aliphatic index (AI), and grand average of hydropathy (GRAVY) (Gasteiger et al. 2005).
Multiple-sequence alignment and phylogenetic tree structure The amino acid sequences of the AGPase proteins in FASTA formats were used for multiple-sequence alignment using the CLC Sequence Viewer 7.6 software (CLC bio, Aarhus, Denmark). A neighbor-joining phylogenetic tree was constructed using MEGA X 10.1 software (Pennsylvania State University, US) with the following parameters: bootstrap analysis of 1,000 replicates, Poisson correction method, and pairwise deletion (Kumar et al. 2018).

Identi cation of AGPase proteins
Forty-ve AGPase domain-containing proteins from I. batatas (26 accessions), I. tri da (10 accessions), and I. triloba (9 accessions) were identi ed and used for various analyses ( Table 1). The sizes of these proteins were distinctly different; the amino acids ranged from 165 to 525 and the molecular weights (MW) ranged from 18.35 to 58.19 kDa.
The isoelectric point (pI), which represents the average pH of the molecule without a net electrical charge or electrically neutrality, was 4.71-9.53 in all categories. The average pI of I. batatas, I. tri da, and I. triloba AGPase were 6.83, 7.11, and 6.47, respectively. The instability index (II), which represents the stability and instability of a polypeptide at ≤ 40 and > 40, respectively, indicated 40 or less in AGPase of I. batatas. In contrast, some AGPases of the I. tri da and I. triloba were 40 or more. The aliphatic index (AI), which represents the relative volume of the aliphatic side chains of a polypeptide, was similar in the three species, but there were differences between subunits of I. batatas AGPase. Higher AI values were observed for the small subunits than the large subunits of I. batatas AGPase. The grand average of hydropathy (GRAVY), which was analyzed to determine the hydropathy of AGPase, showed that I. batatas had different characteristics from the other two species. All I. batatas AGPases showed negative values, whereas some of the I. tri da and I. triloba AGPases had positive values.

Conserved domain analysis
Six types of conserved domains that showed different distributions were included in the AGPase proteins of these three species (Fig 1b, Supplementary Table 1). Most of the I. tri da and I. triloba AGPase had only the NTP_transferase domain and some had two conserved domains: NTP_transferase at the N-terminal and Hexapep or Cpn60_TCP1 at the C-terminal. On the other hand, the I. batatas AGPase proteins had four types of conserved domains (NTP_transferase, LbH_G1P_AT_C, ADP_Glucose_PP, and Glyco_tranf_GTA_type); each of them had two conserved domains. All of the I. batatas AGPase proteins had the LbH_G1P_AT_C domain at the C-terminals, but the N-terminals differed according to the subunit. The N terminal of all large subunits of I. batatas AGPase proteins has the NTP_transferase domain only except for CAB51610.1, whereas all small subunits have ADP_Glucose_PP domain except for CAB55496.1, AAA19648.1, and CAA86726.1. The proteins with this exception all had partial sequences and had the Glyco_tranf_GTA_type domain at the C-terminals instead.

Phylogenetic analysis
The evolutionary history was inferred using the Neighbor-Joining method (Saitou and Nei 1987). Fig 1a presents the optimal tree with the sum of the branch length = 29.09. This analysis involved 45 amino acid sequences and 512 positions. The conserved domains were labeled on the amino acid sequences (Fig 1a). The length and type of the domain were different for each species. Based on the phylogenetic tree, AGPase proteins from these species were classi ed into two large subunit groups and two small subunit groups.

Discussion
AGPase is an important factor involved in the tuberous root of sweetpotatoes because it is a vital enzyme in starch synthesis (Yatomi et al. 1996;Tsubone et al. 2000). Although it is also present in I. tri da and I. triloba, as well as in plants of the genus Ipomoea, they all have different physiological properties from sweetpotatoes, such as nontuberization. Therefore, AGPase is believed to have different structures or different functions in plants of the genus Ipomoea. The AGPase identi cation of sweetpotatoes and two non-tuberous Ipomoea species performed in this study is very important for understanding the relationship between plants of the genus Ipomoea and the functions of each species.
Sweetpotato is a polyploid crop of I. tri da, but it is unclear if it is autopolyploidy or allopolyploidy (Roullier et al. 2013;Wu et al. 2018). The amount of AGPases increased by whole-genome duplication in sweetpotatoes from its relatives. This result is consistent with a study showing that the number of rboh genes in the polyploid plant, Gossypium hirsutum, was higher than its progenitor plants G. raimonddi and G. arboreum (Wang et al. 2020). Moreover, some AGPases in I. tri da and I. triloba exhibited an II value ≥ 40, which means an unstable state, but there was no AGPase representing an II value ≥ 40 in I. batatas (Table 1). This suggests that some of the genes that were unstable during the evolution of I. batatas may have been deleted.
A difference in the domain composition of AGPase was observed between sweetpotatoes and the other Ipomoea plants; I. batatas has a more complex composition (Fig. 1b). The N-terminal of the small subunit and the C-terminal in sweetpotatoes were composed differently from the domains of the two species. These results suggest that LbH_G1P_AT_C at the C-terminal and ADP_Glucose_PP and Glyco_tranf_GTA_type at the N-terminal of the small subunit contribute to the different functions and regulations than non-tuberous relative plants. Many studies have shown that genes can be orthologs or paralogs by domain architectures, such as the insertion and deletion of new domains during evolution (Björklund et al. 2005;Forslund et al. 2011). Although this study cannot con rm the homolog genes of each AGPase in the genus Ipomoea plants, the evolutionary process of the genome among these plants, including AGPase, is expected to be revealed through further studies.

Conclusion
Sweetpotato AGPases has relatively conserved domains compared to I. tri da and I. triloba. The small subunit of AGPase showed complex structures in sweetpotatoes compared to the other two species. Sweetpotato AGPase had the LbH_G1P_AT_C domain in the C-terminal region, which was not present in I. tri da and I. triloba. This suggests that the structure of AGPase in sweetpotato, which is different from the other two species, plays important roles in certain functions of sweetpotatoes, such as starch biosynthesis and tuber formation. More isolation studies and further examination of gene expression will be needed to clarify the functional role of sweetpotato-speci c domains in tuberization.

Declarations
Funding Not applicable

Con ict of interest
We declare that we have no con ict of interest.

Availability of data and material
Not applicable Code availability Not applicable

Author contributions
All authors contributed to the study conception and design. Kim SH conceived the original research plan; Nie H performed the data collection and wrote the manuscript; Kim SJ, Kim HH, and Kim JS revised the manuscript. All authors have reviewed and approved the nal version of the manuscript.