A Conserved NAG motif is critical to the catalytic activity of galactinol synthase, a key regulatory enzyme of RFO biosynthesis

1 Galactinol synthase (GolS) catalyzes the key regulatory step in the biosynthesis of Raffinose Family 2 Oligosaccharides (RFOs). Even though the physiological role and regulation of this enzyme has been 3 well studied, little is known about active site amino acids and the structure-function relationship with 4 substrates of this enzyme. In the present study, we investigate the active site amino acid and structure 5 function relationship for this enzyme. Using a combination of three-dimensional homology modelling, 6 molecular docking along with a series of deletion, site directed mutagenesis followed by in vitro 7 biochemical and in vivo functional analysis; we have studied active site amino acids and their 8 interaction with the substrate of chickpea and Arabidopsis GolS enzyme. Our study reveals that the 9 GolS protein possesses GT8 family-specific several conserved motifs in which NAG motif plays a 10 crucial role in substrate binding and catalytic activity of this enzyme. Deletion of entire NAG motif or 11 deletion or the substitution (with alanine) of any residues of this motif results in complete loss of 12 catalytic activity in in vitro condition. Furthermore, disruption of NAG motif of CaGolS1 enzyme 13 disrupts it’s in vivo cellular function in yeast as well as in planta. Together, our study offers a new 14 insight into the active site amino acids and their substrate interaction for the catalytic activity of GolS 15 enzyme. We demonstrate that NAG motif plays a vital role in substrate binding for the catalytic 16 activity of galactinol synthase that affects overall RFO synthesis. 17 18 D ow naded rom http://pndpress.com /bchem j/article-oi/10.1042/BC J20210703/6/bcj-2021-0703.pdf by gest on 08 N ovem er 2021 Bchem al Jornal. This is an Acepted M ancript. ou re encuraged to se he Vrsion of R eord tat, w en puished, w ill relace his vesion. he m st up-tote-version is avilable at https://drg/10.1042/BC J210703


.1.123) is a member of 21
Glycosyltransferase family 8 GT8 family glycosyltransferase in CAZy (Carbohydrate Active 22 enzymes) database and is present only in flowering plants [1,2]. GolS catalyzes the first committed 23 step in the biosynthesis of Raffinose Family Oligosaccharides (RFO) and is highly conserved among 24 plant species. The enzyme essentially carryout the transfer of galactosyl moiety from uridine 25 diphosphate galactose (UDP-alpha-D-galactose) to myo-inositol for the synthesis of galactinol (UDP 26 + O-alpha-D-galactosyl-(1->3)-1D-myo-inositol) which acts as a stepping stone for the biosynthesis 27 of the RFO members [3]. This free galactinol essentially supplies activated galactosyl moiety to 28 generate series of RFOs including raffinose, stachyose, verbascose by the enzymes raffinose synthase, 29 stachyose synthase and verbascose synthase, respectively [4]. RFOs play a very important and diverse 30 physiological roles in plants such as seed desiccation tolerance [5][6][7][8][9], translocation of photo-31 assimilates [10], promote biotic [11,12] and abiotic stress tolerance [13][14][15][16]. Galactinol and RFOs are 32 also believed to act as an osmolyte to maintain the cell turgor pressure and stabilizes the integrity of 33 cellular proteins and membranes during stressful environments [17]. Most of the plants have more 34 than one isoform which are usually encoded by a small gene family. In Arabidopsis, GolS isoforms 35 are encoded by seven distinct genes, while in chickpea (Cicer arietinum) GolS are encoded by two 36 genes [5,8,15]. GolS genes were shown to be differentially expressed in organs and in response to 37 different stimuli to perform distinct physiological functions [4,5,14,[18][19][20]. The overexpression of 38 GolS in homo-or heterologous system confers abiotic stress tolerance in diverse species. For instance, 39 the over expression of AtGolS2 imparts drought stress tolerance in Arabidopsis as well as rice [15,21]. 40 Further, we have previously established a direct role of chickpea GolS in abiotic stress tolerance and 41 seed longevity by reducing the ROS mediated damage [5,18,22]. The GolS isoforms are apparently a 42 monomeric protein with a calculated molecular weight of 35 kDa to 45 kDa. The GolS protein 43 possesses GT8 family-specific motifs such as DxD, HxxGxxKPW and GXG which are predicted to be 44 important for GolS enzyme activity, for instance DxD motif is crucial for divalent cation binding [2]. 45 However, due to the lack of crystallographic and structural studies of GolS, little is known about the 46 relationship between enzyme structure, active site amino acids and catalysis. Therefore, even though 47 physiological role, regulation and basic biochemical properties of this enzyme have been elucidated in 48 several plant species, studies pertaining to the identification of active site amino acid residues, 49 structure-function relationship between substrates and enzymes are largely untapped. 50 In our previous study, we have reported five distinct GolS isoforms (CaGolS1, CaGolS1′ CaGolS2, 51 CaGolS2′ and CaGolS2′′) in chickpea and CaGolS1′ (Accession no: KU189227), an alternative splice 52 variant of CaGolS1, was found to be biochemically inactive [5]. These observations prompted us to 53 investigate the biochemical and structural basis of the loss of catalytic activity of this CaGolS1' 54 Downloaded from http://portlandpress.com/biochemj/article-pdf/doi/10.1042/BCJ20210703/922716/bcj-2021-0703.pdf by guest on 08 November 2021 by a modified Fiske and SubbaRow (1925) protocol [31]. To reaction mixture, 100 μL of 2.5% 123 ammonium molybdate (dissolved in 2 N HCl) and 40 μL of Fiske and SubbaRow reducer were added. 124 After 2 min incubation at room temperature, 40 μL of 34% sodium citrate.2H 2 O solution were added, 125 and absorbance was immediately measured at 660 nm. The amount of Pi (inorganic phosphate) 126 formed by the hydrolysis of UDP was determined using a standard curve constructed with KH 2 PO 4 127 and correlated to the amount of UDP produced by the galactinol synthase. 128

Thermo-tolerance assay 129
For comparing the thermotolerance potential delivered by deletion variant of CaGolS enzyme. We 130 cloned the CaGolS1, CaGolS1Δ (N187A) in the pYES: DEST52 vector for yeast expression using 131 gateway-based cloning (Invitrogen). The constructed pDEST52-CaGolS1 and pDEST52 empty vector 132 were separately transformed into yeast strain (INVSc1) using the PEG-lithium acetate-based 133 transformation protocol. To inspect the CaGolS activity of yeast cells transformed with 134 pDEST52:CaGolS1 and pDEST52:CaGolS1∆ yeast cells were grown to logarithmic growth phase 135 (OD=1). The yeast cell was centrifuged and washed, the cell pellet was lysed by vortexing with glass 136 beads to prepare crude lysate [32]. The thermo-tolerance potential was also assessed using spot assay 137 and growth curve of yeast cells harboring the pDEST52:CaGolS1 and pDEST52:CaGolS1∆. For this, 138 the yeast cells were grown to the mid-logarithmic phase of growing cells with A 600 0.5 (about 1 x 10 7 139 cells ml -1 ) on YEB medium at 30°C, and the cell density was adjusted to 0.02 (A 600 ). For spot assay, 140 the equal cell density (A 600 0.02) was further serially diluted up to 10 -6 from each treatment and 141 spotted (5 μl) on the plate in triplicate [33]. For control the plate was kept at 30°C while for thermal-142 stress treatment, the plate was kept at 42°C for 8 h then the plate was shifted to 30°C and the growth 143 was monitored. Similarly, growth curve was plotted by assessing the growth (Absorbance OD 600 ) of 144 yeast cells incubate at 37°C and 30°C for every 2 hours. The initial cell density of all the yeast 145 cells/treatment were kept similar i.e., OD = 0.2 A, and the growth was recorded at every 2 h by 146 scoring the absorbance (OD 600 ). Each treatment was set up for 3 repetitions, with at least four 147 biological replicates. 148

Generation of transgenics 149
To generate transgenic lines of CaGolS1 and CaGolS1Δ (N187A), we cloned the CaGolS1 in 150 pENTR™/D-TOPO™ vector to generate entry clones. The entry clones were further subcloned in the 151 destination vector (pEarlyGateway:201) using gateway cloning approach (Invitrogen). The constructs 152 were confirmed through sequencing and transferred to agrobacterium cells (GV3101). Subsequently, 153 floral dip was performed to generate the transgenic plants of CaGolS1 and CaGolS1Δ. The positive 154 transformants were screened through Basta resistance (120 mg L -1 ) till we obtained T3 homozygous 155 lines [34]. The T3 homozygous transgenic lines were further selected on the basis of increased mRNA 156 level, GolS protein and galactinol content. 157

Western blot analysis of CaGolS1 and CaGolS1Δ mutant lines 158
Crude protein from transgenic (CaGolS1 and CaGolS1Δ) along with wild type and vector control 159 were extracted using protein extraction buffer used in our previous study [35]. The concentration of 160 isolated protein was determined using Bradford reagent (GE). For protein electrophoresis, 10 ug 161 proteins were separated on 12% SDS-PAGE and then blotted onto PVDF (polyvinylidene difluoride) 162 membrane using the Biorad blotting system. The membrane was incubated with the primary 163 monoclonal Anti-HA antibody (1:2000 dilution) (Merck), followed by horseradish peroxidase-164 conjugated secondary antibody (GE Healthcare). As a loading control, tubulin was detected using an 165 anti-tubulin antibody (Sigma) following the same procedure. 166

RNA extraction and cDNA synthesis for Quantitative PCR assays 167
Total RNA was extracted using the TRIzol reagent (Sigma) and the quantification of the RNA 168 samples was assessed by using the ND-1000 UV-visible light spectrophotometer [36]. RNA samples 169 with a 260/280-nm wavelength ratio of ~2 and a 260/230-nm wavelength ratio >2 was retained for 170 analysis. Total RNA (1 ug) was reverse transcribed into cDNA, using the verso cDNA synthesis kit 171 (thermo scientific). The absence of genomic DNA in RNA samples was checked by qPCR before 172 cDNA synthesis. Also, a negative control (No Template Control) was incorporated in each assay. For 173 normalization, the transcript levels of two endogenous reference genes (18S rRNA and EF1α) with 174 stable expression were used in each assay. The gene specific primer set along with the reference genes 175 used for the qRT-PCR were previously described and validated [37] 176

Controlled Deterioration Test (CDT) 177
The T 3 transgenic seeds expressing CaGolS1 and CaGolS1Δ were subjected to a controlled 178 deterioration test (CDT) to assess their vigor. In this, the seeds were initially imbibed in water to 179 enhance their moisture content and later subjected to high temperature and relative humidity 180 [5,35,38]. The treated seeds (4 days of CDT) were immediately used for germination assay and 181 viability test (TTZ assay). While for MDA and H 2 O 2 assay, the seeds were used immediately or snap 182 freeze in liquid nitrogen and stored in -80°C for later use. 183

Germination and TTZ assay 184
The germination and tetrazolium assay (TTZ; 2,3,5-Triphenyl-tetrazolium chloride) were essentially 185 performed as described previously [5]. After CDT treatment, the seeds were evaluated for their 186 germination potential. For this, the seeds were placed on ½ MS medium and kept in growth chamber. 187 The seeds were monitored for germination potential. The seeds were considered germinated and 188 scored after the 1mm radicle emersions. Besides, the seeds were subjected to 1% TTZ stain to 189 determine the viability of seeds before and after CDT and photographed [23,39,40]. 190

Malondialdehyde and H 2 O 2 estimation 191
T3 seeds of transgenic along with WT and VC, challenged with CDT was assayed for lipid 192 peroxidation based on TBA (thio-barbituric acid) method and H 2 O 2 using potassium iodide (KI) 193 method as essentially performed previously [14,27]. 194

Statistical analysis 200
The data presented in the manuscript represent the mean value and standard deviation of triplicate 201 analysis. The one-way analysis of variance (ANOVA) was employed to compare the statistically 202 significant difference between the sample and control value using DMRT as the post hoc test. The 203 difference in the mean was considered statistically significant if the α <0.05. 204

Results 206
Structural and bioinformatic analysis of CaGolS1 predicts probable active 207

site(s)/domain(s)/motif(s) 208
As, the 3D structure for galactinol synthase is not available, we generated the same through I-tasser 209 server as described in Experimental procedure section. The best model structure was selected as per 210 the I-tasser confidence level (c-score), in agreement with Ramachandran plot, and was used for 211 further bioinformatics analysis ( Table 1) protein for any suitable substrate ( Fig. 1 A-B). Several conserved domains/motifs such as FLAG, 216 DxD, HxxGxxKPW, APSAA and NAG which are reported previously [2,5] are also shown in analysis by I-tasser, predicted good binding with UDP in the central cavity ( Table 2) of the protein 228 with interacting residues ranging from almost all above reported conserved motifs/domains, and this 229 data also corresponds to the residues predicted by InterProScan as substrate binding sites. Thus, it is 230 clear that CaGolS1 interacts properly with UDP galactose and residues from reported motifs/domains 231 are either completely or partially involved in the interaction. Nevertheless, I-tasser and InterProScan 232 both have identified some more residues other than the region of reported motif/domain as potential 233 substrate binding site. 234

Comparative analysis of catalytically active CaGolS1 and inactive CaGolS1' reveals loss of 235 probable active site and conformational integrity of CaGolS1' 236
In our previous study, we have reported that CaGolS1', a splice variant of CaGolS1, is enzymatically 237 inactive, and the lack of biochemical activity of CaGolS1' is conceivably due to the absence of 73aa 238 stretch ( Fig. S1) [5]. This gives us an insight that residue(s) from 73aa stretch are likely to be 239 responsible to impart GolS activity to the molecule. To reconfirm that the deletion of these 73aa 240 indeed caused CaGolS1' isoform to be biochemically inactive, we generated a deletion construct 241 where this 73aa stretch was deleted from the biochemically active wild type CaGolS1 (designated as 242 CaGolS1Δ1). Subsequently, CaGolS1Δ1 was bacterially over expressed with C-terminal hexa-his-tag 243 and recombinant proteins were purified to homogeneity using pre-packed Ni-NTA column. Purified 244 recombinant protein was assessed for GolS activity and results revealed that CaGolS1Δ1 was indeed 245 biochemically inactive (Fig. S2). Next, to investigate whether the deletion of this stretch of 73aa 246 residues had changed the structural conformation and/or disrupted the active site amino acid residues, 247 we generated the 3D structure of CaGolS1' using I-tasser server as described in Experimental 248 procedures section. It is important to keep into consideration that like other biochemically active 249 GolS, CaGolS1 and CaGolS1′ (Accession no: KU189227) both possess FLAG, DxD, HxxGxxKPW 250 and APSAA conserved motifs and sequences (Fig. 1E). 3D model structure of CaGolS1' which lacks 251 a stretch of 73aa showed five α-helices and seven β-sheets where five pairs of β-sheets were 252 surrounded by α-helices and several intermediary placed varied length polypeptide chains invariably 253 present on surface of the protein (Fig. 1D). Data obtained by using ligand-binding site prediction by 254 InterProScan and I-tasser showed a significant decrease in number of amino acids involved in 255 probable active binding site formation for any suitable substrate in this mutant variant. Furthermore, 256 data obtained from I-tasser and InterProScan showed that almost 30-35% of residues predicted as 257 probable substrate-binding site belongs to the region of 73aa. This structural analysis between these 258 two proteins established that wild type CaGolS1 is much more stable and active protein with a greater 259 possibility to interact and bind with substrate UDP-Galactose/Inositol as compared to its splice 260 variant. Furthermore, 3-dimensional structural and their docking analysis with GolS substrate viz 261 UDP-galactose revealed that CaGolS1 wild type protein showed proper binding of UDP-galactose in 262 the central core cavity of the protein, whereas this substrate interacts poorly with surface residues of 263 CaGolS1' protein ( Fig. 2 A-B). Molecular docking analysis of CaGolS1 with UDP-galactose reveals 264 that nearly 50% of interacting residues belong to the region of 73aa. Therefore, it is evident that these 265 markedly displaced loops and folds in structure of splice variant (CaGolS1') serves as deterrent 266 against the substrate binding and results in an apparent conformational change in the active site which 267 might be attributed to the loss of catalytic activity of this isoform. Various parameters used for 268 bioinformatic analysis and molecular docking are represented in Table 1. 269

Homology modelling and deletion study reveal key motifs for GolS activity. 270
Based on previous analysis, it is assumed that 73aa regions possess important motifs/residues which 271 are likely to play important role in substrate binding for the catalysis. To identify the precise potential 272 key domain(s)/motif(s) important for catalysis or imparting GolS activity, we next performed in-depth 273 bioinformatic analysis. Initially, BLASTP analysis was carried out using NCBI server and protein was 274 found to be conserved as GolS1 in several plant species. Next, we carried out the Multiple Sequence  (Fig. S3). 281 Subsequently, to identify which of these conserved domains are important for GolS activity, we 282 generated four deletion variants of CaGolS1 (CaGolS1Δ2, CaGolS1Δ3, CaGolS1Δ4, CaGolS1Δ5) by 283 deleting 18-19aa sequence spanning these conserved sequences from 73aa patch (Fig. S4) using 284 splicing by overlap extension PCR (SOE-PCR). In these 18-19aa deletion mutants, LYFNAG belongs 285 to the first deletion mutant, FAEQDF belong to third deletion mutant and YNLVLAMLW is present 286 in fourth deletion mutant whereas no conserved sequences were observed in second deletion mutant. 287 All these deletion mutants were bacterially expressed and then purified enzymes were used for GolS 288 assay. Our results showed that all deletion mutants were enzymatically inactive (Fig. 3A-C). It can be 289 predicted that these conserved sequences, incomplete or in partial, are playing an important role in 290 can also be due to the altered conformation occurred by the larger deletion of 18-19aa. Thus, the 292 deletion of relatively larger patch of 18-19aa (CaGolS1Δ2, CaGolS1Δ3, CaGolS1Δ4, CaGolS1Δ5) 293 essentially leads to a conformational change and was not confirmatory to identify the important motif 294 and amino acids that are crucial for the catalytic activity of GolS enzyme. Hence, to narrow down to 295 identify precise amino acid residues/motifs responsible putative substrate-binding site from these 296 identified conserved domains (LYFNAG, FAEQDF, YNLVLAMLW), we cross-checked for any 297 probable ligand-binding residues predicted/identified by I-tasser or InterProScan, and it was observed 298 that NAG and FAE have been predicted by both I-tasser and InterProScan as probable substrate -299 binding site. More importantly on analyzing the same through selected 3D model structure we 300 identified that particularly NAG of LYFNAG conserved motif in the central core actively take part in 301 forming the substrate-binding cavity of GolS. 302

Mutation in NAG motif causes a complete loss of GolS activity 303
To verify such possibilities, we made bacterially expressed recombinant mutant proteins with these 304 deletions [CaGolS1Δ6: NAG and CaGolS1Δ7: FAE] (Fig. 4A-B). Recombinant proteins were 305 purified to homogeneity using pre-packed column of Ni-NTA matrix as described in previous section, 306 and subsequently analyzed their activity. Among these mutant variants, CaGolS1Δ6 (NAG) showed 307 the complete loss of activity whereas CaGolS1Δ7 (FAE) exhibited nearly 50% reduction in catalytic 308 activity compared to CaGolS1 (Fig. 4C). These results give a strong indication that NAG is crucial for 309 GolS activity whereas the FAE might be indirectly responsible for activity as in FAE mutant NAG 310 will be at its native position but due to the deletion of FAE the conformation of molecule is slightly 311 deteriorated, making changes in the active site. To revalidate these results, we performed the 312 molecular docking of both mutants with UDP galactose and found that CaGolS1Δ6 interacts poorly 313 with substrate on the surface of the molecule indicating that deletion of NAG results in the 314 deformation of active site leaving substrate no proper binding cavity whereas CaGolS1Δ7 shows 315 below moderate level of interaction with UDP suggesting that deletion of FAE resulted in 316 conformational change blocking the path of UDP to reach the binding cavity/site of the molecule, and 317 thus resulted in the reduction of GolS activity by ~50% (Fig. 2 C-D). Moreover, on looking into the 318 details of molecular docking analysis of wild type CaGolS1 with UDP galactose it was found that 319 NAG residues showed proper binding with substrate (Fig. 2a). Thus, it can be predicted that NAG 320 (N 187 , A 188 , G 189 ) is crucial for imparting GolS activity to the molecule. 321 To further investigate the catalytic importance of each amino acid residue in NAG motif (N 187 A 188 322 G 189 ) in detail, deletion and replacement of these residues have been carried out using SDM approach. 323 In N187A [CaGolS1Δ8] or G189A [CaGolS1Δ9] mutants GolS activity was completely abolished 324 (Fig. 4c). Further, we deleted either N 187 (CaGolS1Δ10), A 188 (CaGolS1Δ11) or G 189 (CaGolS1Δ12) 325 and assessed the effect of each residue on enzyme activity (Fig. 4c). Our results clearly depicted that 326 all the amino acids of NAG motif are critically important for GolS activity. 327 The NAG motif is highly conserved across the plant species, therefore, to confirm whether NAG 328 motif is also important for GolS activity in other plant species, we performed similar approach of 329 SDM and disrupt the NAG motif and generated constructs for three different mutant variants of 330 AtGolS1 (AtGolS∆1: NAG to AAA; AtGolS∆2: N193A; AtGolS∆3: N 193 deleted). Arabidopsis GolS1 331 (AtGolS1) along with its three mutant variants were bacterially expressed and purified using affinity 332 chromatography, and subsequently catalytic activity was examined and compared with wildtype 333 AtGolS1 (Fig. 5a-c). As anticipated, all AtGolS∆1; AtGolS∆2 and AtGolS∆3 mutant variants were 334 appeared to be enzymatically inactive, signifying the vital importance of intact NAG motif for GolS 335 activity across the species. 336

Effect of N187 mutation of CaGolS1 function in yeast 337
To lend stronger support for the above finding, we further examined our results in vivo. Previously we 338 reported that CaGolS1 improves thermal stress tolerance to Arabidopsis by limiting the heat stress-339 induced ROS level [14]. Considering the role of CaGolS1 in heat stress tolerance, we assessed the 340 growth pattern of yeast (INVSc1) harbouring native CaGolS1, CaGolS1∆ (N187A) and empty vector 341 control under thermal stress as mentioned in material and method section. Initially, we conducted the 342 western blot analysis to confirm that CaGolS1 and CaGolS1∆ (N187A) both are indeed expressed in 343 the yeast cells (Fig. S5). At 30°C, yeast cells transformed with CaGolS1, CaGolS1∆ (N187A) and 344 empty vector grew equally well and did not show a significant difference in their growth pattern (data 345 not shown). However, yeast cells expressing CaGolS1 exhibited better growth response than empty 346 vector control yeast cells after heat stress exposure. After heat stress treatment, the growth of empty 347 vector transformed cells was inhibited, while the CaGolS1 transformed cells were noticeably less 348 inhibited in similar condition and able to mitigate heat stress induced growth inhibition (Fig. 6). in biochemically inactive GolS which is unable to provide thermotolerance to yeast in contrast to its 360 native GolS (Fig. 6). 361

Disruption of N187 residues in NAG motif in CaGolS1 significantly affects its function on seed 362 vigor and longevity in planta 363
Previously, Arabidopsis transgenic plants constitutively overexpressing CaGolS1 were shown to 364 accumulate significantly increased galactinol and raffinose which consequently improves seed vigor 365 and longevity [5]. In our previous sections, we evidently demonstrated that a point mutation in NAG through transcript accumulation (Fig. S6) of the respective transgenic lines followed by the GolS 374 activity in three independent overexpression lines (OE-1, OE-2, OE-3) (Fig. 7A). Further, western 375 blot analysis was carried out to confirm that CaGolS1 and CaGolS1∆ (N187A) were indeed expressed 376 in respective transgenic lines (Fig. S7). In biochemical and metabolic analysis of the transgenic lines, 377 we found a significantly increased total GolS activity and accumulation of galactinol as well as 378 raffinose were observed in CaGolS1 transformed lines compared to wild type or vector transformed 379 lines (Fig. 7, Fig. S8). However, in contrast to CaGolS1 transformed plants, transgenic plants 380 transformed with CaGolS1∆ did not show any increase in GolS activity, galactinol and raffinose 381 content despite the protein was expressed. The galactinol and raffinose content of CaGolS1 mutant 382 transformed line were found fairly similar to wild type or empty vector control lines. Subsequently, a 383 Controlled Deterioration Test (CDT) was conducted to evaluate the seed vigor and longevity for these 384 transgenic lines. As anticipated, seeds from CaGolS1 transformed lines exhibit improved seed vigor, 385 whereas the CaGolS1-N187A transformed seeds behaved similar to the wildtype counterpart. After 4 386 days of CDT, CaGolS1 transformed seeds exhibited 50% to 60% germination while CaGolS1∆ 387 transformed seeds showed only 10-12 % germination like control seeds (wild type and vector control) 388 ( Fig. 7 B-D). After CDT, only CaGolS1 transformed seeds showed dark red staining in contrast to 389 CaGolS1∆ (N187A) transformed seeds and control seeds, which remained unstained or stained pale 390 red (Fig. 7E). These results clearly indicated that only CaGolS1 transformed seeds were viable after 391 aging treatment due to overexpression of catalytically active CaGolS. As in our previous study, we 392 found a positive influence of galactinol and raffinose on ROS scavenging during seed aging [5], it is 393 tempting to speculate that whether the reduced seed germination of CaGolS1∆ overexpressing seeds is 394 accompanied by an overaccumulation of ROS level. To address this point, we determined the H 2 O 2 395 and MDA content in the transgenic and control seeds after CDT exposure (Fig. 7 F, G). In the WT 396 and CaGolS1∆ mutant seeds, both H 2 O 2 and MDA have been found to be significantly higher in 397 response to CDT. In general, the mutant variant and control seeds behaved almost similar, also the 398 reduced germination in CaGolS1∆ mutant were attributed to higher ROS accumulation which could 399 indicative of higher seed deterioration after aging. These data clearly suggested that N187A mutant 400 generate an enzymatically inactive galactinol synthase that could not contribute to the accumulation of 401 galactinol and its subsequent product and thereby provide a straightforward explanation for its 402 observed phenotype of reduced seed vigor. Thus, its overexpression did not contribute to seed vigor 403 and longevity and showed reduced germination which is strongly associated with increased cellular 404 death after CDT resulting from the reduced ROS scavenging capacity. transferring the activated sugar moiety from the donor molecule to its acceptor [43,44]. In the 411 evolutionary spectrum, GolS is restricted to the plant kingdom and belongs to GT8 family [4,16]. The 412 GolS enzyme catalyzes the key regulatory step of RFO biosynthesis and has led to raise substantial 413 interest in understanding the RFO regulation in plants. Even though the physiological role and 414 regulation of this enzyme have been characterized from several plant species, studies on determining 415 active sites and structure-function relationship have not been carried out so far [45][46][47][48][49]. To date, no 416 crystallographic studies are available which shed light on the active site and catalytic mechanisms. In 417 this work, we have identified that NAG motif is critically important for GolS catalytic activity 418 through homology modelling and site directed mutagenesis followed by in vitro and in vivo functional 419 characterization. Homology modelling has now been regularly used to understand the sequence-420 structural relationship. Sequence analysis reveals that GolS proteins possess DxD, HxxGxxKPW 421 motifs and conserved sequences like NAG, APSAA. Among these, NAG, DXD amino acid residues 422 are predicted to be a part of GolS enzyme active site and further DxD motif is also shown to be 423 required for divalent cation binding [2]. In our previous studies [5,14], we have identified the 424  In this study, we have focused on the biochemical aspect of CaGolS and pinpoint the crucial residues 429 involved in the catalytic activity of CaGolS. To address this, we generated several deletion variants 430 using SDM approach of several residues to assess the functional consequence at their enzyme activity. 431 When we inspected the CaGolS1 and CaGolS1´ at structural level, there were several structural 432 deformities present in the CaGolS1' which might be corelated with the lost activity of CaGolS1´. The 433 3D structure of CaGolS1 shows the complex structure formed of seven α-helices and nine β-sheets 434 and varied length polypeptide chains in coil and/or linear form. The ligand-binding residues predicted 435 by InterProScan and I-tasser lies in the core of the protein molecule making a suitable binding 436 cavity/site for substrate, whereas in CaGolS1´ structure is formed of five α-helices and seven β-sheets 437 and intermediary placed varied length polypeptide chains invariably present on the surface of the 438 protein. Also, a significant reduction in the number of probable ligand-binding residues was observed 439 by the data retrieved from InterProScan and I-tasser which is believed to be due to the 440 deletion/absence of 73aa stretch as compared to wild type molecule and is expected to be responsible  . S9). In this analysis, we observed that structural conformation of molecules formed by first 450 hundred aa is almost comparable with least deviation, but in the next two segments of analysis 451 substantial deviations/changes were observed in CaGolS1' as compared to CaGolS1. This is believed 452 to be due to the absence/deletion of 73aa stretch which results in the loss of GolS activity in CaGolS1' 453 and also on performing the molecular docking with UDP galactose, UDP interacts poorly with 454 CaGolS1' as compared to CaGolS1 (Fig. 2). 455 Besides, the disruption of NAG domain in AtGolS enzyme evidently support our results found in 456 CaGolS, and indicate that the NAG domain is indeed crucial for GolS among different species. The 457 higher thermotolerance in the yeast cells expressing the CaGolS1 supported the fact that, despite 458 being a plant-specific enzyme GolS play a significant role in thermal stress in yeast. In contrast, the 459 yeast cells expressing the N187A mutant which is proved to be enzymatically inactive, behaved 460 similar to the cell harbouring the empty vector (Fig. 6). We can correlate that the thermotolerance                  Table S1. Primers used in this study.