Structural and Functional Evolution of Isopropylmalate Dehydrogenases in the Leucine and Glucosinolate Pathways of Arabidopsis thaliana*

The methionine chain-elongation pathway is required for aliphatic glucosinolate biosynthesis in plants and evolved from leucine biosynthesis. In Arabidopsis thaliana, three 3-isopropylmalate dehydrogenases (AtIPMDHs) play key roles in methionine chain-elongation for the synthesis of aliphatic glucosinolates (e.g. AtIPMDH1) and leucine (e.g. AtIPMDH2 and AtIPMDH3). Here we elucidate the molecular basis underlying the metabolic specialization of these enzymes. The 2.25 Å resolution crystal structure of AtIPMDH2 was solved to provide the first detailed molecular architecture of a plant IPMDH. Modeling of 3-isopropylmalate binding in the AtIPMDH2 active site and sequence comparisons of prokaryotic and eukaryotic IPMDH suggest that substitution of one active site residue may lead to altered substrate specificity and metabolic function. Site-directed mutagenesis of Phe-137 to a leucine in AtIPMDH1 (AtIPMDH1-F137L) reduced activity toward 3-(2′-methylthio)ethylmalate by 200-fold, but enhanced catalytic efficiency with 3-isopropylmalate to levels observed with AtIPMDH2 and AtIPMDH3. Conversely, the AtIPMDH2-L134F and AtIPMDH3-L133F mutants enhanced catalytic efficiency with 3-(2′-methylthio)ethylmalate ∼100-fold and reduced activity for 3-isopropylmalate. Furthermore, the altered in vivo glucosinolate profile of an Arabidopsis ipmdh1 T-DNA knock-out mutant could be restored to wild-type levels by constructs expressing AtIPMDH1, AtIPMDH2-L134F, or AtIPMDH3-L133F, but not by AtIPMDH1-F137L. These results indicate that a single amino acid substitution results in functional divergence of IPMDH in planta to affect substrate specificity and contributes to the evolution of specialized glucosinolate biosynthesis from the ancestral leucine pathway.

To compensate for their sessile nature, plants evolved mechanisms to cope with rapid environmental changes and challenges (1). The production of specialized metabolites is one of the important mechanisms for the survival and fitness of plants (2). The molecular diversity of these specialized compounds arises from differential modification of common backbone structures, which necessitates the evolution of homologous enzymes with varied specificities (1). In plants, glucosinolates constitute a diverse group of sulfur-containing specialized metabolites (3)(4). Biosynthesis of methionine-derived glucosinolates is initiated by the sequential addition of methylene groups to produce chain-elongated methionine derivatives via an iterative three-step chain-elongation process that mimics the chemistry of leucine synthesis (Fig. 1A).
To date, all the genes involved in the methionine chain-elongation process have been identified and characterized in Arabidopsis thaliana (5)(6)(7)(8)(9)(10)(11)(12)(13)(14). The different enzymes of the methionine chain-elongation pathway for glucosinolate synthesis have evolved from leucine synthesis by gene duplication and functional specification (14 -15). For example, four genes in Arabidopsis encode isopropylmalate synthases (IPMS) 4 with two (IPMS1 and IPMS2) serving in leucine biosynthesis and the other two genes encoding methylthioalkylmalate (MAM) synthases (MAM1 and MAM3), which catalyze the committed step in methionine chain-elongation (5)(6)16). A recent study showed that loss of a C-terminal regulatory domain and a few amino acid exchanges can covert IPMS into MAM (14). Specialization of the Arabidopsis isopropylmalate isomerases (IPMI) for different catalytic properties occurs by changes in the oligomeric composition of these enzymes. IPMI are heterodimeric enzymes consisting of a large subunit encoded by a single gene and a small subunit encoded by one of three genes (8 -9, 12). Metabolic profiling of the large subunit mutant revealed accumulation of intermediates in both the leucine pathway and the methionine chain-elongation pathway, demonstrating the dual function of this subunit in both leucine and glucosinolate biosynthesis (10). In contrast, the small subunits are specialized to either leucine biosynthesis or methionine chain-elongation (2,10,12). Furthermore, among the six branched-chain aminotransferases (BCATs) in Arabidopsis, BCAT4 in the cytosol is specifically involved in glucosinolate biosynthesis, whereas BCAT3 in the plastids functions in both amino acid and glucosinolate biosynthesis (7,9). The molecular changes that tailor BCAT activity are unclear.
Previously, we showed that A. thaliana isopropylmalate dehydrogenase 1 (AtIPMDH1) catalyzes the oxidative decarboxylation step in the methionine chain-elongation of glucosinolate biosynthesis and that AtIPMDH2 and AtIPMDH3 are primarily involved in leucine biosynthesis (Fig. 1B) (11,13). These studies highlight the functional specialization of these isoforms, but do not reveal how these activities evolved.
Here we examine the molecular basis for the functional evolution of the IPMDH family in Arabidopsis. The crystal structure of AtIPMDH2, the first determined for a plant IPMDH, reveals an active site structure similar to that of the bacterial enzymes and provides a template for modeling substrate binding in the active site. Analysis of the AtIPMDH2 structure, sequence comparisons, and site-directed mutagenesis demonstrates that a single residue difference in the active site drastically alters substrate specificity of the AtIPMDH isoforms both in vitro and in vivo. This work demonstrates the basis for functional divergence of an AtIPMDH isoform for glucosinolate biosynthesis from those of leucine biosynthesis.

EXPERIMENTAL PROCEDURES
Plants and Growth-Seeds of A. thaliana ecotype Columbia (Col-0) and SALK mutant atipmdh1 (Salk_063423C) were obtained from the Arabidopsis Biological Resource Center (ABRC). Seed germination and plant growth conditions were as previously described (11,13).
Glucosinolate Analysis-Rosette leaves of 4-week-old plants and mature seeds were used for glucosinolate analysis. Glucosinolates were analyzed using HPLC-mass spectrometry, as previously described (11,13).
Protein Expression, Purification, Assays, Crystallization, and Structure Determination-Expression and purification of wildtype and mutant AtIPMDHs as histidine-tagged proteins for functional analysis was performed using nickel-affinity chromatography, as previously described (11). IPMDH assay conditions using either 3-isopropylmalate or 3-(2Ј-methylthio)ethylmalate as a substrate and the analysis of steady-state kinetic parameters were as previously described (11). All kinetic parameters were determined by directing fitting data to the Michaelis-Menten equation in SigmaPlot.
For crystallization of AtIPMDH2, the histidine tag was removed by thrombin digestion and the protein further purified using size-exclusion chromatography (17). Crystals of AtIPMDH2 were obtained in 5 l hanging drops of a 1:1 mixture of protein and crystallization buffer (0.16 M ammonium sulfate, 0.08 M sodium acetate trihydrate, 20% PEG 4000, 20% glycerol) at 4°C over a 0.7 ml reservoir. Data collection (100 K) was performed at beamline 19-ID at the Advanced Photon Source Argonne National Laboratory. Diffraction data were integrated and reduced using HKL3000 (18). The structure of AtIPMDH2 was solved by molecular replacement performed with PHASER (19) using the structure of IPMDH from Salmonella typhimurium (20) as a search model. Model building was performed in COOT (21) and all refinements were performed with PHENIX (22). Data collection and refinement statistics are reported in Table 1. The atomic coordinates and structure factors for AtIPMDH2 have been deposited in the Protein Data Bank (PDB ID code 3R8W).
Site-directed Mutagenesis and Mutant Protein Analysis-Site-directed mutagenesis was performed using the QuikChange PCR method (Stratagene). Bacterial expression vectors for each AtIPMDH (11,13) were used as templates with specific oligonucleotide pairs, as follows: AtIPMDH1- GACCCATCTCAGGTCTCAG-3Ј (mutated codon in underlined and mutation shown in bold). Mutant protein expression, purification, and assays were performed as described above for wild-type enzyme.

Differential Expression and AtIPMDH Metabolic Specialization-The three IPMDH genes in Arabidopsis have
overlapping, yet distinct expression patterns. AtIPMDH1 (At5g14200) is highly expressed in leaves and roots; AtIPMDH2 (At1g80560) is weakly expressed throughout the plant; and AtIPMDH3 (At1g31180) is constitutively expressed at high levels in all tissues (11,13,23). To test the possible contribution of differential expression to the specialization of AtIPMDHs, each gene was placed under control of the native AtIPMDH1 promoter and then transformed into an atipmdh1 mutant line (11). As shown in Fig. 2, the altered glucosinolate profile of the atipmdh1 mutant could only be rescued by expression of AtIPMDH1. In addition, the atipmdh1 glucosinolate phenotype could not be rescued if expression was driven using either AtIPMDH2 or AtIPMDH3 promoter (data not shown). These results indicate that AtIPMDH1 sequence and differential expression are important in AtIPMDH specialization.
Structure of AtIPMDH2-To determine the molecular architecture of a plant IPMDH, the 2.25 Å resolution x-ray crystal structure of AtIPMDH2 was solved by molecular replacement (Table 1). There were four molecules in the asymmetric unit representing two AtIPMDH2 dimers. The monomers of each dimer are related by non-crystallographic symmetry. Each AtIPMDH2 monomer consists of two domains (Fig. 3A). Domain 1 contains seven ␣-helices (␣1-4 and ␣9 -11) and five ␤-strands (␤1-3 and ␤11-12), along with the N and C termini. Four ␣-helices (␣5-8) and seven ␤-strands (␤4 -10) comprise domain 2. Between the two domains, ␤4 and ␤5 form the interdomain region. The second domain also serves as the dimerization interface with ␤6 and ␤7 of each monomer as part of an inter-subunit ␤-sheet and ␣7 and ␣8 of each monomer forming a four-helix bundle at the dimer interface. The overall structure of AtIPMDH2 is similar to those of the IPMDHs from various bacteria, including Salmonella typhimurium and Thermus thermophilus (20,24,25), with a root mean square deviation of 1.3-1.7 Å 2 over ϳ350 residues. Because the plant and bacterial IPMDHs share ϳ50% sequence identity, conservation of key residues defines the active site region situated in a cleft between the two domains of each monomer (Fig. 3A).
The active site (Fig. 3B) is roughly delineated by ␣8 at the bottom and with ␣4 of one monomer and ␣7 of the adjacent monomer forming opposite sides of the site. Within the active site, all of the residues previously identified in structures of bacterial IPMDHs in complex with isopropylmalate and Mg 2ϩ are also conserved in AtIPMDH2 (24,25). Because efforts to obtain a structure of AtIPMDH2 in complex with ligands did not yield crystals, 3-isopropylmalate and Mg 2ϩ were manually modeled into the plant enzyme based on the positions of these ligands observed in the bacterial structures (Fig. 3, B and C) (24,25). This comparison shows that Asp264* (asterisk denotes a residue from the adjacent non-crystallographic symmetry related monomer), Asp-288, and Asp-292 are positioned to interact with a catalytically essential divalent metal (i.e. Mg 2ϩ or Mn 2ϩ ) and that a trio of arginines (Arg-136, Arg-146, and  Although all of these amino acids are invariant in the bacterial and plant IPMDHs involved in leucine biosynthesis, the side-chain corresponding to Leu-133 is replaced with a phenylalanine in AtIPMDH1 (Fig. 3D), which is the isoform previously shown to be primarily involved in glucosinolate synthesis in Arabidopsis (11). Mechanistically, the conversion of 3-isopropylmalate to 4-methyl-2-oxovalerate in leucine synthesis and the conversion of 3-malate derivatives (e.g. 3-(2Ј-methylthio)ethylmalate) to 2-oxo acids (e.g. 5-methylthio-2-oxopentoate) in glucosinolate synthesis likely use a common metal-dependent reaction (Fig. 4); however, different substrate side chains of 3-malate derivatives (Fig. 1B) must fit in the plant IPMDH active site for production of aliphatic glucosinolates with six different chain lengths (C3-C8). Thus, we hypothesize that this single amino acid exchange from the leucine found in AtIPMDH2 and AtIPMDH3 to the phenylalanine in the active site of AtIPMDH1 may contribute to the functional divergence of this isoform for glucosinolate biosynthesis.
Biochemical Analysis of Wild-type and Mutant AtIPMDHs-Previous studies on the AtIPMDHs demonstrate that each isoform accepts 3-isopropylmalate as a substrate (11,13), but a kinetic comparison with a glucosinolate pathway substrate has not been reported. Using both 3-isopropylmalate and 3-(2Јmethylthio)ethylmalate, the steady-state kinetic parameters for  each AtIPMDH were determined ( Table 2). Comparison of the catalytic efficiencies shows that AtIPMDH2 and AtIPMDH3 favor 3-isopropylmalate over 3-(2Ј-methylthio)ethylmalate by 14,900-and 29,600-fold, respectively. Moreover, these isoforms were ϳ20-fold more active with the leucine biosynthesis substrate than AtIPMDH1. In comparison, AtIPMDH1 accepts both substrates with comparable k cat /K m values, but was ϳ500fold more efficient with the glucosinolate substrate than the other two isoforms. These catalytic efficiencies agree with the observed in vivo roles of the AtIPMDH isoforms in glucosinolate and leucine synthesis pathways (11)(12)(13).
To investigate the significance of the active site difference in the AtIPMDH, a series of point mutants (AtIPMDH1-F137L, AtIPMDH2-L133F, and AtIPMDH3-L134F) were generated. Kinetic analysis of these mutants demonstrates the critical role of this active site change in determining substrate specificity ( Table 2). In AtIPMDH1, substitution of Phe-137 with a leucine reduced the k cat /K m of the mutant for 3-(2Ј-methylthio)ethylmalate to values comparable to those observed for AtIPMDH2 and AtIPMDH3. This was also accompanied by improved catalytic efficiency with 3-isopropylmalate, as the AtIPMDH1-F137L mutant was only 2-to 3-fold less efficient with this substrate than AtIPMDH2 and AtIPMDH3. The complementary mutation in either AtIPMDH2 (L133F) or AtIPMDH3 (L134F) yields mutant enzymes that were ϳ30-fold less active with 3-isopropylmalate than the corresponding wild-type proteins, but still comparable to wild-type AtIPMDH1. Moreover, AtIPMDH2-L133F and AtIPMDH3-L134F displayed nearly a 100-fold improvement in activity with 3-(2Ј-methylthio)ethylmalate as a substrate to k cat /K m values that were 4-and 7-fold less than those observed with AtIPMDH1. In addition, based on the currently available plant sequences in GenBank TM , AtIPMDH1 is the only IPMDH with the unique phenylalanine instead of leucine, which is present in all other species (Fig. 5). These results demonstrate the critical role of the residue at position 133 (AtIPMDH2 numbering) in the evolution of AtIPMDH1 for the methionine chain-elongation reactions of glucosinolate biosynthesis.
In Vivo Analysis of AtIPMDH Mutant Function-To test whether the amino acid substitution that occurred in AtIPMDH1 contributes to its specific function in vivo, atipmdh1 mutant plants were transformed with each of the mutant AtIPMDH genes driven by the AtIPMDH1 promoter. After isolation of homozygous lines, the glucosinolate profile in each mutant was examined. In comparison to the results shown in Fig. 2, the pronounced glucosinolate phenotype in the atipmdh1 mutant could not be rescued by AtIPMDH1-F133L ( Fig. 6), indicating that the active site substitution impaired AtIPMDH1 function for glucosinolate synthesis in vivo. In contrast, the glucosinolate phenotype could be restored to the wildtype profile by expression of either AtIPMDH2-L133F or AtIPMDH3-L134F under the control of AtIPMDH1 native promoter (Fig. 6). The in planta findings corroborate the conclusion drawn from the biochemical analysis of recombinant proteins and provide evidence for the evolution of AtIPMDH1 by gene duplication and a single critical amino acid substitution.

DISCUSSION
The evolution of specialized metabolism from primary metabolism is a common theme across biochemical pathways in plants (and microbes). Here we explored the molecular basis underlying the divergence of biological function in the IPM-DHs in the leucine and glucosinolate biosynthesis pathways of Arabidopsis. Although all three AtIPMDHs accept 3-isopropylmalate, AtIPMDH1 is less efficient than the other isoforms (11,13). Previous work also showed that knock-out mutants of AtIPMDH1 result in reduced levels of leucine and the C4 to C8 aliphatic glucosinolates (11). In contrast, knock-out mutations of the other isoforms did not alter glucosinolate levels but reduced leucine content (11,13). Interestingly, a double mutation of AtIPMDH2 and AtIPMDH3 in Arabidopsis plants led to defects in pollen and embryo sac development, suggesting that leucine synthesis is essential for gametophyte formation. Using a combination of structural and functional analysis, this work demonstrates that a single amino acid change in the AtIPMDH active site leads to functional divergence of these enzymes in leucine synthesis (primary metabolism) and aliphatic glucosinolate synthesis (specialized metabolism).
Functional specification of AtIPMDHs in leucine and glucosinolate biosynthesis has been observed (11,13). To evaluate if altered expression of AtIPMDH isoforms underlies functional specialization, each isoform gene was expressed under control of the AtIPMDH1 promoter in an atimpdh1 mutant background. Because the glucosinolate profile in the mutant was rescued only by expression of AtIPMDH1 (Fig. 2), it appears that gene duplication and subsequent mutation to a new function is the underlying evolutionary mechanism.
The three-dimensional structure of AtIPMDH2 (Fig. 3) and functional analysis (Table 2) of the AtIPMDHs provides insight on the specific changes required to alter the metabolic roles of these enzymes. A common chemical transformation is required to convert 3-isopropylmalate to 4-methyl-2-oxovalerate in leucine synthesis and 3-malate derivatives to 2-oxo acids in glucosinolate synthesis (Fig. 1). The AtIPMDH active site includes

Kinetic parameters of wild-type and mutant AtIPMDHs
All reactions were performed as described under "Experimental Procedures." All k cat and K m values are expressed as a mean Ϯ S.E. for an n ϭ 3. invariant residues for binding of either Mg 2ϩ or Mn 2ϩ (Asp-288, Asp-292, Asp-264*) and for charge-charge interactions with the substrate carboxylate groups (Arg-136, Arg-146, and Arg-174). Likewise, Tyr181 and Lys232*, which are proposed to perform general acid-base chemistry in the reaction mechanism (26), are conserved. For both 3-isopropylmalate (leucine synthesis) and 3-malate derivatives (glucosinolate synthesis), the overall reaction (Fig. 4) involves oxidation of the alcohol by deprotonation and hydride transfer to NAD ϩ . This is followed by spontaneous decarboxylation, stabilization of the resulting enolate by the metal ion, and protonation to yield the final product.

3-Isopropylmalate 3-(2-Methylthio)ethylmalate
Leucine and glucosinolate synthesis requires the same chemistry, but the AtIPMDH active site must accommodate reactants with different side-chains (i.e. isopropyl versus elongated methionine side-chain groups). The AtIPMDH2 structure and sequence analysis reveals a single amino acid difference of a leucine (AtIPMDH2 and AtIPMDH3) versus a phenylalanine FIGURE 5. Alignment of IPMDH sequences from different plant species. Amino acid residues with high consensus value (90%) are colored in red. The residues interacting with the malate backbone of the substrates are shaded in yellow and residues interacting with the ␥ moiety of the substrates are highlighted in light blue. The LF residues in AtIPMDH1 and conserved LL residues in other IPMDH homologs were marked with asterisks. Species abbreviation: Bn, Brassica napus (rape); Ca, Capsicum annuum (pepper); Gm, Glycine max (soybean); Os, Oryza sativa (rice); Pp, Physcomitrella patens (moss); Pt, Populus trichocarpa (poplar); Rc, Ricinus communis (castor); Sb, Sorghum bicolor (sorghum); Vv, Vitis vinifera (grape); and Zm; Zea mays (corn).
(AtIMPDH1) in the active site. This difference occurs in the set of residues proposed to form the substrate interaction surface in the bacterial and plant IPMDH (20,24,25). Both in vitro and in vivo functional analysis of AtIPMDH1-F137L, AtIPMDH2-L133F, and AtIPMDH3-L134F demonstrates that switching this amino acid in each isoform is sufficient to interconvert catalytic efficiency (Table 2) and to change the aliphatic glucosinolate profiles in transgenic plants (Fig. 6). These results suggest that gene duplication of AtIPMDH followed by mutation of one active site residue in AtIPMDH1 leads to its specialized role for glucosinolate synthesis in Arabidopsis.
Sequence alignment of AtIPMDH homologs in other plant species revealed that the phenylalanine is not present in other homologs (Fig. 5), suggesting they may be primarily involved in leucine biosynthesis. As more genomic sequences become available, a broad implication of this amino acid substitution can be appreciated. It should be noted that the three AtIPMDHs have overlapping substrate specificity, but with distinct preferences. AtIPMDH1 is primarily involved in methionine chain-elongation of aliphatic glucosinolate biosynthesis. When AtIPMDH1 is mutated, the functions of AtIPMDH2 and AtIPMDH3 in glucosinolate biosynthesis become evident (13). On the other hand, all three AtIPMDHs are involved in leucine biosynthesis, with AtIPMDH2 and AtIPMDH3 exhibiting dominant roles (11,13). This study has uncovered the molecular basis, i.e. a single amino acid substitution underlying substrate preference and functional divergence of the AtIPMDHs.
The structure-function analysis of the AtIPMDH provides insight on the molecular basis for altered function, but it is unclear how the leucine to phenylalanine mutation allows AtIPMDH1 to accommodate the growing methionine chain in subsequent iterations of the glucosinolate synthesis reactions (Fig. 1A). Multiple structures of IPMDH from bacteria indicate that the structural features around the active site are flexible and that active site dynamics likely plays a potential role in substrate recognition and catalysis (27). Moreover, the effect of the longer side-chain on the kinetics of the various glucosinolate biosynthesis pathway enzymes (i.e. BCAT, MAM, IPMI, and IPMDH) has not been explored. In Arabidopsis, multiple lines of evidence strongly support the evolution of methionine chain-elongation process of glucosinolate biosynthesis from leucine biosynthesis (5-8, 11); however, the molecular under-pinnings for this evolution are only beginning to be understood. For example, the substrate specialization of the heterodimeric IPMI is determined by which small subunit associates with the large subunit (2,8,10,12). Recently, the changes needed to convert IPMS from leucine synthesis into a MAM was demonstrated to involve the loss of a C-terminal regulatory domain responsible for feedback inhibition by leucine and a series of amino acid mutations (14). In contrast to large remodeling of protein structure in IPMS and MAM, the substrate specificity of IPMDH requires one amino acid difference.
Interactions between Arabidopsis and its environment may have driven the co-evolution of the pathways needed to synthesize the core glucosinolate structure and the elongation of the methionine side-chain. The biosynthesis of the glucosinolates has been suggested to have evolved from the prevalent system of cyanogenic glucoside biosynthesis (28 -30). Evidence for this includes the wide distribution of cyanogenic glucosides in plants and arthropods, and the conservation of cytochrome P450s in the biosynthesis of glucosinolates and cyanogenic glucosides. In addition, metabolic engineering using cytochromes P450 involved in cyanogenic glycoside biosynthesis allows for the generation of acyanogenic plants that also display altered glucosinolate profiles (28 -31). It is evident that when environmental challenges such as insect herbivores present themselves, specialization of enzymes from different pathways contributes to the evolution of methionine-derived glucosinolates for plant survival (32)(33)(34).
In summary, we have determined a key molecular change responsible for altering substrate specificity of the IPMDHs in Arabidopsis and the recruitment of an IPMDH from leucine biosynthesis for the specialized synthesis of glucosinolates. Future studies need to explore protein level changes in other glucosinolate enzymes to understand how the entire glucosinolate pathway evolved.