A Computational Protein Structure Refinement of the Yeast Acetohydroxyacid Synthase

A combined molecular modeling and molecular dynamics simulation was carried out to obtain an improved description of the yeast acetohydroxyacid synthase (AHAS) in aqueous solution. After a thorough homology modeling, the AHAS catalytic dimer was subjected to a molecular dynamics (MD) simulation to analyze its behavior and optimize its geometry. The AHAS 3D molecular structure was analyzed according to the number of salt bridges and hydrogen bonds formed. During 20 ns of MD simulation, an average fluctuation of 3.9 Å was obtained. The cofactor thiamine diphosphate makes a relevant contribution to the system stability; this hypothesis was confirmed by the decrease in the average fluctuation of 0.3 Å. Moreover, the Ramachandran plot revealed no denaturation framework during the time of the simulation.


Introduction
The experimental methods for determining the threedimensional structure of biomolecules are essential for the study of their structure and, therefore, their functions. 1,2resently, due to the technological development and improvement of computers, the amount of structural data is increasing, especially for proteins. 1,3,4However, several problems in experimental methods are still obstacles to the experimental determination of structures; this is particularly true in the application of X-ray diffraction crystallography (XRD), one of the main techniques for the determination of macromolecular structures.Problems related to protein purification, crystallogenesis, vitrification and harvesting of the crystal, obtaining phase information, structure refinement, flexibility and/or disorder of parts of the molecules in the crystal can directly affect the 3D structure description process. 1,36][7] In this context, the advances in computing and programming areas in recent decades have resulted in the creation of tools that make the generation of high quality structural models possible. 80][11] AHAS is the first enzyme in the biosynthetic pathway of branched chain amino acids (valine, leucine and isoleucine) present in plants, seaweed, fungi, bacteria and archaea but not in animals.For this reason, the enzymes involved with the biosynthesis of such amino acids, especially AHAS, are potential targets in the development of agrochemicals (herbicides, fungicides and antimicrobial compounds). 9,12The molecular structure of AHAS involves two distinct subunits: (i) the catalytic subunit, which has all the apparatus necessary for the catalysis; and (ii) the regulatory subunit responsible for modulating the activity of the catalytic subunit. 10During its activity, AHAS demands the following cofactors: thiamine Vol. 26, No. 8, 2015   diphosphate (ThDP), flavin adenine dinucleotide (FAD), a magnesium ion (Mg 2+ ) in the active site 10,[12][13][14] and, generally, a potassium ion (K + ) to guarantee structural stability. 11,150][11] The dimeric structure described for the AHAS of Saccharomyces cerevisiae (protein data bank (PDB) 1JSC) 13 includes some missing residues, namely, loop regions of undetermined position, giving a disordered conformation to these regions.Thus, the present study involves the use of molecular modeling and molecular dynamics (MD), as well as available experimental data for the enzyme to propose a full atomic three-dimensional structure description of the yeast AHAS enzyme functional dimer.

Methodology
Describing the molecular structure of the AHAS enzyme The amino acid sequence of the AHAS (S. cerevisiae) was obtained online from the protein database of the National Center for Biotechnology Information (NCBI).The missing residues were added based on homology modeling protocols: (i) identification of proper templates by searching the structural database of protein sequences in the PDB 4 using the basic local alignment search tool (BLAST) server 20 and protein BLAST (BLAST-P) algorithm; 21 (ii) selection of templates based on the following criteria: highest percentage of sequence identity and similarity, cofactor integrity and proportion of missing residues on the disordered loops of AHAS molecular structure; (iii) alignment of the selected templates and target amino acid sequences as well as determination of three-dimensional AHAS structure by applying the Swiss-PDB viewer program; 22 (iv) obtaining the AHAS missing loop structure by the SWISS-MODEL server; [22][23][24] and (v) insertion of the cofactors according to the atomic coordinates of the templates.

Molecular dynamics simulation
To evaluate and to refine the AHAS molecular structure, after all adjustments, an MD simulation was carried out to provide some preliminary insights about AHAS behavior in aqueous solution.As a result, three input files were prepared in a box of water transferable intermolecular potential three point (TIP3P) 25 to evaluate their behavior and interaction energies.The systems are summarized in Table 1 and shown in Figure 1.
All MD simulations were performed by means of the nanoscale molecular dynamics (NAMD) 2.7 program 26,27 using the following protocol: (i) minimization step at constant number of particles, system volume and temperature (NVT ensemble), where the system temperature was gradually increased in steps of 100, 200 and 298 K, until the energy stabilization; (ii) a cutoff distance of 12 Å was used, and long-range corrections were considered using the Ewald sum formalism; 28 (iii) in the equilibration step,   a Langevin piston 29 and thermostat were applied at 1.0 bar and 310 K, respectively, using the isothermal-isobaric with constant number of particles (NpT) ensemble for 20 ns.The force field parameters for all simulated systems were taken from the SwissParam web server 30 and added to the "chemistry at Harvard macromolecular mechanics" 27 (CHARMM27) 31 force field.The results were analyzed using visual molecular dynamics (VMD) 32 and Grace 33 programs.

System evaluation
To evaluate the structural quality of the simulated systems, a Ramachandran diagram 34 was plotted.A Ramachandran diagram assessment is based on permitted values of dihedral angles φ (phi) and ψ (psi) of residues in the protein structure.Three parameters were analyzed: (i) accuracy of the atomic coordinates of the AHAS molecular structure; (ii) denaturation of the AHAS during the MD simulation timescale; and (iii) impact of the inserted loops on the AHAS molecular structure.Ramachandran diagrams were calculated using the torsion angle plot program 35 at the StrucTools online server. 36

Results and Discussion
For a preliminary evaluation and validation of the AHAS refined model as well as to characterize the AHAS refined molecular structure behavior in aqueous solution, the structure and three systems trajectories, obtained as detailed in Methodology, were analyzed and are presented below.

Molecular modeling
Most studies reported in the literature involving AHAS three-dimensional structures refer to the dimeric form. 9,11,37he crystallographic structure of chain A from S. cerevisiae AHAS (PDB 1JSC) 13 and also chain A from S. cerevisiae AHAS in complex with the herbicide chlorimuron-ethyl (PDB 1N0H) 38 were 100% identical with the target sequence in BLAST, being highly rated according to this algorithm.The chain A structure from Arabidopsis thaliana AHAS (PDB 1Z8N), 39 considered one of the most representative structures of the enzyme, 9 exhibited 43% identity with the target sequence.Nevertheless, the template selection was not only envisioned as the results presented by BLAST but also as the criteria concerning the integrity of cofactors and the lost portion of the residues in the crystallographic structure.Therefore, the structures chosen for modeling were PDB 1JSC, the only structure free from herbicides and with intact cofactors, and PDB 1Z8N, which shows a minimal proportion of lost residues in the XRD structure determination.The S. cerevisiae AHAS dimer was built by considering that the geometry of the association of its two monomers is similar to the dimeric structure presented in the PDB 1JSC template.
The alignment between the selected template sequences and the S. cerevisiae AHAS was important to the construction of atomic coordinates of the conserved regions in the AHAS model.Figure 2a shows the superposition of the templates (structural alignment) between PDBs 1JSC and 1Z8N.By using this consideration, the enzyme core region is more conserved, and as a result, the superposition was attained successfully.Then, this result was used to build a preliminary model of the molecular structure of the AHAS (Figure 2b).However, as highlighted in Figure 2b, several interruptions in the polypeptide chain are present, and these interruptions correspond to external variable regions from the templates.Consequently, no structural superposition for these regions was feasible.
The variable regions were modeled with the server SWISS-MODEL that completed the missing loops, which in turn had the anchor residues: serine (SER) 188-alanine (ALA) 198 (α), glutamic acid (GLU) 497-SER514 (β), asparagine (ASN) 795-threonine (THR) 801 (γ), ALA823-valine (VAL) 827 (δ), ALA919-isoleucine (ILE) 927 (ε), proline (PRO) 964-arginine (ARG) 968 (ζ) and phenylalanine (PHE) 1114-histidine (HIS) 1117 (η) (Figure 3b).In the end, the coordinates of the complete AHAS model were obtained with a .pdbextension.Based on the coordinates of the template structure PDB 1JSC, the cofactors were directly inserted into the .pdbfile.As a result, the complete model obtained is shown in Figure 3a.minimal proportion of lost residues in the XRD structure determination.The S. cerevisiae AHAS dimer was built by considering that the geometry of the association of its two monomers is similar to the dimeric structure presented in the PDB 1JSC template.
The alignment between the selected template sequences and the S. cerevisiae AHAS was important to the construction of atomic coordinates of the conserved regions in the AHAS model.Figure 2a shows the superposition of the templates (structural alignment) in which the PDBs 1JSC and 1Z8N are in green and purple, respectively.By using this consideration, the enzyme core region is more conserved, and as a result, the superposition was attained successfully.Then, this result was used to build a preliminary model of the molecular structure of the AHAS (Figure 2b).However, as highlighted in Figure 2b, several interruptions in the polypeptide chain are present, and these interruptions correspond to external variable regions from the templates.Consequently, no structural superposition for these regions was feasible.

Molecular dynamics simulation
After obtaining the simulated trajectories for the three systems described (Figure 1), some major parameters were analyzed to elucidate the behavior of the systems in solution.First, the evolution of the interaction energy (Coulomb and van der Waals) between the protein and water were monitored as a function of time.Table 2 shows that no significant disparities were observed in the energy average values between the systems.The energy values reached their equilibrium at a temperature of 310 K and pressure of 1 bar.The temperature used in the simulations is in the range in which the enzyme is active, according to data available in the enzymatic repository, Braunschweig enzyme database (BRENDA), 40,41 from 30 to 40 °C. 42 characterize the stability of the system, both hydrogen bonds between AHAS and water molecules and electrostatic interactions between oppositely charged amino acids (salt bridges) were analyzed.For the hydrogen bond analysis, a cutoff distance of 3.0 Å and a cutoff angle of 20° between donor and acceptor atoms were used. 43,44For the salt bridge analysis, a cutoff distance of 3.2 Å between the oxygen atoms of acidic residues and the nitrogen atoms of basic residues was employed. 45,46Both hydrogen bonds and the formation of salt bridges are presented in Table 3.
Concerning the hydrogen bonds, an average of 809, 794 and 777 bonds were formed between the enzyme and water molecules in systems 1, 2 and 3, respectively.The initial value for hydrogen bonds was smaller compared with the final value for all systems analyzed because the systems do not interact effectively with water molecules at the beginning of the simulation.As the simulation proceeds, the number of hydrogen bonds increases as a consequence of the system solvation.
As was expected, the aspartic acid (ASP), GLU and lysine (LYS) residues were the main contributors to the hydrogen bond formation in the three simulated systems because ASP and GLU are negatively charged, and LYS is positively charged.As a result, the side chains having the amino acids mentioned can be stabilized by forming hydrogen bonds in aqueous solution.Cysteine (CYS), ILE and tryptophan (TRP) amino acids contribute much less to the formation of hydrogen bonds due to their hydrophobic character and their being more internalized in the structure of the enzyme.
An important observation in systems 2 and 3 was the number of hydrogen bonds between AHAS and the FAD cofactor.Although the function of this cofactor in the structure and the enzymatic catalysis are still not clear for the scientific community, 9,11,37 the cofactor's atomic coordinates and interactions with the enzyme residues are well defined.In simulation, an average value of 10 hydrogen bonds was observed between the FAD and the AHAS for both systems 2 and 3.In the literature, Pang et al. 13 determined the position of the FAD cofactor in the crystallographic structure of AHAS from S. cerevisiae, which is strongly associated with the enzyme through 12 hydrogen bonds, established with specific residues that form the binding site of this cofactor.McCourt and Duggleby 9 revised all structural information available on the AHAS enzyme (bacterial, fungal and plant) and reported that FAD is linked to the enzyme through 7 hydrogen bonds and another 42 non-bonded interactions.This way, the average number of hydrogen bonds between AHAS and FAD obtained in the simulation trajectory is in agreement with literature data, which can be used as a validation parameter for the final model.Concerning the salt bridge formation, the initial and final values are close to each other during the entire trajectory, although, in the initial nanoseconds of the simulations of systems 2 and 3, a higher number of salt bridges were observed, stabilizing the initial structure of the protein.
A decrease in the number of these salt bridges were observed in the simulation, which could be attributed to the interactions between water molecules and the charged amino acids (ASP, GLU and LYS).Observing the progression in time of the average number of hydrogen bonds to these systems (Table 3), the number of salt bridges was inversely proportional to the number of hydrogen bonds, which could indicate the influence of the solvation of the molecule.Franca et al. 46 also had similar results with simulations of the acetyl-CoA carboxylase, in which a decrease in the number of salt bridges was observed during the simulation.This decrease could be attributed to the formation of new hydrogen bonds by charged amino acids, causing a reduction in the number of salt bridges and fluctuations in the three-dimensional structure.These fluctuations in all systems were evaluated along the timeline using the root mean square deviation (RMSD) of the positions of the α carbons in the initial and final structures presented in Figure 4.
According to Figure 4, system 2 has the highest conformational fluctuation.In system 2, the RMSD varies considerably, with a fluctuation of approximately 3.9 ± 0.3 Å.In contrast, systems 1 and 3 have a smaller conformational variation than the variation shown in system 2, reaching a value of 3.6 ± 0.2 Å after 6 ns.
The smaller stability observed in system 2 may be attributed to two factors: (i) insufficient simulation time for the system to reach a required stability, and (ii) the absence of the ThDP cofactor in the simulations, which plays a role in the stabilization.These factors, especially the factor presented in (ii), can be studied better through the system 3 simulation, which involves the polypeptide chain of AHAS, with all cofactors (FAD, ThDP and K + and Mg 2+ ions).This system presents smaller conformational fluctuations; after 6 ns of simulation, some variations occurred at approximately 3.5 ± 0.2 Å.Thus, the data presented by this system suggest the importance of the ThDP cofactor in the AHAS enzymatic stability.In the absence of ThDP and the presence of Mg 2+ ion, a greater variation was verified, probably because the Mg 2+ has an anchor function towards the ThDP.
For the modeled loops, an average fluctuation of 11.4 ± 0.6 Å was observed for the three evaluated systems in the protein structure.Because the loops are externally located, the flexibility can be attributed to the solvation effects when compared with the protein core regions.In this context, the RMSD is in agreement with the expected values.

System evaluation
The initial and final conformations (2 and 20 ns) of the three simulated systems were analyzed in terms of the φ and ψ torsion angles in the Ramachandran diagram (Figure 5).The evolution of this graphic indicated that considerable changes between the torsion angles φ and ψ did not occur after 20 ns of simulation.Despite the presence or absence of the cofactors, no denaturation was observed in the AHAS model.
Regarding the quality of the atomic coordinates, it was verified using the evaluation of the 1136 enzyme residues that a great portion of its residues concentrated in the most favorable region (core) and additionally allowed regions for all systems, whereas few residues were located in the generously allowed and disallowed regions.In the diagrams presented, the MD simulation can be observed to change the position of several residues from generously allowed and disallowed regions to allowed regions.However, given the motion in the enzymatic structure in the aqueous environment, other residues were able to move to allowed and disallowed regions over time.By analyzing the interrupting regions after inclusion of missing residues and 20 ns of MD minimization and equilibration, it is possible to infer that the fluctuation range within acceptable values refers to external protein regions.Therefore, the model presented is in agreement with a full atomic model description for the proteins, and the model presented is suggested as an addition to an online protein repository.8][49] Thus, the templates used were submitted for evaluation by the server StrucTools, aiming to obtain data for comparison.The residues from the 1JSC and 1Z8N templates concentrated almost completely into the most favorable and additionally allowed regions, presenting only two residues in the generously allowed regions and none in disallowed regions.Therefore, the residues in the most favorable and allowed regions of all three systems are presumed to be in agreement with the values of the templates used.Nevertheless, the three systems showed an average of three residues in the disallowed regions, which is not justified by the quality of the templates used and deserves further investigation.

Conclusions
An improvement in the determination of the AHAS atomic coordinates was obtained by choosing accurate template files; as a result, the position of all missing crystallographic residues could be determined.The model quality was in agreement with the crystallographic template structures, therefore indicating that no damage to the AHAS three-dimensional structure occurred after loop modeling.Despite this agreement, some residues of the AHAS model occupied unfavorable regions in the Ramachandran plot.This result can be explained by the presence of variable regions in the template files.In both the presence and the absence of the cofactors, no denaturation evidence was verified.The presence of all cofactors confers stability onto the AHAS molecular structure, whereas the absence of the cofactors leaves empty space in an important AHAS region, consequently conferring instability onto the model.
In summary, the structural refinement of the AHAS enzyme was obtained according to its accuracy, and a well-coordinated atomic position was obtained.The molecular modeling used in the study provided structural characterization of specific and highly disordered regions of protein loops.][52]

Figure 2 .
Figure 2. (a) Superposition of the template structures PDB 1JSC and PDB 1Z8N.(b) Preliminary model obtained from the alignment of the template and the target sequences.The interruptions in the polypeptide chain are highlighted, indicating the absence of some loops.

Figure 2 .
Figure 2. (a) Superposition of the template structures PDB 1JSC (green) and PDB 1Z8N (purple).(b) Preliminary model obtained from the alignment of the template and the target sequences (chain A shown in green and chain B shown in orange).The interruptions in the polypeptide chain are highlighted, indicating the absence of some loops.

Figure 4 .
Figure 4. Time evolution of the RMSD of AHAS atoms from the initial structure.

Table 1 .
Description of the simulated systems