Data for molecular dynamics simulations of B-type cytochrome c oxidase with the Amber force field

Cytochrome c oxidase (CcO) is a vital enzyme that catalyzes the reduction of molecular oxygen to water and pumps protons across mitochondrial and bacterial membranes. This article presents parameters for the cofactors of ba3-type CcO that are compatible with the all-atom Amber ff12SB and ff14SB force fields. Specifically, parameters were developed for the CuA pair, heme b, and the dinuclear center that consists of heme a3 and CuB bridged by a hydroperoxo group. The data includes geometries in XYZ coordinate format for cluster models that were employed to compute proton transfer energies and derive bond parameters and point charges for the force field using density functional theory. Also included are the final parameter files that can be employed with the Amber leap program to generate input files for molecular dynamics simulations with the Amber software package. Based on the high resolution (1.8 Å) X-ray crystal structure of the ba3-type CcO from Thermus thermophilus (Protein Data Bank ID number PDB: 3S8F), we built a model that is embedded in a POPC lipid bilayer membrane and solvated with TIP3P water molecules and counterions. We provide PDB data files of the initial model and the equilibrated model that can be used for further studies.

bilayer membrane and solvated with TIP3P water molecules and counterions. We provide PDB data files of the initial model and the equilibrated model that can be used for further studies. &

Value of the data
The XYZ coordinates of the cluster models of the cytochrome c oxidase (CcO) cofactors can serve as starting points for further electronic structure calculations.
Additional electronic structure calculations can be employed to derive charges and bond parameters for different redox states of the CcO cofactors.
The Amber force field parameters are suitable to set up MD simulations of CcO in the parameterized redox states of the cofactors.
The PDB coordinates of CcO embedded in lipid bilayer membrane and water are useful as starting point for additional simulations with different protonation states or mutations.

Data
The data provided in this article were generated for molecular dynamics (MD) simulations of ba 3type cytochrome c oxidase (CcO) of Thermus thermophilus [1]. Atomic coordinate data of cluster models of CcO cofactors are provided in XYZ coordinate format and atomic coordinate data of the entire CcO model embedded in POPC lipid bilayer and solvated with TIP3P water molecules and counterions are provided in PDB file format. Force field parameter data compatible with the Amber ff12SB or ff14SB force fields are provided in standard Amber OFF library and parameter format for the Cu A pair including coordinating side chains; heme b; and the dinuclear reaction center (DNC), which consists of heme a 3 , Cu B and coordinating residues. Scripts for automated building of the enzymemembrane system with the leap program of the Amber software package are also provided.

Model preparation (PDB data file generation)
The CcO model that was employed for molecular dynamics simulations in Ref. [1] is based on the high resolution (1.8 Å) X-ray crystal structure of the ba 3 -type CcO from Thermus thermophilus in the reduced state in lipidic cubic phase (LCP) crystal as obtained from the Protein Data Bank (PDB: 3S8F) [2]. This crystal structure contains two oxygen atoms between the Fe and Cu centers of the DNC, which have been proposed to be a bridging hydroperoxide [3]. Our model keeps this bridging hydroperoxide and uses oxidation states of the cofactors (see Section 3) in which the heme b and Cu A pair are reduced, that is Fe b 2 þ and Cu A þ -Cu A þ , while the DNC (heme a 3 and Cu B ) is fully oxidized with Fe a3 3 þ and Cu B 2 þ . This model corresponds to state 6 in the CcO reaction cycle presented by Noodleman and coworkers [4,5].
The Hþ þ software [6] was used to determine the protonation states of titratable residues. H þ þ cannot recognize non-standard complex ligands such as heme automatically. We therefore manually supplied PQR files (PDB þcharge þradius) with appropriate charges. For standard residues, the charges from the Amber ff12SB force field were used. The charges of cofactors were obtained through QM calculations as described in Section 3 of this article. General atomic van der Waals radii were used (C: 1.70 Å; N: 1.55 Å; Fe: 1.00 Å; O: 1.50 Å; Cu: 1.40 Å; H: 1.20 Å). The pH was set to 6.5. All Arg, Tyr and Lys residues are predicted to be protonated with high pKa values (Arg 411.7, Tyr 48.9, Lys48.7). All but two Glu residues are predicted to have pKa values o5.4, while Glu203 and Glu131 II have pKa values of 7.0 and 6.7, respectively. We have assumed that all Glu residues are in their standard, deprotonated state. All Asp residues apart from Asp372 have pKa values o4.1 and are thus deprotonated, while Asp372 is protonated. Most His residues have rather low pKa values while His462, His552, and His8 II have pKa values around 6.7 and His376 has a pKa 412. The PDB files provided here have all His residues in their default protonation state (singly protonated on the ε-nitrogen atom), including His376. This corresponds to protonation state A in Ref. [1]. The lowest energy protonation state would have a doubly protonated His376 (state D in Ref. [1]).
The protein including 189 X-ray crystal water molecules was inserted into a palmitoyl-oleoylphosphatidylcholine (POPC) bilayer of 180 lipid molecules and solvated in 0.15 M KCl solution with 18,866 water molecules, followed by removal of K þ ions to neutralize the system. The CHARMM-GUI membrane builder [7], VMD [8], and the AmberTools [9,10] charmmlipid2amber.py program were used as described in Ref. [1]. The resulting PDB data that is provided with this article contains coordinates for protonation state A of Ref. [1]. We provide both the initial coordinates for our model that are based on the crystal structure (cco-lipid-ion-water-crystal.pdb) and coordinates obtained through equilibration using an MD protocol (see Ref. [1]) with the newly developed parameters (ccolipid-ion-water-equilibrated.pdb). Other protonation states or mutants can be easily generated by appropriate renaming of residues in the PDB files. The example build scripts that are provided with this article can be employed by the AmberTools leap program to set up input files for MD simulations with Amber. The build scripts generate files with the Amber Lipid14 force field [11] in combination with the Amber ff12SB force field [12], TIP3P parameters for water [13], the Joung/Cheatham ion parameters [14], and the parameters for non-standard residues that are provided with this article.

Parameter derivation for CcO cofactors (cluster model XYZ data, Amber parameter data)
Force field parameters for the Cu A pair (cua-pair.prm), heme b (heme.prm), and the DNC heme a 3 and Cu B (heme.prm, dnc.prm) were obtained as described in Ref. [1] and detailed below. Parameters for heme b including bonded parameters to the coordinating histidine residues and Lennard-Jones parameters for Cu are based on literature data [15,16]. Parameters for heme a 3 are derived from these heme b parameters and combined with GAFF [17] parameters for the geranyl-geranyl tail and the formyl group. GAFF parameters are also used for the bridging hydroperoxide in the DNC and the special C-N bond between His233 and Tyr237. Harmonic bond and angle constraints are used between all metal centers and the ligands, with values that are based on the crystal structure for the Cu A pair and based on geometries optimized for cluster models with density functional theory (DFT) for the DNC.

DNC cluster model geometry optimization
DFT geometry optimizations were performed with the ADF program [18][19][20] using the OLYP exchange-correlation functional [21] and double-and triple-zeta polarized (DZP and TZP) Slater type basis sets (TZP on the metal atoms and DZP on all other atoms) from the ADF basis set library [22]. The COSMO model [23][24][25] with a dielectric constant of ε ¼18.5 was used to approximate the effect of the protein environment. The geometry optimizations of the DNC employed the same 205 atom cluster model (opt-DNC-full-model.xyz) used previously by Noodleman et al. [4] It contains heme a 3 , the side chain of the axial His384 ligand to heme Fe a3 , the bridging hydroperoxide, Cu B and the side chains of its three coordinating ligands His283, His282 and the special His233 that is covalently linked to Tyr237, as well as the side chains of the three residues Arg449, doubly protonated His376 (positively charged, þ1) and protonated Asp372 above the DNC, Gly232 including backbone atoms linked to His233, and 7 water molecules. The low spin Fe 3 þ and Cu 2 þ centers of the DNC were antiferromagnetically coupled and the positions of the link atoms were constrained to the crystal structure coordinates during the DFT geometry optimizations.

Cofactors charge derivation
Charges for the cofactors were computed with DFT using the OLYP potential and an electrostatic potential (ESP) fit approach as implemented in the SCRF module [26,27] of ADF. For the Cu A pair we employed a 36 atoms cluster model (charge-derivation-cua-pair.xyz). It includes the two Cu atoms and the side chains of the coordinating histidine and cysteine residues His114 II , His157 II , Cys149 II , and Cys153 II . For heme b we used a 97 atoms cluster model (charge-derivation-hemeb.xyz) that includes both the heme b and the side chains of its two axial histidine ligands, residues His72 and His386. For the DNC, we employed the coordinates from the geometry optimized cluster model, added the geranyl-geranyl tail and removed residues Arg449, His376, Asp372 and Gly232. We generated two different charge sets for the DNC, one with deprotonated PRAa 3 (192 atoms, charge-derivation-DNC.xyz), and one with protonated PRAa 3 (193 atoms, charge-derivation-DNC-PRAa 3 H.xyz).
The ESP charges are used both for the cofactors and the coordinating residues, which thus carry charges that differ from the standard AMBER ff12SB charges. Charges on symmetry equivalent atoms were averaged and all charges were scaled uniformly to yield the correct integer charge. Data with residue definitions and these optimized charges are contained in the Amber OFF format library files for the Cu A pair (cua-pair.lib), heme b (hemeb.lib), and the DNC (dnc.lib).

Acknowledgments
We gratefully acknowledge financial support by NIH (Grant R01 GM100934) and the China Scholarship Council (CSC) (No. 201306820006) for funding a two-year research visit of LY in the USA. We also acknowledge the support and help for LY by Professor Sanguo Hong, Ning Zhang, and Hongming Wang of the Department of Chemisty, Institute for Advanced Study, Nanchang University, China. This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1053575. We appreciate computer time from the San Diego Supercomputer Center through award TG-CHE130010 to AWG. We also thank the Scripps Research Institute for computational resources. This work was also supported in part through computer time provided by the NIH National Biomedical Computation Resource (NBCR) (Grant P41 GM103426) and Exxact Corporation.