Computational refinement of spectroscopic FRET measurements

This article supplies raw data related to a research article entitled “Joint refinement of FRET measurements using spectroscopic and computational tools” (Kyrychenko et al., 2017) [1], in which we demonstrate the use of molecular dynamics simulations to estimate FRET orientational factors in a benchmark donor-linker-acceptor system of enhanced cyan (ECFP) and enhanced yellow (EYFP) fluorescent proteins. This can improve the recalculation of donor-acceptor distance information from single-molecule FRET measurements.


Type of data
MD simulation setup and force field parameterization of ECFP and EYFP are given. MD sampling performed in the absence and in the presence of 0.1 M buffering ions of Na þ and Clrevealed that the increased ionic strength had no effect on the equilibrium distance distributions between EFCP and EYFP moieties.
The distribution of the FRET orientational factor, κ 2 , estimated from MD simulations, is found to correspond well to the theoretical curve for an ideal, unrestricted, isotropic re-orientation of dipoles.
The comparison of the effect of the standard assumption about the unique value of the Förster radius R 0 on the accuracy of determination of the Donor-to-Acceptor distance distributions is presented, revealing that both apparent distributions calculated from either MD-generated or experimentally measured FRET efficiencies systematically overestimate the width of the true distance distribution R DA .

Data
MD simulations for a benchmark donor-linker-acceptor system composed of enhanced cyan (ECFP) and enhanced yellow (EYFP) fluorescent proteins were used to estimate FRET orientational factors, which can improve the recalculation of donor-acceptor distance information from sm-FRET measurements. ECFP and EYFP proteins were linked with a flexible unstructured Gly/Ser peptide linker composed of 3 repeating units of GlyGlySerGlyGlySer (GGSGGS) as shown in Fig. 1.

MD sampling of ECFP-l 3 -EYFP by multiple discrete MD simulations
Proper conformation sampling of l 3 linker is essential for the accurate analysis of MD simulations in ECFP-l 3 -EYFP. In general, the convergence of the MD sampling of long-chain polypeptides is challenging, because they have multidimensional energy surfaces characterized by many local minima separated by potentially high free-energy barriers, which can lead to kinetic trapping. In our system, however, we did not expect MD sampling to become restricted to a series of localized metastable conformations, because the flexible linker connecting two rigid FPs is not expected to create high-energy barriers. Nevertheless, in order to ensure sufficient conformational MD sampling, we carried out a series of multiple discrete MD runs, each initiated with random initial structures. Such an approach has been shown to enhance conformational sampling [2][3][4][5]. To investigate the role of starting configurations in the equilibrium structure of EFCP-l 3 -EYFP, we used a series of four independent parallel MD simulations, each of which was carried out with a different initial conformation of l 3 linker. Fig. 2 show MD snapshots of the initial and final conformations of EFCP-l 3 -EYFP for the four different MD runs referred to as runs 1-4. In runs 1-2, EFCP-l 3 -EYFP was constructed as having the extended conformation of l 3 , so that the initial value of the donor-to-acceptor distance R DA was about 90 Å. In runs 3-4, the EFCP and EYFP moieties were set in different orientations to avoid local conformation trapping during MD sampling. The large-scale MD parallel simulations of runs 1-4 revealed that in all studied systems R DA converged to some plateau at $ 42-46 Å (Fig. 3). From these MD runs the last 100-200 ns were used for the analysis, during which the average orientation factor approached the theoretical value for unrestricted diffusion, confirming that the time range of our simulations is sufficient for the proper conformational sampling of the studied system.

Effect of ionic strength
Most MD samplings were carried out in water with sodium ions added to counterbalance the protein charge. To estimate the role of ionic strength in conformation equilibrium of EFCP-l 3 -EYFP, the control MD simulation (100 ns) was also performed in the presence of 0.1 M buffering ions of Na þ and Cl - (Fig. 4A), which revealed only a small effect of ionic strength on equilibrium distances between bulk EFCP and EYFP moieties (Fig. 4B).

Orientation factor κ 2
The orientation factor κ 2 that provides the dependence of the interaction between two electric dipoles on their orientations as shown in Fig. 5. κ 2 can be defined as shown in Fig. 5B. To calculate instantaneous κ 2 (t) from MD trajectories, the angles θ D , θ A and distance R DA were calculated by using GROMACS analysis utilities g_angle and g_sgangle, respectively.
The instantaneous orientation factor κ 2 (t) for the FRET ECFP-EYFP molecular pair was calculated from the instant position coordinates of the MD simulation trajectory, which allowed us  to estimate the distribution of the donor and acceptor transition dipoles (Fig. 6). The MDestimated average value of o κ 2 4 ¼0.69 is in close agreement with the isotropic limit of 2/3, serving as an additional proof of sufficient conformational sampling. Comparison of the simulated distribution of κ 2 with the theoretical angular distribution for isotropic dipoles in a threedimensional system [6,7] is shown in Fig. 6. While overall agreement was observed, it might be noted that the donor-acceptor pair linked by a series of chemical bonds still shows some minor deviations from ideal angular orientations of random non-interacting dipoles, also observed for other systems [8][9][10][11].

MD-based calculation of FRET efficiency
The apparent FRET efficiency E DA for the ECFP-l 3 -EYFP system was calculated from MD-generated time traces by using Eqs. (1) and (2), the MD-estimated instantaneous value of the ECFP-to-EYFP distance R DA, and the instantaneous orientation factor κ 2 (t), both evaluated at each MD trajectory step [1].
where R 0 is the Förster radius characteristic for the specific dye pair involved in FRET. For ECFP/EYFP pair R 0 ¼48 Å [12], as calculated from the following where refractive index n ¼ 1.4 [12,13], quantum yield of the donor Q D ¼ 0.4, the overlap integral J(λ) is calculated using the normalized ECFP emission spectrum and the EYFP excitation spectrum, normalized to 84,000 M -1 cm -1 at 514 nm. For κ 2 , a value of 2/3 was used, i.e., assuming dynamic averaging of the relative orientations of the ECFP emission and EYFP absorption dipole moments. The mean value of the MD-estimated apparent FRET efficiency was estimated to be 0.55 and the corresponding E DA histogram reveals a broad distribution (FWHH ¼ 0.25), which can be compared to the distribution obtained from the sm-FRET measurements (Fig. 7A) [1].

Reconstructing R DA distance distributions from FRET efficiencies
First, we used MD-generated FRET efficiency E DA to reconstruct the "apparent" Donor-Acceptor distance distribution (R App ), assuming a unique Förster distance of R 0 ¼ 48 Å that corresponds to the average orientational factor κ 2 ¼ 0.69 (Eq. 1). Then, we compared it to the "true" distance distribution (R True ) directly estimated from the MD trajectory (Fig. 7B).
The comparison of the effect of the standard assumption of the unique R 0 value on the accuracy of determination of the Donor-to-Acceptor distance distributions is presented in Fig. 8. Both apparent distributions calculated from either MD-generated or experimentally measured FRET efficiencies, calculated under standard assumptions, systematically overestimate the width of the true distance distribution R DA . We conclude that a careful consideration of the orientational dynamics within a FRET pair is crucial for accurate measurements of distance distributions.

Materials
The materials and methods used to prepare the sample and collect the absorption, fluorescence intensity and fluorescence decay and sm-FRET data are given in [1,14].

Molecular dynamics simulation setup
The structural model of peptide ECFP-l 3 -EYFP [12] was designed based on available X-ray structures of ECFP (PDB code: 1CV7) and EYFP (PDB code: 1YFP) [15], connected by flexible linker TLGMDELYKSGIR(GGSGGS) 3 -TMVS referred to as l 3 . The CHARMM27 force field for proteins, recently adopted for the GROMACS package, was used [16]. In this force field, the CHARMM27 CMAP correction term was implemented. The solvent water was modeled using the special CHARMM (TIP3P) model [16]. The bond length and angle parameters for a chromophore residue of ECFP and EYFP were optimized by density functional theory calculations at the B3LYP/cc-pVDZ level and adopted for the CHARMM27 force field. Partial charges needed for Coulomb interactions were derived from the B3LYP/ccpVDZ electron densities by fitting the electrostatic potential to point (ESP) charges [17]. The  S 0 -S 1 transition dipole moment needed for calculation of the orientation factor κ 2 was derived from time-dependent density functional calculations TD-B3LYP/cc-pVDZ, and the corresponding dipole unit vectors are shown in Fig. 5. The estimated direction of the excited-state transition dipole moments of the ECFP and EYFP chromophores are found to be in agreement with those previously published for the chromopores of the green fluorescent protein family [18]. The MD topology file for polypeptide ECFP-l 3 -EYFP was built using GROMACS pdb2gmx utility. The CHARMM27 topology building blocks for non-native amino acid residues CRO and CRF were implemented using the corresponding GROMACS RTP library.
All MD simulations were carried out at a constant number of particles, constant pressure (P ¼1 atm), and constant temperature (T ¼298 K, NPT ensemble). Three-dimensional periodic boundary conditions were applied with the z-axis lying along a direction normal to the bilayer. The pressure was controlled isotropically, so that the x, y and z dimensions of the simulation box were allowed to fluctuate independently from each other, keeping the total pressure constant. The reference temperature and pressure were kept constant using the Berendsen weak coupling scheme with a coupling constant of τ T ¼0.1 ps for the temperature coupling and τ P ¼ 1.0 ps for the pressure coupling [19]. Electrostatic interactions were simulated with the particle mesh Ewald (PME) approach using the long-range cutoff of 0.8 nm [20]. The cutoff distance of Lennard-Jones interactions was also equal to 0.8 nm. All bond lengths in the protein were kept constant using the LINCS routine [21]. The MD integration time step was 2 fs. The MD simulations were carried out using the GROMACS set of programs, version 4.5.5 [22]. Molecular graphics and visualization were performed using VMD 1.8.6 software packages [23].

Molecular dynamics simulation flowchart
A series of MD simulation runs of ECFP-l 3 -EYFP, each starting with a different initial configuration, was performed according to the following flowchart: To initialize MD simulations, the FP protein was placed in a rectangular unit cell with a minimum distance of 2.0 nm from the box edge. The MD unit cell was filled with TIP3P water molecules by using GROMACS genbox utility. Next, steepest descent energy minimization was performed followed by an addition of ions in order to counterbalance the protein charge, yielding a final system of 11 Na þ ions. The control MD sampling was also performed in the presence of 0.1 M buffering ions Na þ and Cl -. Before accumulating MD conformational sampling, each starting system was taken through a 10 ns pre-equilibration run. To achieve better MD statistics, four initial configurations of ECFP-l 3 -EYFP with different distances and orientations between ECFP (donor) and EFYP (acceptor) were sampled. These four conformations of ECFP-l 3

Supporting information
Raw simulation data of various conformations of protein ECFP-l3-EYFP are provided in PDB format. Supplementary raw data shown in Figs. 7 and 8 can be found in the online version of the paper.

Transparency document. Supporting information
Transparency data associated with this article can be found in the online version at http://dx.doi. org/10.1016/j.dib.2017.03.041.