CYP116B5hd, a self-sufficient P450 cytochrome: A dataset of its electronic and geometrical properties

This paper documents the dataset obtained from the Electron Paramagnetic Resonance (EPR) study of the electronic properties of a self-sufficient cytochrome P450, CYP116B5hd, which possesses an interesting catalytic activity for synthetic purposes. In fact, when isolated, its heme domain can act as a peroxygenase on different substrates of biotechnological interest. Raw data shown in Famulari et al. (2022) and supplementary data in raw and processed forms (figures) are documented and available in this paper. Additionally, simulations of the experimental data together with simulation scripts based for EasySpin, a widespread MATLAB toolbox for EPR spectral simulations, are provided. The procedure for g-value analysis based on a crystal-field theory is also detailed here, offering an interesting tool for comparison of FeIII-heme P450 systems. Due to the catalytic interest of the protein, which has been recently discovered, and the correlation that has been reported between g-values and peroxidase function, both, CW-EPR and HYSCORE spectra and data set of the model CYPBM3hd are also provided. Finally, the materials and methods for enzyme production and purification, sample preparation and experimental and spectroscopic procedures a together with instrumental details are described in detail. The data files and simulation scripts can be found in: https://doi.org/10.5281/zenodo.6418626

Due to the catalytic interest of the protein, which has been recently discovered, and the correlation that has been reported between g-values and peroxidase function, both, CW-EPR and HYSCORE spectra and data set of the model CYPBM3hd are also provided. Finally, the materials and methods for enzyme production and purification, sample preparation and experimental and spectroscopic procedures a together with instrumental details are described in detail. The data files and simulation scripts can be found in: https: //doi.org/ 10 The data were collected at low-temperature (6-40 K) using a Bruker Elexys E580 X-band spectrometer (microwave frequency 9.68 GHz) equipped with a cylindrical dielectric cavity and a helium gas-flow cryostat from Oxford Inc. Data format Data are in .DTA (file containing the spectral data), .DSC (text file containing the spectral parameters) and .txt (text file containing the spectral data) Description of data collection The data were collected as described in Section 5 .

Value of the Data
• The dataset provided consist of a collection of CW-EPR and HYSCORE spectra of CYP116B5hd and CYPBM3hd. Due to the sensitivity of EPR to the close environment of the iron, the availability of such data will be of use to EPR researchers for comparison with future studies of other proteins of the family. • The simulation scripts will help other heme-protein researchers to get familiar with EPR simulations using EasySpin. • The detailed explanation of the model for g-value analysis allows to understand the significance and interpretation of the two crystal-field parameters ( and V), providing structural insights of the heme site and a way for classification and quantitative comparison between CYP450 enzymes. • The details about the materials and methods will be of interest for sample preparation of similar protein type or experimental reproducibility.

Data Description
Data were collected using an EPR spectrometer and processed with EasySpin, a widespread MATLAB toolbox. The dataset in the Zenodo repository ( https://doi.org/10.5281/zenodo.6418626 ) contains raw and processed numerical data. Additionally, below we present figures and tables complementing the ones shown in [1] .
Description of the dataset: • Data type : Experimental spectroscopic measurements (EPR), computer simulations and data analysis. Files in the folder PARACAT_WP5_20220225_MATLAB contain: ready-to-plot / ready-tosimulate data in .m format.

Experimental Design, Materials and Methods
Cloning, expression and purification of CYP116B5hd . The construct used in this work was obtained by cloning the initial part of the gene of CYP116B5 (coding for the first 442 amino acids, the heme domain) between NdeI and EcoRI restriction sites in a pET-30a ( + ) vector with the insertion of a N-terminal 6xHis tag [8] . Expression and purification of the protein were carried out as described before [8] . Briefly, protein expression was carried out in E. coli BL21 (DE3) cells at 22-24 °C for 24 h in LB medium supplemented with 0.5 mM of δ-aminolevulinic acid ( δ-Ala) and 100 μM IPTG. For the purification, cells were resuspended and sonicated in a 50 mM KPi pH 6.8 buffer supplemented with 100 mM KCl, 1 mg/mL lysozyme, 1% Triton X-100 and 1 mM PMSF (phenylmethylsulfonyl fluoride), and 1 mM benzamidine. After ultracentrifugation at 90,0 0 0 g for 45 min at 4 °C, the supernatant was loaded onto a 1 ml His-trap HP column (GE Healthcare) and eluted by a linear gradient of imidazole ranging from 20 to 200 mM. The purest fractions were then concentrated, loaded into a Superdex 200 size exclusion chromatography column (GE Healthcare) and eluted using a 50 mM KPi pH 6.8 buffer containing 200 mM KCl. The purified protein was then concentrated and stored in 50 mM KPi pH 6.8 containing 10% of glycerol after buffer exchange by ultrafiltration using Amicon Ultra 30,0 0 0 MWCO devices. Protein concentration was estimated from the spectrum of the P450 −CO complex upon reduction with sodium dithionite and CO bubbling, using an extinction coefficient of 91,0 0 0 M −1 cm −1 [9] .
Electron Paramagnetic Resonance (EPR) All the protein samples in buffer KPi 50 mM pH 6.8 10% glycerol were mixed with 30% of glycerol as glassing agent to an approximate final protein concentration of 200 μM. The samples with histidine and imidazole were prepared adding an excess of these chemicals to reach the ratio (1:10) with respect to the protein. The experiments were performed on a Bruker Elexys E580 X-band spectrometer (microwave frequency 9.68 GHz) equipped with a cylindrical dielectric cavity and a helium gas-flow cryostat from Oxford Inc. The samples were and kept continuously frozen with liquid nitrogen to preserve the integrity of the protein Continuous-Wave EPR. The spectra were recorded at 20 K or 40 K and a microwave power of 0.07 mW, modulation amplitude of 1 mT and a modulation frequency of 100 KHz were used. The spectra were baseline corrected and smoothed then simulated by using the EasySpin toolbox for MATLAB [10] .
Hyperfine Sublevel Correlation (HYSCORE) [11] . Pulse EPR experiments were performed at 6 K using the pulse sequence π /2 -τπ /2 -τ 1 -ππ 2 -π /2 -τ -echo with microwave pulse lengths τ π /2 = 16 ns and τ π = 16 ns. The time intervals t 1 and t 2 were varied in steps of 16 ns or 24 ns. τ values of 208 ns or 250 ns were chosen. A four-step phase cycle was used for eliminating unwanted echoes. The time traces of the HYSCORE spectra were baseline corrected with a third-order polynomial, apodized with a Hamming window and zero filled. After twodimensional Fourier transformation, the absolute value spectra were calculated.

CW-EPR and HYSCORE Spectra of CYPBM3hd
For comparison purposes, analogous experiments to the ones showed in [1] were performed on the model self-sufficient CYPBM3hd, as shown in Fig. 1 . Supporting the CW-EPR spectra, the appearance of extra peaks in Fig. 2 b, due to an additional strongly coupled nitrogen, indicates that imidazole also displaces the axial water in CYPBM3hd.

Simulation Parameters of CW-EPR Spectra
The parameters used for simulating the CW-EPR spectra shown in Fig. 1 of [1] are collected in Table 1 .

Simulation of CYP116B5hd 1 H HYSCORE Spectrum
The simulation of the experimental HYSCORE spectrum (Fig. 3) allowed to obtain the hyperfine coupling parameters, a iso = -1.095 MHz and T = 5.20 MHz with a b angle of 22 °. The dipolar hyperfine coupling value, T , can be utilized to determine the radial distance, r , between the electron spin and the coupled proton through the point-dipole approximation: T = μ 4 π h g e βe g n βn

Analysis of Low-spin Fe III EPR Signals
One of the most useful approaches when interpreting the g -values of a heme low-spin Fe III , a d 5 species, is to consider the relationship between its electron configuration and the gyromagnetic tensor ( g ). The orientation of the tensor is explained by the so-called counter-rotation theory, considered in detail somewhere else [2 , 3] .
We will discuss in the following the connection between the principal values of the tensor and the composition of the ground state. The energy levels of the d orbitals are defined by the metal surroundings: the nature of the ligands and their geometry. Provided approximate octahedral symmetry and strong ligands, all five d electrons of the metal are located in the t 2g orbitals, d xy , d xz and d yz , resulting in a total electron spin S = ½ (low-spin). The energy distribution of the t 2g orbitals can be expressed in terms of the rhombic ( V ) and axial ( ) crystal field parameters, as schematized in Fig. 4 . The difference in the ligand-field strengths between axial ligands and N-porphyrin ligands induces the splitting in energy between d xy , and the other two d xz and d yz , parametrized by . In turn, asymmetries in the axial ligands can split the energy of d xz and d yz . , V represents this energy difference. Therefore, is sensitive to the presence and nature of axial ligands and to the strength of their interaction with the metal whereas V is related to the nature of the axial ligands and its orientation with respect the porphyrin ring.
According to the formalism introduced by Griffith [4] and developed by Taylor [5] , this lowspin system can be treated as a monoelectronic system (one hole) where the spin-orbit coupling energy is of the same order of magnitude than the described effects of the ligands. This results in a ground state that can be described as an admixture of the former three t 2g orbitals via spin-orbit coupling.
Following this model, the wavefunctions for the Kramer's ground doublet are given by Eqs. (1) and (2) , where a, b , and c are orbital coefficients: The g -tensor principal values can be expressed in terms of the mixing coefficients a, b , and c through the following expressions: Conversely, when the g-values are known from experiment, the coefficients can be calculated: (4) The ratio of the crystal field parameters and the spin-orbit coupling constant, V / ξ and / ξ are then calculated from the g -tensor values according to the following equations: Note that the knowledge of all three g -values is necessary in order to determine V/ ξ and / ξ . Also, the spin orbit coupling constant of free iron is ξ ∼ 400 cm −1 .
If this model is an accurate description for the iron centre, a 2 + b 2 + c 2 ≡ 1 . That is, all these equations are valid as long as the t 2g orbitals are purely non-bonding and the remaining two empty e g orbitals (|dx 2 -y 2 and |dz 2 ) lie sufficiently high in energy that their contributions can be neglected. If we define the normalization parameter as: It can provide an idea of the degree of suitability of the analysis. If the value is very close to one, the description will prove valid. On the other hand, if the value is less than one, it has been interpreted as an indication for the existence of some degree of covalency, or delocalization of the spin density onto the ligands [5] . Conversely, if m is larger than one, it can be an indication of mixing of excited states with high orbital contributions (i.e. excited orbitals perturbing the ground state) [6 , 7] Table 2 .
It is worth noting that the calculated values for m are very close to 1 in all cases, which indicates the suitability of the analysis presented in Table 2 and, therefore, a predominant nonbonding character of the orbital hosting the unpaired electron with negligible admixture of higher energy terms.

Ethics Statements
The work did not involve any human or animal subjects, nor data from social media platforms .

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.