Combined molecular docking, homology modeling and DFT
method for the modification of bovine serum albumin (BSA) to
improve fluorescence spectroscopy for phthalate acid esters
chelated with BSA

While phthalate acid esters (PAEs) cannot fluoresce alone, they can be detected by fluorescence spectroscopy after chelation with bovine serum albumin (BSA). In this study, the types of amino acid residues at the active site of PAEs chelated with BSA were determined using molecular docking technology. A modification scheme of BSA with higher detection sensitivity fluorescence spectroscopy for PAEs was proposed based on the docking results and constructed for a novel BSA structure with a higher detection sensitivity of fluorescence spectroscopy using a homologous modeling method. Density functional theory (DFT) was employed to explore the influence before and after BSA modification on PAEs’ detection through fluorescence spectroscopy. The results showed that the docking scores between BSAs and dimethyl phthalate (DMP), dibutyl phthalate (DBP) and di-n-octyl phthalate (DNOP) were increased up to 26.45%, 16.82% and 16.30%, respectively, indicating that the active site modification of BSA could enhance the binding affinity between BSA and PAEs. The fluorescence intensity of PAEs chelated with modified BSAs were calculated. The fluorescence intensity of fluorescence spectroscopy for DMP, DBP and DNOP chelated with BSAs after modification was increased up to 2.8-, 104.51and 62.43-fold, respectively, which achieved the purpose of theoretically modifying BSA to improve the detection sensitivity of fluorescence spectroscopy for PAEs.


Introduction
Phthalate Acid Esters (PAEs) are usually synthesised by the esterification of phthalic anhydride with various alcohol compounds (González-sálamo et al., 2018a) and they have high lipid solubility and low water solubility (González-Sálamo et al., 2018b). PAEs are widely used as plasticisers in food packaging, toys, medical supplies and personal care products as they have a characteristic offering significant improvement of the strength and plasticity of plastic products (Staples et al., 1997). The estimated worldwide annual production of plastic products is 150 million tons, while about 8 million tons of PAEs are consumed each year (Net et al., 2015). Because PAEs are not covalently bonded to polymers (Bope et al., 2019) and easily released into the environment, studies have shown that PAEs have endocrine disrupting, teratogenic, carcinogenic and mutagenic properties (Wang et al., 2015;Guo and Kannan, 2011) and their extensive use further increases the risk to human health. Therefore, finding efficient ways of detecting PAEs in the environment to avoid their potential harm to biological health has become an environmental problem that need to be solved.
Presently, the main methods for detecting PAEs are Gas Chromatography (GC), High-performance Liquid Chromatography (HPLC), Gas Chromatography-Mass Spectrometry (GC-MS) and Liquid Chromatography-Mass Spectrometry (LC-MS). Liang et al. (2009) used the GC method to detect PAEs content in water and soil in Wuhan. However, the GC method is only suitable for PAE samples that can be volatilised below 300°C, and GC detectors are vulnerable to contamination with poor sensitivity (Beens and Brinkman, 2014). Li et al. (2017) utilised HPLC and LC-MS to detect PAEs degradation intermediates. The detection process in this method is complex and requires the high pretreatment of samples. Shao et al. (2013) employed GC-MS to detect PAEs in water and sediments. The sample preparation and operation steps of the traditional detection method are complicated and require a lot of manpower, financial resources and time, and since there are more than one type of PAE in the environment, the PAE spectral detection results obtained by traditional detection methods are susceptible to interference and can easily mistake environmentally friendly and harmful contaminants. Therefore, there is a need to find ways to quickly and further improve the detection sensitivity of PAEs.
Compared with traditional detection methods, fluorescence spectrometry has the advantages of high sensitivity, simplicity and high selectivity (Zhang et al., 2011). However, since PAEs cannot emit fluorescence alone and cannot be directly detected by fluorescence methods, the protein's endogenous fluorescence spectrum can be studied by the interaction of PAEs with proteins, thereby indirectly realising the detection of PAEs (Xie et al., 2011). Phthalates combine with serum albumin in plasma after entering the human body and gradually play a toxic role in the receptor sites and the transport of plasma (Varshney et al., 2010;Ryan et al., 2011;Wagner et al., 2010). Therefore, in this study, Bovine Serum Albumin (BSA), which has low molecular weight, high solubility, high stability, low cost, high affinity with the ligand and high homology with Human Serum Albumin (HSA), was selected as the target protein to improve the detection sensitivity of PAEs for analysis (Duan et al., 2013;Bourassa et al., 2010) and thus study the mechanisms and laws of the two and provide guidance for theoretical transformation. The molecular docking method has been widely used in the research field of interactions between small molecules and proteins (Karjiban et al., 2014). As a recognised method for the predictable synthesis of receptor proteins, the homologous modeling method can rapidly and effectively construct the structure of novel proteins (Zhou et al., 2013;Klepeis et al., 2009;Liu et al., 2006) and density functional theory has been widely used in the theoretical calculation of spectral recognition (Kohn and Sham, 1965;Egger et al., 2015). The DMP, DBP and DNOP were limited in the list of priority control pollutants by the Environmental Protection Agency (EPA) in 1977 (Amir et al., 2005). In China, DMP, DBP and DNOP were also identified as priority control pollutants for control in 1990 Jin et al., 2013). In this paper, three representative PAEs (DMP, DBP and DNOP) were selected to dock with BSA.
Firstly, since phthalate acid esters (PAEs) cannot fluoresce alone, they can be detected by fluorescence spectroscopy after chelation with bovine serum albumin (BSA). Therefore, in order to improve fluorescence spectroscopy for phthalate acid esters chelated with BSA, three representative PAEs (DMP, DBP and DNOP) were employed to dock with BSA through molecular docking technology by Discovery studio software. The docking scores between PAEs (DMP, DBP and DNOP) and BAS were generated, and the levels of the docking scores were used to evaluate the binding affinity between them. The interaction module in the software were used to identify the types of the key amino acids between PAEs and BSAs.
Secondly, a modification scheme of BSA with higher detection sensitivity fluorescence spectroscopy for PAEs was proposed based on the docking results. The hydrophilic amino acid residues near the low affinity active sites were replaced with hydrophobic amino acid residues which could improve the affinity between the molecule and the protein, whereas the hydrophobic amino acid residues were replaced with hydrophilic amino acid residues.
In addition, the novel BSA structures based on the modification schemes were constructed with a higher detection sensitivity of fluorescence spectroscopy using a homologous modeling method, and the density functional theory (DFT) was employed to evaluate the influence before and after BSA modification on PAEs' detection through fluorescence spectroscopy.
As shown in Fig. 1, the design of the novel BSA with higher detection sensitivity fluorescence spectroscopy for PAEs were summarized.

Data sources
In this paper, the receptor protein used for molecular docking with dimethyl phthalate (DMP), dibutyl phthalate (DBP) and di-n-octyl phthalate (DNOP) was a commonly used bovine serum (bovine serum albumin, BSA: 3V03) (Shi et al., 2016). The BSA (PDB ID: 3V03) and the sequence of its amino acid residues were obtained using the NCBI database query and stored in PDB format.
Molecular docking method used to ascertain the key amino acid residues of PAEs binding to BSA Discovery Studio 4.0 software was used to explore the docking conformation of PAEs with BSA and novel modified BSAs. Firstly, the protein model in the form of PDB was removed from the water molecules and hydrogenated and charged. Secondly, the receptor binding cavity of the protein was defined, and the PAEs were optimised using Gaussian 09 software to obtain the optimal configuration. In the case of unknown receptors, the lowest energy conformation of the molecule was regarded as the dominant stable conformation. The molecular program called Minimize was adopted to optimise the energy of each molecule following the Tripos force and MMFF94 charge loading (Halgren, 1996). Using the Powell energy gradient method, the maximum number of optimisations was set to 10,000, and the energy convergence limit was 0.001 kJ/mol; the rest were default values. The optimised PAEs were loaded into Discovery Studio 4.0 software. To compare the change in binding affinity between BSA and target PAEs before and after modification, the LibDock module of Discovery Studio 4.0 software was used to dock the target DMP, DBP and DNOP with BSA and their novel modified BSAs, respectively. The docking scores was obtained by considering the polarity, hydrophobicity, entropy and solvation between ligand and receptor. Docking scores were used to evaluate the docking results of molecules. The higher the score was, the stronger the interaction between molecules and the more stable the binding (Wang et al., 2017).
Homologous modeling for BSA modification with high detection sensitivity of fluorescence spectroscopy for PAEs Homology modeling is a computational method for establishing a three-dimensional structural model based on the amino acid sequence of a protein (Yan et al., 2016). In
this paper, the sequences of amino acid residues of BSA (PDB ID: 3V03) were obtained using the NCBI database query (https://www.ncbi.nlm.nih.gov/protein/3V03_A). The key amino acid residues at the active sites of BSA (PDB ID: 3V03) were replaced by a rational design and other molecular biological means to obtain a modified sequence of BSA amino acid residues. SWISS-MODEL module, as a method to protein homology modelling, is based on the principle is based on the principle of selecting proteins with homology and known structure with unknown structure as the template of predicting structure, comparing the multiple alignment of target sequence with all homologous sequences, and identifying conservative segments. Then, the three-dimensional structure corresponding to these conservative segments is copied from the template structure, and the final three-dimensional structure is optimized by energy minimization technology. Using the SWISS-MODEL module in the automatic protein modeling server provided by Glaxo Smith Kline centre in Geneva, Switzerland (http://www. swissmodel.expasy.org), the novel modified BSA amino acid sequences and template BSA molecules were submitted to the above swiss-model server respectively.
Finally, the novel modified BSA structure was obtained using a homologous modeling method. To further verify the structural rationality of the model, the Ramachandran conformation map (http://services.mbi.ucla.edu/SAVES/) in the online evaluation server PROCHECK was used to evaluate the structural rationality of the novel modified BSA model. The condition signifying qualified model quality is if the basic group's proportions in these three regions (core region + allowable region + maximum allowable region) are greater than 95% (Arnold et al., 2006).
DFT for calculation of PAEs' fluorescence emission spectra Firstly, the time-dependent density functional theory (TD DFT) method was used to optimise the structure of excited state (Qiu and Li, 2018). Secondly, the structures of target PAEs were optimised and the fluorescence spectrum intensity of PAEs were calculated at the basis group level of B3LYP/6-31g * using Gaussian 09 software. (Gu et al., 2018;Jiang et al., 2017;Qiu and Li, 2018;Yang, 2016). The fluorescence emission spectra of the optimised target DMP, DBP and DNOP molecules combined with BSA and novel modified BSAs were calculated to further explore the influence of novel BSAs before and after modification of the fluorescence spectrum detection of target PAEs.

Results
Identification of key amino acids at the binding sites between PAEs and BSA In this paper, the NCBI database was used to query the structure information of BSA (PDB ID: 3V03) and to identify the binding sites of the target PAEs to BSA (PDB ID: 3V03). The LibDock module of Discovery Studio 4.0 software was used to dock the optimised target DMP, DBP and DNOP with BSA, respectively. As shown in Fig. 2, the amino acid residues surrounding the binding sites of the target DMP, DBP and DNOP molecules to the BSA included Arg198, Arg217, Arg194, Ser214, Asp450, Arg256, Trp213, Leu197, Leu454 and Val342. Studies have shown that when a compound binds to a receptor, there are non-bond interactions such as hydrophobic interactions, hydrogen bonding, electrostatic interactions and van der Waals forces (Johnson et al., 2010;Wang, 2013). Amino acid residues (could be identified as key amino acid residues) acted in main roles at a certain distance from the binding site . Therefore, the above amino acid residues can be preliminarily identified as key amino acid residues for direct interactions between BSA and DMP, DBP and DNOP.
The degradation mechanism of aromatic pyrene based on the structure of the target enzyme JPN2-NDO shows that hydrophobic interactions play an important role in the binding of aromatic pyrene and the target enzyme JPN2-NDO (Jin, 2017). When the hydrophilic amino acid Thr308 of template 1O7N was replaced by the hydrophobic amino acid Val232 of target enzyme JPN2-NDO, the hydrophobic interactions at the active site were enhanced, which is more favourable to the binding of pyrene. Therefore, this paper intended to replace hydrophilic amino acid residues at the active site of BSA with hydrophobic amino acid residues to improve the affinity between the target PAEs and BSA, and thereby improve the efficient detection of target PAEs by BSA.
Design of targeted modification scheme of BSA with high detection sensitivity of fluorescence spectroscopy for PAEs Reasonable protein design can identify key amino acid residues related to protein properties based on the protein structure, function and catalytic mechanism (Samant et al., 2014). The mutation sites in its sequence were designed to change the specific amino acid residues in the protein through biological means such as substitution or deletion (Johnson et al., 2010) to modify the characteristics of the protein molecules. The cellulase Asn179, Asp194 and Glu137 were mutated into Lys through site-specific modification and their thermal stability was significantly improved (Hakamada et al., 2001). The effects of amino acids in Cel6A of thermobifida fusca cellulase on its catalytic activity showed that the mutant modified with Arg237, Glu263, Lys259 and His159 amino acid residues could improve the hydrolysis activity of the enzyme (Zhang et al., 2010). As shown in Tab. 1, according to the hydrophilic and hydrophobic characteristics of 20 natural amino acid residues (Jiang, 2016), the hydrophilic amino acids Arg198 and Arg256 at the active site of BSA (PDB ID: 3V03) were replaced with hydrophobic amino acids Ile, Phe, Val, Leu, etc. A targeted modification scheme of single amino acid residues of eight novel modified BSAs was designed.

Homology modeling and rationality verification of novel BSA with high detection sensitivity of PAEs
The amino acid sequence of BSA (PDB ID: 3V03) was obtained from the NCBI database (Fig. 3). The homology of the modified novel BSAs and template proteins was more than 90%, indicating that the selected template protein was reasonable (Benkert et al., 2011).
The Ramachandran diagram (Figs. 4 and 5) showed that 95.3% of the amino acid residues of the constructed BSA-1 structure were in the optimal region; 4.7% were in the allowable region; 94.0% of the constructed BSA-2 structure were in the optimal region; and 5.8% were in the allowable region. The above analysis indicates that the threedimensional structure of BSA-1 and BSA-2 constructed by homology modeling was reasonable and reliable. The Ramachandran conformational maps of the eight modified BSAs were also analysed; the results verified that the novel modified BSA structure meets the rationality of the model
with an amino acid percentage greater than 95% in the optimal region + allowable region + maximum allowable region (Morris et al., 1992).

Evaluation of binding affinity between BSAs after modification and PAEs
The target molecules (DMP, DBP and DNOP) were molecularly docked with the eight modified modified BSAs using the LibDock module in Discovery Studio 4.0 software. The scoring functions of the BSA (PDB ID: 3V03) docked with the DMP, DBP and DNOP molecules were 61.10, 91.02 and 109.49, respectively. As shown in Tab. 2, comparisons of the docking scores between the BSAs after modification with the target DMP, DBP and DNOP molecules showed that the hydrophilic amino acid residues on the active site of BSAs (PDB ID: 3V03) were replaced by hydrophobic amino acid residues. The docking scores of the BSA with target PAEs before and after modification were showed a difference. After the active sites   Arg198 and Arg256 were replaced by hydrophobic amino acid residues such as Ile, Phe, Val and Leu, the scoring functions of all the modified BSAs docked with the target molecules (DMP and DBP) were improved at ranges between 3.37-26.45% and 8.49-16.82%, respectively, and only two modified BSAs docked with DNOP increased by 5.83% and 16.30%, respectively. Most docking scores are improved, indicating its corresponding BSA (PDB ID: 3V03). Site-directed modification could enhance the binding affinity between BSA after modification with target PAEs. In summary, eight modification schemes to replace hydrophilic amino acids with hydrophobic amino acids at the active site of BSA could simultaneously increase the binding of BSA to target DMP, DBP and DNOP molecules to varying degrees, and promote the detection of BSA and target PAEs. The modification of the BSA (PDB ID: 3V03) with Arg198 at the active site instead of the hydrophobic amino acid Val was significant.
Changes in the fluorescence intensity of PAEs with BSA before and after modification To further explore the influence of the novel modified BSAs on the fluorescence spectrum detection of target PAEs, Gaussian 09 software was used to calculate the fluorescence spectrum intensity of the target DMP, DBP and DNOP molecules combined with the novel modified BSAs under the TD-DFT method B3LYP/6-31G (d) base groups (Qu et al., 2016). The calculated results are listed in Tab. 3.
After binding the target DMP, DBP and DNOP molecules with BSA (PDB ID: 3V03), the fluorescence intensity of the target PAEs was found to be weak (40.44,32.35 and 28.31,respectively;Tab. 3). Compared to the fluorescence intensity of PAEs after binding with the original BSA, the fluorescence intensity of all DMP, DBP and DNOP molecules after binding to the novel BSAs increased. The fluorescence spectral intensities of DMP, DBP and DNOP molecules were increased by 2.10-2.80,  instead of Leu and Arg198 instead of Leu) had the most significant improvement in the fluorescence spectrum intensity of the target PAEs. Compared to the DMP and DNOP, the effects of the novel BSAs has a more significant effect on the fluorescence intensity of DBP molecules.

Discussion
In this paper, the theoretical modifications of BSA with high detection sensitivity of PAEs was carried out, and the influence of novel modified BSA on the fluorescence intensity of PAEs was analysed. The results showed that the binding affinity of eight novel modified BSAs with high detection sensitivity to target DMP, DBP and DNOP was improved to varying degrees. The modification of novel BSA with Arg198 replaced with Val at the active site was very significant. Three modified BSAs with high detection sensitivity (BSA-7, BSA-8 and BSA-4) showed the most significant improvement in the fluorescence spectrum intensity of the target DMP, DBP and DNOP. The novel BSAs had a more obvious effect on the fluorescence intensity values of DBP. Theoretical verification showed that the modified BSAs designed using this method satisfies the purpose generating high detection sensitivity of PAEs and could provide a theoretical basis for the design of modified BSAs with high detection sensitivity of PAEs.