Crystal Structure of MpPR-1i, a SCP/TAPS protein from Moniliophthora perniciosa, the fungus that causes Witches’ Broom Disease of Cacao

The pathogenic fungi Moniliophthora perniciosa causes Witches’ Broom Disease (WBD) of cacao. The structure of MpPR-1i, a protein expressed by M. perniciosa when it infects cacao, are presented. This is the first reported de novo structure determined by single-wavelength anomalous dispersion phasing upon soaking with selenourea. Each monomer has flexible loop regions linking the core alpha-beta-alpha sandwich topology that comprise ~50% of the structure, making it difficult to generate an accurate homology model of the protein. MpPR-1i is monomeric in solution but is packed as a high ~70% solvent content, crystallographic heptamer. The greatest conformational flexibility between monomers is found in loops exposed to the solvent channel that connect the two longest strands. MpPR-1i lacks the conserved CAP tetrad and is incapable of binding divalent cations. MpPR-1i has the ability to bind lipids, which may have roles in its infection of cacao. These lipids likely bind in the palmitate binding cavity as observed in tablysin-15, since MpPR-1i binds palmitate with comparable affinity as tablysin-15. Further studies are required to clarify the possible roles and underlying mechanisms of neutral lipid binding, as well as their effects on the pathogenesis of M. perniciosa so as to develop new interventions for WBD.

In addition to lipid binding motifs, SCP/TAPS proteins are characterized by a large central CAP cavity as large as 1638 Å 3 in the case of Pry1 26 . Early studies of SCP/TAPS proteins indicated that the central CAP cavity contained a tetrad of residues, two His and two Glu that bind divalent cations including Zn 2+ and Mg 2+ 15, 22, 29, 30 . The tetrad was shown to be important for Zn 2+ binding and heparin-sulfate dependent inflammatory modulation mechanisms of cobra CRISP natrin 29 . The tetrad residues are contributed by four poorly conserved CAP motifs defined by Gibbs and colleagues 22 . Additionally the CAP cavity is independent of the lipid cavities and not connected within the monomer. A crystallographic dimer is formed in the Pry1 crystal structure in which the central CAP cavity is connected to the CBM 26 . It remains unknown if this crystallographic dimer has any functional roles 26 . Furthermore, the CAP tetrad is not required for sterol transport because SmVAL4, a CAP protein lacking the tetrad, is able to effectively transport sterol in vivo and bind sterol in vitro 31 . Additionally, mutating the tetrad did not reduce the ability of Pry1 to bind and transport sterols 27 . These studies indicate that SCP/TAPS proteins have independent lipid and cation binding functions.
Despite having a conserved alpha-beta-alpha sandwich topology, SCP/TAPS proteins are ~50% loops, which makes it difficult to predict their structures 13,26,31,32 . We present in this report the structure of MpPR-1i, a SCP/ TAPS protein expressed by M. perniciosa during biotrophic stage of WBD, in basidiomes, and in monokaryotic mycelia 33 . MpPR-1i has less than 25% sequence identity with any of the structures in the protein data bank, which hampered efforts at solving the structure using molecular replacement. The crystal structure of MpPR-1i was determined using selenourea (SeUrea) soaking method to solve the phase problem 34 . This is the first de novo structure determined by SeUrea phasing.
Using TLC analysis, a neutral saturated lipid was found bound to recombinant MpPR-1i (Figure S.6). Attempts at identifying the lipid by mass spectrometry failed, likely due to experimental limitations related to their ionization of neutral lipids as was previously observed in studies of HIF-3α where the authors identified the nature of the phospholipids but were unable to identify neutral lipids 35 . Interestingly, the crystal structure of MpPR-1i did not reveal any electron density for bound lipid, which is not unusual considering the low resolution of the structure and also could result from the crystallization agents outcompeting the lipid or the conformational flexibility of the lipid. The lipid identified by TLC was usurped during recombinant production in E. coli and may not be the same lipid that MpPR-1i binds endogenously when M. perniciosa infects cacao. Future studies beyond the scope of this manuscript include identifying the major lipids secreted during this infective process and determining if MpPR-1i is capable of binding to them. Structure Determination. All attempts at molecular replacement failed, which was not unexpected since MpPR-1i shares less than 25% sequence identity to any known structure. Despite the large number of sulfur atoms, attempts at single wavelength anomalous phasing using S signal (S-SAD) failed. Single wavelength Se anomalous data were collected to 2.9 Å resolution after soaking a single crystal with SeUrea, and nine SeUrea binding sites were identified. Using these phases, 1225 amino acid residues corresponding to seven monomers were built into the asymmetric unit (Table 1). In the refined model six SeUrea are located at the interface of adjacent monomers, while three are relatively weak binding sites (Figure S.7). SeUrea interacts with the carboxyl group from the side chain of Gln68 and the main chain of Val122 through hydrogen bonds ( Figure S.7). The structure was refined and extended to higher resolution, using a 2.43 Å native data set. Coordinates and structure factors for both models have been deposited in the Protein Data Bank under accession numbers 5V50 (native) and 5V51 (SeUrea).
Overall Structure of MpPR-1i. Each monomer of MpPR-1i has a conserved alpha-beta-alpha sandwich topology made up of 3 β strands sandwiched between two helical domains, connected by loops (Fig. 1a). One of these loops connects the two longest β strands, extends out from the core structure, and is exposed to the solvent channel in crystal. There are seven monomers in the asymmetric unit, which form a pseudo seven fold screw axis when viewed along the diagonal of the cell (Fig. 1b,c and d). The MpPR-1i crystal has very high solvent content, ~70%, which is clearly demonstrated by the solvent channel in the crystal packing viewed along a cell dimension (Fig. 1c).
The main chains of the MpPR-1i monomers are very similar with rmsd ranging between 0.19 Å to 0.27 Å. The most variable regions between the monomers are loop regions, notably the solvent exposed loop connecting the two longest β-stands, as well as the N-and C-termini loops (Fig. 2a). The amino termini of 6 monomers have the same orientation, while one (labeled monomer B) has a different orientation (Fig. 2a). While six monomers have conserved C-ter loops, the main and side chain residues starting from Leu155 in the carboxyl terminus of one (labeled monomer C) are flipped in an opposite conformation from the other monomers. Notably residues Tyr158 and Tyr 159 in monomer C are oriented 90° away from what is observed in the other monomers ( Fig. 2a and b). The interface between adjacent monomers appears to be crucial for crystal packing and have a buried surface area of ~800 Å 2 per monomer. None of the intermolecular contacts between monomers have more than 8 hydrogen bonds and the majority of the residues at the monomer interface are hydrophobic residues as illustrated by the interface between monomers A and B ( Fig. 2c and d).
Central CAP cavity. Like other reported SCP/TAPs protein structures, MpPR-1i has a large central CAP cavity (Fig. 3a,b) 13,15,22,29,[36][37][38] . The volume of the CAP cavity of MpPR-1i is 1334.39, Å 3 which is comparable to the large size previously observed in Pry1 at 1638 Å 3 . In many CAP proteins, the central CAP cavity contains a tetrad formed by residues from four signature CAP motifs: His from CAP1, Glu from CAP2, His from CAP3, and Glu from CAP4. These tetrad residues bind divalent cations including Zn 2+ and Mg 2+ (Fig. 3c,d) 13,15,16,21,22,24,29,30,39 . MpPR-1i, like SmVAL4, lacks the tetrad that binds divalent cations in other SCP/TAPS proteins 31 (Figs 3 and 4). This explains why MpPR-1i does not bind Zn 2+ used in the crystallization solution. It remains unknown why some SCP/TAPS proteins have the conserved tetrad while others do not; however the absence of the tetrad in MpPR-1i means it lacks the ability to bind divalent cations and will not be involved in heparin-sulfate dependent inflammatory modulation mechanisms like natrin 29 .  . 5a and b). The binding affinity of MpPR-1i for palmitate was determined using our established in vitro lipid-binding assay 27 and this analysis showed that MpPR-1i binds palmitic acid. The measured estimated equilibrium constant for MpPR-1i is K d 107 μM, which is comparable to that of tablysin-15 with a K d of 94 μM 36 (Fig. 5c).

Discussion
Selenourea phasing. All attempts at molecular replacement failed regardless of search model used so we tried phasing by anomalous diffraction. Although the crystallization condition contains zinc acetate, no anomalous signal for Zn 2+ ions was observed in any of the data sets, which was expected since MpPR-1i lacks the CAP tetrad. SeUrea soaking provided sufficient anomalous signal to phase the crystal structure of MpPR-1i. The low resolution SAD data at 2.9 Å has enough anomalous signal to locate the Se atoms, and enough reflections to build the whole model even without native data. This approach enables the use of SeUrea quantitatively and can be adapted for phasing other structures. As previously discussed, SeUrea does not form a stable aqueous solution, so a reducing agent like sodium sulfite (Na 2 SO 3 ) or TCEP is added to slow down the oxidation of SeUrea 34 . The stability of SeUrea was improved by using a higher concentration of Na 2 SO 3 to prepare the 1 M SeUrea/Na 2 SO 3 solution, allowing the stock solution to be stored at −20 °C for several months.
Oligomerization of MpPR-1i. MpPR-1i forms a unique crystallographic heptamer, which likely does not have any functional relevance as MpPR-1i forms monomers in solution. Evidence supporting the monomer includes DLS revealing a MW of ~20 kDa, the absence of dimerization peaks in MS, the similar molecular mass of ~17 kDa on both reduced and non-reduced gels, and the protein elution off a sizing column as a sharp peak with a molecular mass of ~17 kDa. The formation of both monomers and dimers has been previously reported in other SCP/TAPS. While some like Na-ASP-2, GLIPR-1, and Pry1 form dimers in solution, others like SmVAL-4 form monomers 13,15,26,31 . Interestingly, none of the dimers formed within the heptamer are similar to the packing of the two-CAP Na-ASP-1 or to the dimer in Pry1 that connect the CAP cavity 14,26 . While the formation of the crystallographic heptamer has no apparent functional relevance, it explains the failure of phasing by S-SAD, because the heptamer only has 42S atoms out of 18,732 total atoms, which gives weak anomalous S signal compared to the strong Se signal from SeUrea soaking. Comparison of MpPR-1i with other SCP/TAPS proteins. Using PDBFold, the most similar structures to MpPR-1i were identified as the apo structure of human Golgi-associated PR-1 protein GAPR-1 16, 24 , Pry1 from yeast 26 , SmVAL4 from Schistosoma mansoni 31 , the NMR structure of a plant P14a 17 , and the structures of human glioma pathogenesis related protein (sGLIPR1) 15 . MpPR-1i shares 19.4%, 24.2%, 20.8%, 24.3% and 20.2% sequence identity with these proteins respectively. While the core alpha-beta-alpha sandwich topology is conserved, MpPR-1i has different loop regions as well as helix and strand lengths compared to the other structures (Fig. 4). The regions of greatest flexibility have been implicated in ligand binding and make up ~40% of the structure. Interestingly, the caveolin binding motif (CBM) loop, which has been implicated in cholesterol binding in Pry1, is significantly shorter in MpPR-1i than in other CAP proteins (Fig. 4). The shortened length of the CBM loop significantly reduces the size of the sterol binding cavity, rendering it barely large enough to accommodate dioxane and definitely too small to accommodate cholesterol (Fig. 3). Thus structural data strongly suggests that MpPR-1i will lack the ability to bind cholesterol. In vivo and in vitro analyses of the implications of the small CBM on sterol binding by MpPR-1i are currently being investigated and will be published elsewhere.

Lipid binding function of MpPR-1i.
MpPR-1i gene expression was detected in monokaryotic mycelia, basidiomata, and especially in the green broom stage of the disease 33 , which suggests participation in fungal pathogenesis. The observation that MpPR-1i binds to a neutral lipid suggests that it can accommodate fatty acids in its large open palmitate binding cavity between α-helices 1 and 4 ( Fig. 6a and b) as observed in SmVal4 and tablysin-15 31,38 . Tablysin-15 is a protein present in the saliva of the horsefly Tabanus yao, which scavanges cysteinyl leukotriene, an eicosanoid lipid that promotes inflammatory response 38 . During plant infection, lipolytic enzymes target host cellular membranes, releasing free fatty acids, such as oxylipins, that have roles in plant immunity 40 . Indeed, the binding affinity measured in our established in vitro lipid binding assay was comparable to that previously observed for tablysin-15 28 . Therefore, MpPR-1i could act similarly to tablysin-15, sequestering lipids that potentiate plant defense response. Further studies are needed to determine the binding of MpPR-1i to free fatty acids that are important in plant immunity.

Conclusions
The structure of MpPR-1i was determined by SeUrea phasing. This is the first de novo structure determined using this phasing technique and reveals the applicability of this method to a new structure with >70% solvent content. MpPR-1i is a compact CAP protein that is a monomer in solution but is packed as a high solvent content crystallographic heptamer. The loops connecting the two longest strands are exposed to the solvent channel and exhibit the largest inter-monomer conformational flexibility. MpPR-1i retains the palmitate binding cavity while the sterol binding CBM cavity is smaller than previously observed in other SCP/TAPS proteins. Future studies include assessing the mechanisms of lipid binding by MpPR-1i.   Mass Spectrometry. Lyophilized protein was reconstituted by addition of water and 5% acetonitrile prior to mass spectrometry (MS) analysis using an Impact II QTOF mass spectrometer (Bruker Daltonics), equipped with a Qtof Control and Electrospray source. MS spectra were acquired in positive ion mode using water, 5% acetonitrile, and 0.1% formic acid. Instrument parameters were set as follows: nebulizer gas (Nitrogen) pressure, 2 Bar; Capillary voltage, 4.500 V; ion source temperature, 180 °C; dry gas flow, 9 L min-1; spectra rate acquisition between m/z 300-2000.

Recombinant protein expression and purification of
Crystallization and selenourea soaking. Lyophilized  In vitro palmitate binding assay. The radioligand binding assay was performed as described previously 43,44 . Purified protein (100 pmol) in binding buffer (20 mM Tris, pH 7.5, 30 mM NaCl, 0.05% Triton X-100) was incubated with [ 3 H]-palmitic acid (100-400 pmol) for 1 h at 30 °C. Protein was removed from unbound ligand by adsorption to Q-sepharose beads (GE healthcare, USA), the beads were washed, and the protein-bound radioligand was quantified by scintillation counting. To determine non-specific binding, the binding assay was performed without the addition of the protein.
Data Collection and Structure Determination. Synchrotron X-ray diffraction data were collected at wavelength of 0.978 Å on Southeast Regional Collaborative Access Team (SER-CAT) 22-ID beam-line at the Advanced Photon Source, Argonne National Laboratory, USA. Data sets were processed with HKL2000 45 in space group P2 1 with the "auto-correction" option turned during scaling. The best SeUrea soaked crystals diffract to 2.9 Å, while the best native crystals diffract to 2.43 Å. Attempts to solve the crystal structure of MpPR-1i by molecular replacement by submitting both data to BALBES online server failed 46 . Parallel attempts at phasing using multiple MR search models, truncated CAP proteins, and polyalanine models [13][14][15] with PHASER 47, 48 were also unsuccessful. The phenix.anomalous signal in PHENIX package was used to estimate the correlation coefficient for anomalous data set processed without merging Friedel pairs 49,50 . Correlation coefficient for anomalous data set (CC ano ) at different resolution is shown in Figure S.8. SHELXD was used to find the sub-structure of the anomalous data and identified six Se 51 ; however, attempts to build the polyalanine model even with relatively higher resolution native data using SHELXE failed. After switching to Phenix.Autosol for phasing and model building with Phenix.Autobuild, an initial model with R = 0.37 and R free = 0.41 was obtained, indicating that the correct solution was found 52 . Buccaneer was adopted for further model building which resulted in an 88% complete model with R = 0.29 and 984 residues assigned into seven chains. The highest quality single chain was extracted and used as the molecular replacement model in PHASER 53 to generate a more complete model. The SeUrea binding sites were cross validated by anomalous difference map and the heavy-atom sites found by Phenix. Autosol, then incorporated into model by Coot 53 . Thereafter, the structure was iteratively manually adjusted in Coot and refined using REFMAC5 54,55 and PHENIX 52 . The occupancies of SeUrea molecules were also refined. Data collection and structure refinement statistics are listed in Table 1.