SAXS/WAXS data of conformationally flexible ribose binding protein

Modern artificial intelligence-based protein structure prediction methods, such as Alphafold2, can predict structures of folded proteins with reasonable accuracy. However, Alphafold2 provides a static view of a protein, which does not show the conformational variability of the protein, domain movement in a multi-domain protein, or ligand-induced conformational changes it might undergo in solution. Small-angle X-ay scattering (SAXS) and wide-angle X-ray scattering (WAXS) are solution techniques that can aid in integrative modeling of conformationally flexible proteins, or in validating their predicted ensemble structures. While SAXS is sensitive to global structural features, WAXS can expand the scope of structural modeling by including information about local structural changes. We present SAXS and WAXS datasets obtained from conformationally flexible d-ribose binding protein (RBP) from Escherichia coli in the ribose bound and unbound forms. SAXS/WAXS datasets of RBP provided here may aid in method development efforts for more accurate prediction of structural ensembles of conformationally flexible proteins, and their conformational changes.

a b s t r a c t Modern artificial intelligence-based protein structure prediction methods, such as Alphafold2, can predict structures of folded proteins with reasonable accuracy.However, Al-phafold2 provides a static view of a protein, which does not show the conformational variability of the protein, domain movement in a multi-domain protein, or ligand-induced conformational changes it might undergo in solution.Smallangle X-ay scattering (SAXS) and wide-angle X-ray scattering (WAXS) are solution techniques that can aid in integrative modeling of conformationally flexible proteins, or in validating their predicted ensemble structures.While SAXS is sensitive to global structural features, WAXS can expand the scope of structural modeling by including information about local structural changes.We present SAXS and WAXS datasets obtained from conformationally flexible d-ribose binding protein (RBP) from Escherichia coli in the ribose bound and unbound forms.SAXS/WAXS datasets of RBP provided here
• Since very few WAXS data sets are available in the public data repositories, the present datasets can aid in method development for accurate calculation of WAXS data from coordinates of the structures [14] .

Data Description
RBP, which is the periplasmic ribose-binding component of bacterial, ribose-specific ABC transporter, is a two-domain, flexible protein with a hinge adjacent to the sugar binding cleft at the domain interface [4][5][6][7] .A number of crystal structures of RBP have been determined previously in both sugar-bound and free forms [4][5][6][7] ( Fig. 1 ).Since RBP is well-characterized, SAXS and WAXS datasets of RBP in free and ribose-bound forms provide a valuable dataset for method developers.SAXS data were collected for both ligand bound and unbound RBP at three different protein concentrations ( Fig. 2A ; Table 1 ).Guinier analysis was performed to estimate the radius of gyration (R g ) and evaluate data quality ( Fig. 2B ; Table 2 ).Average R g for RBP was 2.10 ± 0.01 nm.In the presence of ribose, the average R g for RBP was 2.05 ± 0.02 nm.RBP has been characterized by SAXS before, showing a similar, small change in R g between the sugar-bound and sugar-free states [15] .Maximum particle diameters (D max ) were estimated from the pair distribution function.A modest difference in the average D max of RBP (8.43 ± 0.15 nm) and that of RBP bound to ribose (7.76 ± 0.47 nm) was observed ( Fig. 2C ; Table 2 ).Comparison of the WAXS data obtained from RBP and RBP-ribose showed expected difference consistent with conformational change between the two sets ( Fig. 3 ).

Sample preparation
Commercially synthesized and cloned rbsB gene in pET-15b vector (GenScript) was used for expression in E. coli BL21(DE3) cells.After cell lysis by sonication, and centrifugation, supernatant was used for two step purification using Ni-affinity column (IMAC Sepharose, GE healthcare) and size exclusion column chromatography (16/60 Superdex 75, GE healthcare).Purified protein in buffer A (50 mM Tris pH 7.0, 50 mM NaCl and 10% (v/v) glycerol) was further concentrated.Protein concentrations were determined using Pierce TM BCA Protein Assay Kit (Thermo Scientific) using the manufacturers protocol.The samples, and background buffers from the sizeexclusion column runs, were frozen in liquid nitrogen and shipped to the synchrotron beamlines for data collection.For RBP-ribose sample preparation, the protein sample and background buffer were spiked with d-ribose.

Data collection and analysis
SAXS data was collected at beamline BM29, European Synchrotron Research Facility, Grenoble, France ( ESRF) [16] .Concentration series experiment was performed at three different protein concentration (3 mg/mL, 6 mg/mL and 9 mg/mL) for both unliganded and liganded RBP so that Fig. 2. SAXS data of RBP (left panel) and ribose-bound RBP (right panel).(A) LogI(q) versus q plots (I is intensity in arbitrary unit and q is momentum transfer in nm −1 , q = 4 πsin( θ)/ λ, where λ is the X-ray wavelength and 2 θ is the scattering angle) at three different protein concentrations.(B) Guinier plots (lnI(q) versus q 2 ) and corresponding residual plots of the SAXS data shown in A. The Guinier plots are shifted on the y-axis for better visualization.(C) Pair distribution function (P(r)) versus r (pair-wise distance in nm) plot for RBP with and without ribose, calculated using following boundary conditions: P(r) = 0 at r = 0 and at r ≥ Dmax , where Dmax is the estimated maximum particle diameter (Table 2).The pair distribution function plot is normalized to a maximum value of 1 in the y-axis.the concentration dependence can be assessed.To minimize radiation damage, protein samples were continuously moved during X-ray exposure.Azimuthal averaging, frame averaging (up to 10 frames) and buffer subtraction were performed for each dataset using the beamline software [17 , 18] , SAXS datasets were further analysed using ATSAS version 3.0.1 [19] .Samples for WAXS experiment were prepared in the same way as described above.WAXS data was collected at beamline BL-15A2, Photon Factory, Tsukuba, Japan [20] .Since WAXS signal is weak, up to 60 frames were collected and combined in SAngler [21] .
All dataset files (accessible from SASBDB [1] , see Specifications Table ) contain three columns containing momentum transfer (q), Intensity (I) and error associated with intensity.Data collection parameters are provided in Table 1 .Differences observed in size parameters upon ribose binding to RBP are provided in Table 3 .

Table 1
Experimental details for (a) SAXS and (b) WAXS experiments.The table contains the detailed parameters used for data collection and data processing.

Table 2
SAXS data analysis.SAXS data was analysed using ATSAS version 3.0.1.The table provided here shows the values of Guinier Rg , real space Rg , Dmax and qRg range used for Guinier analysis for the RBP SAXS datasets.

Table 3
Comparison of sizes obtained from SAXS data of RBP in presence and absence of ribose.The values of Rg and Dmax observed at three different concentrations were averaged and shown here.