Analyzing the weak dimerization of a cellulose binding module by analytical ultracentrifugation

.


Introduction
Cellulose binding modules (CBMs) are proteins that are found as parts of many different enzymes and other proteins that interact with cellulose.Their role is to bind to cellulose and typically they do so by surface exposed aromatic side chains.It is well known that CBMs play a crucial role in cellulose degradation and the function of enzymes [1].There are over 80 different families of functionally related, but structurally different proteins that bind to carbohydrates.They form an example of convergent evolution with different protein scaffolds achieving a similar binding function [2].The details of these binding mechanisms differ between the individual families, although they share some common elements such as aromatic ring-pyranose stacking as part of the binding interface.
CBMs have been utilized in a large number of biotechnical applications because they provide an easy way to control the binding of proteins to cellulose.Especially the ability to use recombinant DNA technology to design new structurally engineered proteins with CBMs attached, has led to a wide range of applications.For example, the activity of enzymes can be enhanced by increasing their binding to cellulose by CBMs.In one study, different types of CBMs were used to either enhance or inhibit the activity of lytic polysaccharides monooxygenases (LPMO) by swapping between different types of CBMs [3].The activity of hydrolases have also been increased by fusing enzymes to CBMs [4].Immobilization to cellulose is also an application that has been explored.CBMs can be used as a high-capacity purification tag, a targeting molecule or an affinity tag for enzyme immobilization and processing [5].In addition, CBMs can be used to increase the rate of catalysis by creating association between biocatalysts and substrates [6,7].Other applications include cell immobilization [8], diagnostics, and even the construction of protective devices against nerve gas [9].
Because CBMs attach to cellulose surfaces, they can also be used to modify the properties of cellulose based materials.An example of this is improving the mechanical properties of paper.A protein constructed to contain two CBMs at each end of a linker resulted in paper with higher folding endurance [10].In another example, a recombinant silk-protein based adhesive for cellulose depended on the binding of CBMs [11].
Here we focus on the CBM CipA from the Cellulomonas fimi cellulosome anchoring protein CipA.CBM CipA belongs to family III according to the Cazy-classification [2].CBM CipA has a nine-stranded β sandwich like structure with a jellyroll topology and belongs to surface binding type of CBMs (type A).It has a face that contains a planar linear strip of aromatic and polar residues, which participates in interaction with crystalline cellulose [12].CBM CipA shows a very strong binding to crystalline cellulose, with a dissociation constant (K D ) of 0.6 μM [13][14][15].This is an order of magnitude higher affinity than, for example the fungal Family I type of CBMs have [16].The CBM CipA has a big practical value in applications due to its high affinity to cellulose.Because of its wide use and because a potential multimerization behavior could influence the use of the protein in applications, we decided to investigate the multimerization characteristics of CBM CipA in more detail.Furthermore, a possibility of dimerization was indicated in our previous work using CBM CipA as a fusion partner in silk-material forming proteins.Fusion proteins containing the CBM CipA seemed to associate with each other stronger than other proteins that did not contain the CBM CipA .This stronger self-association effected the way the assembled into different protein-based materials [11,17,18].For CBM CipA , no detailed information of multimerization is available, but the formation of dimers has qualitatively been reported for the closely related CBD Cex from the enzyme Cex, a β-1,4-glycanase from C. fimi [19].Moreover, a cellohexose -mediated dimerization of a CBM belonging to class 29 has been shown [20].As a method to study the solution interactions of the CBM CipA we chose analytical ultracentrifugation (AUC) which is one of the most powerful techniques available for quantifying weak interactions [21].Experiments performed by AUC are either based on analyzing sedimentation velocity or sedimentation equilibrium.We analyzed the CBM CipA by multiple approaches to obtain the K D , as well as diffusion coefficients and information on the anisotropy of the protein in solution [22].Molecular dynamics (MD) simulations were done to widen the understanding of the system.

Protein expression
The cloning, purification, and characterization of CBM CipA has been described previously [15].It contained a 6xHis-tag in the C-terminus.The sequence of the CBM CipA was: E. coli strain BL21 (DE3) (ThermoFisher Scientific) was used for CBM CipA production.CBM CipA was expressed using MagicMedia (ThermoFisher Scientific) expression medium in accordance to the manufacturers protocol.Cell lysis was performed with a high-pressure homogenizer EmulsiFlex.The purification process consisted of two parts, first precipitation of contaminating bacterial proteins and debris by heating and then affinity purification of CBM CipA by immobilized metal affinity chromatography (IMAC).The sample was heated up to 70 °C for 30 min, and then clarified by centrifugation at 5000 rpm for 20 min.An ÄKTA-Pure liquid chromatography system (GE Healthcare Life Sciences) with a 5 mL HisTrap column was used for IMAC of the supernatant.Gel filtration was used after the IMAC to transfer CBM CipA into suitable buffers for subsequent experiments.

Analytical ultracentrifugation
Analytical ultracentrifugation experiments were performed using a Beckman Coulter Optima AUC.Sample cells with 12 mm Epon centerpieces and sapphire windows were used.Sample cells were cleaned with deionized water (13.0MΩ•cm resistivity), 20% ethanol, and 1% Hellmanex detergent.All measurements were made in phosphate-buffered saline (137 mM NaCl, 2.7 mM KCl, 10 mM Na 2 HPO 4 , and 1.8 mM KH 2 PO 4 , pH 7.35).The sample volumes were 400 μL.All experiments were performed at 20 °C and prior to each experiment the system was thermally stabilized for 1 h 30 min (except sedimentation equilibrium experiments).
In sedimentation velocity experiments, UV-Vis absorbance at 280 nm and interference detection were used.The experiments were carried out at 50000 rpm using an An-50 rotor.CBM CipA samples were measured in the concentration range from 0.05 mg/mL to 0.7 mg/mL (3-38 μM) that corresponds to 0.1-1.3OD range (280 nm absorbance) and from 1 to 5 mg/mL (55-270 μM) measured by interference.To get complete sedimentation and sufficient amount of data points, 400 scans per each sample were collected with a frequency of one scan every 80 s.
Speeds were chosen to have the reduced buoyant molecular weight of the sedimenting species in the range of 1-5 to provide optimal steepness in the exponent graph.The time until equilibrium and the reduced buoyant molecular weight were estimated according to [23].

Data analysis by Ultrascan
Sedimentation velocity absorbance data analysis were performed using the software UltraScan III (v.4.0 revision 2466) [24].Rotorstretching calibration and chromatic aberration corrections were applied during data import.A partial specific volume of 0.7148 mL/g was used.It was based on the amino acid sequence and calculations by the estimation tool implemented in Ultrascan.A density of 1.0056 g/mL and viscosity of 1.0120 mPa•s for the buffer were obtained from the tabulated values in Ultrascan.During data editing, the meniscus position and end cutoff distance were entered manually.The plateau position corresponding to a uniform concentration prior to acceleration and the baseline buffer absorbance were determined automatically.
Concentration profiles obtained in the sedimentation velocity experiments were treated using 2-dimensional spectrum analysis (2DSA) [25].We used 2DSA to perform primary analysis coupled with simultaneous time-invariant and radial-invariant noise reduction to obtain values for frictional ratios (f/f 0 ), and sedimentation coefficients (S).The grid parameters used were 1 to 10 for S and 1 to 4 for f/f 0 with 64 grid points in each direction.The choice of parameters was based on time-derivate analysis and was selected to cover the distribution of parameters for all species in solution.The 2DSA was done in three steps.First the time-invariant noise was removed after which the meniscus was fitted.The meniscus fit range was 0.03 cm around the previously chosen position.In the meniscus fitting procedure, 10 positions were chosen, and corresponding root mean square deviation (RMSD) values were approximated by second order polynomial to determine the position with lowest RMSD.The position with lowest RMSD was used for further analysis.The third step was an iterative 2DSA refinement consisting of repetitions with improved meniscus position and fitting of the noise profile.The number of iterations was 10, which was sufficient for convergence of the fit.The parameters of the grid were the same in all steps.
The 2DSA results and noise reduction profile obtained by the analysis described above were used for further analysis.A genetic algorithm (GA) and a genetic algorithm Monte Carlo (GAMC) [26] optimization were used to extract the main populations of frictional ratio and sedimentation coefficient combinations present in the data.The existence of reversible self-association was analyzed using the van Holde-Weischet (vHW) [27] method.The dissociation constant (K D ) was determined by a discrete model genetic algorithm (DMGA).The specific parameters used in GA, GAMC, vHW and DMGA are provided in the supplemental information.Hydrodynamic parameters of monomer and dimer were estimated using the UltraScan Solution Modeler (US-SOMO) [28].Calculations were performed using the ZENO hydrodynamic computation algorithm [29,30] and the van der Waals overlap bead model [31].

Data analysis by Sedfit
C(s) [32] and ls-g*(s) [33] analysis of sedimentation velocity absorbance and interference data were performed using Sedfit version 16.1c [34].We loaded every fifth of the first 350 scans, in total 70 scans per loading concentration for both c(s) and ls-g*(s) type of analysis.The values for partial specific volume, buffer density, and viscosity were obtained from Ultrascan, as described above.The resolution was 100.Analysis and fitting of isotherms of weight-average sedimentation coefficients was carried out according to the modified Hill equation [35]: where S(c) is the weight-average sedimentation coefficient at concentration c, S mon is the sedimentation coefficient of the monomer, and S dim is the sedimentation coefficient of dimer.

Data analysis of sedimentation equilibrium data
Analysis of sedimentation equilibrium data was carried out by Sedphat version 15.2b [34].Fitting parameters included local concentrations and dissociation constants, while the molecular weight was fixed.Sedfit was used for initial data inspection.

Molecular modelling
All-atom molecular dynamics (MD) simulations were carried out by using the Gromacs software (version 5.1.4)[36,37].We used the particle Mesh Ewald (PME) electrostatics calculation scheme [38], and a non-bonded interactions cut-off of 1 nm.The modelling was performed with the Amber03ws force field [39] and TIP4P2005 water model [40].Two systems were studied, one with a single CBM CipA molecule and one with two CBM CipA molecules in the simulation box.A cubic box of 8 nm × 8 nm × 8 nm in initial size was used for the single CBM CipA system and for the two molecule-system we used a box size of 12.5 nm × 12.5 nm × 12.5 nm (Fig. 1).We used the crystal structure PDB ID: 1NBC as the initial structure for CBM CipA .The data were modified to leave only protein itself.The structural model differed from the actual protein by the H 6 -tag and two amino acids in the beginning and at the end of the sequence.The structure for the two-molecule system was obtained from the final frame of the single CBM CipA simulation.The structure was cut with a 1 nm water shell.The initial arrangement of the two molecules relative to each other was chosen based on the location of the β-sheets and the geometric shape.After this, the proteins were solvated with explicit water molecules.Neutralization of the net charge in the system was performed by replacement of water molecules by Na + ions.Both systems were energy-minimized.After initial equilibration, the simulations were performed in NPT (isothermal-isobaric) ensemble.The temperature and pressure control were carried out by the V-rescale thermostat [41] with a time constant of 0.1 ps and the Parrinello-Rahman barostat [42] with a time constant of 2 ps.The temperature and pressure reference levels were set at 300 K and 1 bar.The visualization was done with the VMD software (v.1.9.4a9) [43].The main simulation was conducted for 160 ns for the single CBM CipA and for 450 ns for the two CBM CipA system.In the dimer simulation the first 150 ns were disregarded in the analysis as dimer formation period.
Diffusion coefficients were calculated using the mean square displacement (MSD).The stable linear region on the MSD plot was approximated by a straight line according to the equation ð2Þ where ⟨|r(t) − r(0)| 2 ⟩ is the MSD, D is the diffusion coefficient and t is the time.

Analysis of reversibility
Our initial question was to determine the presence of self-association for CBM CipA in the chosen concentration range.For this we used, two ways of treating the data, the vHW and the ls-g*(s) methods.The methods are implemented in Ultrascan and Sedfit, respectively.The vHW method is based on a graphical transformation of the sedimentation velocity experimental data and allows obtaining a distribution of sedimentation coefficients that is independent of diffusion.By the vHW method, one can differentiate between non-interacting and self-associating species and one can also identify boundary spreading due to diffusion and heterogeneity in the sedimentation coefficient.If the kinetics of interactions are slow compared to the time of the experiment or if the system is non-interacting, the interacting components will be separated by the centrifugal force and the boundaries of the concentration profiles will be separated.However, in the case of relatively fast kinetics, the components have sufficient time to re-equilibrate and a single boundary would be observed.Increasing the sample concentration results in changes in the ratios between species, and therefore in differences in the boundary shape.These differences represent a shift toward higher sedimentation coefficients and higher partial concentrations of oligomers.For CBM CipA , vHW distributions demonstrated self-association behavior (Fig. 2a), showing a shift of the distribution to higher values of sedimentation coefficients with an increase of the loading concentrations.The presence of a shift indicated clearly that the CBM CipA undergoes reversible self-association [22].
The least-squares apparent sedimentation coefficient distribution, i.e. the ls-g*(s), method was used as an alternative approach to analyzing reversible association.The method is based on calculating the apparent sedimentation coefficient distribution based on direct least-squares boundary modelling.A shift in the distributions as a result of concentration changes serves as a robust indication of protein self-association [44].Moreover, in the case of reversible association, the distributions should follow a certain pattern: the height of the peak should decrease, and width should increase when approaching the K D concentration.However, at concentrations higher than the K D concentration the height of the peak should increase and its width decrease.This pattern represents the successive changes in composition of the system with concentration.There is a shift from a state with mostly monomers to states with increasing fractions of oligomers.All mentioned features were found in ls-g*(s) distributions shown in Fig. 2b.

Initial analysis of data and estimation of parameters
We continued to analyze data in order to establish parameters for possible monomers and oligomers, and to further choose the appropriate interaction model.We used the two programs, Ultrascan and Sedfit, to obtain starting parameters and to perform qualitative analysis.
Fig. 2. a) A van Holde-Weischet integral distribution plot for different loading concentrations: 0.05 mg/mL (3 μM)violet triangles, 0.3 mg/mL (16 μM)blue circles, 0.6 mg/mL (32 μM)red diamonds, 1.85 mg/mL (100 μM)green squares and 4.93 mg/mL (267 μM)yellow crosses.The shift toward higher sedimentation coefficients with an increase of loading concentration and the half-parabolic shape of distributions clearly indicate the presence of reversible self-association in the system.b) ls-g*(s) distribution: 0.05 mg/mL (3 μM)violet line, 0.3 mg/mL (16 μM)blue line, 0.6 mg/mL (32 μM)red line, 1.85 mg/mL (100 μM)green line, 4.93 mg/mL (267 μM)yellow.A clear shift of the curves toward higher S values correspond to a reversible self-association of the components.At the highest concentration the system has switched to a state where over half of the protein is as dimers.Therefore, we can predict that the K D value is between 32 and 267 μM.Starting parameters were then taken for further analysis using methods that have been designed to take reversible interactions between molecules into account and allow to collect quantitative information about interacting system.Ultrascan has multiple methods for data analysis.We used two of them: 2DSA and GA and their Monte-Carlo versions.Both methods are intended for non-interacting or non-reversible interacting systems and therefore in the case of reversible system can be used only for estimation of parameters.After initial data treatment and noise reduction by 2DSA, GA was used for further data refinement.The sedimentation profile, fitted data, and residuals at 0.7 mg/mL (38 μM) concentration are shown in Fig. 3a.The GA analysis of datasets at concentrations higher than 0.2 mg/mL (11 μM) revealed two clearly identifiable peaks at 2.15 S and 2.8 S (Fig. 3b).These peaks correspond to molecular weights of 18 and 37 kDa, respectively.
Next, we conducted a GAMC analysis to evaluate the quality of data fitting and to eliminate peak artefacts due to random noise.In particular, the peak at 3.5S did not show good convergence, indicating that it was a result of background noise.No significant changes in the two main peaks at 2.15 S and 2.8 S were observed in the pseudo 3D distribution after 64 Monte Carlo iterations (Fig. 4).
In addition, GAMC was used to estimate sedimentation coefficients, diffusion coefficients, frictional ratios, and partial concentration values of the monomers and oligomers in the range of concentrations from 0.05 to 0.7 mg/mL (3-38 μM) (Table 1).The sedimentation velocity fitting showed low RMSD values.However, all parameters showed strong concentration dependency and variation.This variation is likely to be due to the system being interacting while the model is based on a non-interacting model.Moreover, it should be noted that the values listed should be seen as fitting (or apparent) parameters.In general, the best agreement is achieved at low and at high concentration, when mostly one of the states prevail: monomer or dimer.
We obtained similar results using the Sedfit c(s) distribution.Results are shown in Figs. 5 and 6.
Both data fitting with the Sedfit c(s) non-interacting model and with Ultrascan GA showed separation into two peaks, but these analyses are suitable only for irreversible systems or systems with slow kinetics.The analysis is thus potentially insufficient since above we obtained vHW and ls-g*(s) distributions that clearly showed that the system was reversibly interacting.
Since both GAMC and c (s) showed two peaks in the range between 2 and 3 S, we concluded that the simplest and most appropriate model to investigate further is a monomer-dimer interaction model.Hydrodynamic parameters of the monomer were taken from the dataset with the lowest concentration to minimize the influence of the dimer fraction and were: S = 2.1S, f/f 0 = 1.26.

Molecular dynamics modelling
Molecular dynamics simulations were conducted as a complementary approach to study molecular parameters.Two types of systems were investigated: one with a single CBM CipA molecule and one system with two CBM CipA molecules in the simulation box.The first system was used to estimate the hydrodynamic properties of the CBM CipA molecule as a monomer.From this, a diffusion coefficient of 7.9•10 −7 ± 0.5•10 −7 cm 2 /s was calculated.Compared to the diffusion coefficient at lowest concentration of 9.8•10 −7 ± 0.1•10 −7 cm 2 /s, deviation is about 20%.We consider the agreement as sufficiently close, taking into account the widely different approaches used.
The second system allowed us to study the process of CBM CipA dimerization.Initially, the CBM CipA molecules were apart from each other in the simulation box (Fig. 1) but a stable dimer form was formed after 150 ns of MD simulations.The dimer form did not change until the end of the simulation, i.e. the next 300 ns (Fig. 7).The CBM CipA dimer diffusion coefficient (D) was determined to be 6.2•10 −7 ± 1.6•10 −7 cm 2 /s, which is in very good agreement with the measured one of 6.1•10 −7 ± 0.8•10 −7 cm 2 /s.The single monomer and dimer in the simulation box corresponds to the concentration 6 mg/mL and 3.1 mg/mL, respectively.The latter concentration is significantly above the concentration range studied in the experiments.This suggests, that the D determined from the MD simulations should be rather compared with the 0.7 mg/mL diffusion coefficient value, i.e. 6.8•10 −7 cm 2 / s, and not the one for dilute solution.
For both, the monomer and the dimer, the D calculated from the MD simulations is underestimated by 12-20%.This might originate from the oversimplifications in the modelling methodology.For example, the D of the TIP4P/2005 water which is used in this work, is ~10% smaller than experimental values [40].Nonetheless, the MD method clearly reflects the decrease in D caused by dimerization.
A residue contact map (RCM) (Fig. 8) was calculated from the two CBM CipA simulation trajectory to understand which residues are involved in the dimer formation in the simulated system.The RCM shows the mean distance between the CBM CipA residues, that is, each point on the map shows the distance between two residues, as specified by the residue index.We plotted the two CBM CipA molecules with consecutive numbering, meaning that the lower left quadrant shows interactions within molecule 1, the upper right quadrant shows interactions within molecule 2, and both the lower right and upper left show interactions between molecules.Notably, an RCM is symmetric with respect to the diagonal from lower left to upper right.Residues on this line have a distance equal to zero, since each residue on this line is compared to itself.Square areas through which the zero-line passes display the distances between residues inside the molecule.The lines parallel and perpendicular to the zero line inside these areas identify parallel and antiparallel β-sheets.The distance between residues belonging to different molecules is shown in square areas perpendicular to the zero line.The RCM highlights at least two areas which can participate in the interactions between the CBM CipA molecules.The first area represents the interaction between residues Met-81 and Ser-82 of one CBM CipA and Gly-65 of the other CBM CipA .The second area shows interaction between Gln-48 of first molecule and group of residues Ser-63, Glu-102, Pro-103, Ala-105 belonging to the second molecule.
We used the program SOMO to estimate the results of the sedimentation velocity experiments.The SOMO-estimations were based on the   2 and show that experimental and predicted values are in good agreement for the monomer but were slightly off for the dimer.Both sets of SOMO data show overestimation of the sedimentation and diffusion coefficients and underestimation of the frictional ratio.The difference between experimental and SOMO results could be explained by the absence of a poly-His tag and a few other amino acids in the crystal structure or by tight packing of proteins in the crystal, that led to increase in diffusion coefficient.In the case of the dimer, an inexact partial specific volume (could also contribute.

K D determination
AUC allows different routes toward analyzing interactions and estimating K D .Here, we compared three different approaches: 1) global fitting in Sedphat, 2) DMGA method in Ultrascan and 3) building of isotherm on basis of Sedfit fitting.Together these methods allow obtaining quantitative information about the system.
Global fitting of sedimentation equilibrium data on sedimentation equilibrium experiments were conducted to study the oligomerization process in absence of kinetics effects (Fig. 9a, b, and c).In the experiments, we used three loading concentrations, 0.1 mg/mL (5 μM), 0.25 mg/mL (13 μM), and 0.4 mg/mL (20 μM) in combination with four speeds: 27000 rpm, 35,000 rpm, 40,000 rpm, and 45,000 rpm.Equilibrium data was collected after 46, 24, 20, and 20 h respectively for each speed (Fig. 9d).The dimerization model in Sedphat was the simplest model showing the best fit for the data.Adding more complexity did not improve fitting results.A global fit of the obtained equilibrium data at three loading concentrations and four speeds to monomerdimer self-association model provided the dissociation constant K D of 99 μM.
An analysis using the DMGA implemented in Ultrascan gave K D values that differed slightly with loading concentrations (Table 3).The analysis showed overall lower values than described above, and had also a larger variability.The shift in values and variability in results probably reflect the effect of noise in the data.Nonetheless, the analysis reassures the overall order of magnitude for the K D .
We further analyzed the data through isotherms of weight-average sedimentation coefficients.The weight-average sedimentation coefficients were calculated by integration of sedimentation coefficient distribution obtained by the c(s) method in Sedfit.Eq. ( 1) was then used for isotherm fitting.We fixed the value of S at 2.1 S for the monomer and fitted the values for S for the dimer and K D .Fitting of the binding isotherm based on the weight-average sedimentation coefficients gave S = 2.8 S for dimer and the value of K D = 71 μM (Fig. 10).

Conclusions
We showed that the CBM CipA form dimers with a weak K D of 90 ± 30 μM.In a highly conservative interpretation of the combined data, the CBM CipA K D range could be 50-200 μM.A K D lower or higher than this range would cause significant changes in AUC data, which would have clearly been observed.Each of three methods: building of isotherms on basis of Sedfit fitting, DMGA method in Ultrascan, and global fitting in Sedphat have their own drawbacks, and solely looking at only one of them would not have given an equally wide understanding of both dynamics and affinity of the interaction.However, together the results obtained by these methods are not contradictory and their K D values are close, allowing us to be confident in the stated K D range for CBM CipA .
The interaction with K D about 90 μM can be classified as weak, on the edge of ultra-weak.Typically interactions that have a K D over 100 μM are classified as ultra-weak [21].The K D corresponds to a half of the   CBM CipA molecules being as dimers at a concentration of 1.7 mg/mL.As the K D for the binding of CBM CipA is 0.6 μM [13][14][15], which is around two orders of magnitude lower we conclude that its dimerization has a negligible effect on cellulose binding (dimer fraction at such concentration is about 1% or lower).However, in other uses of the CBM CipA such as using them for molecular adhesives, much higher concentrations were used, reaching over 100 mg/mL [11,17].At these concentrations the majority (over 98%) of protein molecules will be in the dimer form.The dimerization could therefore affect the proteins solution behavior, such as coacervation [18].Modelling indicated a possible mechanism of interaction between CBM CipA molecules, and the modelled structures showed calculated properties that can be used for estimation of hydrodynamic parameters of oligomers.In general AUC is suitable for studying weak and ultraweak interactions, and several examples are found.Approaches include both sedimentation velocity and sedimentation equilibrium setups and use of different software-packages for analysis [45][46][47][48].Despite different approaches implemented in the different software-packages, each software has its own benefits.The approaches did not contradict other, and they and all supported the same overall conclusion.

CRediT authorship contribution statement
Dmitrii Fedorov did the AUC measurements, modelling, and data analysis, and wrote the manuscript, Piotr Batys supervised and assisted in the modelling and wrote the manuscript, David B Hayes critically evaluated data, supervised and assisted in AUC data analysis, commented and corrected the manuscript, Maria Sammalkorpi supervised modelling and wrote the manuscript, Markus B Linder supervised the work, and wrote the manuscript.

International
Journal of Biological Macromolecules 163 (2020) 1995-2004 ⁎ Corresponding author.E-mail address: markus.linder@aalto.fi(M.B.Linder).https://doi.org/10.1016/j.ijbiomac.2020.09.054 0141-8130/© 2020 The Authors.Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).Contents lists available at ScienceDirect International Journal of Biological Macromolecules j o u r n a l h o m e p a g e : h t t p : / / w w w .e l s e v i e r .c o m / l o c a t e / i j b i o m a c

Fig. 1 .
Fig. 1.The initial all-atom molecular dynamics simulation configuration of two CBM CipA molecules in the simulation box.The explicit water molecules are shown as red dots and the Na + ions that are required for charge neutrality as red spheres.(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 3 .
Fig. 3. a) Data from sedimentation velocity experiments (blue lines) and simulated data based on the GA analysis (red lines).Residuals are shown in green in the panel below.Experimental and simulated data are right on top of each and residuals are random and not intense, indicating a high precision of the fit.b) GA analysis results shown as sedimentation distribution plots at four concentrations: 0.3 mg/mL (16 μM)yellow, 0.5 mg/mL (27 μM)blue, 0,6 mg/mL (32 μM)green and 0,7 mg/mL (38 μM) red lines.The signal concentration relates to the optical concentration of each component.

Fig. 4 .
Fig.4.Pseudo 3D plot of the CBM CipA solution composition at a concentration of 0.7 mg/mL (38 μM) after 64 Monte Carlo iterations by GAMC.The pseudo 3D distribution connects sedimentation coefficient S, frictional ratio f/f 0 , and partial concentrations of the solution.The graph shows two peaks, a monomer-like at 2.15S and f/f 0 ≈ 1.15 and a dimer-like at 2.8 S and f/f 0 ≈ 1.5.The red circle marks the area where 3.5S peak would be expected to be found.

Fig. 5 .
Fig. 5. a) Absorbance sedimentation boundaries of 0.6 mg/mL (32 μM) CBM CipA in BSA buffer pH 7.35 at 50000 rpm (circles) and best-fit solution by c(s) model (lines).Color gradient represents the time from beginning of the experiment (violet) to the end (red).b) c(s) distribution showing two peaks at approximately monomer and dimer positions.

Fig. 6 .
Fig. 6.Left: Absorbance sedimentation boundaries of 0.05 mg/mL (3 μM) CBM CipA in BSA buffer pH 7.35 at 50000 rpm (circles) and best-fit solution by c(s) model (lines).Color gradient represents the time from beginning of the experiment (violet) to the end (red).Right: c (s) distribution, that shows only one peak at S ≈ 2.1 S.

Fig. 7 .Fig. 8 .
Fig. 7. CBM CipA dimer structure as obtained by the MD simulations.The snapshot corresponds to the final configuration at 450 ns.

Table 1
Sedimentation velocity fitting results by Ultrascan for 7 different loading concentrations.Parameters of monomer-like and dimer-like peaks and RMSD of the fitting are shown.Data for the dimer-like peak is shown only for the concentrations where the peak was clearly identifiable.Parameters are normalized to 20 °C and water as solvent.

Table 2
Comparison between experimental results at concentration 0.05 mg/mL (3 μM) for monomer and at concentration 0.7 mg/mL (38 μM) for dimer and SOMO modelling.

Table 3
The value of the dissociation constant K D as determined at different concentrations and using DMGA for analysis of data.