Stabilization of Non-Native Folds and Programmable Protein Gelation in Compositionally Designed Deep Eutectic Solvents

Proteins are adjustable units from which biomaterials with designed properties can be developed. However, non-native folded states with controlled topologies are hardly accessible in aqueous environments, limiting their prospects as building blocks. Here, we demonstrate the ability of a series of anhydrous deep eutectic solvents (DESs) to precisely control the conformational landscape of proteins. We reveal that systematic variations in the chemical composition of binary and ternary DESs dictate the stabilization of a wide range of conformations, that is, compact globular folds, intermediate folding states, or unfolded chains, as well as controlling their collective behavior. Besides, different conformational states can be visited by simply adjusting the composition of ternary DESs, allowing for the refolding of unfolded states and vice versa. Notably, we show that these intermediates can trigger the formation of supramolecular gels, also known as eutectogels, where their mechanical properties correlate to the folding state of the protein. Given the inherent vulnerability of proteins outside the native fold in aqueous environments, our findings highlight DESs as tailorable solvents capable of stabilizing various non-native conformations on demand through solvent design.


Table of contents
1 Materials and methods
The chemicals used for DES preparation were vacuum dried at 70 °C (ChCl, d-ChCl, Urea, and d-Urea) or 50 °C (ChAc, d-ChAc, Glyc, and d-Glyc) for 24 hours before preparation of the DES.AcOH and d-AcOH were not vacuum dried due to the high vapor pressure of the compound.The different DESs (1:2 ChCl:Urea, 1:2 ChCl:Glyc, 1:2 ChAc:Glyc, 1:2 ChCl:AcOH, and their deuterated analogues) were prepared by weighing the required amounts of each chemical into a round bottom flask, followed by continuous stirring and heating at 60 °C under an argon atmosphere until a clear liquid was formed.After cooling, the DESs were equilibrated at room temperature in a desiccator for a minimum of 24 h before use.Once ready, the DESs were stored sealed and under a dry atmosphere to avoid water adsorption.The residual water in the DESs was measured using Karl-Fischer titration to an average water content of 0.24% for 1:2 ChCl:U, 0.47% for 1:2 ChCl:G, 0.42% for 1:2 ChAc:G, 0.63% for 1:2 ChCl:AcOH.
Phosphate buffer (10 mM, pH 7 in water and 10 mM, pH 7.4 in D2O) was used as reference for the protein state in aqueous buffer (from herein referred to as native state) in the experiments presented here.

Synthesis and characterization of choline-d9 acetate-d3
Choline chloride (trimethyl-d9) (98% isotopic purity) and sodium acetate-d3 (99% isotopic purity) were used without further purification.All other reagents and solvents were purchased from Sigma-Aldrich.NMR spectra were recorded on a Varian Unity INOVA 400 MHz spectrometer or a Bruker Avance Neo 500 MHz spectrometer. 13C NMR spectra were 1 H-decoupled.Spectra were recorded at 298 K.Chemical shifts, expressed in parts per million (ppm), were referenced to the residual signal of the solvent.

Choline (trimethyl-d9) acetate-d3
Using a modification of a reported procedure, 1 a solution of sodium acetate-d3 (1.43 g, 16.8 mmol) in methanol/absolute ethanol (67.5 mL:35 mL) was added to a solution of choline chloride (trimethyl-d9) (2.50 g, 16.8 mmol) in methanol (67.5 mL) and the mixture was stirred for one hour at room temperature.Sodium chloride precipitated and was removed by filtration.Acetone (50 mL) was added after which a second crop of sodium chloride precipitated and was removed by filtration.The solvents were removed in vacuo to provide the title compound as a clear, viscous liquid (2.35 g, 80%).

Sample preparation
Initially, protein stock solutions were prepared in water or D2O by adding the required amounts of BSA or Lyz to water for a protein concentration of ca.5000 μM.10-and 50-fold dilutions of the 1000 μM samples were conducted to obtain the stock solutions at lower concentrations, 500 and 100 μM respectively.Each protein solution was then centrifuged at 13 000 rpm for 5 min at 25 °C to remove any aggregates from the solution, followed by pipetting of the supernatant into new tubes.
Protein samples in neat (anhydrous) DESs for spectroscopic characterization were prepared by mixing the required amount of each DES and protein stock solution (non-buffered) in a glass vial.The added amount of protein stock solution was at least 20 times lower (v/v) than the added amount of DES to minimize the initial water content.Subsequently, samples were freezedried using an optimized protocol as explained below.Aqueous protein solutions were prepared in phosphate buffer following the same dilution protocol.Samples were stored at 4 °C between preparation and measurements and used within 72 hours after preparation to minimize the impact of protein degradation on the results.
The samples for the SANS experiments were prepared in deuterated DESs and buffered D2O using the protocol presented for the protiated analogues.These samples were stored at -20 °C before measurement and used within a week of sample preparation.
The samples at high protein concentration required for the gelation experiments were prepared from highly concentrated, aqueous solutions of Lyz, i.e., 350 mg mL -1 .The aqueous stock solution was then centrifuged at 4 500 rpm for 5 min at room temperature to remove any aggregates.Then, 1:2 ChCl:Glyc was added to a final concentration in the stock DES (after water removal) of 140 mg mL -1 and water was subsequently lyophilized to yield the stock solution in DES.Samples were prepared by mixing the stock solution with 1:2 ChCl:Glyc and 1:2 ChCl:AcOH so that the required ratios in the system 1:x:2-x ChCl:Glyc:AcOH were reached at a final protein concentration that was a 3-fold dilution of the stock solution in DES.
Control samples of the DESs without protein submitted to the same protocols were tested throughout the research activates for water content using Karl-Fischer titration.The results from the titration showed minor variations in the water content (±0.2 wt%) during this process.The water average water content was 0.24% for 1:2 ChCl:Urea, 0.47% for 1:2 ChCl:Glyc, 0.42% for 1:2 ChAc:Glyc, and 0.63% for 1:2 ChCl:AcOH.

Methods
Freeze-drying was performed on an Epsilon 2-6D LSCplus from Martin Christ to remove the water previously added during protein incorporation to the DESs.The samples were loaded in glass vials and introduced in the instrument tray.Initially, the samples were quenched to -40 °C by abruptly lowering the temperature of the trays.Once the samples reached the target temperature and solidify, the pressure in the instrument was gradually reduced to 0.8 mbar to sublimate the water in the sample while the temperature was gradually risen to -15 °C for 24 hours.Subsequently, the temperature of the system was adjusted to room temperature.Finally, the instrument was automatically closed the vials under dry atmosphere and the system was vented to reach ambient conditions.Using this protocol avoids the sample bubbling which could lead to protein degradation during freeze-drying stages.
The final protein concentration in the samples was determined using a ND-1000 Spectrophotometer from Saveen Werner by measuring the absorbance at 280 nm using the protein extinction coefficient (BSA: e=43,824 cm -1 M -1 , M=66,400 g mol -1 ; Lys: e=38,940 cm -1 M -1 , M=14,300 g mol -1 ).Corrections for solvent contribution were performed for each sample.UV-Vis absorption measurements were performed using the Varian Cary 50 UV-Vis Spectrometer.Samples were loaded in 1 mm light path, 10 mm width Quartz.The absorbance was measured in the range 190-500 nm at 600 nm min -1 .The contributions from the solvents were subtracted to each sample.The first-derivative and second-derivative UV spectra were determined in a wavelength range between 240 and 310 nm.The peaks in the tyrosine/tryptophan region (280-300 nm) in the secondderivative UV spectra were analyzed to extract the positions and amplitudes of the peaks as previously described. 2eady-state fluorescence measurements were carried out on a Cary Eclipse Fluorescence Spectrometer in the 96-well plate configuration at excitation wavelengths of 295 nm (Trp).The emitted intensity was measured between 300-600 nm at 600 nm min -1 using excitation and emissions slit of 5 nm.Data were accumulated for 15 scans for each measurement.
Exited-state emission fluorescence measurements were conducted on a FS5 Spectrofluorometer from Edinburgh Instruments.The source was a fixed pulsed LED (pulse width 794.7 ps) with an excitation wavelength of 280 nm, thus targeting the Tyr and Trp residues of the protein.Data were acquired using 1-cm path length, 4-cm width, quartz cuvettes with the samples loaded on a temperature-controlled sample stage at 25 °C.The emitted intensity was collected for 50 ns after each pulse with a resolution of 0.0244 ns.The evolution of the emitted intensity with time was collected at 345 nm and data were accumulated to reach 5000 counts at the main pulse to obtain an appropriate signal-to-noise ratio.The instrument response function (IRF) was obtained using the scattering solution of 0.1 wt% SiO2 (LUDOX® HS-30) particles in water.
Far-UV circular dichroism (CD) spectroscopy was acquired using a was performed on a Jasco J-715 between 190 nm and 250 nm at 50 nm min -1 .A spectral bandwidth of 1 nm and a response time of 1 s were used, and data were accumulated for 10 scans.Samples were loaded in a 0.1-mm quartz cuvette.The UV-vis absorption of each sample at 280 nm were used to determine the protein concentration and effective pathlength.The contribution from each was subtracted to each spectrum.The mean residue ellipticity ([q]MR) was calculated as follows: where q is the measured ellipticity in deg, n is the number of amino acid residues (583 for BSA, 113 for Lyz), l is the cuvette path length in cm -1 , and [protein] is the concentration of either BSA or Lyz in the sample in dmol cm -3 .
All spectrometers were equipped with a temperature-controlled sample stage and measurements were performed at 25 °C.
Small-angle neutron scattering (SANS) measurements were performed on D22 and D33 at the Institut Laue-Langevin (Grenoble, France).Sample were loaded in 1 mm path length, 1 cm width, quartz Hellma cells and placed in a temperature-controlled sample changer at 25 °C for measurement.On D22, the sample-to-detector distances were 1.7 m and 8 m for the front and rear detectors, respectively.Measurements were conducted using a neutron wavelength of 6 Å.This setup provided a combined q-range of 0.006-0.65Å -1 .The detectors of D33 were placed at 1.7 m and 8 m from the sample and data were collected using a neutron wavelength of 6 Å.This configuration yielded a combined q-range of 0.0047-0.34Å -1 .The raw data was reduced according to the instruments' protocols using the GRASP V10.17i software, 3 using the attenuated direct beam to obtain intensities in absolute scale.The contribution from the empty cell was subtracted, and the noise was accounted for by subtracting the signal recorded with a sintered B4C absorber.In addition, the scattering from the solvent was subtracted accounting for the incoherent contribution observed at high q values.The reduced data is presented as scattered intensity in absolute scale, I(q) in cm -1 , versus momentum transfer, q in Å -1 .Densities (ρ) of the DESs were measured using a vibrating U-tube densimeter DMA-5000 by Anton Paar at 25 °C.The uncertainty in density is estimated to be 0.001 g cm −3 .The kinematic viscosity (ν) of the binary and ternary DESs was determined using a capillary-based method in a micro Ubbelohde viscometer.Samples were loaded in the Schott capillaries IIc and III, and the automated Lauda Processor Viscosity PVS1 system determined the flow time with a resolution of 0.01 s.The temperature of the system was kept constant at 25 °C.Three repeats were performed per sample and the error bar corresponds to the standard deviation between these measurements.From the kinematic viscosity and the density values, the dynamic viscosity (η) is given by the formula: The rheology measurements were carried out on an Anton Paar MCR 301 at 25 °C using a cone plate with a diameter of 25 mm, a cone angle of 1° and a gap of 0.048 mm.Strain sweeps of samples containing Lyz (42 mg mL -1 ) in DES were performed at constant frequency of 1 Hz over the strain range of 0.01 -100% with 7 measurements per decade.The kinetics of the gel formation at 25 °C were determined by monitoring the evolution of the storage moduli G'.A sample was prepared freshly by mixing 350 µL of Lyz solution (84 mg mL -1 ) in 1:2 ChCl:Glyc with 350 µL of 1:2 ChCl:AcOH.The sample was gently mixed for a few seconds and placed on the rheometer.G' was collected for 360 min at constant frequency of 1 Hz and strain of 10% with 6 measurements per minute.A small amount of low viscosity silicon oil was placed around the geometry to avoid water absorption during measurements.

Data analysis
The analysis of the time-correlated single photon counting obtained from the exited-state emission fluorescence was performed in the FAST software package from Edinburgh Instruments.The analyses were conducted using a reconvolution approach, by which the theoretical models were iteratively convoluted with the instrument response function.The theoretical models were determined using the exponential components analysis, as described in the following equation: where B is the background signal, τi is the decay time, and αi is the weighing of each decay component i.Initially, models containing one, two, or three exponential decays were tested.However, the best results were obtained from the doubleexponential decay function, as the addition of an extra component did not significantly improve the quality of the fit.This observation is in good agreement with previous reports. 4Thus, we concluded that the systems only can be described using two characteristic decay times and the equation was reduced to: The quality of the fits was evaluated by the calculation of the χ 2 statistical parameter using the formula: where I(tk) and Ic(tk) represent the measured and calculated emitted intensities at time k respectively.
From the calculated populations of each decay, the intensity average lifetime, <τ>int, was determined as the main changes were observed in the slow component of the decays.This approach weighs the lifetime of each component by the intensity of that component using the equation: The average lifetimes were normalized to the viscosity of the continuum using the modified Förster-Hoffmann equation: 5 Or in the logarithmic form: where z, kr, and β represent constants related to the decay mechanism.For a fluorophore with no changes in the decay mode, e.g., fluorophore in water-glycerol mixtures, 5 the log(<τ>int) and log(η) are expected to follow a linear relationship.However, changes in the environment of the fluorophore, e.g., conformational change of the protein, could lead to deviations from linearity.
SANS data were analyzed using a combination of model-free methods that yield information on the shape and size of the protein.The pair-distance distribution function of the scatterer (p(r)) were determined using the indirect Fourier transform (IFT) approach, which determines a histogram of all distances within the scatterer. 6The function is parametrized by the input parameters of the maximum dimension of the scatterer (Dmax) and a regularization constant (α).From the p(r) the radius of gyration of the scatterer (Rg) and the extrapolated forward intensity (I(0)) were calculated using the following equations: To obtain information on the collective behavior of the protein, the aggregation number (Nagg) was calculated from the scattering data as follows: where V is the experimental volume of the scatterer, Vm is the volume of the protein monomer, c is the concentration of protein in the sample (in g cm -3 ), ∆SLD is the scattering excess (i.e. the difference between the scattering length density of the protein and that of the solvent, SLDp-SLDs), and NA is Avogadro's number.The scattering length density and monomer volume of the proteins were calculated using the sequence of each protein (PDB files: BSA -4F5S; Lyz -1HEW) and the Biomolecular Scattering Length Density Calculator from the Science and Technology Facilities Council, UK. 7 The SLDs of the solvents are presented in Table S1: Table S1 Scattering length density of the solvents used in the SANS characterization.
In addition, the modelling of the protein scattering was performed using a generalized form of the Porod approximation. 8This model determines the Porod exponent m (or n) using the following equation: The parameter m relates to the fractal dimension of the scatterer and for polymer coils can adopt different values depending on the polymer-polymer and solvent-polymer interactions.As such, a value for the slope of 3 relates to a collapsed polymer coil, 2 is a signature of Gaussian chains and 5/3 is for fully swollen polymer coils. 9 The analysis using the IFT method was performed in the ATSAS 3.2.1 suite. 10The analysis of data using the generalized Porod approximation was performed on SasView 5.0.3. 11 2 Results from the characterization of the proteins Table S3 Parameters derived from the analysis of the SANS data presented in Figure 3 and 5: Dmax -maximum dimension of the scatterer, r1 -the apparent size of the protein monomer parameterized as the position of the first peak in the p(r), Nagg -the average number of protein units in a self-association equilibrium, and m and n -negative Porod slopes at high and low q, respectively.

Validation of the exponential component analysis from excited-state emission fluorescence
Initially, we aimed to determine the most suitable approach to analyze our data by using an increasing number of decay components, i.e., one, two, and three.As expected, the addition of extra components to the theoretical model increased the quality of the fit (Figure S4 and Table S4).However, beyond two components, the gains were only marginal.

Analysis from the two-state denaturation model
To probe the possibility of co-existing conformations, e.g., variable populations of folded and unfolded states, 12 data were analyzed using a two-state denaturation model.This approach uses four characteristic lifetimes, two associated to the folded protein (τF1 and τF2) and two to the unfolded state (τU1 and τU2) with variable populations of each component: This approach was used to analyze the data for Lyz in 1:1:1 ChCl:Glyc:AcOH.The results from the fit are presented in Figure S5 and Table S5.Table S5 Parameters derived from the analysis using a two-state denaturation model presented in Figure S5.This approach to fit the data was found unsuccessful.Besides we observed an increase in the χ 2 parameter compared to the floating double exponential model (from 1.33 to 1.52), the two-state denaturation model yielded negative αi values for some components, thus resulting physically unrealistic.As such, we concluded that the variation in the composition of the ternary DESs leads to the stabilization of discrete conformational intermediates instead of co-existing folded and unfolded proteins.

Excited-state emission fluorescence of lysozyme in aqueous buffer
To validate our analysis protocol, we determined the excited-state emission fluorescence for Lyz in aqueous buffer and compared the results to literature values.The results from the characterization are presented in Figure S6.
exponential component analysis from excited-state emission fluorescence 4 Analysis from the two-state denaturation model 5 Excited-state emission fluorescence of lysozyme in aqueous buffer 6 Densities and viscosities of the DESs 7 References Figure S11 H NMR spectrum of choline (trimethyl-d9) acetate-d3 (400 MHz, D2O).

Figure
Figure S4 Excited-state emission fluorescence data, fits, and residuals of Lyz in 1:1:1 ChCl:Glyc:AcOH using a different number of components: (a) one, (b) two, and (c) three.The instrument response function (IRF) is presented in each plot as grey dots.
TableS2Parameters derived from the analysis of the spectroscopy data for Lyz and BSA in different DESs and aqueous buffer presented in Figure2 and 4: λd2,Tyr -Tyr peak position obtained from the peak analysis of the second-derivative UV-vis spectra, λmax,Trp -position of the emission maximum for the Trp peak, and [θ]MR -mean residue ellipticity at 222 nm.

Table S4
Parameters derived from the exponential component analysis presented in FigureS4.