Decoupling Protein Concentration and Aggregate Content Using Diffusion and Water NMR

Protein-based biopharmaceutical drugs, such as monoclonal antibodies, account for the majority of the best-selling drugs globally in recent years. For bioprocesses, key performance indicators are the concentration and aggregate level for the product being produced. In water NMR (wNMR), the use of the water transverse relaxation rate [R2(1H2O)] has been previously used to determine protein concentration and aggregate level; however, it cannot be used to separate between them without using an additional technique. This work shows that it is possible to “decouple” these two key characteristics by recording the water diffusion coefficient [D(1H2O)] in conjunction with R2(1H2O), even in the event of overlap in either D(1H2O) or R2(1H2O). This method is demonstrated on three different systems, following appropriate D(1H2O) or R2(1H2O) calibration data acquisition for a protein of interest. Our method highlights the potential use of benchtop NMR as an at-line process analytical technique.


Size-exclusion chromatograms
The peak at an elution volume of approximately 25 mL is a column artefact, and was excluded from any calculations used to determine the percentage monomer content.

Size-exclusion chromatograms
The peak at an elution volume of approximately 25 mL is a column artefact; the peak at approximately 120 mL is histidine, a non-protein component of the mAb concentrate used.These were both excluded from any integral calculations used to determine the percentage monomer content.To determine the water longitudinal relaxation time [T 1 ( 1 H 2 O)], the "T1" pulse sequence was used.This is provided by the manufacturer and used without modification.The repetition time was kept at 10 s, with 4 scans accumulated.The "T1" pulse sequence is an inversion recovery sequence. 1To obtain the longitudinal relaxation rate, the longitudinal relaxation time (T 1 ) for the water signal is first extracted by fitting the data to Equation S1:

Fits and residuals plots
where I t is the signal intensity at time t, and I 0 is the signal intensity at t = 0.A single component fit was used to give T 1 ( 1 H 2 O), the reciprocal of which gives R 1 ( 1 H 2 O).In these experiments, the following parameters were used: 20 µs radiofrequency (RF) pulse, with a 90° of -6 dB; 200 µs dwell time; 16384 points; 16 delay steps, spaced linearly between a minimum value of 1 ms and a maximum value of 15 s.The total experiment time for one measurement was approximately 14 min.
All NMR data was processed, analysed and visualised in MATLAB R2020b or R2022a (MathWorks, US), using custom scripts.All measurements were recorded in triplicate, with the arithmetic mean value taken to give the sample value; the standard error of this value was used to determine sample error.As the data shown in Figures S20 and S21 has been included for reference only, actual sample values and associated errors are not given.

Data -BisAb
Data table

Case studies
First case study -BSA A sample of BSA was studied by a user, and a R 2 ( 1 H 2 O) value of 0.555 s -1 , and a D( 1 H 2 O) value of 2.586 × 10 -9 m 2 s -1 , were calculated.
The standard calibration data for D( 1 H 2 O) and R 2 ( 1 H 2 O) is consulted (Figure S23):

Figure S23 -Plots showing the effect of concentration and aggregate percentage for known BSA calibration samples with R 2 ( 1 H 2 O) (A) and D( 1 H 2 O) (B). The dashed line in (
The solid lines represent linear regression fits to individual sample concentrations and the blue shaded areas represent 95% confidence intervals calculated from average of the standard error of all data points.In (A), the 95% confidence intervals are multiplied by a factor of 5 for visibility.Sample errors were determined by taking the standard error of the arithmetic mean of three sample measurements; values can be found in Table S2.Error bars are excluded for clarity where the errors are smaller than the symbols used.
The D( 1 H 2 O) value of the unknown sample is identified on the y-axis of the reference data plot, and a horizontal line is drawn (see Figure S23B).
A MATLAB script was written to create a (5 × 10) matrix of upper and lower limits of the 95% confidence intervals (obtained from the linear regression fits to the data shown in Figure S23B) for each different BSA sample concentration shown in Figure S23B; this is called "ci_array".The full MATLAB code can be found on Page S24 of the Supporting Information.A (5 × 10) matrix of identical D( 1 H 2 O) values (e.g.2.586 × 10 -9 m 2 s -1 in this example) was then created; this is called "d_array".The following logical comparison was then performed for the two matrices to identify any overlap of D values within 0.1% of the upper and lower limts of "ci_array", thus yielding a Boolean array which we call "truth_array": A horizontal line is then drawn on the reference R 2 ( 1 H 2 O) data at the obtained value of 0.555 s -1 (see Figure S23A).It is seen that this line intercepts the 95% confidence intervals of the R 2 ( 1 H 2 O) reference data at four different (and unique) BSA concentrations: (R 2,i ) 9.62 mg mL -1 , ~36% aggregate.(R 2,ii ) 8.26 mg mL -1 , ~38% aggregate.(R 2,iii ) 6.59 mg mL -1 , ~48% aggregate.(R 2,iv ) 4.59 mg mL -1 , ~78% aggregate.
The only assignment for the unknown BSA sample is therefore a concentration of 9.62 mg mL -1 , with 32-36% aggregate.
Truth array for D = 2.586 10 -9 m 2 s -  S3. Error bars are excluded for clarity where the errors are smaller than the symbols used.
The R 2 ( 1 H 2 O) value of the unknown sample is identified on the y-axis of the reference data plot, and a horizontal line is drawn (see Figure S25A).Upon visual inspection, it is clear that the sample could have two potential identities: (R 2,i ) 4.46 mg mL -1 , ~85% aggregate.(R 2,ii ) 7.14 mg mL -1 , ~ 22% aggregate.
Based on the obtained D( 1 H 2 O) value, the user can determine that the concentration of the sample is 7.14 mg mL -1 .
The only assignment for the unknown mAb sample is therefore a concentration of 7.14 mg mL - Figure S1 -Schematic of the protocol followed to generate the different stressed fraction-containing solutions.

Figure S16 -Figure S17 -
Figure S16-Example data obtained during a "PGSE" pulsed gradient spin echo experiment for three repeat measurements on a 9.62 mg mL -1 , 0% stressed fraction BSA solution.The associated fits and residual for each datapoint were determined in MATLAB 2022a (Mathworks, US).The residuals are multiplied by a factor of 10 for visibility.

Figure S18 - 1 Figure S19 -
Figure S18-Example data obtained during a "PGSE" pulsed gradient spin echo experiment for three repeat measurements on a 9.03 mg mL-1  BisAb solution before heat-stress.The associated fits and residuals for each datapoint were determined in MATLAB 2022a (Mathworks, US).The residuals are multiplied by a factor of 10 for visibility.

Figure S22 -
Figure S22-Normalised SEC chromatogram of a 9.03 mg mL -1 BisAb solution, before (Pre-HT) and after (Post-HT) heat-stress.The peak at an elution volume of approximately 25 mL is a column artefact; the peak at approximately 120 mL is histidine, a non-protein component of the BisAb concentrate used.

Table S1 -
Concentrations of the protein solutions studied in this work, as determined by UV-vis spectroscopy.

Table S2 -
Data for each BSA solution studied in this work.Percentage aggregate content was determined by size-exclusion chromatography analysis, while the water transverse relaxation rate [R 2 ( 1 H 2 O)] and water diffusion coefficient [D( 1 H 2 O)] were determined by NMR analysis.Sample errorswere determined by taking the standard error of the arithmetic mean of three sample measurements.

Table S3 -
Data for each mAb solution studied in this work.Percentage aggregate content was determined by size-exclusion chromatography analysis, while the water transverse relaxation rate [R 2 ( 1 H 2 O)] and water diffusion coefficient [D( 1 H 2 O)] were determined by NMR analysis.Sample errorswere determined by taking the standard error of the arithmetic mean of three sample measurements.

Table S4 -
Average effective diameter recorded for a 10.07 mg mL -1 mAb solution at different stressed fractions.
S16The behaviour of R

1 ( 1 H 2 O) with increased stressed fraction for BSA solutions
). Sample errors were determined by taking the standard error of the arithmetic mean of three sample measurements.Error bars are excluded for clarity where the errors are smaller than the symbols used.

Table S5 -
Data for each BisAb solution studied in this work, before (Pre-HT) and after (Post-HT) heat-stress.The water transverse relaxation rate [R 2 ( 1 H 2 O)] and water diffusion coefficient [D( 1 H 2 O)]were determined by NMR analysis.Sample errors were determined by taking the standard error of the arithmetic mean of three sample measurements.
abs(ci_array -d_array) <= threshold*abs(ci_array) Truth array generated using standard calibration data for a BSA sample with a D( 1 H 2 O) value of 2.586 × 10 -9 m 2 s -1 .Yellow represents a true value, while blue represents a false value.From this truth array, the concentrations and % aggregate shows that the sample could be any one of the following identities: v ) 8.26 mg mL -1 , ~64% aggregate.(D vi ) 8.26 mg mL -1 , ~71% aggregate.
sample of mAb was studied by a user, and a R 2 ( 1 H 2 O) value of 0.590 s -1 , and a D( 1 H 2 O) value of 2.5550 × 10 -9 m 2 s -1 , were recorded.The standard calibration data for D( 1 H 2 O) and R 2 ( 1 H 2 O) is consulted (FigureS25): A -1 .The solid lines represent linear regression fits to individual sample concentrations and the blue shaded areas represent 95% confidence intervals calculated from average of the standard error of all data points.In (A), the 95% confidence intervals are multiplied by a factor of 5 for visibility.Sample errors were determined by taking the standard error of the arithmetic mean of three sample measurements; values can be found in Table

Comparison of 0% and 100% SF data for BSA and mAb Figure S26
-Plots showing the effect of concentration with R 2 ( 1 H 2 O) (A) and D( 1 H 2 O) (B) for BSA solutions with 0% (circles) and 100% (diamonds) stressed fraction.Sample errors were determined by taking the standard error of the arithmetic mean of three sample measurements; values can be found in TableS2.Error bars are excluded for clarity where the errors are smaller than the symbols used.Figure S27 -Plots showing the effect of concentration with R 2 ( 1 H 2 O) (A) and D( 1 H 2 O) (B) for mAb solutions with 0% (circles) and 100% (diamonds) stressed fraction.Sample errors were determined by taking the standard error of the arithmetic mean of three sample measurements; values can be found in TableS3.Error bars are excluded for clarity where the errors are smaller than the symbols used.