Integrated ultra scale-down and multivariate analysis of flocculation and centrifugation for enhanced primary recovery

.


Introduction
The bioindustry requires continuous improvement of its bioprocesses.Continuous improvement is especially important with the implementation of quality-by-design.Recent advances in upstream bioprocessing allow to produce higher cell density cultures, which result in an overload downstream processing.One of the most crucial challenges in primary recovery is the isolation of intracellular products after cell disruption.As well as releasing the product of interest, this step also releases the impurity content, such as DNA and host cell proteins.This generates a rise in the viscosity along with the generation of fine cell debris, which affects subsequent unit operations such as filtration and chromatography (Balasundaram et al., 2009;Singh et al., 2016;Kang et al., 2013).The introduction of flocculation after homogenisation and before centrifugation can enhance the efficiency of primary recovery.The main objective of flocculation is the aggregation of cell debris to create larger particles.Depending on the flocculating agent, it can also reduce the amount of nucleic acids, colloids, and lipids.As a result, this can improve the centrifugation performance and simplify purification steps (Balasundaram et al., 2009;Le Merdy, 2015;Salt et al., 1995;Singh and Chollangi, 2017;Van Alstine et al., 2018).
Ultra scale-down (USD) models are powerful tools to increase knowledge about the products and optimise their processes.These allow the use of small quantities of material to better understand the impact on the sample during biomanufacturing in a time-effective manner.These techniques aim to mimic full-scale process behaviour by using lab-scale devices and methodologies (Rayat ACME et al., 2016;Masri, 2016).Therefore, establishing a scale-down methodology for flocculation and centrifugation will benefit the development and optimisation of a robust large-scale primary recovery.
To this end, the kompAs (R) ultra scale-down centrifugation device was previously created at University College London (UCL) to understand and extrapolate laboratory results into a large-scale process.The USD centrifugation technique has been used to study the impact of flocculated material on centrifugation (Chatel et al., 2014;Berrill et al., 2008;Espuny Garcia del Real et al., 2014).Although an ultra scale-down flocculation system that allows the study of the multi-parameter effect on performance and protein recovery for applicability at large-scale has not yet been established.
The growing acceptance of quality-by-design in the biopharmaceutical industry has contributed to the increased use of multivariate data analysis (MVDA).The continuous process enhancement and better product characterisation requires a clearer understanding of how process parameters affect product quality attributes and process performance for an improved commercial scale control and monitoring.The analysis of data profiles from cell culture operations, product comparability assessment, root cause analysis in manufacturing and raw material characterisation are common applications where MVDA is implemented (Rathore et al., 2011;Mercier et al., 2014).Principal component analysis (PCA) and partial least square (PLS) are useful when implementing process analytical technology (PAT) (Rathore et al., 2014) as well as scale-down validations (Manahan et al., 2019;Tsang et al., 2014).PCA and PLS reduce the dataset dimensions to projection variables, simplifying its representation.PCA is widely used in data extraction as it identifies major trends, data clusters, and relationships between observations and variables.In contrast, PLS combines the features from principal component analysis and multiple linear regression to describe the relationships between multiple process variables and process outcomes (Beckett et al., 2018).
The goal of this work is to establish USD flocculation, which can be used in tandem with USD centrifugation for process development, characterisation, and optimisation.The investigation of the links between flocculation and centrifugation processes is not trivial.Thus, the novelty of this study is to demonstrate the use of the newly established USD system in tandem with MVDA to assess the multi-parameter effects of flocculation on primary recovery performance.Acquisition of data from various conditions is possible because of the nature of ultra scale-down, which enables high throughput experimentation.Thus, USD studies cover a wide range of parameters, such as those that affect individual unit operations, interactions between operations, and comparisons between different process scales (e.g., USD versus pilot scale).
Specifically, the aim is to apply the tandem USD platform established in this work to the sequential multivariate data analysis which evaluated: (1) the impact of system design features (i.e., vessel impeller type, process shear in the centrifuge feed zone); (2) homogenate (feed) variability; and (3) flocculation process parameters (Camp number, flocculant addition time and floc aging time) on primary recovery performance (e.g., %clarification, supernatant filterability, protein titre, overall protein yield).Additionally, PCA is used to transform particle size distribution to allow a more straightforward assessment of floc growth and strength, eliminating subject evaluations by visual PSD observations.It will also be discussed if the linked flocculation-centrifugation process at ultra scale-down could predict results at a larger scale.

Evaluation and scale-up of flocculation and centrifugation
Flocculated cell debris can break up under exposure to process shear imposed by the continuous centrifuge feed zones, decreasing separation efficiency.Therefore, generating strong and dense flocs is essential to avoid floc breakage.One of the most promising ways to study the sensitivity of flocs to process shear is using the kompAs (R) USD rotating-disc shear device developed at UCL (Rayat ACME et al., 2016).The method consists of exposing the flocs to different levels of shear; subsequently, particle size distribution (PSD) is measured, and the PSD of non-sheared and sheared flocs are compared.The stronger the flocs, the fewer changes from the PSD of non-sheared flocs.This method of comparison is often done manually (Chatel et al., 2014).
The production of strong flocs is very complex.It depends on several factors such as suspension composition, pH, ionic strength, cell system, flocculating agent (type, addition rate and concentration) and mixing shear environment (Camp number and residence time) (Wang et al., 2011;Barany and Szepesszentgyörgyi, 2004;Pearson et al., 2004a).Consequently, the interaction of these factors should be kept constant to expect similar floc properties across different scales.The Camp number (Ca) is a dimensionless group of variables that ensures a similar shear exposure during mixing by taking the shear rate (G, s -1 ) and residence time (t, s) into account: (Bell and Dunnill, 1982) 0.5  (1) Where P is the power dissipated into the fluid suspension (W), V is the suspension volume (m 3 ), t is the mixing time (s) and µ is the suspension viscosity (Pa s).For a stirred tank vessel: (2) where the power number (P O ) depends on the impeller type and can vary depending on the Reynolds number, ρ is the suspension density (Kg m -3 ), N is the stirrer speed (s -1 ), and D is the stirrer diameter (m).Depending on the type of impeller, the design of the tank and baffle can also change (Doran, 1995).The Camp number and mixing time should be kept constant during scale-up, maintaining a consistent mixing and process shear environment across scales.An early work (Bell and Dunnill, 1982) highlighted that a Camp number equal to or greater than 10 5 generates strong aggregates.
Flocculating agents can also act as precipitants, such as polyethyleneimine (PEI) (Balasundaram et al., 2009;Singh et al., 2016).The reduction of impurities by flocculation and centrifugation can enhance further downstream processes (Balasundaram et al., 2009;Singh et al., 2016;Felo et al., 2015).However, Pearson et al., observed enzyme losses during polyelectrolyte flocculation and explained possible reasons for these losses (Pearson et al., 2004b).According to their study, the product can either interact with the flocculant or be entrapped in the flocs, leading to changes in product yield.Hence, total protein and product yield over flocculation and centrifugation are important to assess in each step.If there are losses, this will help identify in which steps these losses occur and which factors affect these losses.
The use of kompAs (R) USD shear device, together with a laboratory centrifuge, mimics a full-scale centrifuge clarification which can then be used to evaluate floc systems.kompAs (R) simulates the process shear stress encountered by the process material in the centrifuge entry feed zone within various types of centrifuges (Boychyn et al., 2005).The use of Sigma theory enables the comparison of performance between centrifuges of different sizes and designs by maintaining the ratio of flow rate (Q) to equivalent settling area (Σ) constant (Ambler, 1959).The application of Sigma theory to compare USD and full-scale was described elsewhere (Maybury et al., 1998(Maybury et al., , 2000)).

Experimental conditions
Ultra scale-down (USD) studies were performed to investigate factors affecting flocculation or centrifugation performance.The factors evaluated were Camp number (Ca), addition time (time required for flocculant addition while mixing, e.g., to add 10 mL of flocculant over 10 min, the pump was set to 1 mL min -1 ), aging time (extra mixing time after flocculant addition), impeller type, scale, homogenate preparation (feed) and the shear level experienced at centrifuge feed zones for different centrifuge designs or operations.The performance and quality indicators evaluated are the percentage of solids remaining, optical density at 600 nm, the percentage of protein removed, supernatant total protein and the filterability of the supernatant.In the end, particle size distribution was assessed by PCA to describe floc growth and strength.A sequential analysis of the collected data was performed to evaluate the impact of seven variables on six responses (including particle size distribution), as described in Fig. 1(i).The factors and responses are explained in Fig. 1(ii).

Homogenate preparation
Saccharomyces cerevisiae was used as a biological model system for USD platform proof of concept.It is a common system to study flocculation (Salt et al., 1995;Milburn et al., 1990) and used for proof-of-concept studies (Espuny Garcia del Real et al., 2014;Lopes and Keshavarz-Moore, 2013;Radhakrishnan et al., 2018).Saccharomyces cerevisiae has a variety of biotechnological applications (Parapouli et al., 2020) and it have been used to express industrial-relevant products, such as enzymes and recombinant proteins (Vieira Gomes et al., 2018;Huang et al., 2014;Nielsen, 2013).Samples from blocks of active Baker's yeast (Bioreal®, Riegel Am Kaiserstuhl, Germany) were suspended to 10% (w/ v) in 10 mM phosphate buffer at pH 7.5.A high-pressure batch homogeniser (APV Lab 60 Homogeniser, APV, Crawley, UK) was used to disrupt the cells (5 passes at 500 bar and 7 °C).The two feeds were prepared from different blocks in different days.The homogenate was divided into 200 mL bottles and frozen immediately after homogenization at -20°C.Prior to each experiment, the homogenate was thawed.For Feed 1, homogenate was quick thawed at room temperature.For Feed 2, homogenate was slowly thawed overnight at -4 °C.The particle size distribution and total protein concentration were measured.
Flocculation and centrifugation experiments using these homogenates are illustrated in Fig. 2.

Ultra scale-down flocculation
A 25% (w/v) of polyethyleneimine stock solution (linear PEI 50% (w/v) in water (Sigma-Aldrich, Gillingham, UK) was used.For the ultra scale-down flocculation, 1 mL of the PEI solution was added by a syringe pump to 47 mL of homogenate for 7.5 or 15 min to achieve 5.6% (w PEI /w wcw ) as a final concentration.The flocculation was performed in a 50 mL baffled beaker with an impeller (six-bladed 20 mm diameter Rushton turbine or 45° pitched four-blade, dia 18.5 mm).
For the laboratory flocculation, 11.1 mL of the solution was added by a syringe pump to 500 mL of homogenate over 7.5 or 15 min to achieve the same final concentration as the USD.Flocculation was performed in a 600 mL baffled beaker with an impeller (45° pitched four-blade, dia 40 mm).
Aliquots (2 mL) of each flocculation material were centrifuged (Eppendorf 5415 centrifuge, rotor F45-24-11, Stevenage, UK) at 15,000 rpm for 30 min at 4 °C to obtain a well-clarified supernatant with an optical density (OD ws ) measured before storage at -80 °C.Total protein concentrations of these samples were measured as described in the analytical methods.Samples of flocculated materials were also taken for particle size distribution and optical density (OD f ) measurements.
To scale-up flocculation and ensure similar chemical interactions and product yield across different scales, flocculant addition time, dose, and final concentration in terms of weight of flocculant per weight cell were kept constant.Flocculation was carried out at room temperature.

Ultra scale-down centrifugation
Flocculated samples were exposed to low shear (ε diss =0.045 ×10 6 W kg -1 ) and high shear (ε diss =0.53 ×10 6 W kg -1 ) at 6,000 rpm and 12,000 rpm, respectively, for 20 s in a highspeed rotating-disc device (kompAs shear device, UCL, London, UK).The low shear condition is equivalent to the conditions experienced in a hydro-hermetic disc stack centrifuge's feed zone.High shear conditions mimic a non-hermetic disc stack centrifuge's feed zone (Hutchinson et al., 2006).Samples from each shear processing were submitted for particle size distribution (PSD) measurements.Subsequently, the samples were placed in a laboratory centrifuge (Eppendorf 5415 centrifuge, rotor F45-24-11) and run at 6,000 rpm for 5 min at 4 °C to mimic full-scale centrifugation with a ratio of flow rate to equivalent settling area (Q FS /S FS ) of 2.8 × 10 -8 m s -1 .The supernatant was collected without disturbing the pellet to measure its optical density (OD s ) before storage at -80 °C for later experiments on filterability and measurement of total protein concentration.Route A in Fig. 2 illustrated the USD or lab-scale flocculation and USD centrifugation.Route B in Fig. 2 also used USD centrifugation.
Details of how to set up the ultra scale-down centrifugation methodology can be found as Supplementary material (S1).Scale-up/scale-down was based on mimics of similar shear stress enviroment and constant Q/Σ across the scales.

Pilot-scale flocculation and centrifugation
The same PEI solution in the USD experiments was used for pilot-scale flocculation.A peristaltic pump added 149 mL of PEI solution over 15 min to achieve the same final concentration as the scale-down experiments.The volume of frozen-thawed homogenate required for the pilot scale was 6.7 L. The flocculation was performed in a 10 L baffled vessel (Bioflo Q452100) with a single impeller (six-bladed Rushton turbine, dia 75.5 mm) and mixed at 210 rpm for 30 min.Wellspun samples from pilot scale flocculation were taken to obtain the turbidity before storage at -80 °C.Flocculated samples were also measured for particle size distribution and optical density (OD f ) measurements.Flocs were then processed for USD or pilot scale centrifugation, routes B and C in Fig. 2. The two routes aim to understand whether flocs processed through USD or pilot-scale centrifugation will result in the same performance.By Route B, flocs are exposed to USD centrifugation with the same conditions previously described in section 2.1.4.
By Route C, a continuous disk stack centrifuge (GEA Westfalia PSC-1 disc stack centrifuge, GEA, Milton Keynes, UK) was operated at 100 L h -1 and 13,500 rpm.This condition corresponds to a Q FS /Σ FS of 2.8 × 10 -8 m s -1 .During the operation, feed to the centrifuge was continuously mixed.The clarified supernatant was collected after reaching a steady state.Prior to storage at -80 °C, samples were taken to

Total protein concentration
Samples were thawed for at least 1 h at 4 °C before use.A Bradford colorimetric assay (3.1 mL method) was used according to the manufacturer's specifications (Sigma-Aldrich, Gillingham, UK).Before adding the Bradford reagent, the samples were 4x diluted in 10 mM phosphate buffer, pH 7.5.The mixtures were gently vortexed and incubated for 5-30 min.After, these were transferred to cuvettes and read at 595 nm.A calibration curve was generated using bovine serum albumin (BSA) protein standard (Sigma-Aldrich, Gillingham, UK).Assays were performed in triplicates.

Percentage of solids remaining
The percentage of solids carry-over is obtained from optical density (OD, (-)) measurements, as shown in Eq. 3.
OD s is the optical density of the supernatant after centrifugation, OD w is the optical density of the well-spun feed (completely sedimented), and OD f is the optical density of the feed (dilution required).The optical density is measured at a wavelength of 600 nm with a UV-Vis spectrometer (Jenway 6300, Antylia Scientific, St Neots, UK).Dilution of samples in 10 mM phosphate buffer allowed an optical density in the linear range (below 1.0 OD units).The OD was measured in triplicate.A 10 mM phosphate buffer was used as a blank.

Particle size distribution
Particle size distribution (PSD) was measured with a small volume of sample by laser diffraction using a HydroMV Mastersizer 3000 (Marvern Instruments Ltd., Malvern, UK).These measurements were taken before and after the flocculation step and after exposure to low and high shear.Samples were dispersed in ultrapure water (Milli-Q, Merck-Millipore, Watford, UK) until the % obscuration measured by the Mastersizer was within 8-15%.The Mastersizer software calculates the particle size distribution in terms of particle volume percentage.The resulting particle size distribution is the average of five readings for each sample, for which the percentage coefficient of variation is below 4%.

Filterability
The supernatant filterability was determined by measuring the amount of supernatant that permeated through the membrane at constant transmembrane pressure.For the samples exposed to low shear, five-volume measurements were taken at intervals of 30 s.For high sheared samples, ten-volume measurements were taken every 30 s. Filterability was obtained by calculating V max using Eq.4: where V is the total filtrate volume membrane area (L m -2 ) collected over time t (s), V max is the maximum amount of fluid that will pass through the filter before completely plugged (L m -2 ), and Q o is the initial volumetric filtrate flow rate (L m -2 s -1 ).V max methodology is also used elsewhere (Lau et al., 2013;Arunkumar et al., 2016;Kong et al., 2010).
This method was performed on a liquid handling robotic platform (Tecan Freedom EVO1, Tecan, Reading, UK).Custom-made filter pods and blocks (Biochemical/Chemical Mechanical Workshop, UCL, London, UK) were run using the vacuum block (TeVacS, Tecan Vacuum Separator).A similar vacuum block was used by Kong et al. and Rayat et al. (Rayat ACME and Lye, 2010;Kong et al., 2010).Specifically, in this study, the custom filter pods were fitted with a 0.22 µm PES filter sheet (Millipore, Watford, UK) with an effective area of 0.28 cm 2 .A custom filter block can accommodate 4 of these filter pods.Before each run, membranes were flushed with 1.5 mL of Milli-Q water at the same transmembrane pressure as the sample runs.The volume of feed supernatant in each well is 1.5 mL, and the method is run at a constant 70 kPa transmembrane pressure and room temperature.All experiments were performed in duplicate.Filterability was measured only for Feed 2.

Statistical methods
A combination of principal component analysis (PCA), partial least square regression (PLS), and analysis of variance (ANOVA) were used as statistical evaluation tools to identify important factors and trends in the data.PCA is a method for converting complex data sets into orthogonal components known as principal components (PCs).PCs consider the greatest possible variance in the data.Because all the major components are orthogonal, any redundant information is eliminated.In contrast, PLS accounts for the covariance of both X-and Y-spaces by combining features of principal component analysis and multiple linear regression.As a result, data analysis is performed to ensure that the X-scores have as much covariance as possible with the Y-scores, which describe the relationships between process variables and process outcomes.In this case, the reduction variables are latent variables (LVs).The loadings show how much influence the original variables had on the model's outcome (Beckett et al., 2018).Data for PCA and PLS analysis were organised, so rows represent samples or data points, and columns correspond to factors and responses.The dataset can then be uploaded into suitable software, where columns are designated as X-variables and Y-variables.
For the same biological sample, technical repeats were taken as average readings.For each bioprocess condition studied, at least two independent repeats were performed and described as biological repeats.SIMCA version 13.0.3.0 (Umetrics AB, Umea, Sweden) was used for PCA and PLS analysis.Design Expert 11 (StatEase, Minneapolis, US) was used to perform an ANOVA of the collected data.Prior to the data analysis with SIMCA or Design Expert 11, data was organised using MS Excel 2016.

Use of PCA and PLS for factor-screening
To begin our investigation, a Principal Component Analysis (PCA) (indicated in Fig. 1 (ii), excluding PSD) was used as an exploratory model, as seen in Fig. 3.A PCA model with two principal components (PCs) explains 83.3% of Y-variability.
The first principal component (PC1) represents the most significant variance of the Y-dataset, while the second principal component (PC2) is orthogonal to the first, describing the Y-dataset's second largest variance.PC1 (x-axis in Fig. 3) shows a separation by different levels of shear exposure.PC2 (y-axis in Fig. 3) reveals clusters by different homogenate feeds.Additionally, it is observed that the shear level at the pilot scale centrifuge (labelled "cent" in Fig. 3A) is similar to the results obtained at the high shear USD methodology (labelled "HS" in Fig. 3A).This result indicates that our pilot scale centrifuge operates to an equivalent high shear centrifuge (compared to the low-shear centrifuge mimic used here).
PCA does not explain how the different factors might be related to the responses.While PCA only considers the variability of one set of data, Partial Least Squares (PLS) finds correlations between factors input and responses.As a result, PLS was used to explore the importance of each factor to the overall response.Using two latent variables, 35.6% of the factors explain 51% of the responses.The variable influence on projection (VIP) reflects the importance of each factor in the PLS model in relation to the responses, as shown in Fig. 4. The factors and responses explored here are shown in Fig. 1(ii) and were evaluated using the systems described in  By analysing the PLS loading plot (data not shown), it is possible to determine how each factor impacts each response.The shear level has a strong positive correlation with the percentage of solids remaining and optical density and a strong negative correlation with filterability.The homogenate feed has a strong correlation with the supernatant total protein.In fact, homogenate from Feed 1 has an average of 1.5-fold higher supernatant total protein content than Feed 2. Although Feed 1 and Feed 2 were prepared under the same flocculation and homogenisation conditions, the thawing method was different, and homogenisation was conducted on separate days (from different stocks of Baker's yeast packs).Total protein measurements on the homogenate before and after freeze-thaw did not change.As a result, the freeze-thawed process was discarded as a cause for the observed total protein difference in the homogenate.This difference was attributed to batch variability between the different stocks of the Baker's yeast used in this study, demonstrating how upstream variations can affect protein release during primary recovery.Camp number had less impact on centrifugation performance and protein content than shear level and homogenate feed.Results suggested that higher Ca produce supernatant which is easier to filter (i.e., better filterability), higher protein yield, lower optical density and percentage of solids remaining.
The effect of feed and shear on flocculation was also reported by Chatel et al.However, the importance of achieving optimal flocculant chemistry (e.g., flocculant dose) and Camp number to overcome the impact of feed and shear on floc formation was not mentioned. 10Working away from the optimal Ca and flocculation chemistry, the shear level and homogenate variability will have more impact on the performance.In this work, the Ca was varied to improve flocculation and centrifugation performance.Nonetheless, even for Ca higher than 10 5 , significant improvements were not observed as the shear level and feed impact were still more important.Parameters relating to flocculation chemistry could be explored further to obtain a robust primary recovery where Camp number can have a more significant impact on performance.
Fig. 5 shows the univariate analysis to evaluate the effect of homogenate feed and centrifuge feed zone type on the most affected responses.All the other factors remained constant, and only the USD data was used.These results show the importance of shear level experienced by the flocs and feed variability and align with those illustrated by the PCA and PLS models in Fig. 3 and Fig. 4.
The protein content after flocculation and centrifugation was found to be different between homogenate Feed 1 and Feed 2. Furthermore, Feed 1 had slightly larger particles (PSD data not shown), resulting in smaller flocs and three times fewer solids in the supernatant than Feed 2 when exposed to high shear.As already concluded from the sequential analysis, the feed variability influenced flocculation and centrifugation performance.Pearson et al., already identified batch variability as the most important factor for enzyme loss variability (Pearson et al., 2004b).Barany & Szepesszentgyörgyi detected an influence of suspension concentration and composition on flocculation efficiency, showing that suspension characteristics and content may vary from batch-to-batch (Barany and Szepesszentgyörgyi, 2004).
Fig. 5 also shows that solids and optical density in the supernatant increase with the intensification of shear, as mimicked by the kompAs USD device, such as those that may be imposed at a centrifuge feed zone.In this case, the prepared flocs are shear sensitive.They break when exposed to the centrifuge feed zone, affecting clarification.Chatel et al., as well observed an impact of centrifuge shear level on breakage of E. coli flocs (Chatel et al., 2014).
The influence of shear stress and feed variability on flocculation and centrifugation is well known (Chatel et al., 2014;Barany and Szepesszentgyörgyi, 2004;Pearson et al., 2004b).This work's novelty is the application of PCA and PLS to study the multi-parameters effect on flocculation and centrifugation.Obtaining conclusions using univariate analysis for 65 experimental runs and several parameters and responses would be challenging.The use of MVDA allowed a straightforward detection of shear level and feed variability as the main factors influencing flocculation and centrifugation for the specific flocculation chemistry.This analysis led to the conclusion that Camp number is not so relevant if flocculation chemistry is not optimised.The same conclusion cannot be taken only with univariate analysis from Fig. 5.The univariate result only reveals that shear stress and feed variability affects flocculation and centrifugation for the specific Camp number, scale, impeller type, aging and addition time.

Use of ANOVA to investigate factor-interaction
The previous MVDA analysis by PCA and PLS looked at the overall system (Fig. 1(i) A) with all the variables described in Fig. 1(ii).Following this, an ANOVA was performed to investigate the relationship between specific factors and responses (i.e., indicators of process performance) and, specifically, to assess the performance of each feed for different centrifuge types.This analysis identifies specific relationships between each response and factor (Fig. 1(i) B).Moreover, it is possible to detect if the process scale (i.e., pilot scale or ultra scale-down) impacts each response.We have removed the factors that resulted in a larger response (e.g., feed variability and shear), which can mask the impact of other factors studied here (e.g., addition time, aging time).ANOVA evaluates the amount of variation between groups of samples with the amount of variation within groups.It allows the identification of statistically significant factors for each response.This study considered an important factor when the p-value was lower than or equal to 0.05.
Table 1 shows that Camp number is the most critical factor for flocculation and centrifugation performance for specific feeds and process shear.It affects clarification and filterability for both feeds and centrifuge types.The pioneering work by Bell & Dunnill (1982) describes that Ca influences floc strength; on the one hand, Camp numbers higher than 10 5 gives strong, rigid particles.On the other hand, Camp numbers lower than 10 5 generate weak particles which are difficult to handle and separate from the liquid.Consequently, an increased Ca result in a better flocculation performance, improving particle separation.Although, it is important to emphasise that the generated flocs are shear sensitive, limiting the effect of Camp number.In this case, changing the Camp number on its own was not enough to make stronger flocs.As a result, other factors should be explored in creating strong flocs, such as those related to flocculation chemistry and other mechanisms that could potentially increase floc strength.Examples are flocculant type, flocculant dose and final concentration of flocculant (Barany and Szepesszentgyörgyi, 2004;Pearson et al., 2004aPearson et al., , 2004b)).It would be important to optimise these parameters along with Camp number to generate flocs strong enough to withstand the shear level encountered at the centrifuge entry feed zones during solid-liquid separation.Even if they break, the floc particles generated should be large enough (i.e., not generate small fines) to maintain a robust centrifuge performance.
The impact of addition and aging time is unclear from these results, while the use of different impellers only shows an influence on the filterability and total protein concentration of the supernatant of a hydrohermetic (low-shear) centrifuge.We postulate that filterability difference might be related to a difference in protein content in solution due to floc packing characteristics when using different impellers.This assumption needs to be further investigated.The higher the protein content, the more difficult the supernatant is to filter (Rayat ACME and Lye, 2010).However, the flocculation protein recovery is independent of the impeller type.
The ANOVA crucially demonstrates that the different production used in these experiments (i.e., ultra scale-down, lab-scale or pilot scale) do not significantly impact on flocculation and centrifugation performance.These results validate the tandem USD flocculation-centrifugation system and establish the utility of this USD platform to mimic largerscale performance.
Fig. 6 shows the raw data comparing flocculation at different scales followed by USD centrifugation for the same flocculation condition and feed.Floc clarification and subsequent filterability are well predicted by USD for the different centrifuge types.Protein losses across scales and for different centrifuge shear levels were not statistically different.These results are the basis of the ANOVA analysis shown in Table 1.Flocculation performance is similar across the scales, reflecting in the following centrifugation step.To ensure an effective scale-up, Camp number, mixing time (aging + addition), final flocculant concentration (%w/w) and flocculant dose (%w/w) were kept constant.This scale-up strategy was explained in Section 1.1.When a different vessel impeller is used, the power number must be changed.
Interestingly, the observed reduction in protein yield is independent of scale and centrifuge type.This protein loss is related to the precipitant characteristic of the flocculant PEI (Singh et al., 2016;Felo et al., 2015;Milburn et al., 1990).The same insights on protein yield was observed at ultra scaledown and at the pilot scale.

Use of particle size distribution (PSD) combined with PCA to evaluate floc strength and re-appearance of smaller particles
Having established the ultra scale-down floc-centrifugation system, we evaluated floc strength for specific floc preparations (Fig. 1(i) D).This investigation demonstrates how particle size distribution was used to evaluate floc growth and floc breakage during exposure to process shear (i.e., those encountered in feed zones of centrifuges).Fig. 7 shows the particle size distribution of a specific floc preparation (using different Ca) before and after exposure to shear.This representation allows the visualisation of all particle sizes.
For the high Ca (i.e., 4.63×10 5 ), Fig. 7a, flocculation forms a monoidal bell-shaped curve shift to the bigger particle sizes than for the low Ca (i.e., 4.63×10 4 ), Fig. 7b.Also, flocs formed at high Ca (Fig. 7a) demonstrate a more gradual disruption with the rise of shear level.On the other hand, for low Ca (Fig. 7b), PSD has a similar shape at low or high shear, displaying weaker flocs (i.e., flocs were already broken in the final state at the low shear, which further increases in shear did not create further breakage).The difference is also clear regarding the generation of fines, where flocs prepared at Camp number 4.63×10 5 produced fewer fines than a Camp number at 4.63×10 4 .
It is known that a higher Camp number generates stronger flocs, whereas a smaller velocity gradient (G in Eq. 1) generates bigger flocs (Bell and Dunnill, 1982).Consequently, flocs from low Ca (Fig. 7b) with low G (26 s -1 ) should generate bigger but weaker flocs than those from high Ca with higher G (257 s -1 ) (Fig. 7a).However, the results show smaller and weaker flocs when low Ca is applied with low G (Fig. 7b).A possible explanation is that flocs formed at such low Ca (< 10 5 ) are so weak that they are easily disrupted while handling samples.As a result, the non-sheared PSD may not accurately represent floc size for such low Ca as they may break while mixing during, or even before, PSD measurement.
At Ca 1.16×10 6 (not shown), flocs are smaller and only aggregated fines were not disrupted under shear exposure.It seems that the high shear environment (G=652 s -1 ) to which flocs were exposed during their formation either created irreversible breakage or prevented good floc growth and strength.Longer aging time is expected to promote a natural selection of flocs with more inter-particle contacts and with large aggregates being created from small ones.Nevertheless, aging time is beneficial if the polymer's tails and loops do not break permanently by shear force (Bell and  Dunnill, 1982;Zhou and Franks, 2006).Under a high average velocity gradient, an irreversible break up of flocs can occur, disabling floc from growing and getting stronger.
The impact of aging time on PSD under different shear levels was also evaluated for Ca= 4.63×10 4 and Ca= 1.16×10 6 (not shown).This analysis supported the results from the PCA that revealed no significant effect of the aging time on floc strength.Note that, apart from interaction with Ca for Feed 1, aging time did not show any considerable effect on the flocculation and centrifugation performance as shown in Table 1.
Flocs are shear sensitive for all the conditions studied.These results highlight what has been mentioned earlier, that investigation of other factors related to the flocculation chemistry aspect (e.g., flocculant dose) is required to produce stronger flocs.Nonetheless, we have shown that a multifactor assessment of impacts on floc strength and particle reappearance after process shear is complex and can be difficult to perform.Floc strength had to be evaluated by inspecting the particle size distribution of non-sheared and sheared flocs.Manual inspection and comparison of PSD measurements are tedious.For this reason, PCA of particle size distribution data was used to find clusters or trends by Camp number, aging and addition time effect on PSD measurements (Fig. 1(i)D).PCA was performed using particle size distributions of flocs created from different flocculation conditions and PSD from the same flocs which were then exposed to different shear levels (low shear and high shear).The particle size distributions shows particle volume percentage for each particle size (typically between 0.1 and 1000 µm).Thus, each data point represents one flocculation experiment, which includes PSDs of unsheared and sheared flocs (representing flocs in low-shear and high-shear centrifuge mimics).This analysis was performed using Feed 2 at ultra scale-down.The use of PSD changes at different shear levels to evaluate floc strength was also mentioned elsewhere (Rayat ACME et al., 2016;Chatel et al., 2014).The innovative part of this current study is the use of MVDA to study the PSD data.This way, the floc growth and strength are represented by a single point on a PCA score scatter plot (Fig. 8) instead of examining three different PSD (floc, low sheared floc and high sheared floc) for each flocculation condition.(For an introduction as to how principal components are derived see (Bro and Smilde, 2014)).
In Fig. 8, the PCA score plot shows that two principal components explain 83.2% of data variability.After colouring the score plot with the different factors, three clusters by Camp number were observed.The first principal component (x-axis Fig. 8) explains 46.7% of the PSD variability and distinguishes the flocs formed at Ca = 4.63 × 10 5 (i.e., green circles vs red and blue circles).The second principal component (y-axis Fig. 8) separates flocs formed at a Camp number lower than 10 5 (i.e., blue circles, vs red and green circles).The main characteristics that lead to the observation of the three groups are the floc growth and fines reappearance.Flocs formed at Ca = 4.63 × 10 5 displayed better growth than for higher and lower Camp numbers.On the other hand, the higher reappearance of fines under shear was observed for Ca = 4.63 × 10 4 .Aging and addition time has little or no impact on the particle size distribution under these flocculation conditions.
These results are consistent with the results obtained in the earlier visual inspection of the PSDs.PSD data implies floc growth and strength characteristics that can be used to predict centrifugation performance.Their transformation with PCA simplifies the comparison of multiple conditions.This observation is crucial as this could potentially streamline experiments needed to screen flocculation conditions.In particular to this work, determining optimal floc chemistry for robust centrifugation performance will be possible without performing centrifuge experiments.

Conclusion
Ultra scale-down tool was established to study flocculation and centrifugation in the primary recovery.The USD flocculation and centrifugation generated large and complex amount of data which precludes the use of univariate analysis to extract actionable study conclusions or insights.The use of MVDA facilitated data interpretation and visualisation of the relationships within a set of complex data, such as the process conditions and various performance indicators for the bioprocess sequence of floc centrifugation.A sequential multivariate analysis was shown to evaluate the multiparameter impact on flocculation and centrifugation for a specific flocculation system.The sequential analysis showed that homogenate feed and process shear (e.g., as a consequence of a centrifuge feed zone design) have the most significant effect on the efficiency of this flocculation and subsequent centrifugation step.The Camp number was shown to be also important but with less impact than the shear stress and feed variability.Under the conditions used in this study, the flocs formed are shear sensitive, and the increase in Ca did not significantly improve floc strength.It was shown that high values of Ca (≥ 10 5 ) do not translate directly into strong flocs if floc chemistry is not optimal.Flocculation is a complex operation that depends on floc chemical properties and mixing parameters and their interactions.Thus, the development of USD flocculation and centrifugation requires a link to MVDA.In this case, the optimisation of the chemistry side of flocculation (e.g., flocculant type, flocculant dose, pH) is a future requirement to obtain stronger flocs capable of overcoming shear stress and feed variability.
For the feed materials used in the study, losses of total protein content were observed only during the flocculation steps.Additionally, the level of protein yield was not significantly affected by any of the other factors studied.
The data produced at different scales were not significantly different, showing that the established USD floccentrifuge system can be used to predict pilot scale performance for these linked operations.
At the end of the sequential analysis, a PCA model assessed floc growth and strength by evaluating the changes in particle size distribution at different shear levels.The study of floc growth and strength through PCA of PSDs gave the same insights gained by evaluating floc centrifugation performance.PSD analysis of flocs exposed to various shear levels provides an alternative to actual centrifuge studies when optimising flocculation steps for robust primary recovery.Floc strength characterisation is a subject of a separate work in our lab.
For the first time, a USD platform was linked to MVDA to maximise the analysis of experimental conditions and responses to improve primary recovery characterisation.Thus, the sequence analysis demonstrated in this paper can be used to deduce the conditions required to create strong flocs for primary recovery enhancement.Further research in our lab focus on applying the established USD flocculation and centrifugation to an industry-relevant bioproduction system.

Fig. 1 -
Fig. 1 -Experimental conditions and sequence of data analysis.(i) Data analysis sequence used in this study: In (A) Principal Component Analysis (PCA) and Partial Least Squares (PLS) were used to evaluate key factors influencing flocculation and centrifugation process; Analysis of variance (ANOVA) is used to determine in (B) the key factors that affect floc process performance for a specific homogenate feed that was processed through a certain type of centrifuge (hydrohermetic or nonhermetic) and in (C), if the bioprocess is affected by experimental system scale; In (D), particle size distribution (PSD) data was used in PCA to determine if Camp number (Ca), addition time and aging time affect floc strength using data from USD runs with a 45° pitched 4 blade impeller.(ii) Factors and responses for evaluating flocculation and centrifugation performance and protein recovery.

Fig. 2 -
Fig. 2 -Experimental sequence.Bioprocess studies at (A) ultra-scale-down (USD) or lab-scale, (B) pilot flocculation and USD centrifugation and (C) pilot-scale flocculation and centrifugation.Sample points are for the measurement of particle size distribution (PSD), optical density (OD), total protein concentration and filterability.

Fig. 3 -
Fig. 3 -Principal Component Analysis (PCA) score scatter plot coloured by the most important factors: (a) centrifugation feed zone shear level and (b) homogenate batch (feed).The first principal component (t (Balasundaram et al., 2009)) explains 55.8% of the data variability and the second principal component (t (Singh et al., 2016)) explains 27.5% of the data variability.The 95% confidence ellipse was built from 65 different flocculation and centrifugation conditions.LS is the mimic of a low shear centrifuge feed zone (ε diss = 0.045 ×10 6 W kg -1 ), HS is the mimic of a high shear centrifuge feed zone (ε diss = 0.53 ×10 6 W kg -1 ), and cent is the shear level experienced at pilot scale centrifuge feed zone.

Fig
Fig. 2A, 2B, or 2C.The highest VIPs shown in Fig. 4 correspond to the homogenate feed, process shear in the centrifuge feed zone and Camp number.This result is consistent with the findings shown by the PCA, where there are data classifications based on feed and process shear.By analysing the PLS loading plot (data not shown), it is possible to determine how each factor impacts each response.The shear level has a strong positive correlation with the percentage of solids remaining and optical density and a strong negative correlation with filterability.The homogenate feed has a strong correlation with the supernatant total protein.In fact, homogenate from Feed 1 has an average of 1.5-fold higher supernatant total protein content than Feed 2. Although Feed 1 and Feed 2 were prepared under the same flocculation and homogenisation conditions, the thawing method was different, and homogenisation was conducted on separate days (from different stocks of Baker's yeast packs).Total protein measurements on the homogenate before and after freeze-thaw did not change.As a result, the freeze-thawed process was discarded as a cause for the observed total protein difference in the homogenate.This difference was attributed to batch variability between the different stocks of the Baker's yeast used in this study, demonstrating how upstream variations can affect protein release during primary recovery.Camp number had less impact on centrifugation performance and protein content than shear level and homogenate feed.Results suggested that higher Ca produce supernatant which is easier to filter (i.e., better filterability), higher protein yield, lower optical density and percentage of solids remaining.The effect of feed and shear on flocculation was also reported by Chatel et al.However, the importance of achieving optimal flocculant chemistry (e.g., flocculant dose) and Camp number to overcome the impact of feed and shear on floc formation was not mentioned.10Working away from the optimal Ca and flocculation chemistry, the shear level and homogenate variability will have more impact on the performance.In this work, the Ca was varied to improve flocculation and centrifugation performance.Nonetheless, even for Ca higher than 10 5 , significant improvements were not observed as the shear level and feed impact were still more important.Parameters relating to flocculation chemistry

Fig. 4 -
Fig. 4 -Identification of the key factors for flocculation and centrifugation process: variable importance of projection (VIP) for a PLS model with two latent variables.In this PLS model, 35.6% of X variables explain 51% of Y variables variability.The red colour, of the first five bars, identifies the most important factors that impact on flocculation and centrifugation (VIP > 1).

Fig. 5 -
Fig. 5 -Influence of homogenate batch and shear stress on flocculation and centrifugation: Ultra scale-down flocculation and centrifugation of different homogenate batches (Feed 1 and Feed 2).The flocs were exposed to low and high shear levels (LS -low shear equivalent of 0.045 ×10 6 W kg -1 and HS -high shear equivalent of 0.53 ×10 6 W kg -1 ) and centrifuged (at Q/Σ = 2.8×10 -8 m s -1 ).The centrifuge performance was evaluated based on: (a) % solids remaining, (b) OD 600 and (c) supernatant total protein after centrifugation.Flocculation experiments were performed with a Camp number of 4.6×10 5 (G=257 s -1 ), aging time of 15 min and an addition time of 15 min.Error bars represent mean ± SD (n = 7 for Feed 1 and n = 8 for Feed 2 for each shear level).N.M. -not measured.

Table 1 -
Fig. 6 -Centrifuge performance of different floc preparations from different scales (USD, Lab-scale and Pilot -scale): (a) % solids remaining, (b) OD600, (c) supernatant filterability, (d) supernatant total protein and (e) % loss of protein.Ultra scaledown centrifugation were performed at low and high-shear levels (LS: low-shear, equivalent to 0.045 ×10 6 W kg -1 ; and HS: high-shear, equivalent to 0.53 ×10 6 W kg -1 ).During flocculation, Camp number was 4.63×10 5 with aging time of 15 min and an addition time of 15 min for the second homogenate batch (Feed 2).This corresponds to a G of 257 s -1 .Data presented as mean ± SD: n = 5 for ultra scale-down, n = 1 for lab scale and n = 2 for pilot scale.