Measuring and using scanning-gradient data for use in method optimization for liquid chromatography

method-development


Introduction
High-performance liquid chromatography (HPLC) is an indispensable technique in a wide variety of fields, including food science, environmental chemistry, oil analysis, forensics and (bio-)pharmaceutics.In spite of decades of research and development, the mechanisms of HPLC separation are still not fully understood [1][2][3][4][5] .Among the large number of retention mechanisms available, reversed-phase liquid chromatography (RPLC) is the most-common separation mode.In RPLC, analytes are mainly separated based on differences in distribution between a relatively hydrophilic (aqueous/organic) mobile phase and a relatively hydrophobic stationary phase [6] .To facilitate elution of all analytes within an ap-propriate time window, the solvent strength of the mobile phase can be increased during the run by increasing the percentage of organic modifier in a gradient program.Despite the fact that many chromatographic methods rely on gradient-elution RPLC as an HPLC workhorse, method development can still be time consuming, since gradient method development relies on adjustment of several method parameters including gradient slope, possible steps in the gradient and the initial time associated with an isocratic hold (if not zero).Especially for challenging samples, the large number of parameters that can be adjusted requires extensive trial-and-error or design-of-experiment optimization, requiring extensive gradient training data.This is particularly true for samples of short-term interest ( e.g.impurity profiling for a pharmaceutical ingredient in development) or second-dimension separations in 2D-LC, where RPLC is also predominantly used [7] .Still, too often method development involves a great number of trial-Fig.1. Workflow of the method optimization using scanning gradients to obtain retention-model parameters.The workflow starts at the top right with an insufficientlyresolved sample, on which scanning gradients are performed.After that, the two (or more) scanning gradients are linked by peak tracking and the retention parameters are calculated.For the optimization, the different parameters that need to be optimized and their boundaries must be defined.The optimization program can predict outcomes for all combinations of the different chromatographic parameters that are varied.After that the assessment criteria must be defined and applied.The optimized separation can then be verified experimentally, which can either lead to an optimized method or trigger an additional iteration.and-error experiments, rendering the use of LC time-consuming and costly.
To facilitate faster method development, many groups have explored the use of computer-aided method development through retention modelling [8][9][10][11][12][13][14][15][16][17][18][19][20] .The aim of this approach is to predict optimal method parameters for a specific sample and a specific chromatographic system ( i.e. stationary-phase chemistry and mobile-phase composition) through simulation of retention times.Retention modelling will result in faster method development [16] , while it may also yield a better understanding of the influence of different parameters, such as organic-modifier concentration and pH, on the retention [ 15 , 21 ].It is thus not surprising that retention modelling has been widely applied to predict retention of solutes in RPLC as a function of pH, organic-modifier concentration, charge state of the analyte and temperature [22][23][24] .Several strategies for retention modelling exist, but some of these require either extensive knowledge of the analytes or large quantities of input data [ 22 , 25 ].One interesting approach, which does not require any a priori knowledge, is the use of scouting experiments.This strategy is employed in several method-optimization software tools, such as Drylab [26] , PEWS 2 [9] and PIOTR [ 15 , 16 ].Here, a very limited set of specific pre-set gradients are employed to obtain analyte retention times [27] .A suitable retention model, designed to describe retention as a function of mobile-phase composition, is fitted to the experimental data.This yields the retention parameters for each analyte as described by the model.The model is then used to simulate the separation for all analytes under a large number of different chromatographic conditions.The parameters that need to be varied and their boundaries must be defined.Each of the resulting simulated chromatograms is then evaluated against one or more desirability criteria.The most optimal separation conditions can, for example, be determined using the Pareto-optimality approach [28] .This process is described in Fig. 1 .
Retention-model parameters can either be determined from isocratic or gradient-elution retention data (or both) [9] .Isocratic measurements may yield a more accurate description of the retention as a function of mobile-phase composition, but require more tedious experimental work, whereas scanning gradients are less cumbersome.If the shape of the gradient can be accounted for, then isocratic data can be used to accurately predict gradientelution retention times [ 29 , 30 ], the opposite is less true [31] .
Scanning experiments allow LC methods to be rapidly optimized.However, to the best of our knowledge, several factors that may influence the prediction accuracy in retention modelling have hardly been studied systematically, even though they may ultimately determine the usefulness of retention-time prediction.For RPLC, examples of such parameters include ( i ) selection of the appropriate retention model and the number of parameters in the regression model, ( ii ) the effect of the gradient slopes used ( e.g.whether the use of faster gradients compromises parameter accuracy), ( iii ) the minimum number of different gradient slopes required, ( iv ) the minimum difference (leading to a different ratio) between these slopes, and ( v ) the number of replicate measurements for each gradient elution condition.
In this work we have studied each of these aspects systematically using two sets of data having different measurement precision.For each data set by itself, each of the above-mentioned parameters is explained and investigated.Additionally, the feasibility and limitations of extrapolating ( i.e. predicting much slower or faster gradients than those used for scanning) was investigated.Finally, the results are summarized, and guidelines are formulated for successful use of gradient-scanning techniques.

Chemicals
For all measurements concerning the first dataset (Set X), the following chemicals were used.Milli-Q water (18.2M cm ) was obtained from a purification system (Arium 611UV, Sartorius, Germany).Acetonitrile (ACN, LC-MS grade) and toluene (LC-MS grade) were purchased from Biosolve Chemie (Dieuze, France).Formic acid (FA, 98%) and propylparaben (propyl 4-hydroxybenzoate, ≥99%) were purchased from Fluka (Buchs, Switzerland).Ammonium formate (AF, ≥99%), cytosine ( ≥99%), sudan I ( ≥97%), propranolol ( ≥99%), trimethoprim ( ≥99%), uracil ( ≥99.0%), tyramine ( ≥98%) and the peptide mixture (HPLC peptide standard mixture, H2016) were obtained from Sigma Aldrich (Darmstadt, Germany).The peptides in the mixture were numbered one to five on their elution order in RPLC.The following dyes analysed in this study were authentic dyestuffs obtained from the reference collection of the Cultural Heritage Agency of the Netherlands (RCE, Amsterdam, The Netherlands): indigotin, purpurin, emodin, rutin, martius yellow, naphthol yellow S, fast red B, picric acid, flavazine L, orange IV.Stock solutions of all compounds were prepared at the concentrations and with the solvents indicated in Supporting Material Section S-1, Table S-1.From these stock solutions analytical samples were prepared by combining portions of the stock solutions in equal ratios; the specific compounds that were combined into mixtures are also indicated in Table S-1.For the second dataset (Set Y), the following chemicals were used.Milli-Q water (18.2M cm ) was obtained from a purification system (Millipore, Billerica, MA) purpurin ( ≥ 90%), propylparaben ( ≥ 99%), emodin, toluene, trimethoprim, and the peptide mixture (HPLC peptide standard mixture) were obtained from Sigma Aldrich (United States).Rutin ( ≥ 94%) and cytosine were obtained from Sigma Aldrich (China).Berberine and naphthol yellow S were both obtained from Sigma Aldrich (India).Tyramine ( ≥ 98%) was obtained from Sigma Aldrich (Switzerland).Sudan I ( ≥ 95%) was obtained from Sigma Aldrich (United Kingdom).Propranolol ( ≥ 99%) was obtained from Sigma Aldrich (Belgium).Martius yellow was obtained from MP Biomedical (India).Orange IV was obtained from Eastman Chemical Company (United States).Uracil ( ≥ 99.85%) was obtained from US Biological.Flavazine L (Acid Yellow 11) was obtained from Matheson Coleman & Bell Chemicals.Stock solutions of individual compounds were prepared at the concentrations and with the solvents indicated in Supporting Material Section S-1, Table S-2.From these stock solutions analytical samples were prepared by combining portions of the stock solutions in equal ratios; the specific compounds that were combined into mixtures are also indicated in Table S-2.

Instrumental
Experiments of Set X were performed on an Agilent 1290 series Infinity 2D-LC system (Waldbronn, Germany) configured for one-dimensional operation.The system included a binary pump (G4220A), an autosampler (G4226A) equipped with a 20-μL injection loop, a thermostatted column compartment (G1316C), and a diode-array detector (DAD, G4212A) with a sampling frequency of 160 Hz equipped with an Agilent Max-Light Cartridge Cell (G4212-60 0 08, 10 mm path length, V det = 1.0 μL).The dwell volume of the system was experimentally determined to be about 0.128 mL by using a linear gradient from 100% A (100% water) to 100% B (99% water with 1% acetone) and determining the delay in gradient at 50% of the gradient.The injector needle drew and injected at a speed of 10 μL •min −1 , with a 2 s equilibration time.The system was controlled using Agilent OpenLAB CDS Chemstation Edition (Rev.C.01. 10 [201]).In this study a Kinetex 1.7 μm C18 100 Å 50 × 2.1 mm column (Phenomenex, Torrance, CA, USA) was used.
The experiments of Set Y were performed on a 2D-LC system composed of modules from Agilent Technologies (Waldbronn, Germany) but configured for one-dimensional operation using the 2D-LC valve to introduce samples to the column, and the 2D-LC software to control mobile phase composition and switching of the 2D-LC valve.This type of setup has been described previously [ 32 , 33 ].The system included a binary pump (G4220A) with Jet Weaver V35 Mixer (p/n: G4220A-90123), an autosampler (G4226A), a thermostatted column compartment (G1316C), and a diode-array detector (DAD, G4212A) with a sampling frequency of 80 Hz equipped with an Agilent Max-Light Cartridge Cell (G4212-60 0 08, 10 mm path length, V det = 1.0 μL).The 2D-LC valve used in this case was a prototype (p/n: 5067-4236A-nano) that has fixed internal loops with a volumes of about 150 nL.Samples were infused directly into the valve at port #3 using a 1 mL glass syringe and a Harvard Apparatus (p/n: 55-2226) syringe pump at a flow rate of 1 μL/min.The dwell volume of the system was about 0.081 mL.The system was controlled using Agilent OpenLAB CDS Chemstation Edition (Rev.C.01.07 [465]).A Zorbax SB 5 μm C18 80 Å 50 × 4.6 mm column (Agilent) was used.

Analytical methods
Set X was recorded with the following method: The mobile phase consisted of buffer/ACN [v/v, 95/5] (Mobile phase A) and ACN/buffer [v/v, 95/5] (Mobile phase B).The buffer was 5 mM ammonium formate at pH = 3 prepared by adding 0.195 g formic acid and 0.0476 g ammonium formate to 1 L of water.All gradients performed in this study started from 0 min to 0.25 min isocratic 100% A, followed by a linear gradient to 100% B in either 1.5, 3, 3.75, 4.5, 6, 7.5, 9 or 12 min.In all gradients, 100% B was maintained for 0.5 min and brought back to 100% A in 0.1 min.Mobile phase A was kept at 100% for 0.75 min before starting a new run.The flow rate was 0.5 mL •min −1 and the injection volume was 5 μL.The peak tables (S-1 to S-8) can be found in Supplementary Material section S-1.The ten replicate measurements were recorded over a span of multiple days.The buffers used as mobile phase were refreshed several times over the duration of this study.
Set Y was recorded using the following conditions: The mobile phase consisted of buffer (Mobile phase A) and ACN (Mobile phase B), and the flow rate was 2.5 mL/min.The buffer was 25 mM ammonium formate at pH = 3.2.This was prepared by adding 5.98 g formic acid (98% w/w) and 2.96 mL of ammonium hydroxide (29% w/w) to 20 0 0.0 g of water.All gradients performed in this study started at 5% B at 0 min, followed by a linear gradient to 85% B in either 1, 1.5, 3, 3.75, 4.5, 6, 7.5, 9, 12 and 18 min.In all gradients, 85% B was maintained for 0.5 min and brought back to 5% B in 0.01 min.Mobile phase B was kept at 5% for 1 min before starting a new run.Ten replicate retention measurements were made for each gradient elution condition.The entire dataset was collected using a single batch of mobile phase buffer, over a period of three days.

Data processing
The in-house developed data-analysis and method-optimization program MOREPEAKS (formerly known as PIOTR [16] , University of Amsterdam) was used to ( i ) fit the investigated retention models to the experimental data, ( ii ) determine the retention parameters for each analyte from the fitted data, and ( iii ) to evaluate the goodness-of-fit of the retention model.Microsoft Excel was used for further data processing.

Compound selection
Compounds were selected to cover a wide range of several chemical properties, including charge, hydrophobicity and size, to increase the applicability of the results to a broad range of applications.To facilitate robust detection, UV-vis was chosen as detection method.Common small-molecule analytes were included, such as toluene, uracil and propylparaben.In addition, a number of synthetic and natural dyes were selected, which feature favorable UVvis absorption ranges to facilitate identification.Emodin, purpurin, sudan I and rutin, were selected as neutral components.Martius yellow, naphthol yellow S, orange IV and flavazine L were included due to their (multiple) negative charges.The pharmaceutical compounds trimethoprim and propranolol were added to the set to include positively charged analytes.Metabolites, such as tyramine and cytosine, were included, but these analytes eluted around the dead time.The column dead time was determined to be 0.262 min for Setup X with an standard deviation of 0.0027 min ( V 0 = 131 μL) and 0.171 min for Setup Y (determined in 50/50 ACN/buffer) ( V 0 = 428 μL) with a standard deviation of 0.0 0 05 min, which was calculated by analysing the hold-up time of uracil (non-retained analyte).A standard mixture of peptides was added yielding a final list of 18 compounds.The retention times of these compounds were measured for eight different gradient slopes for Set X and ten different gradient slopes for Set Y.Each measurement was repeated ten times over the course of several days for both sets.Set X included three extra components, viz .indigotin, picric acid, fast red B and two extra peptides, while Set Y included berberine.The analyses of Set Y were performed with a single batch of buffer, yielding highly repeatable retention times, whereas Set X was recorded over a span of a week using multiple batches of prepared buffer.This yielded a dataset with highly repeatable data (Set Y), and a set with less-repeatable data (Set X).Where relevant, the measurement precision is shown in the figures in this paper.

Decision on the model
Multiple models to describe retention in LC have been proposed [34] .For RPLC separations the most commonly used model is a linear relationship between the logarithm of the retention factor ( k ) and the volume fraction of organic modifier ( ϕ).This model results in a two-parameter log-linear equation, often referred to as the "linear-solvent-strength" (LSS) model [35] .
where ln k is the natural logarithm of the retention factor at a specific modifier concentration, ln k 0 refers to the isocratic retention factor of a solute in pure water, ϕ refers to the volume fraction of the (organic) modifier in the mobile phase, and the slope S LSS is related to the interaction of the solute and the (organic) modifier.Another two-parameter (log-log) model was proposed by Snyder et al .to describe the adsorption behaviour in normal-phase liquid chromatography (NPLC) [36] .
In this model, the R parameter is the so-called solvation number, which represents the ratio of surface areas occupied by adsorbed molecules of the strong eluent component and the analyte.A more extensive form of the LSS model is the quadratic model (QM), proposed by Schoenmakers et al. , introducing a third parameter [27] .
In this and subsequent retention-model equations, S 1 and S 2 are empirical coefficients used to describe the influence of the organic modifier on the retention of the analyte.Other three-parameter models are also evaluated in this research, viz. the mixed-mode model (MM, Eq. 4 ), which was developed for HILIC separations [37] , and the well-known Neue-Kuss model (NK, Eq. 5 ).
The latter model allowed exact integration of the retention equation, thus simplifying retention modelling in gradient-elution LC [ 14 , 38 ].The above models all account only for the dependence of retention on the organic-modifier concentration.Indeed, charged compounds can also be retained through secondary interactions in RPLC, which can also depend on the organic-modifier concentration.These secondary interactions may lead to increases in prediction errors, and for that reason the results for individual compounds are shown in Figs. 3 , 4 , 6-11 .In these models, the organicmodifier fraction is related to the retention factor, which can be calculated with the retention time ( t R ) and the column dead time ( t 0 ) when performing isocratic elution.
In this calculation, the obtained retention factor can directly be linked to the experimental organic-modifier concentration.When using gradient elution, the retention factor is described by the general equation of linear gradients [27] .(7) In this equation k (ϕ) is the retention model, expressing the relationship between retention ( k ) and organic modifier fraction ( ϕ).The slope of the gradient ( B ) is the change in ϕ as a function of time ( ϕ = ϕ init + Bt) and τ is the sum of the dwell time ( t D ), the dead time ( t 0 ) and the programmed runtime before the start of the gradient ( t init ), yielding isocratic elution.In this equation, k init is the retention factor at the organic-modifier concentration at which the gradient starts.If the analyte does not elute during or before the gradient, the retention time is described by (8) in which t G represents the gradient time.
One frequently used measure for model selection is the Akaike Information Criterion (AIC) [39] .AIC values can be calculated upon fitting a model to the data by considering the sum-of-squares error of the fit (SSE), the number of observations ( i.e. data points, n ) and the number of parameters ( p ).A more-negative value reflects a better description of the data by the tested model.Using more parameters generally enables more facile fitting of the data to a model, but according to Eq. 9 adding more model parameters is penalized by the AIC.
In Fig. 2 A, the average AIC values are plotted for the five different models used to fit Set X (left bars) and Set Y (right bars), using all replicate measurements obtained with eight different gradient slopes (1.5, 3, 3.75, 4.5, 6, 7.5, 9, 12).The ratios between the gradient time and the dead time are comparable for the two sets, but not identical.The range in t g / t 0 values covered is 5.9 to 46.9 for Set X and 5.9 to 105.3 for Set Y.Because the range of values is very similar and strongly overlapping, there is no t g / t 0 bias in our results.Moreover, since we have made no attempt to predict retention on one system using data collected on the other system ( i.e. , no method transfer), any differences in t g /t 0 between the datasets are unimportant in the context of this study.For Set X, the plot suggests that the LSS model describes the data best, but the Neue-Kuss and the quadratic model also yield good AIC values, despite using three parameters.However, data from Set Y was best described by the log-log adsorption model rather than the log-linear LSS model.This suggests that the noise in Set X may obscure the non-linear trend and that scanning experiments are best carried out under highly repeatable conditions.The appropriateness of a non-linear model is consistent with prior observations described in the literature [ 24 , 40 , 41 ].  all ten gradients).However, this model results in a poor description when the input data is limited to three gradient durations ( Fig. 2 B).The latter plot shows a positive average AIC value for the NK model, which indicates a poor description of the data [42] .
An alternative method to assess the goodness-of-fit is to check the accuracy of predictions made using the model.When the model parameters are established using only data from three gradient programs, the retention times of the analytes for the remaining five gradient programs may in principle be predicted and used to validate the model.Models were constructed for each set (X and Y) using the data from the scanning gradients of 3, 6, and 9 min duration.These scanning gradients were selected based on the conventional wisdom that the ratio between the slopes of the two most extreme scanning gradients (the gradient slope factor or GSF, denoted by ) should be at least three [ 16 , 31 , 43 ].At this point it is good to note that the effective slope of a gradient is also related to the span of the gradient ( ϕ = ϕ final − ϕ initial ) and to the dead time ( t 0 ), so that changes in the gradient slope may also occur when changing the flow rate (see Eq. 10 ).
The performance of the models was assessed by predicting the retention times for gradients of 3.75, 4.5 and 7.5 min.The results are shown in Fig. 3 for both datasets (X and Y).The prediction errors ( ε) were calculated using where t R,pred is the predicted retention time and t R,meas is the mean of all considered experimental retention times of the identical gradient.Where relevant, the following figures will indicate which equation was used, and what datapoints were included.The Neue-Kuss (NK) model performed poorly (see the retention plots in Supplementary Material, section S-6) when using just three input gradients and, therefore, it was omitted from the figure.The results for Set X in Fig. 3 show that the two-parameter LSS and ADS models generally yield similar or better predictions compared to the three-parameter models.The box-and-whisker plots are based on 30 prediction errors ( n r = 30; 3 predicted retention times in 10 replicates).Larger experimental variation results in a greater spread of predicted values, although the average prediction error often remains low.The narrow boxplots in the bottom half of Fig. 3 illustrate that a higher prediction accuracy can be obtained from more-precise data.The adsorption model (purple) yields significantly lower errors than the LSS model for almost all analytes.The predictions using the mixed-mode model, which was developed for HILIC [37] , and the quadratic model exhibit relatively large deviations for Set Y.The robustness of fit was found to be better for both two-parameter models (LSS and ADS) than for the three-parameter models (QM, MM and NK; see Supplementary Material, section S-6), where a significant spread in prediction error was observed..

Effect of scanning speed
The total duration of the three measured scanning gradients determines the total time and resources required to obtain the retention data needed to build a retention model.Retention parameters were obtained for all analytes in Set X using three sets of gradients (Series 2 -fast, Series 3 -regular, Series 4 -slow; see Fig. 4 , top).For Set Y an additional series (Series 1 -very fast; see Fig. 4 , bottom) was included.The GSF ( ) value between the slowest and fastest gradient in each series was always approximately equal to 3. Retention times were predicted for a gradient with a duration within the range of the used gradients ( i.e. interpolation; the performance of Series 1 was assessed by predicting the retention time for a 3-min gradient and Series 2, 3 and 4 with gradients of 3.75, 7.5 and 9 min, respectively).The results are shown in Fig. 4 .
For the results shown in Fig. 4 , the prediction error was calculated using Eq.11a , which allowed comparison of the four series.The results in Fig. 4 suggest that the scanning speed ( i.e. the different sets of scanning gradient lengths used) is insignificant relative to the measurement precision.In addition, the predicted retention times deviate mostly less than 0.5% from the measured retention times.For Set Y, almost all the prediction errors of Set Y are below 0.2%.Next to that, the prediction errors are smaller than for Set X, even when using very steep gradients (Series 1).Consequently, there is no evidence to support choosing either a fast or slow set of scanning gradients.The results suggest that relatively short scanning gradients can be used to build a reliable model.However, if the model can only be used for interpolation, the range of useful applications for a series of short gradients may be very narrow, which could be a reason to opt for a broader range of scanning gradients.This will be addressed below in Section 3.3 .

Effect of number of replicate measurements
Building a model using more replicate measurements will generally decrease the influence of the measurement precision on the  prediction error.This raises the question how many replicates suffice ( i.e. yield an acceptable prediction error).To investigate this, retention times were predicted for gradient times of 4.5 and 7.5 min as a function of the number of replicate measurements used ( i.e. the number of sampled replicates from the total of ten measurements in this study for each gradient).In all cases, the retention parameters were established for each compound using scanning gradients of 3, 6 and 9 min.The resulting prediction errors for all compounds are shown in Fig. 5 as a function of the number of sampled replicates.Note that the number of points used is much larger for a small number of replicates, as the total pool of experiments allows many more variations.
The trends in Fig. 5 suggest a small improvement in prediction accuracy for Set X ( Fig. 5 A) as more replicate measurements are sampled, whereas this is not the case for Set Y ( Fig. 5 B).This is in agreement with the fact that Set X features a larger measurement precision than Set Y.The precision of Set X only becomes similar to that of Set Y when seven or more replicate measurements are used.Although more replicates are usually thought to reduce the effect of experimental variation, Fig. 5 B suggests that Fig. 6.Average prediction errors relative to the measured point of the retention times of each compound for a gradient time of 4.5 and 7.5 min, using 1 to 10 replicate measurements of the experimental scanning gradients for Set X (top, using LSS model) and Set Y (bottom, using ADS model).Prediction errors calculated using Eq.11b prior to averaging.The spread (standard deviation) of the predicted retention times is indicated by the error bar and the measurement precision is indicated in grey on the right of each cluster.See Supplementary Material, Section S-8, Fig. S-14 for the remainder of the compounds.
with high-precision retention-time measurements a single set of experiments may suffice.This is perhaps counterintuitive, but the model is constructed using a total of three gradients.Apparently, with high-precision measurements the model is constrained sufficiently to yield a robust prediction performance.This is also in line with the improved AIC values for the non-linear adsorption (ADS) model for Set Y (see Fig. 2 ).
Fig. 6 shows the prediction error as a function of the number of replicate measurements for each compound separately for Set X (top) and Y (bottom).Generally, the results are in agreement with those of Fig. 5 .However, for a number of compounds the influence of the number of replicates is much more profound for Set X and to a lesser extent also for Set Y. Compounds such as martius yellow, naphthol yellow S, rutin and trimethoprim feature a relatively low measurement precision in Set X.All of these compounds are charged under the mobile phase conditions, and thus their retention may be more sensitive to small changes in buffer concentration and pH.In contrast to Set Y, Set X was measured over the span of days, using several batches of buffer.Therefore, chromatographers are encouraged to take all possible measures to maximize the measurement precision, before recording scanning gradients.Another difference between Set X and Set Y was the column used, which vary in the extent to which the stationary phases can inter-act with analytes through secondary interactions.This could lead to larger prediction errors for charged species.

Replicate scanning gradients or spread their duration?
Another practically relevant question is whether the accuracy of the predictions can be improved by increasing the number of different gradient times that are used, rather than repeating measurements with the same gradient time.To test this, two different sets of scanning gradients were considered, each using a total of six scanning gradients, and thus six retention times per compound for fitting the model.The first set (A) consisted of three replicate measurements each of the 3-min and the 9-min scanning gradients.The second set (B) comprised single measurements from six different scanning gradients (1.5, 3, 3.75, 6, 9, 12 min duration).The retention times from gradients (4.5 and 7.5 min) that were not used to build the model were used to test the accuracy of prediction.This process was carried out in triplicate, using three different sets of retention times.The absolute errors in the resulting replicates of predicted retention times were pooled, before conversion to relative errors and creating the plots shown in Fig. 7 .This was performed with the LSS model for Set X (X1, top left) and the ADS model for Set Y (Y2, bottom right), indicated with the blue background.To make sure that findings were not model-dependent, the Fig. 7 shows that the prediction errors are similar for the set of two gradients performed in triplicate and the set of six different gradients.It is clear that using a non-optimal model (X2 and Y1) increases the prediction error, which is consistent with the results shown in Fig. 3 .The difference in prediction error between Fig. 7 -X1 and Fig. 7 -Y2 is due to the difference in measurement precision between Set X and Set Y.For models depending on more data ( e.g.Neue-Kuss) this conclusion may not be valid.Fig. 7 applies to two-parameter models.When the measurement precision is lower, it may be beneficial to use multiple replicates (see Fig. 6 ).For this reason, and because running fewer different methods with more replicates is easier than measuring a larger number of different gradients just once, replicate measurements may be preferred over a wider spread at the cost of a reduced interpolation range in t g .

Effect of the gradient-slope factor of the two most extreme scanning gradients
The gradient-slope factor between the two most extreme scanning gradients ( , Eq. 10 ) is typically chosen around three [16] .For example, when a 3-min scanning gradient is chosen as a starting point, the other scanning gradient that needs to be measured will typically be (at least) 9 min in duration (assuming identical composition span and column dead time).The origin of the ≥ 3 recommendation is unclear.In this section we will investigate the effect of the magnitude of the value.Combining a 3-min scanning gradient with gradients of 1.5, 3.75, 4.5, 6, 7.5, 9, or 12 min duration will result in values of 0.5 (or 2), 1.25, 1.5, 2, 2.5, 3, and 4, respectively.Previously ( Figs. 3 , 4 , 6 , 7 ), we used the prediction accuracy for a specific gradient as a measure to assess the effects of various parameters.However, this approach cannot be used to compare the influence of the value, because a specific gradient will sometimes be within and sometimes outside the range of slopes spanned by the two scanning gradients.Thus, for comparison, the retention parameters ( i.e. slopes and intercepts of the retention models, ln k 0 and S values for the data of Set X described by the LSS model and ln k 1 and R values for the data of Set Y, described by the ADS model) were obtained for each value and for each compound (with ten replicate measurements per ).The re-sulting values were then compared with the benchmark values obtained for = 3 .In Fig. 8 -X1 and 8 -X2, respectively, the ln k 0 and S parameters are shown for data Set X and in Fig. 8 -Y1 and 8 -Y2, respectively, the ln k 1 and R parameters are shown for data Set Y (all relative to the values obtained for = 3).The extent of the agreement between the calculated parameters indicates a high similarity between the models.
The plots of Set X in Fig. 8 show that variations in the model parameters are mostly small, except for the fastest scanning gradients (1.5 and 3 minutes, = 0.5, dark blue points).In that case ln k 0 and S tend to covary simultaneously.The largest variations are observed for charged compounds ( e.g.Fig. 8 -X2, naphthol yellow S and orange IV) and for rutin, and variations tend to increase with decreasing .In the plots for Set Y ( Fig. 8 -Y1 and 8 -Y2) similar trends are visible for martius yellow and toluene.The plots for Set Y include two extra values (0.33 and 6, based on 1-min and 18-min gradients, respectively).The results from these two additional factors follow a similar pattern.The data for = 0.5 show a larger deviation from the black line than those for = 2 and the data for = 0.33 deviate significantly from the black line ( = 3).The data in Fig. 8 suggests that scanning gradients of 3 and 3.75 min ( = 1.25) produce retention times similar to these obtained from scanning gradients of 3 and 9 min ( = 3).To verify this, the retention times for the 7.5-min gradient were predicted using fitting parameters obtained using various combinations of scanning gradient data (with 10 replicates).The results are shown in Fig. 9 .Other approaches to establish the effect of on the prediction error have been followed, as described in Supplementary Material, section S-10, Fig. S-18-24.
Fig. 9 shows that a value of > 3 does not always result in the smallest error.A value of = 4 or = 6, based on longer (12 or 18 min) gradients was expected to yield the most reliable results, but greater prediction errors are typically observed than for = 2 or = 3.This could feasibly be explained by a lower measurement precision in longer gradient runs, but when the measurement precision is increased, as is the case for Set Y, the same trends are observed.The detrimental effect of using long gradients is more severe for = 6 than for = 4.All these results suggest that the prediction accuracy depends less on the gradient-slope factor ( ) than on the proximity of the slope of the scanning gradients to that of the predicted gradient.For example, when retention for a  7.5-min gradient is predicted, the closest scanning gradients are those of 6 min ( = 2) and 9 min ( = 3).These conditions result in the lowest prediction errors in Fig. 9 .Scanning gradients that differ more from the one that is to be predicted, for example longer gradients of 12 min ( = 4) or 18 min ( = 6), or shorter gradients of 4.5 min ( = 1.5) or 3.75 min ( = 1.25), result in increased prediction errors, independent of whether interpolation or extrapolation is required.These effects are observed more clearly for Set Y, where the measured precision is increased.For Set X, the lowest values yield the highest deviation for charged compounds, such as naphthol yellow S, orange IV and flavazine L. Low values (below 1) also yield poor prediction errors using the data from Set Y.The main conclusion from Fig. 9 is that the proximity of the slope of the scanning gradients to that of the predicted gradient is a much more important factor than the value of per se .

Limits of use
Generally, it is not advisable to extrapolate the retention model to predict retention times for gradients that are shorter or longer than those used for scanning.When applying scanning gradients to the development of LC × LC methods, it is interesting to inves- tigate whether retention times obtained using very short gradients ( i.e. similar to conditions used for 2 D separations) can be used to predict retention times under gradient conditions where shallower slopes are used ( i.e. 1 D methods).For example, when using the reference scanning gradient set ( i.e. 3, 6 and 9 min), it is thought to be best used to predict retention times for gradients with durations between 3 and 9 min.This conventional wisdom is tested in this section of the paper.Using the retention parameters obtained using the reference scanning gradient set to predict retention for faster gradients, such as 1.5 min, is expected to yield higher prediction errors than scanning sets that embrace this scanning gradient time ( Fig. 9 ).In the top two graphs of Fig. 10 , the prediction error for a 1.5-min gradient is shown for all compounds, calculated from a model constructed using retention times obtained from scanning gradients of 3, 6, and 9 min for different numbers of replicates (1 to 10).The prediction error for Set X remains relatively large as the number of replicates increases, irrespective of the measurement precision.This conclusion may be affected by the relatively low flow rate used for such a short gradient time.At higher flow rates, faster gradients are less affected by deformation of the gradient profile [30] .Set Y was recorded with a higher flow rate and a higher precision and, again, the prediction error does not appear to decrease with an increasing number of replicate measurements.
The same approach was used to predict retention times by extrapolation towards shallower gradients.Using the same reference gradient set, the retention times of all compounds were predicted for the 12-min gradient as a function of the number of experiments ( Fig. 10 ).The prediction error decreases with increasing number of replicate measurements for compounds with a large experimental variation (naphthol yellow S, martius yellow) in Set X.The same pattern was observed for other charged compounds (see Supplementary Material section S-11, Fig. S-25).However, for all the other compounds in Set X and for all compounds in Set Y the prediction error is barely affected by the number of replicate measurements, which is consistent with our earlier conclusion regarding Set Y (see Fig. 6 ).
The prediction errors resulting from extrapolation toward either slower or faster ( Fig. 10 ) gradients are higher than for gradients with a slope within the range used to establish the model parameters ( Fig. 6 ), but extrapolation towards shallower gradients yields smaller errors than towards steeper gradients.Especially for highly charged compounds with low experimental precision, such as martius yellow or naphthol yellow S, multiple replicate measurements may enhance the predictive ability of the model.In the Supplementary Material section S-11 Fig. S-26 the same pattern is observed for fast red B and picric acid.However, for compounds with highly repeatable retention times the prediction error is not affected by the number of replicates.
Since gradient-scanning techniques are used for the development and optimization of 2D-LC methods [ 7 , 44 ], prediction of first-dimension retention times ( i.e. in slow gradients) from second-dimension retention times ( i.e. fast gradients) is of interest.In the previous section, the retention times were predicted for a 12-min gradient using the reference set of scanning gradients (3, 6 and 9 min).The same predictions (12-min gradient) were also made using a model based on retention data from a set of faster gradients (1.5, 3 and 4.5 minutes) from Set X.For Set Y, retention times for an even slower gradient (18 min) could be predicted using a model constructed using data from an even faster set of scanning gradients (1, 1.5 and 3.75).Fig. 11 shows that large errors of up to 4% result from the prediction of retention times for the slow gradient (12-min) from the model based on fast scanning gradients for Set X.In a hypothetical 20-min gradient, this amount to a difference of 48 s.For Set Y it can be seen that these errors increase when the difference between the lengths of the target and scanning gradients increases.In almost all cases the retention in slow gradients is overestimated by the model.

Concluding remarks
In this paper we describe a systematic, in-depth study into the application of retention modelling for development and optimization of RPLC separations.Two data sets were recorded (X and Y), using the same analytes and similar instrumentation, but in different locations and with slightly different conditions.Set X was recorded under typical LC conditions and as such may be representative for common practice.In Set Y, conditions were chosen to minimize the experimental measurement variability, including the use of a higher flow rate (2.5 compared to 0.5 mL/min.; see ref [32] ), and precise control over re-equilibration time following gradient elution [45] .This latter data set represents the highest preci-sion achievable in our hands.Five different retention models were investigated.For Set X, a log-linear (or "linear solvent strength", LSS) model was found to provide the best fit of the data; for Set Y a log-log ("adsorption", ADS) model proved optimal.Generally, at least two scanning gradients (for a two-parameter model) that differ in their (effective) slopes by at least a factor of three are used [ 16 , 31 , 43 ].Therefore, a benchmark set of three scanning gradients with durations of 3, 6 and 9 min was designated in this study (from 5 to 95% or 5% to 85% of strong solvent for Set X and Set Y, respectively).Fig. 12 was constructed by condensing the effects of the investigated parameters on the prediction accuracy of all compounds studied.We come to the following conclusions from the resulting data.
• Whereas it is frequently recommended that the slopes of scanning gradients used to obtain retention data should vary by a factor of three or so, we do not see any evidence in our results that support this guideline.That is, similar retention prediction errors were obtained from models based on scanning gradients with slopes varying by a factor of three compared to models based on gradients with slopes varying by as little as 1.25.We also observe that the speed ( i.e. , absolute analysis or gradient time) does not have a strong impact on prediction error.On the other hand, the data show that the proximity of the slope of a gradient, for which retention will be predicted, to one of the scanning gradients, used to build the model, is far more determinant of retention prediction error.With decreasing proximity, it is more important that the slope of the target gradient lies between the slopes of the scanning gradients ( i.e. , interpolation is better than extrapolation, as one would expect).These findings have obvious implications for the design of experiments; using scanning gradients with a large variation in slopes is not required per se , but using a large range of slopes enables prediction of retention for a wider array of gradients without extrapolating.• When designing experiments for the purpose of building a retention model, one has to decide how to allocate instrument time and choose whether to repeat measurements for a small number of scanning gradients, or to do fewer repeat measurements for a larger set of gradient times.Using prediction error as a metric of model performance, the data do not show any general preference for sets of scanning gradients focused on replicate measurements (e.g., three replicate measurements each of two different gradients) or ones focused on using many different gradient times (e.g., one replicate each of six different gradients).However, in cases where the variability of retention measurements in scanning gradients is high, the predictive performance of models can be improved by making more repeat measurements.• Finally, predicting retention times for relatively slow gradients using a model constructed from data obtained from fast gradients led to relatively large prediction errors.Unfortunately, this makes it impractical to accurately predict first-dimension retention times using models constructed from second-dimension retention data for use in the development and optimization of comprehensive two-dimensional liquid chromatography.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 2 A
suggests that the Neue-Kuss model describes the retention relatively well when eight different gradients are used to establish the model (supported by Fig. S-3, using the full set of

Fig. 2 .
Fig. 2. Comparison of average AIC values for all studied components for the five different models using A) all replicate measurements from eight measured gradients (1.5, 3, 3.75, 4.5, 6, 7.5, 9, 12), B) all replicate measurements from the gradients with duration of 3, 6 and 9 min.exclusively.For every pair, the first bar depicts the AIC value of Set X and the second bar represents Set Y. See Supplementary Material, section S-3, Tables S-9 through S-18 for a full list of all determined AIC values for all individual components and section S-4, Fig. S-1 for a plot of the AIC values for the complete set of gradients of Set Y.

Fig. 3 .
Fig. 3.Comparison of the prediction errors (for gradient times of 3.75, 4.5, and 7.5 min) relative to the measured points for Set X (top) and Set Y (bottom) using retention parameters obtained using retention data from gradient times of 3, 6 and 9 min in the linear solvent strength (LSS, dark blue), adsorption (ADS, purple), quadratic (QM, orange) and mixed mode (MM, yellow) models, calculated using Eq.11a .The box-and-whisker plots are all based on a total of 30 prediction errors, i.e. ten replicates for three different predicted gradients.The whiskers represent the distance from the minimum to the first quartile (0%-25%) and from the third quartile to the maximum (75%-100%) of each set of predictions.The box indicates the interquartile range between the first and third quartile (25%-75%), and the median (50%) is indicated by the horizontal line inside the box.Data are shown for a selected number of analytes.See Supplementary Material, section S-5, Fig. S-2 for the results for the remainder of the compounds in this study.

Fig. 4 .
Fig. 4. Comparison of prediction errors relative to the measured retention times using three (Set X, top) or four (Set Y, bottom) different sets of scanning gradients, with different total durations.Predictions were made with the LSS model for Set X and the ADS model for Set Y and the prediction error was calculated using Eq.11a .See Supplementary Material Section S-7, Fig. S-13 for the remainder of the compounds.See text for further explanation.

Fig. 5 .
Fig. 5.The relative prediction errors calculated using Eq.11b for all compounds investigated in this study as a function of the number of sampled replicates from the total pool of experiments for Set X (A) and Set Y (B).The cross represents the mean and the points indicate outliers.

Fig. 7 .
Fig. 7. Prediction error relative to the measured retention time for two different sets of input scanning gradients, one created by repeating measurements and one by spreading measurements.Predictions performed in triplicate for 4.5-min and 7.5-min gradients, with the LSS model (X1, Y1) and the ADS model (X2, Y2) for both Set X and Set Y. Prediction errors are calculated using Eq.11b .The cross represents the mean and the points indicate outlier points.See Supplementary Material, Section S-9, Fig. S-15 for the remainder of the compounds.

Fig. 8 .
Fig. 8. Model parameters obtained for Set X (LSS model; X1, ln k 0 ; X2, S ) and Set Y (ADS model; Y1, ln k 1 ; Y2, R ) all relative to the values obtained for = 3 (black line).Data points reflect averages based on ten replicate measurements.See Supplementary Material, section S-10, Fig. S-16 for the remainder of the compounds.

Fig. 9 .
Fig. 9. Prediction error of retention relative to the measured retention times in a 7.5-min gradient calculated with various combinations of scanning gradients (indicated by the values at the bottom of the figure; one gradient is always 3 min in duration) for Set X (LSS model) and Set Y (ADS model).Prediction errors are calculated using Eq.11a .Results are based on ten replicate measurements.See Supplementary Material, section S-10, Fig. S-17 for the remainder of the compounds.

Fig. 10 .
Fig. 10.Prediction errors relative to the measured point for retention in a 1.5-min and a 12-min gradient for each compound as a function of the number of replicate experiments, using the reference set of scanning gradients (3, 6, and 9 min) for Set X (using the LSS model; 1,5, first frame; 12, third frame) and Set Y (using the ADS model; 1,5, second frame; 12, fourth frame).Prediction errors are calculated using Eq.11b .The measured precision is shown in grey to the right of each cluster.See Supplementary Material, section S-11, Fig. S-25 and S-26 for the remainder of the compounds.

Fig. 11 .
Fig. 11.Prediction error relative to the measured point for the retention times of all compounds in a 12-min (top) and in an 18-min (bottom) gradient predicted from models constructed using two or three different sets of scanning gradients.Data based on 10 replicate measurements.Predictions are made with the LSS model for Set X and the ADS model for Set Y. Prediction errors are calculated using Eq.11a .See Supplementary Material, section S-11, Fig. S-27 for the remainder of the compounds.

Fig. 12 .
Fig. 12. Combined results of all scanning-gradient parameters.The box-and-whisker plots represent the average prediction error of all the compounds for Set X (top) and Set Y (bottom).Predictions are made with the LSS model for Set X and the ADS model for Set Y. Prediction errors are calculated using Eq.11a for columns with heading Model, Speed, GSF and 2 D to 1 D, and Eq.11b for columns with heading Nr. of repeats, Repeat or spread, Extrapolation 1.5 and Extrapolation 12.