Strategies for predictive modeling of overloaded oligonucleotide elution profiles in ion-pair chromatography

,


Introduction
Oligonucleotides (ONs) are an emerging class of drugs of diagnostic and therapeutic use [1].During synthesis and storage of ONs, a significant number of full-length product (FLP)-derived impurities are generated [2].Methods that qualitatively and quantitatively characterize ONs and their impurities, and can robustly separate them at preparative scale, are critical for the delivery of high-quality ONs.One popular method used in the separation of ONs is ion-pair chromatography (IPC).It is based on reversed-phase liquid chromatography (RPLC).
Unmodified ONs such as ribonucleic acid (RNA) and deoxyribonucleic acid (DNA) are sensitive to endogenous nucleases and are therefore promptly degraded in vivo [3].One strategy to overcome this drawback is the addition of phosphorothioate (PS) modifications to the ON backbone [4].Despite improved nuclease resistance and protein binding properties of such modified ONs, PS modifications produce ONs with a large number of diastereomers [5].More specifically, 2 n diastereomers, with n being the number of PS linkages [6].Thus, for a 16-nucleotide-long fully PS-modified ON, one can expect 32,768 diastereomers.Traditional IPC-based methods can only partially differentiate these diastereomers, which results in peak broadening [7,8].In this context it is worthwhile mentioning that the quality control (QC) of a newly approved therapeutic ON is more challenging than the QC of classical small active pharmaceutical ingredients (APIs) [8].
IPC is a technique for separating charged polar solutes for which RPLC does not provide satisfactory retention [9,10].Adding lipophilic charged components called ion-pair reagents (IPRs) to the mobile phase causes a drastic increase in the retention of charged polar solutes, enabling their separation.IPC has been under development for almost 40 years [10,11].In recent years, the interest in the technique has increased mainly due to the renewed interest of (bio)pharmaceutical industries in the development of ON therapeutics [8].The chromatographic mechanism behind the retention of charged solutes during IPC is still not fully understood.Generally, two main mechanisms are accepted: (i) formation of a less polar ion-pair complex between the IPR and the solute in the mobile phase, followed by interaction of this complex with the stationary phase; and (ii) adsorption of IPR to the stationary phase, followed by retention of solute on this IPR adsorbed on the stationary phase [9].
IPC has been well studied at the analytical scale [8].Nonetheless, such a technique has yet a lot to gain from studies at the preparative (overloaded) scale, especially when it comes to modeling it.In general, IPR is usually considered as changing the retention and is not explicitly modeled [12].Åsberg et al. predicted the overloaded elution profiles of peptides in the absence of additives such as trifluoroacetic acid (TFA) using an equilibrium-dispersive (ED) model together with Langmuir isotherm equations [13].They showed that including TFA in the mathematical model does not influence the ability of the model to correctly predict the overloaded elution profiles.A similar model was applied in overloaded elution profiles of 9-and 10-mer deoxythymidine (dT)-based ON in IPC [14].As this study was limited to relatively simple ONs, further examination of this model using more representative ONs, especially modified ones, is needed.
Recently, we developed a multilayer electrostatic-modified adsorption model describing the adsorption process in IPC [15,16].This multilayer-based model assumes that the IPR adsorbs to the stationary phase, creating a primary layer with active sites that serve as ion-exchange sites for the solute.The solute can then adsorb on the established IPR layer, creating a secondary layer on the stationary phase.In the mathematical description, electrostatic attraction and repulsion were also required.The model was able to predict profiles from Langmuirian to U-shaped, and then back again to Langmuirian profiles, without changing the equations or model parameters [15].Our previous findings illustrate that advanced models can be used in fundamental studies to gain knowledge of the chromatographic process and explain the various phenomena taking place in it.However, from a process perspective, complex models are tedious and challenging to be validated.Therefore, simple mathematical models are desired when it comes to predicting overloaded concentration profiles and using them for process optimization.
The aim of this study is twofold: i) to develop and investigate different adsorption isotherm models that can predict the overloaded concentration profiles of ONs eluted in IPC; and ii) to investigate how the PS modification affects the modeling and separation of closely related impurities.For this purpose, two types of models will be investigated.In the first case, simple isotherm models will be used.They will be one-and two-component competitive Langmuir isotherm models, modified to the gradient mode, that are suitable for process optimization.In the second case, a multilayer isotherm model that considers IPR adsorption on the stationary phase will be derived and used to describe elution profiles in the competitive two-component case.This more advanced model is best suited for fundamental investigations.The onecomponent case involves FLP, while the two-component competitive case includes FLP and FLP with its n -1 truncated ON.As model compounds, three different heteromeric ONs with varying number of phosphorothioate linkages will be employed, enabling us to investigate how the degree of PS modification affects the separation.

Retention modeling
The retention of a solute is dependent on the concentration of a cosolvent in the eluent.Often, the simple linear solvent strength (LSS) theory is used.According to LSS theory, the retention factor's (k) dependency on the volume fraction (φ) of the organic modifier in the eluent can be expressed as [17]: where k w is the retention factor for the solute in the eluent without a modifier and S is the so-called solvent strength parameter, describing the retention dependence of the solute to the modifier content in the mobile phase.
In the case of the solutes being separated using a linear co-solvent gradient, the retention time of the solute can be expressed as [17]: with where k init is the retention factor of the solute in the initial co-solvent composition, t 0 is the column hold-up time, t d is dwell time, t g is the gradient time, G is the gradient steepness factor, and Δφ is the change in the modifier fraction between the beginning and end of the gradient.
To calculate overloaded elution bands, one needs to use a column model together with an appropriate description of the adsorption/ desorption process.In this study, the ED model is used as the column model.It can be described as in [18]: where z is the position along the column, t is the time, u is the superficial velocity, ε t is the total porosity, and D a is the apparent dispersion coefficient; q i and c i are the concentrations of the i:th compound in the stationary and mobile phases, respectively.Danckwerts-type boundary conditions were used at the column inlet and outlet [18].The gradient was linear and has been described by Åsberg et al. [19].

One-component and two-component competitive Langmuir adsorption models
The ED column model needs to be supplemented with an appropriate description of the adsorption/desorption process.For process optimization, when triangular overloaded elution bands are observed, the most commonly used model is the classical Langmuir adsorption isotherm.In gradient modeling, a common assumption is that the adsorption isotherm model itself is not affected, with only the adsorption isotherm parameters being changed.Using the LSS theory, the Langmuir isotherm model modified to gradient elution for the one component can be written as: where a 0 and K 0 are the Henry constant and association equilibrium constant for a mobile phase without any modifier, respectively.S a and S K are the parameters expressing the rate of change of the Henry constant and association equilibrium constant with the modifier in the eluent, respectively.The competitive two-component Langmuir adsorption model for component i can be written as: Observe that the S parameter in the numerator and denominator of the isotherm equations Eqs. ( 5) and (( 6)) can be different.For simplicity, often the same S parameter (S K = S a ) is used in modeling the overloaded elution bands in the gradient mode [12,20].From the above isotherm M. Leśko et al. equations, it follows that the Henry and equilibrium constants change during the gradient elution.Because the Henry constant is the product of the monolayer saturation capacity (q s ) and the equilibrium constant (K), the monolayer saturation capacity is affected by the amount of modifier in the eluent.Unsurprisingly, the use of different S parameters has been shown to improve the agreement between calculated and experimental elution bands [21].

Multilayer two-component competitive adsorption model
The simple adsorption isotherm models presented in Eqs. ( 5) and (6) assume that the IPR plays a more passive/hidden role and that the adsorption of the IPR to the stationary phase can be ignored.From a fundamental perspective this is not a reasonable assumption, because it is known IPR adsorbs to the stationary phase [22] and that this phenomenon affects the adsorption isotherm [15].To develop a more realistic model that incorporates this knowledge and that is simpler than our previously published multilayer model [15], the following assumptions were made: 1 The solute is not adsorbed to the stationary phase if using a mobile phase without any IPR.This assumption is based on the fact that highly polar ONs are generally not retained without any IPR in the eluent.2 The IPR adsorbs to the stationary phase and the adsorption process will depend on the amounts of IPR and organic modifier present in the mobile phase.Here, we will assume that the adsorption process can be described using the gradient-modified Langmuir model (see Eq. ( 7)). 3 The IPR layer serves as the charged surface on which ONs can adsorb.
The monolayer saturation capacity is defined as a 0,IPR , where q IPR is the stationary-phase concentration of IPR and α i is a factor accounting that ONs generally interact with several IPRs in the IPR layer on the stationary phase.For simplicity, we assume that the adsorption process can be described by a competitive Langmuir adsorption model.Here, we will ignore the fact that the adsorption isotherm parameters are directly affected by the amount of cosolvent in the eluent.
From the first and second assumptions we can state that the adsorption of IPR to the stationary phase can be expressed as: where a 0,IPR equals K 0,IPR •q s .According to the third assumption, the adsorbed IPR serves as the charged surface on which the ONs can adsorb, thus: where K 1 and K 2 are the equilibrium constants for the components 1 and 2, respectively.
To simulate the overloaded elution profiles, the multilayer twocomponent competitive adsorption model was used in combination with the ED model (Eq.( 4)).

Chemicals and materials
The mobile phase additives (dibutyl)ammonium acetate (DBuAA) and tributylammonium acetate (TbuAA) were prepared by mixing, respectively, (dibutyl)amine (≥99.5 %, CAS number: 11-92-2) and tributylamine (≥99.5 %, CAS number: 102-82-9) with acetic acid (≥99.5 %, CAS number: 64-19-7).Uracil (>99 %, CAS number 66-22-8) was used as a dead time marker.All these reagents were purchased from Sigma Aldrich (St. Louis, MO, USA).The mobile phases were prepared using HPLC or Optima grade acetonitrile (MeCN) (Fisher Scientific, Loughborough, UK) and deionized water with a resistivity of 18.2 MΩ cm delivered from a Milli-Q EQ 7000 water purification system (Merck Millipore, Darmstadt, Germany).10 μmol or 250 nmol aliquots of lyophilized ONs (see Table 1) were purchased from Integrated DNA Technologies (Leuven, Belgium).The ONs were used without further purification, and prepared as stock solutions of 10 mg mL -1 in deionized water.To minimize degradation, samples were kept at 4 • C. In this study, three different 16-mer ONs with the same sequence but varying degrees of PS modifications were used.These ONs are usually referred to here as FLP.Nonetheless, to differentiate between them, the following simplified naming convention is also used: (i) the 16-mer native ON is referred to as MALAT; (ii) the partially PS-modified 16-mer ON is referred to as MALAT-50PS; and (iii) the fully PS-modified 16-mer ON is referred to as MALAT-PS.See Table 1 for more details.For the impurities, the following naming is used: (a) the 5′ end truncated n -1 impurity of the three FLP ONs is referred to as FLP-dT; and (b) the 5′ end phosphodiester impurity of each PS-modified ON is referred to as (P = O) 1 .See Table 1 for more details.
Except for the purity assessment above, all other experiments described here were carried out on an Agilent 1260 Infinity II Bio-Inert LC system (Agilent Technologies, Palo Alto, CA, USA).An XBridge C18 column (2.1 × 100 mm, 5 µm, 145 Å; Waters Corporation) was employed here.

Characterization of FLP-containing ONs
Conditions adapted from Rentel et al. [23] were employed for the characterization of the MALAT, MALAT-50PS, and MALAT-PS ONs.In each analysis, 3 µL of 0.1 mg mL -1 of each ON was injected into the column.The mobile phases consisted of 10/90 (v/v%) MeCN/10 mM TBuAA (eluent A) and 100 % MeCN (eluent B).The gradient program entailed ramping eluent B from 30 % to 70 % over 33 min, followed by a 1-min wash step at 80 % eluent B and a 10-min re-equilibration step at 30 % eluent B. The flow rate and column temperature were set at 0.25 mL min -1 and 50 • C, respectively.Ultraviolet (UV) chromatograms were acquired using a photodiode array detector, with wavelength ranging from 210 to 400 nm and resolution set to 2.4 nm.UV chromatograms were then extracted at 260 nm.
Acquisitions on the mass spectrometer were performed in negative polarity, high-resolution mode, with a scan time of 0.2 s and a scan range of 500-3000 m/z.For the ESI parameters, the capillary voltage was set at 1.27 kV, sampling cone at 45 V, source temperature at 150 • C, desolvation temperature at 550 • C, and gas flows at 50 L h -1 (for the cone) and 998 L h -1 (for desolvation).Lockspray correction was performed using leucine-enkephalin (100 pg µL -1 ) with m/z 554.2615 and a scan time of 0.2 s at 30-s intervals.

Fundamental study
To comprehensively study adsorption isotherms, one needs to explore the adsorption process across a range of concentrations.This is achieved by having a substantial column load at the overloaded studies.For the overloaded injections, samples were further diluted from the stock solution to 5 mg mL -1 and 0.5 mg mL -1 for the FLP and n -1 ONs, respectively.Analytical samples were diluted to 0.1 mg mL -1 .The injection volumes in the overloaded experiments were 5 µL, 10 µL, 17 µL, and 25 µL in the case of single component, and 5 µL, 10 µL, 15 µL, 20 µL, and 30 µL in the case of two components.The analytical injections were carried out with an injection volume of 3 µL.The different column loads in each gradient slope considered here were utilized to enable determination of calibration curves.The same calibration curves were used for the FLP and n-1 ONs.Eluents were prepared by weighing the appropriate amounts of water and MeCN; however, we will refer here to the volume percentage (v/v%).The concentration of DBuAA was 90 mM.The eluents were stirred for at least 12 h before use.The gradient of the MeCN ran from 21.2 % to 44.0 %.Five different gradient slopes were involved: 2.28 % min − 1 , 1.14 % min − 1 , 0.76 % min − 1 , 0.57 % min − 1 , and 0.456 % min -1 .The gradient parameters were established in such a way as to allow elution of all ONs used here within a reasonable retention time, while keeping the other experimental operational conditions constants, such as mobile phase flow rate and column temperature.
Uracil was used as the void volume marker to calculate the total porosity of the column.Here, it was determined to be 0.562.The external porosity was estimated to be 0.347 based on the Blake-Kozeny-Carman equation [24] and the pressure drops were measured in the column at four different flow rates and using eluent containing 30 % MeCN.The temperature of the column and of the eluent at column inlet was kept at 50 • C. The mobile phase viscosity for the average column pressure was taken from the Reference Fluid Thermodynamic and Transport Properties (REFPROP) database [25].The permeation coefficient in the Blake-Kozeny-Carman equation was set to 150.The external porosity is needed in calculation of the apparent dispersion coefficient, see Section 4.2.
The experiments were carried out using the still-air column temperature control mode at 50 • C. The temperature was selected to reduce potential secondary ON structures affecting the separation.Flow rate was 0.25 mL min -1 .The signals were recorded at 260 nm in the analytical mode and 300 nm or 305 nm in the overloaded mode.Repeatability was checked by making two injections, after each change in the slope of the gradient.

Calculations
The column model was solved using the method of Orthogonal Collocation on Finite Elements (OCFE) to discretize the spatial derivatives [26].To solve the sets of ordinal differential equations obtained after the discretization a C-language Variable-coefficients Ordinary Differential Equation solver (CVODE), available in a Suite of Nonlinear and Differential/Algebraic equation Solver package (SUN-DIAL) [27], was used.The isotherm model parameters were determined using the inverse method according to the procedures applied by Åsberg et al. [19,28], with small modifications as described below.As the optimization method, the deterministic Levenberg-Marquardt procedure was used.The precision of the model was calculated based on the formulas given by Seber and Wild [29] for the 95 % confidence interval of Student's t-test.
In the one-component case, the Langmuir isotherm model has four unknown parameters (i.e., a 0 , K 0 , S a , and S K ; see Eq. ( 5)).Simultaneously estimating all four parameters is rather difficult and often results in no satisfactory agreement between calculated and experimental elution profiles.The main difficulty is to find appropriate starting isotherm parameters so that the optimization algorithm can find a global minimum.This happens especially when the deterministic method for estimating the parameters is applied, which is inefficient in the presence of multiple local minima of the objective function.To achieve more efficient calculation, the isotherm model parameters were estimated in two steps.In the first step, the retention times of the analytical peaks were investigated.In cases where the disagreement between the analytical and the end of the overloaded profiles were notable, the analytical retention times were estimated from the tail of the overloaded elution profile.More specifically, from where the concentration was equal to 1 % of the concentration at the peak apex.This is a valid estimation because for the Langmuir isotherm, the analytical peak elutes at the end of the triangular overloaded elution profiles.In other words, in the first step of determining the isotherm model parameters, the denominator (1 + K 0 ⋅exp( − S K φ)c) of Eq. ( 5) was defined as 1 because the concentration, c, is very low.Thus, the parameters a 0 and S a were estimated from the stated retention times in the experiments conducted using gradient slopes 2.28 % min − 1 , 0.76 % min − 1 and 0.456 % min − 1 .In the second step, the parameters K 0 and S K were estimated based on three overloaded concentration profiles obtained for the same gradient slopes as in the first step, while the parameters determined in the first step were kept constant.Chromatograms for 25 µL, 10 µL, and 25 µL for gradient slopes of 2.28 % min − 1 , 0.76 % min − 1 , and 0.456 % min − 1 , respectively, were used in the second step.
In the two-component case, the parameters of the competitive Langmuir isotherm model Eq. ( 6)) were estimated as described above using the data obtained for gradient of 2.28 % min − 1 , 0.76 % min − 1 , and 0.456 % min − 1 .The parameters a 0 and S a for FLP were estimated again based on data obtained for the two-component case, due to a small shift observed between the elution times of the FLP bands in the one-and two-component cases.The parameters of the two-layer model (Eqs.(7) and ((8)) were estimated in one step based on the same overloaded concentration profiles as in the competitive Langmuir isotherm model.In this case it impossible to perform the estimation of the parameters in two step and thus simplification of the determination of the model parameters.The parameters of the IPR isotherm model (Eq.( 7)) were determined together with those of the solute isotherm model (Eq.( 8)) based on the elution profiles of MALAT-PS.For the other ONs considered in this study, the parameters for IPR adsorption came from the MALAT-PS case and were kept constant while determining the solute adsorption model.

Results and discussion
This section is divided into four parts.In Section 4.  model is introduced and verified based on the two-component case.

Analytical investigation
The different FLP ONs employed here were first analyzed by LC coupled to a mass spectrometer with high-resolution accurate-mass capabilities.These analyses revealed the presence of many impurities in addition to the ones selected for this study (i.e., FLP-dT and (P = O) 1 ).As summarized in Fig. S1a-c and Table S2a-c (see Supplementary material), different truncated impurities corresponding to a variety of shortmers were detected in MALAT, MALAT-50PS and MALAT-PS.Moreover, phosphodiester impurities were detected in MALAT-50PS and MALAT-PS.Using a combination of UV and mass spectrometry, the FLP purity in these ON samples were determined to be 85.4 %, 82.2 %, and 83.0 % for MALAT, MALAT-50PS, and MALAT-PS, respectively.
To investigate the separation of ONs and their closely-related impurities, small amounts of the different species were individually injected and separated using a very shallow gradient (0.456 % min -1 ).Three different FLP ONs were investigated: unmodified ON (MALAT), partially PS-modified ON (MALAT-50PS), and fully PS-modified ON (MALAT-PS).Closely-related n -1 impurities truncated from the 5′ end (i.e., FLP-dT) of all three FLP ONs, as well as the phosphodiester (P = O) 1 impurities of the PS-modified ONs (i.e., MALAT-PS and MALAT-50PS) were also investigated (see Table 1).Fig. 1 shows the chromatograms from a 3-µL injection of 0.1 mg mL -1 of each ON with gradient slope 0.456 % min -1 .The inserts in Fig. 1 show zoomed chromatograms for the FLP-dT and (P = O) 1 impurities overlaid with their corresponding FLP.For MALAT, only FLP-dT is overlaid in the insert.By inspecting Fig. 1, one sees that retention increases with the amount of PS modifications.This is expected due to the increased hydrophobicity of ONs with the introduction of sulfur atoms to the phosphodiester linkages.In the inserts, one can see that all FLPs (black lines) are well separated from their corresponding FLP-dT impurities (red lines) and that the selectivity between the PS-modified FLPs and their respective (P = O) 1 impurities is much lower.
To understand how sensitive the separation of ONs is to changes in % MeCN in the eluent, four additional gradient slopes were also evaluated (i.e., 0.57 % min − 1 , 0.76 % min − 1 , 1.14 % min − 1 , and 2.28 % min − 1 ).From the retention factor determined from these five gradients, k init and the S parameters were calculated using Eqs.( 2) and ( 3).The coefficient of determination (R 2 ) correlating theoretical and experimental retention times at different gradient slopes are ≥0.9994(see Table S2).Fig. 2 presents the corresponding calculated k init and S parameters with 95 % confidence interval.The S parameters are similar for the FLPs and their respective impurities, i.e., FLP-dT and (P = O) 1 (Fig. 2a).This indicates that it is difficult to resolve FLP from its similar impurities by adjusting the gradient slope.For instance, the selectivity factors calculated for the FLP and its (P = O) 1 impurity are 1.04 and 1.03, while for the FLP and its FLP-dT impurity they are 1.07 and 1.11 for MALAT-PS and MALAT-50PS, respectively.These values correspond to the shallowest gradient used in this study (i.e., 0.456 % min -1 ).For steeper gradients, the selectivity factors will decrease.Similar to the S parameter, the retention factor at the initial composition (k init ) of the eluent has no significant difference between any given FLP and its respective impurity (see Fig. 2b and Table S2).However, there are significant differences in k init between the FLPs with different degrees of PS-modifications.The difference between MALAT and MALAT-50PS is 1003 % while between MALAT and MALAT-PS is 2370 %.Incorporation of a sulfur atom into the phosphodiester backbone will increase the hydrophobicity of the oligonucleotide, and as a consequence, drastically increase the value of k init .From this calculation also follows that the initial composition of the eluent is more sensitive parameter than S parameter influencing the retention time and must therefore be carefully kept constant to achieve consistent separation.

Prediction of overloaded concentration profiles: one-component Langmuir case
Modeling the overloaded concentration profiles in chromatography entails applying a dynamic column simulation model together with an appropriate description of the adsorption/desorption process.The chromatographic process involving ON elution in the presence of IPR is complex [30], and deriving a model that considers all the underlying mechanisms under IPC is challenging [15].Moreover, the use of such a complex model in process optimization is time consuming.On the other hand, despite faster, a simple model can result in inadequate prediction of elution bands.With that in mind, in this and next section of this manuscript, overloaded concentration profiles will be derived using relatively simplistic adsorption isotherm models.In Section 4.4, a more advanced adsorption isotherm model that considers IPR adsorption will be used to predict overloaded concentration profiles.Positive and negative aspects of these different models will be discussed.
For the one-component case, the three FLP-containing ONs were used as model compounds (see Table 1).Fig. 3 shows, as dotted lines, the overloaded elution profiles from a 25-µL injection containing 5 g L -1 of MALAT, MALAT-50PS, and MALAT-PS using the same five different gradient slopes employed in the analytical investigation (i.e., 0.456 % min − 1 , 0.57 % min − 1 , 0.76 % min − 1 , 1.14 % min − 1 , and 2.28 % min − 1 ).The theoretical overloaded elution bands are shown as solid lines.The validation of the model for the other injection volumes (i.e., 5 µL, 10 µL, and 17 µL) is presented in Fig. S2.The adsorption isotherm model parameters for predicting the elution bands were determined directly from the experimental data obtained in gradient mode, because it is impractical to elute ONs in isocratic mode.The isotherm model parameters determined for MALAT, MALAT-50PS, and MALAT-PS are summarized in Tables S3a-c.
The results from Figs. 3 and S2 illustrate the one-component model is suitable to predict overloaded elution bands from the steepest gradient (2.28 % min -1 ), where sharp elution bands are obtained, to the shallowest gradient (0.456 % min -1 ), with low and wide elution bands, as well as for different injection volumes (i.e., 5 µL, 10 µL, 17 µL, and 25 µL).They also show the ability of the model to predict elution profiles is inversely proportional to the degree of PS modifications on the ONs.This is due to the distortion of the elution profile as a consequence of the decrease in resolution between FLP and closely-eluting impurities for ONs with increasing degree of PS modifications.By inspecting the elution profiles, one can see that the retention and sensitivity of the gradient slope increases as the number of PS modifications on the ONs is increased.These results illustrate the ED column model together with the proposed Langmuir isotherm equation modified to the gradient mode is a suitable choice for predicting the elution band of modified and unmodified ONs, as well as of ONs with varying degrees of PS modifications.It should be noted here that crude ONs (i.e., not purified after synthesis) were used for predicting the overloaded elution bands.Therefore, it is not a surprise that additional elution zones expected in these samples (see Fig. S1a-c and Table S2a-c) can negatively impact the elution bands predicted by the one-component model.Nevertheless, these results highlight that a simplistic model is able to adequately predict the overloaded elution band of unmodified and modified ONs over quite a wide range of gradient slopes, even for crude synthetic compounds.
Let us now consider some issues regarding the modeling of the elution bands in the studied case.The ED column model requires the apparent dispersion coefficient to be determined.Differently from linear chromatography (i.e., analytical chromatography with very low loads), this parameter is not as important for predicting overloaded concentration profiles.This is because the shape of the elution band is mainly controlled by the thermodynamics of the process in preparative mode [18].Changing the apparent dispersion coefficient by ±30 % does not usually cause any significant change in the calculated elution bands.However, this parameter should be also determined using appropriate method, especially when the load of the column is not high and then the apparent dispersion coefficient can take higher influence on the calculated elution bands.Here, the apparent dispersion coefficients were 2.28 % min -1 , 1.14 % min -1 , 0.76 % min -1 , 0.57 % min -1 , and 0.456 % min -1 (from left to right).(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)estimated based on the analytical peaks (i.e., at very low loads) obtained for the shallowest gradient (i.e., 0.456 % min − 1 ) by adjusting the heights and widths of the calculated analytical peaks according to the corresponding experimental data.The estimated apparent dispersion coefficients were verified against steeper gradients.A comparison of calculated and experimental analytical peaks is presented in Fig. S3.Generally, the predicted heights and widths of the analytical peaks were in satisfactory agreement.The highest relative error is 3.8 % for the retention time calculations while for the calculations of the peak width at half height, the relative errors are in most cases less than 10 %.Despite not tested here, it is possible relative errors can be decreased by incorporating a gradient slope dependency into the apparent dispersion coefficient.For the MALAT ONs, the steeper the gradient slope, the more significant the disagreement between calculated and experimental peak width.The relative error of the calculation of the peak width reaches 43.4 % for the steepest gradient slope.The discrepancy can be caused by the dynamic changes in the column due to the faster changes of the MeCN concentration in the mobile phase.The apparent dispersion coefficients were determined to be 0.016 cm 2 min -1 , 0.085 cm 2 min -1 , and 0.085 cm 2 min -1 for MALAT, MALAT-50PS, and MALAT-PS, respectively.The MALAT-50PS and MALAT-PS ONs presented lower column efficiency.This is likely a consequence of peak broadening due to the existence of diastereomers (see Table 1) with almost the same retention time as the analyte under consideration [7,31].
In order to further confirm the above result, the apparent dispersion coefficients were also calculated using an expression derived from the comparison of the General Rate (GR) and Equilibrium-Dispersive (ED) models [32].The average values of these parameters for 21.2 to 35.0 % (v/v%) MeCN in the eluent (the concentration range of MeCN over which the ONs were eluted) are 0.073, 0.088, and 0.094 cm 2 min -1 for MALAT, MALAT-50PS, and MALAT-PS, respectively.The difference between unmodified and PS-modified ONs is much smaller than that following from the estimation based on the analytical peaks.This confirms that the lower efficiency for MALAT-50PS and MALAT-PS in comparison with MALAT is caused by the existence of diastereomers.
A simple model, such as the one-component case, uses the same S parameter for a and K.As a consequence, it does not correctly describe overloaded concentration profiles.Fig. 4 presents the calculated (solid lines) and experimental overloaded (dotted lines) elution profiles of MALAT-PS for an injection volume of 25 μL and gradient times 10 and 50 min.As it can be seen, especially for the shallowest gradient, the calculated elution band has a rounded diffusion part, which disagrees from the corresponding experimental band; this disagreement increases with increasing injection volumes.For the lowest injection volumes (i.e., 5 µL and 10 μL; Fig. S2), the problem of the rounded diffusion part of the simulated elution band is not present and the model gives almost the same agreement as in the final version of the isotherm model presented in Fig. S2.As it can also be seen, extrapolating the calculated injection volumes to 50 µL and 100 μL results in even more rounded elution bands.However, the experimental elution band at these column loads is still triangular (see Fig. S4).The explanation to this observation can be found by closely considering the values of the isotherm model parameters and the isotherm model itself.The same S parameter in the Langmuir isotherm model denotes that the saturation capacity remains constant during gradient elution and that only the equilibrium constant decreases with increasing MeCN content in the eluent.Using just one S parameter, the monolayer saturation capacity for MALAT-PS equals 9.395 g L -1 , while the equilibrium constant decreases from 273.208 L g -1 to 0.739 L g -1 when increasing the MeCN content in the eluent from 21.2 % to 35.0 % (v/v%).The saturation capacity is rather low.This will result, at high solute loads, in the stationary phase being saturated and elution profiles containing a plateau, as observed in the simulations (see Fig. 4).Using an isotherm with different S parameters, the equilibrium constant decreases from 40.286 L g -1 to 1.071 L g -1 , while the saturation capacity decreases from 63.716 g L -1 to 6.487 g L -1 when increasing the MeCN content in the eluent from 21.2 % to 35.0 % (v/v%).In this case, the initial monolayer saturation capacity is much higher than in the case of the isotherm model with the same S parameter, and using this equation, the triangular overloaded concentration profiles can be calculated even for much larger injection volumes.The change in the saturation capacity during gradient elution can be attributed to the mechanisms driving IPC.With increasing MeCN content in the eluent, the IPR adsorption will decrease, resulting in a decreasing surface charge density on the stationary phase surface [33].Thus, the saturation capacity of the model used for simulation must also decrease with increasing MeCN content in the eluent to properly predict the overloaded elution bands.

Prediction of overloaded concentration profiles: two-component competitive Langmuir case
The two-component system studied here consisted of the three FLP ONs and their respective n -1 impurity (i.e., FLP-dT, see Table 1), with Fig. 4. Comparison between experimental overloaded concentration profiles (dotted lines) and calculated overloaded concentration profiles obtained using the Langmuir isotherm model (solid lines; see Eq. ( 5)) with constant saturation capacity for gradient slopes of a) 2.28 % min − 1 and b) 0.456 % min − 1 .Sample injected: 25 μL of 5 g L -1 MALAT-PS.The dashed and dashed-dotted lines show the calculated elution profiles for the 50 μL and 100 μL injection volumes, respectively.(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) M. Leśko et al. the concentration of the ONs being 5 g L -1 and 0.5 g L -1 for FLP and n -1, respectively.Again, crude compounds were used here, so additional chromatographic peaks are expected.Fig. 5 presents the elution profiles (solid lines) of the two-component case resulting from a 30-µL injection of MALAT, MALAT-50PS, and MALAT-PS.
The method used to estimate the isotherm model parameters was the same as for the one-component case.The parameters of Eq. ( 6) were estimated in two steps using the experimental data for the gradient slopes 2.28 % min − 1 , 0.76 % min − 1 and 0.456 % min − 1 .Because there was a shift in the elution times of the FLP bands between the one-and two-component cases (data not shown), the parameters a 0 and S a for FLP were re-estimated this time using data obtained for the two-component case.The isotherm model parameters of the two-component case are presented in Table S4a-c; the predictions of the concentration bands for other injection volumes (i.e., 5 µL, 10 µL, 15 µL and 20 µL) are presented in Fig. S5.
As observed in Fig. 5, an overall suitable agreement between simulated and experimental elution bands was achieved for the twocomponent case.It can also be noticed that the suitability of the agreement is inversely proportional to gradient steepness; the steeper the gradient, the more significant the difference between simulated and experimental elution bands.These results illustrate the model can predict very sharp and narrow elution profiles for the steepest gradient (i.e., 2.28 % min − 1 ) and low and wide elution bands for the shallowest gradient (i.e., 0.456 % min − 1 ), independently of column load and degree of PS modifications in the ONs.In our opinion, this confirms that a relatively simple mathematical description of the complex chromatographic process taking place in IPC is enough to predict overloaded elution bands for general process optimization.
As already mentioned above, the overloaded elution bands modeled here and in Section 4.2 were carried out using crude ONs.Nonetheless, theoretical chromatographic systems consisting of either just one or two components were used as model systems.The detailed characterization of the three FLP-based ONs revealed a significant number of impurities present in these samples (Fig. S1a-c; Tables S1a-c).Most of these impurities are less retained than the FLP and occur at significantly lower quantities, thus not interfering much with the main overloaded elution bands.However, the closely related impurities, such as n -1, n -2 and n -3, are eluted close to the main peak (see Fig. 3 and S2).The injection of large volumes of sample causes them to be co-eluted with the main component or to be displaced by the main component (Fig. 3).As a result, we observe the front of the triangular peaks (concentration bands) are often not as sharp as in classical Langmuirian profiles.Moreover, MALAT-50PS and MALAT-PS ONs contain the (P = O) 1 impurity, which propagates along the column slightly faster than its related FLP (Fig. 1).Therefore, the front of the triangular elution bands of MALAT-PS and MALAT-50PS ONs is more deformed than the one of MALAT ON (see Figs. 4 and 5).For simplicity purposes, the overloaded profiles calculated here do not account for these additional impurities.Therefore, the lower the purity of the ONs, the lower the ability of the model to correctly predict elution profile of the main component (see Figs. S2 and S5).To more accurately model these complex systems, one would have to determine the concentration profiles of all impurities as well as the FLP.Such an optimization goes against the purpose of having a simple model suitable for process optimization and was therefore not attempted here.
It is important to ensure repeatable separations.By inspecting Figs.S2 and S5 in the Supplementary materials, we found very good repeatability of the experiments, which is essential for correct modeling and predictions.This is evident in the diffusive parts of the elution profiles, where the profiles for all considered injection volumes overlaps.

Prediction of overloaded concentration profiles: multilayer twocomponent competitive case
Because of the limitations discussed above, the simple one-and twocomponent-based models are generally insufficient for more fundamental studies.Therefore, in Section 2 a thermodynamically consistent two-layer two-component isotherm model is derived.The main difference from the simple two-component case is that here the adsorption of the lipophilic IPR on the stationary phase is also accounted for.This adsorption of the IPR results in a positively charged layer that attracts the negatively charged ONs.The common type I isotherm equation in the Langmuir model was employed to account for the IPR adsorption on the C18 stationary phase [18].To also account for the gradient elution, the Langmuir isotherm model was modified to the gradient mode (see Eq. ( 7)).Here, gradient mode was selected because MeCN can wash IPR away from the stationary phase, thus impacting the electrostatic-based adsorption of ONs during gradient-based elution.Therefore, the model was made more representative of the mechanisms driving IPC.Assuming the gradient program used in our study, the calculated concentration of IPR on the stationary phase decreases from 29.54 g L − 1 (at 21.2 % MeCN) to 0.006 g L − 1 (at 44.0 % MeCN).
Fig. 6 presents the results of predicting the elution bands for injection volume of 30 µL of MALAT, MALAT-50PS, and MALAT-PS using the twolayer two-component model.The model validation at lower injection Fig. 5. Comparison between experimental overloaded profiles (dotted lines) and calculated overloaded concentration profiles obtained using the competitive Langmuir isotherm model (solid lines; see, Eq. ( 6)) of the twocomponent system for a) MALAT, b) MALAT-50PS, and c) MALAT-PS.Sample injected: 30 μL of 5 g L -1 and 0.5 g L -1 FLP and n -1, respectively; the gradient slopes are: 2.28 % min -1 , 1.14 % min -1 , 0.76 % min -1 , 0.57 % min -1 and 0.456 % min -1 (from left to right).(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)volumes (i.e., 5 µL, 10 µL, 15 µL and 20 µL) is presented in Fig. S6.Overall, the agreement between calculated and experimental elution bands is satisfactory, albeit less satisfactory than the Langmuir isotherm model presented for the simple two-component case above (compare Figs. 5 and S5 to Figs. 6 and S6).As in the simple two-component Langmuir isotherm model, the best agreement is observed for the shallowest gradients (i.e., 0.456 % min − 1 , 0.57 % min − 1 , and 0.76 % min -1 ).We speculate that other factors might be also playing a role at steeper gradient slopes.One potential factor is the formation of significantly different equilibria states for IPR in the mobile and stationary phases at different gradient slopes.The equilibrium state is fully established when the concentration of the modifier is constant.The faster the changes in the concentration of the modifier, the further the system is from equilibrium.Thus, negatively impacting the agreement between experimental and predicted elution profiles for different gradient slopes, especially for steeper gradient slopes.Additional studies are needed to further evaluate our speculation.
The isotherm model parameters of the two-layer two-component model were estimated based on the same experimental data used in the two-component Langmuir isotherm model.They are presented in Tables S5a-c.It should be noted that there are six model parameters to be determined in this two-layer two-component model.This is a rather large number, thus challenging to be estimated.For instance, in the example shown here, all parameters had to be independently adjusted to identify a suitable starting point for the inverse solver.This is time consuming, but crucial for the performance of the deterministic algorithm used in this study.This example illustrates that the two-layer twocomponent model is significantly more complex than the simple cases discussed above, thus difficult to be applied in process optimization.However, the advantage of this model is that it is based on the mechanisms driving IPC and on the fundamentals of chromatography -therefore, making it a thermodynamically consistent model.Despite not tested here, this model has the potential to be used in fundamental investigations aiming to better understand the ON retention mechanisms in IPC.It should be noted that the model contains the description of the IPR adsorption in explicit form, and that the model parameters of this description are estimated indirectly based on the overloaded concentration bands of the solutes.
It is also important to note that the isotherm model parameters for IPR were estimated based on the elution profiles obtained for MALAT-PS.These parameters were kept constant for the other two ONs and only the isotherm parameters for the solutes were estimated.We arbitrarily assumed the adsorption of the IPR is independent of the ON and introduced the α parameter.This parameter makes the model more generic with reasonable prediction capabilities.To further improve the predictivity of the model, additional detailed studies regarding IPR adsorption on the C18 column are needed.We understand that determining the IPR adsorption on stationary phases is laborious and difficult.However, outcomes of the two-layer two-component model presented here are promising and advocate for future development.

Conclusion
IPC is an important and widely used chromatographic technique for the analysis and purification of ONs and other ionic compounds.In this study, we focused on modeling the overloaded concentration profiles of ONs eluted in IPC, more specifically, in the gradient mode.Simple and more advanced models were proposed and evaluated.
We have shown the ED model together with Langmuir's onecomponent description of the adsorption/desorption process and the two-component competitive Langmuir isotherm model could predict the overloaded elution profile of ONs in gradient-mode IPC.Despite not including a complex description of the chromatographic phenomenon, known to occur in IPC, the mathematical models were able to predict the overloaded concentration bands of three unmodified and PS-modified ONs.
To correctly predict the elution profiles, the isotherm models needed various solvent strength (i.e., S) parameters introduced in the numerator and denominator of the equation.Such a change caused both the equilibrium constant and saturation capacity to be functions of the amount of organic modifier in the eluent.In practice, this significantly increased the ratio of the value of the saturation capacity to the equilibrium constant, enabling the elution bands to be correctly calculated at the highest injection volumes (i.e., 17 µL, 20 µL, 25 µL and 30 µL).If the isotherm with different S parameters and MALAT-PS as solute are applied the equilibrium constant is decreasing from 40.286 L g − 1 to 1.071 L g − 1 while the saturation capacity is decreasing from 63.716 g L − 1 to 6.487 g L − 1 with increasing acetonitrile content in the eluent from 21.2 % to 35.0 % (v/v%).In contrast, the saturation capacity determined for the isotherm with the same S parameters equaled 9.395 g L -1 , while the equilibrium constant decreased from 273.208 L g -1 to 0.739 L g -1 when increasing the MeCN content in the eluent from 21.2 % to 35.0 %.Such a relatively low saturation capacity makes it impossible to correctly predict the shape of the overloaded concentration profiles for high column loads.
As an alternative to the two-component competitive Langmuir adsorption model, a more fundamental description of the adsorption/ desorption process was proposed with reliable results.This model is based on the assumption that IPR adsorbs on the stationary phase and thus dynamically creates the first layer.This layer serves as the charged surface on which the ON molecules can adsorb.In gradient elution it was assumed that the organic modifier affects only the strength of the IPR adsorption.The agreement between the calculated and experimental elution bands was adequate in most predicted profiles using the twolayer model.However, the two-layer-based predicted elution profiles are less reliable than those predicted using the two-component competitive Langmuir isotherm model.As in the case of simple description, a more significant disagreement was observed for the steepest gradient; in the two-layer model, this disagreement was even more significant.This suggests the existence of additional factors not accounted for in both models and that exerts more influence at this gradient.Another significant factor contributing to this discrepancy is the fact that the organic modifier changes from 21.2 % to 44.0 % within 10 min.This can cause dynamic changes in the chromatography column and finally result in difficulties predicting the concentration profile.Models for predicting the two-component system have eight and six unknown parameters for the two-component competitive Langmuir and two-layer models, respectively.Despite the two-layer model having fewer parameters to be determined, the estimation process is incomparably more difficult in this case, because one cannot conduct the estimation in two steps and reduce the number of the estimated parameters.In other words, all parameters must be adjusted in such a way as to provide a reliable starting point for the inverse solver.It requires more work and considerable experience to obtain a satisfactory result.Moreover, it should be noted that to calculate the overloaded elution bands, an additional equation must be included in the ED model for IPR, decreasing the calculation speed.Therefore, when the main aim is to predict the overloaded elution bands and the optimization process, we recommend using the ED model connected with the Langmuir isotherm model modified to gradient mode.The advantage of the two-layer model is its foundation in the mechanisms considered in IPC, which permits more fundamental investigation of the chromatographic process.
The mathematical modeling in this study was conducted based on the elution profiles obtained for crude ONs.The synthetic impurities influenced the shape of the overloaded elution profiles, especially for MALAT-50PS and MALAT-PS where (P = 0) 1 impurities propagate slightly faster than the elution profile of the FLP.Thus, the front of the triangular elution bands is more deformed.The mathematical models used in this study do not include other impurities apart from FLP-dT.Therefore, they give the classical sharp triangular elution bands followed from the applied isotherm type I.In other words, the ability to correctly predict the elution bands is decreasing with increasing the contamination of the ONs caused for example by the incomplete sulfurization process.This makes it more difficult to derive the correct mathematical model and influences its predictivity in a negative fashion.
In summary, we have demonstrated that the mathematical models derived here resulted in reliable prediction of overloaded elution bands of ONs.More importantly, our results indicate that ED model together with gradient-adapted Langmuir's isotherm model has the potential to be the simple and efficient predictive model needed for industrial oligonucleotide optimization processes.Moreover, our results suggest that factors such as IPR equilibria states in the mobile and stationary phases at different gradient slopes may also play a role in IPC of oligonucleotides.Further evaluation of these equilibria states and other factors might improve predictivity of simple and, especially, more advanced models.Despite only tested here on MALAT-derived oligonucleotides (without and with varying degrees of PS modifications), we believe the models presented here are general and likely applicable to other shorter and longer ONs that present elution profiles with Langmuirian behavior.

Fig. 1 .Fig. 2 .
Fig. 1.Chromatograms generated after 3-μL injections of samples containing 0.1 g L -1 of a) MALAT, b) MALAT-50PS, and c) MALAT-PS.The gradient slope is 0.456 % min -1 .The inset plots in the top left show the FLP and the impurities FLP-dT and (P = O) 1 .(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Table 1
1, the analytical investigation is discussed.Sections 4.2 and 4.3 present the simple adsorption isotherm model for the one-component and two-component cases, respectively.Finally, in Section 4.4, a new two-layer adsorption Summary of the synthetic FLP, n -1, and (P = O) 1 ONs used in this study, including their sequences, number of PS linkages, and number of diastereomers.* indicates the location of PS linkages within the ON sequences.