A framework for model reliability and estimability analysis of crystallization processes with multi-impurity multi-dimensional population balance models

The development of reliable mathematical models for crystallization processes may be very challenging due the complexity of the underlying phenomena, the inherent Population Balance Models (PBMs) and the large number of parameters that need to be identiﬁed from experimental data. Due to the poor in- formation content of the experiments, the structure of the model itself and correlation between model parameters, the mathematical model may contain more parameters than can be accurately and reliably identiﬁed from the available experimental data. A novel framework for parameter estimability for guaranteed optimal model reliability is proposed then validated by a complex crystallization process. The latter is described by a differential algebraic system which involves a multi-dimensional population balance model that accounts for the combined effects of different crystal growth modiﬁers/impurities on the crystal size and shape distribution of needle-like crystals. Two estimability methods were combined: the ﬁrst is based on a sequential orthogonalization of the local sensitivity matrix and the second is Sobol, a variance-based global sensitivities technic. The framework provides a systematic way to assess the quality of two nominal sets of parameters: one obtained from prior knowledge and the second obtained by simultaneous identiﬁcation using global optimization. A cut-off value was identiﬁed from an incremental least square optimization procedure for both estimability methods, providing the required optimal subset of model parameters. The implemented methodology showed that, although noisy aspect ratio data were used, the 8 most inﬂuential and least correlated parameters could be reliably identiﬁed out of twenty- three, leading to a crystallization model with enhanced prediction capability.


Introduction
Crystallization is an important separation process, extensively used in most chemical industries, either as a method of production or as a method of purification or recovery of solids. Many substances of scientific, technological, and commercial importance are crystalline, ranging from large-tonnage commodity materials to high-value specialty chemicals, such as active pharmaceutical ingredients (APIs). The pharmaceutical industry relies heavily on crystallization as 70% of all the pharmaceuticals formulation and 90% of APIs involve at least one crystallization step during the manufacturing process ( Pena et al., 2015;Alvarez and Myerson, 2010 ). Besides, crystallization is one of the key steps in the produc-P imp, i Impurity factor of the growth rate of i th characteristic facet Variance, [-] x Vector of the differential state variables ˆ y i j Vector of numerically calculated aspect ratio at k th point in time, [-] y ij Vector of measured aspect ratio at k th point in time, [-] z Vector of the algebraic state variables Z Sensitivity Matrix, [-] Greek letters Effectiveness factor of the adsorption on the k th site on i th characteristic facet, [-] β i, k Stochastic measurement error, [-] η ij Time spent by a particle in the presence of impurities, [ s ] θ Angle between {101} and {100} surfaces,[ rad ] λ Cut-off value, [-] μ m, r m, r order joint moment σ Relative supersaturation [-] σ ij 2 Variance [-] ρ c Size Space such as process design, control, real time optimization and Qualityby-Design ( Su et al., 2015;Nagy, 2009;Mascia et al., 2013;Lakerveld et al., 2013;Aamir et al., 2009;Benyahia et al., 2012;Ramin et al., 2018;Benyahia 2018 ). A prerequisite to apply model-based control strategies is the availability of a predictable mathematical model. The most fundamental approach for modelling particulate processes, such as crystallization, is the population balance model (PBM) framework coupled with kinetic expressions, mass and energy balances, which yields a set of nonlinear integro-partial differential equations. The set provides a rigorous approach to model the dynamic evolution of the dispersed phase system's properties, such as CSSD ( Majumder et al., 2012;Sato et al., 2008;Borsos and Lakatos, 2014;Kumar et al., 2008 ). Although the PBM framework is based on first principles, a general theoretical mathematical expression for the determination of the crystallization kinetics doesn't exist and hence, empirical or semi-empirical expressions (e.g. power law etc.) are used, that in most of the cases account the supersaturation as the key variable ( Rawlings et al., 1993;Cao et al., 2012 ). Although the benefits of the mathematical models are widely accepted, setting a unified rigorous framework for building reliable and predictable models is still an open subject, particularly for pharmaceutical processes. In order to obtain accurate model predictions, identification of the unknown model parameters is often required. However, in many cases, first-principles models comprise a large number of parameters which often cannot be estimated reliably from the available experimental data. In addition, the quality and the information content of the available experimental data can be affected by many factors such as noisy measurements, limited number of data points, poor design of experiments (DoE) and limited range of operating conditions Chu and Hahn, 2011 ). Furthermore, strong influence of a parameter on one or more of the measured responses, high correlation between the parameters' effects and/or the effects of a parameter on model predictions can also lead to unreliable and inaccurate identification of the unknown parameter values, which in turn degrades the prediction capability of the mathematical model ( Kravaris et al., 2013Benyahia et al., 2013Eghtesadi and McAuley, 2014 ). Of course, mismatch could also arise from the model structure itself, since several assumptions are commonly made in order to simplify the numerical representations of the system and reduce its complexity with the risk of neglecting some of the key underlying phenomena and consequently reducing the prediction capabilities of the model.
Several approaches have been developed to cope with some of these problems, such as modifying the model structure, incorporating additional measured outputs (e.g. using different PAT) and improving the information content of the experimental data by utilizing DoE approaches. However, before deciding whether the mathematical equations should be modified, or supplementary experiments should be designed and performed, one key step is to investigate whether the available experimental data contain enough information to identify uniquely and reliably the overall model parameters, or alternatively, the subset of the model parameters that could be identified reliably and lead to the most predictable mathematical model. This could be achieved by evaluating the structural identifiability and estimability (i.e. practical identifiability) of the model parameters ( McLean and McAuley, 2011;Sin et al., 2010 ). The structural identifiability approach evaluates whether the parameters are locally or globally identifiable based utterly on the model structure, while the estimability appraises whether the parameters can be identified uniquely by using the available experimental data or data from a proposed set of experiments ( McLean and McAuley, 2011;Walter and Pronzato, 1997 ). The estimability or practical identifiability methodology depends on the domain of variability of model parameters and experimental conditions whereas the structural identifiability is totally independent from both. The objective of the estimability analysis is to identify how many of the model parameters can be estimated accurately from the available data, while the ones with low estimability potential can be set to certain nominal values without degrading the prediction capability of the model Chu and Hahn, 2011 ). Consequently, the estimability potential can be defined as a measure of the effects of parameters on the experimental outputs and/or correlation among the model parameters. In this work, only the estimability of the model parameters is evaluated.
Different approaches have been developed and proposed to help identify the most appropriate subset of parameters for estimation based on the estimability approach. Degenring et al., (2004) proposed a method for parameter selection based on principal component analysis (PCA), which is a statistical procedure that converts a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables. A parameter selection was obtained by using different PCA methods ( Jolliffe, 1972 ), which provided different parameter ranking outcomes. The PCA-based approach was applied in more recent investigations ( Schittkowski, 2007;Quaiser and Mönnigmann, 2009 ) and was proven to be less robust compared to the orthogonalization and the eigenvalue method discussed below. The eigenvalue method, introduced by Vajda et al., (1989) and was improved independently by other researchers ( Schittkowski, 2007;Quaiser and Mönnigmann, 2009 ), determines the most estimable subset of parameters based on the eigenvector and eigenvalues of the fisher information matrix (FIM). Although, the method shown better accuracy compared to other methods, it becomes challenging sometimes to match eigenvalues with specific parameters ( McLean and McAuley, 2011 ). The singular value decomposition ( Velez-Reyes and Verghese, 1995 ) and the correlation and collinearity methods ( Brun et al., 2002;Sin et al., 2010 ) were also proposed for the estimability analysis. The main drawback of the correlation and collinearity techniques is the fact that only the directions of the sensitivity vectors are considered without taking into account the magnitude of the sensitivities ( McLean and McAuley, 2011 ). This limitation led Brun and coauthors (2002) to propose a robust approach that combined a method based on a scalar measure of the FIM and the collinearity method. A more robust approach for performing the estimability analysis is based on the orthogonalization of the sensitivity matrix which was initially introduced by Yao et al., (2003) then improved by Lund and Foss (2008) and Thomson et al. (2009) . The method ranks the parameters according to both their individual effect on the measured responses and the correlation between the parameters. Due to the efficiency of this forward-selection method, it has been employed widely in complex chemical and biochemical systems Surisetty et al., 2010;Thomson et al., 2009;Jayasankar et al., 2009;Onyemelukwe et al., 2018 ).
Despite the popularity of the estimability analysis in numerous scientific areas, such as polymer science, environmental engineering and biology, this class of methods is still novel in the area of crystallization and its inherent benefits are not well understood, as only very limited number of studies have been reported in the literature. Chen et al. (2004) presented a model-discrimination for model-based design by using the D -optimal criterion for the parameter set selection. However, only four parameters were considered making the benefits of the method unclear. Some of the benefits of the parameters selection methods were discussed by Czapla et al. (2009) who used an approach proposed by Brun et al. (2002) to select the most sensitive model parameters of a preferential batch crystallization of enantiomers. However, both studies utilized an arbitrary cut-off value for the parameter selection. A more comprehensive study was presented by Samad et al. (2013a,b ) where two global sensitivity analysis techniques, Morris screening and the standardized coefficients, were utilized to identify the most significant parameters. Although, these techniques may be useful for the classification of the parameters in terms of sensitivity, the correlation of the parameters was not considered during the ranking procedure but estimated afterwards.
Considering all the challenges inherent to parameter selection and identification discussed above and with the scope of improving the current methodology for parameter identification for crystallization processes, a new framework ( Fig. 1 -see next section) is proposed for a systematic and optimal selection of the parameter subset with the highest estimability potential for guaranteed model reliability. As a case study, a batch cooling crystallization process is considered under the presence of multiple impurities, more specifically crystal growth modifiers (CGM), which can affect, besides product purity, the growth and potentially the nucleation kinetics and hence the size and shape distribution of the final crystals. A novel morphological multi-dimensional population balance model that incorporates mechanisms for multisite competitive adsorption of the impurities on the crystal faces, coupled with mass balance equations is used ( Borsos et al., 2016 ).
To the best of our knowledge, it is the first time that the modified Gram Schmidt orthogonalization algorithm and Sobol analysis are combined and applied in the area of crystallization and equally the first time that the estimability analysis in general is being applied to assess the model reliability of a PBM that takes into consideration the presence of impurities. The complexity of the case study provides an opportunity to show the capabilities of the methodology with the scope of building more reliable and high-fidelity models for the pharmaceutical industry for process design, optimization and advanced control that would enhance the implementation of model-based Quality-by-Design (QbD).

Method
The proposed methodology ( Fig. 1 ) combines a sequential orthogonalization method, which takes into account the overall magnitude of the local sensitivities and the correlation between the parameters, with a variance-based global sensitivity ranking method (Sobol). In both cases, a rigorous approach is used to identify the cut-off values based on the minimization of the maximum likelihood criterion. To assess the consistency and quality of the methodology, the correlation coefficients are calculated and the parameter estimates are assessed against their confidence domains. The proposed methodology enables a more robust classification of the model parameters based on the estimability potential, and provides a key tool to analyze the information content of the experimental data and consequently the quality of the measurements and the employed sensors (PAT). As such, the method helps identify the parameters that could be estimated accurately from the available data and informs whether additional data are required to identify a specific model parameter (e.g. new experiments or additional data from another sensor).
The estimability approach aims at identifying more reliably the model parameters from the existing data and requires an initial nominal vector of model parameters, commonly obtained from prior knowledge of the process. In the case of lack of prior knowledge or uncertain model parameters (extremely poor estimates or broad confidence intervals), the estimability framework described in the paper provides a methodology to help identify the set of the nominal parameters. Although a variety of different approaches has been applied to estimate model parameters of complex and highly nonlinear chemical processes, nonlinear optimization algorithms has been widely adopted due to their accuracy and efficiency ( Rawlings et al., 1993 ). In this paper, a hybrid global optimization technique that combines a genetic algorithm and local deterministic method (sequential quadratic programming) was used to identify the unknown parameters.
To maximize the benefits of the methodology, the estimability approach was implemented in both cases: the case where the initial nominal vector of parameters exists form prior process knowledge and the case where all parameters of the nominal vector should be identified globally and simultaneously by minimizing the weighted least square error. Both estimability approaches, the sequential orthogonalization and Sobol (variance-based method), rank the model parameters by order of importance. The ultimate objective of the estimability approach is then to find the optimal subset of model parameters that guarantee maximum model reliability. As a consequence, an estimability threshold or cut-off value is required to identify the subset of parameters that should be subject to re-estimation, to maximize model accuracy, and the subset of parameters that should be kept at nominal values, without degrading the prediction capability of the model. An optimal subset of parameters can be obtained by running a sequential parameter estimation procedure by identifying the top i th parameters (where i = 1, 2, …) each time and calculating the corresponding objective function value. The cut-off value can be obtained when the improvement in the objective function due to an additional parameter becomes insignificant. If the model prediction capability with the optimal parameter subset is unsatisfactory, the method suggest to run additional experiments, redesign the experiments (e.g. optimal experimental designs) or/and select additional or alternative PAT tools with the scope of increasing the information content of the data.

Process model
The Multi-Impurity Adsorption Model (MIAM) was developed by Borsos et al. (2016) as a novel mathematical model for crystallization processes considering multi-impurity adsorption mechanisms with the purpose of process design, optimization and control. The model was built to predict the dynamic evolution of size and shape distribution during crystallization under the presence of Table 1 Complete set of differential-algebraic equations (DAEs) that represent the Multi-Impurity Adsorption Model (MIAM).
Langmuir constant of the j th CGM on the k th site on i th characteristic face Absorption effectiveness factor of the k th site on the i th characteristic face Mass transfer coefficient when impurity distribution does and does NOT occur, respectively Unknown Parameters for Primary Nulceation & Crystal Growth in each characteristic face impurities. The effect of the crystal growth modifiers was monitored in real time by using an in-situ video imaging probe: Lasentec Particle Vision and Measurement V819 (PVM). Images were automatically obtained with a frequency equal to six images per second and analyzed by Lasentec's image and stat acquisition software, where blob analysis was utilized to monitor the aspect ratio. In more detail, the cooling crystallization of pure potassium dihydrogen phosphate (KDP) in deionized water was investigated under the presence of aluminum sulfate (Crystal Growth Modifier: CGM1) and sodium hexametaphosphate (CGM2) and aspect ratio (AR) measurements were obtained as experimental outputs. As it was presented by Borsos et al. (2016) , divalent and trivalent metal ions preferably adsorb onto the {100} KDP crystal facet hindering the crystal growth in that facet, while anionic growth modifiers prefer to adsorb onto the {101} KDP crystal facet inhibiting the crystal growth of the corresponding length. Hence, CGM1 is likely to adsorb onto {100} facet leading to more needle-like shaped crystals, while CGM2 tends to adsorb onto the {101} facet causing an opposite effect by generating crystals with lower aspect ratio. Thus, the CGMs considered in this case have competing effects. Multidimensional population balance equations (PBEs) with two characteristic lengths x l = { x 1 , x 2 } were considered to model the evolution of the crystal shape distribution, ( Fig. 2 ). To integrate /solve the PBM model, the concentrations of the solute and the impurities are also required, which are calculated by coupling the corresponding mass balances. The overall model is summarized in Table 1 and it consists of a set of differential-algebraic equations (DAEs) combined with algebraic equations that describe the kinetics and thermodynamics.
The mathematical model requires 23 parameters and can be represented, for notational expediency, by the following general form of differential-algebraic equations (DAEs): where x is the vector of the differential state variables, z is the vector of the algebraic state variables, and p is the vector of the parameters.
A multi-variate nonlinear dynamic regression model can be considered for the mathematical illustration of the interaction between the model prediction and measured output: where y ij is the j th measurement of the i th experimental output, ˆ y i j is the corresponding model prediction, t ij is the j th sampling time of the i th output and ɛ ij is the measurement error assumed to be uncorrelated, Gaussian distributed, with zero mean.

Estimability analysis
In the current study, the estimability analysis consists of three main steps. In the first step, the relative effect of each model parameter on the measured outputs is determined through local sensitivity analysis of the dynamic system. The Sensitivity analysis is a fundamental study that can determine how the variations of the outputs could be related to certain variations of the input variables. The second step is to apply the orthogonalization algorithm with the scope of ranking the parameters in descending order, in terms of impact on the outputs and minimum correlation between the parameters. Finally, a parameter estimation procedure is performed incrementally and sequentially in order to identify the threshold (cut-off value) on the objective function, which in turn helps select of the optimum most estimable subset. These steps are thoroughly described and discussed below.

Ranking the model parameters -orthogonalization method
The development of an effective solution to the parameter selection problem requires the quantification of the influence of each parameter on the measured outputs. This approach indicates which parameters are the most important and most likely to affect the model predictions. The first step of the estimability analysis method is the evaluation of the sensitivity coefficients which can be calculated analytically or numerically. The numerical approach consists in applying a perturbation to the nominal values of the parameters according to the backward finite differences method as follows where N p is the number of the parameters. It should be mentioned that the relative perturbation applied to the nominal values of the parameters was equal to −2% (i.e. p j / p j ). As such the local sensitivity can be calculated for each sampling or measurement time. As the model parameters and outputs have different units and numerical values that could span several orders of magnitude, a normalization of the local sensitivities is often applied with respect to the parameters' nominal values and corresponding model output in order to make a more reliable comparison between the inherent effects of the parameters. The normalized sensitivity coefficients are given by the following equation: where p j is the nominal value of the j th parameter, y i | t= t k is the model prediction of the i th output, evaluated at a sampling time t k using the nominal vector of parameters p j and j = 1 , 2 , . . . , N p . After the sensitivity coefficients have been calculated, a sensitivity matrix Z is constructed as follows : The sensitivity matrix has a dimension N y × ( N p × N m ), where N y is the number of the measured outputs, N m is the number of the measurements or sampling times, while N p is the number of model parameters. Hence, each column represents the sensitivity coefficients with respect to one particular parameter, while each row captures the sensitivities of a specific output to the whole set of parameters at a particular sampling time.
The orthogonalization method provides an efficient forwardselection method that has been applied extensively for parameter ranking and selection. The technique is relatively simple to implement and most importantly it ranks the parameters more reliably, as both the magnitude of the effect of each model parameter on the outputs and the correlations between the effects of different parameters are considered simultaneously. This is paramount since both phenomena are critical for the parameter selection and discrimination, and consequently, to the prediction capabilities of the mathematical model. If a perturbation of a model parameter has minor effect on the outputs, then the parameter cannot be identified accurately from the data. This can be mathematically determined by calculating the norm of the sensitivity vectors (the norm of the columns Z i ). Conversely, large magnitudes/norms indicate significant effects on the outputs. At the same time, if a disturbance of two or more model parameters have similar trends/effects on the outputs, and then the parameters are highly correlated. As a result, the impact of one parameter overlaps with the impact of the other, and hence these parameters cannot be reliably and uniquely identified from the data . It should be noted that the orthogonalization method selects sequentially the least correlated and most influential parameters. The correlation can also be evaluated using the FIM (e.g. linear dependency Table 2 Orthogonalization algorithm for the estimability analysis .
Z i : sensitivity vector corresponding to the parameter ; p i : λ : cut − off value ; r i : orthogonal projection of Z i ; P j : set of estimable parameters; X j : the matrix of the selected parameters vectors at the j th stage; 1. Select the parameter with the highest effect: find the index k such that: Orthogonalization: Compute the orthogonal projection of the matrix Z : 3. Select the next parameter with the highest effect: of the sensitivity vectors) as described later. In this work, a modified Gram-Schmidt orthogonalization algorithm ( Yao et al., 2003 ) is used to help rank sequentially the model parameters according to the magnitude of the sensitivities and the least correlation effect. The sequential orthogonalization algorithm is presented in Table 2 . The first parameter is selected then all vectors of the scaled sensitivity matrix are sequentially projected onto an orthogonal basis (the sensitivity vectors with the highest magnitude). The orthogonalization method makes it possible to rank the parameters according to their estimability potential. However, the development of a reliable and rebut methodology for the selection of the optimal subset of parameters remains an open subject in the literature, since arbitrary cut-off values are applied in most cases. In this work, an optimization based approach is utilized for the optimum parameter selection based on the maximum likelihood approach: where y ij (p, t) is the experimental measurement; N y is the number of outputs and N e is the number of the experiments.
The maximum likelihood criterion was also used in the parameter estimation problem to identify the initial set of model parameters (i.e. nominal set).

Global Sensitivity Analysis (GSA)
The sensitivity analysis has been extensively applied as a technique for model simplification, model calibration and process understanding through computer-aided design ( Varma et al., 2005;Saltelli et al., 2004 ). The LSA are widely accepted by the research community due to the low computational cost. However, the LSA techniques can only determine the sensitivity of each input separately, without taking into account the overall contributions of the input variables to the output predictions. In GSA methods, a simultaneous perturbation of all parameters (inputs) is performed within specific bounds, as opposed to LSA techniques where the parameters inputs are varied once at a time. Hence, the GSA approaches are capable of measuring not only the relative impact of each input variable, but also the interactions between them. The variance-based global sensitivity techniques, which are used here, depend on the calculation of the following ratio: where E ( y ij ( p , t )| p ) denotes the expectation of the output y on a fixed value, and the variance is calculated over all possible values of the inputs. In our case the inputs are the vector of the unknown parameters p . Many GSA techniques have been developed with the most well established being the method of Sobol (2001) . The Sobol method decomposes the output function y ( p 1 , . . . , p k ) into terms of increasing degrees of interactions between the model inputs as follows: In general, there are infinite ways to decompose the function ˆ y ( p 1 , . . . , p N p ) . However, for independent factors, the decomposition based on orthogonal terms becomes unique ( Sobol , 2001 ) and the functions can be calculated through multidimensional integrals as follows: where dp ∼ i and dp ∼ ij denote the integration over all variables except p i and p j respectively, k is the sampling space. Sobol's method employs Monte Carlo approximations to calculate the integrals described in Eqs. (9 )-(11) ( Sobol ,20 01;Saltelli et al.,20 05 ).
Similarly, Eq. (8) can be re-written as a variance ( Eq. (12) ) and sensitivity ( Eq. (13) ) respectively: where V i is the contribution of the parameter p i to the total variance V ( y ), while V ij is the contribution inherent to the interactions between two parameters p i and p j . Hence, these contributions (i.e. partial variances) can be used to calculate the first-order sensitivity index for the parameter p i which evaluates the main effects of p i on the output (i.e. partial variance of p i to the total variance): In a similar way, the second-order s ij and the total order sensitivity indices s Ti can be determined from: The total sensitivity index s Ti determines the total contribution of the parameter p i considering both direct and indirect effects. Hence, the difference between s Ti and s i indicates the degree of interaction. More information regarding the method of Sobol and other variance-based method techniques can be found in Saltelli et al. (2008) .

Results and discussions
Here, a rigorous selection procedure of the optimal subset of parameters, based on the estimability approach, is developed and implemented to the MIAM. The method combines two estimability methods: the first associates local sensitivities to a sequential orthogonalization procedure and the second uses a variance-based global sensitivity selection. Only the optimal subset or parameters require identification (the rest of the parameters can be set to their nominal values) with a guaranteed minimum model mismatch (i.e. high prediction capability).
It should be emphasized that the parameters of the novel Multi-Impurity PBM model were previously identified by Borsos and co-authors (2016) using a sequential identification methodology and attempted to identify decoupled kinetic parameters while taking several parameters from literature. For instance, the parameters associated with the primary nucleation and the crystal growth of the two different facets were obtained from the literature, while the kinetic parameters inherent to the two crystal growth modifiers (i.e. impurities) were estimated from on-line image analysis data. However, the addition of additives/impurities might affect the kinetics of nucleation and growth ( Epstein, 1982;Kubota, 2001 ), and hence the nominal parameter vector might not be reliable enough.
Commonly, the estimability approach requires a set of nominal parameter values which represent a reasonable initial guess, usually obtained from literature or prior process knowledge. To guarantee a generic robust framework for parameter estimation and extend the discussions above, the estimability approach is developed for two case scenarios: the nominal parameter values were obtained by Borsos and coauthors (2016), in conjunction with the literature, and the case where no prior knowledge of the model parameters is available. In the latter case, a simultaneous identification approach, based on a hybrid global optimization approach that combines a genetic algorithm and a local deterministic approach was used to identify the nominal parameter values presented in Table 3 . In both cases the estimability approach will help identify the best parameter estimates of the optimal subset that guarantees maximum prediction capability, given the available experimental data. It is worth mentioning that this forward estimability method can also be used for the optimum design of experiments for improved parameter estimation ( Benyahia, 2009;Benyahia et al., 2011 ).
One crucial step after the identification of the 23 unknown parameters is the evaluation of the uncertainty of the model estimates, since it could provide information regarding the robustness and the predictive capability of the model. One way of assessing these uncertainties is through confidence domains. In this study, a method based on the FIM is used to estimate the 95% confidence intervals of the model-parameters. The mean values of the identified model parameters for the simultaneous approach and the corresponding confidence intervals are presented in Table 3 .
Although the confidence interval associated with some of the nominal parameter estimates are reasonably narrow, most of the model parameters as highlighted in red present broad confidence intervals. Statistically speaking, this indicates that the parameters are unidentifiable and consequently the parameter estimates are not reliable. Hence, a good fit should be combined with the estimation of confidence regions and the corresponding correlation matrix. In this way, the reliability of the estimated unknown parameters can be assessed ( Table 3 ). In most of the cases, the cause of these broad confidence domains is associated with the existence of strong correlations amongst the parameters. The correlation effects will be thoroughly discussed in conjunction with the estimability analysis in the next sections.

Local estimability analysis: Local sensitivity analysis (LSA) and orthogonalization algorithm
Although the estimability aims in essence at improving the parameter estimates in order to enhance the model prediction capability, the initial parameter estimates, used as nominal values, can play a crucial role in the quality of the sensitivity analysis, both LSA and GSA, and consequently may determine the outcomes of the estimability analysis. Poor nominal model parameters would potentially lead to inaccurate parameter ranking that may lead to a degradation of the predictive capability of the mathematical model ( Benyahia, 2009 ). The investigation of the estimability analysis is carried out, as explained before, in three steps: a local sensitivity analysis is performed in order to evaluate the relative effect of the parameters on the process outputs, then the model parameters are ranked in descending order, in terms of sensitivity magnitude and correlation, by using the orthogonalization algorithm ( Table 2 ). Finally, an incremental optimization approach that consists in a sequential identification of the top i th parameters (where i = 1,2,…) is utilized to determine the threshold or cut-off value and identify the optimal subset of model parameters .
In Fig. 3 , the variation of the dynamic sensitivity of some model parameters is presented. The first selected parameter ( Fig. 3 (a)), g 1 , which is the exponent of the growth kinetic equation in the x 1 dimension (i.e. along the length of the crystal), shows very high sensitivities at all times. This means that g 1 , has a strong effect on the model predictions (outputs) and consequently its estimability potential may be very high depending the concurrent correlation effects. The same stands for the second selected parameter ( Fig. 3 (b)), k e, CGM 2 , which describes the thermodynamic mass distribution coefficient for the CGM2 (i.e. sodium hexametaphosphate). These two parameters are likely to be ranked high in terms of estimability potential meaning that the information obtained from the measurements in the considered time window will be adequate for their accurate estimation. On the other hand, k p , 0 and k m , 0, CGM 1 show very week sensitivities at all times. For instance, the sensitivities associated with k m , 0, CGM 1 are always below 6 × 10 −7 which indicates that these model parameters are likely to be practically unidentifiable or inestimable.
Besides the relative effect of the parameters on the outputs, the sensitivity analysis can give a very good indication of the existence of correlations between the parameters. Similar sensitivity trajectories indicate strong correlation as seen in the sensitivity trajectories of k ads , 0, CGM 1 and k des , 0, CGM 1 ( Fig. 4 (a)) and G des , 1 and k ads , 0, CGM 2 ( Fig. 4 (b)). These outcomes are also consistent with the correlogram (correlation matrix) depicted in Fig. 6 .
Similar results could be drawn from Fig. 5 , where the whole parameter set is presented in a box plot. This diagram illustrates the variation of the estimated model parameters. The parameters may be classified in three different discrete subgroups according to their contribution to the output: high, moderate and low. As such, some of the parameters such as { k e, CGM 2 , G ads , 2 , G des , 2 , g 1 , k g 1 , g 2 , G des , 1 } may be classified as parameters with high impact. { k ads , 0, CGM 1 , k des , 0, CGM 1 , β 1 , G ads , 1 , k ads , 0, CGM 2 , k des , 0, CGM 2 , β 2 , k g 2 } and { G min , 1 , k m , 0, CGM 1 , K e, CGM 1 , G min , 2 , k m , 0, CGM 2 , k p , 0 , E p , k e }, on the other hand, seem to present moderate and low sensitivity to the imposed perturbation respectively. Hence a considerable number of the parameters showed low sensitivity values. This lack of sensitivity suggests that the model is over-parametrized ( Saltelli et al., 2008 ). However, model discrimination is beyond the scope of this paper and all parameters are considered essential for other physical aspects of the model performance and may be set to their nominal values without degrading the prediction capabilities of the model. These observations advocate that the vector of the model parameters, as a whole, is practically unidentifiable (from the available data) and further analysis should be done to select an optimal subset of parameters. It should be highlighted that this classification is based utterly on observation of the variation of the sensitivities and is presented as a preliminary qualitative analysis of the results. The formal implementation of the estimability approach and identification of the cut-off value will be discussed in the subsequent sections. A robust approach for ranking the model parameters according to their estimability potential is based on the orthogonalization algorithm ( Table 2 ), which takes into account both the sensitivity magnitude (i.e. Euclidean norm) and correlation during the sequential selection of the most estimable parameter. The results obtained based on modified Gram-Schmidt orthogonalization algorithm are presented in Table 4 . The exponents of the growth kinetic equations in the { x 1 , x 2 } dimensions (i.e. g 1 and g 2 ) indicate high estimability potential. This was expected since these parame-

Table 4
Ranking of parameters with the highest estimability potential.

Method
Parameter ranking Orthogonalization algorithm g 1 , k e, CGM 2 , G ads , 1 , g 2 , G ads , 2 , k g 1 , β 1 , G des , 2 , k ads , 0, CGM 1 , k des , 0, CGM 1 , k g 2 , G des , 1 , β 2 , k des , 0, CGM 2 , k ads , 0, CGM 2 , k e , K e, CGM 1 , k p , 0 , k m , 0, CGM 2 , G min , 1 , E p , G min , 2 , k m , 0, CGM 1 ters represent the exponential factors of the crystal growth rates used in the model algebraic equations (i.e. empirical power law expressions). The absorption energy of the impurities (i.e. G ads , 1 and G ads , 2 ) also appear to have a significant impact on the outputs since they are highly ranked on the list illustrating high estimability potential. It is also evident, that the kinetic parameters corresponding to the nucleation mechanism are ranked quite low { k p, o , E p , k e } because of their weak sensitivity coefficients at the sampling times. This reveals how critical is the incorporation of the estimability analysis in the development of the design of experiment and consequently in mathematical modelling. Moreover, it is known that the AR measurements can provide negligible information regarding the nucleation phenomena. This limitation maybe overcome by incorporating additional PAT to measure the concentration and number of counts (focussed beam reflectance (FBRM)) and considering these two variables as process outputs in the estimability framework.
The estimability analysis revealed that the data are not adequate to estimate accurately the nucleation kinetics. This is key, especially in systems utilizing different sensors. As such, the information content of each sensor may be assessed and consequently the number of parameters that can be estimated from each individual PAT or from their combination (e.g. sensors providing different outputs) may be determined. This may also be employed for the evaluation of the accuracy of the measuring method with the scope of collecting more accurate measurements. Hence, the estimability analysis can be utilized for the selection of the appropriate PAT and consequently the most efficient strategy to collect the experimental data to improve the accuracy of the model parameters, which in turns enhances the model reliability in key applications such as process design and control. The sequential orthogonalization approach helps select the parameters with the highest sensitivity and least correlation. To provide a more rigorous insight into the correlation effects, a correlation matrix, can be computed by using the Pearson method ( Kendall et al., 1977 ). The 23 × 23 correlation matrix obtained from the covariance matrix (inverse of the FIM) is presented in Fig. 6 . As shown, strong positive or negative correlations exist between some parameters. For instance, the parameters p 20 and p 23 present very strong negative correlation. Hence, any change that occurs in p 20 can be compensated by an inverse change in p 23 . Similar interaction patterns do occur between the kinetics describing the nucleation phenomenon (i.e. p 21 , p 22 , p 23 ), which present strong positive and negative correlation. In general, the presence of high correlations, especially if it involves many parameters, can make the identification process difficult and inaccurate (not unique).
In order to identify the optimal subset of parameters that maximize model reliability, a cut-off value should be set as a boundary between the parameters with high estimability potential, that can be identified reliably, and the remaining parameters that are poorly identifiable and should be set to their nominal value without degrading the prediction capability of the model. As such, the cut-off value is critical as it affects both the cost and quality of the estimability approach and consequently the prediction capability of the model.
To identify the cut-off value and consequently the optimal subset of parameters, an iterative approach is performed. It consists in identifying incrementally the subsets of model parameters, from the experimental data, according to their estimability potential ( Table 4 ) staring with the top ranked parameter, then the top two parameters and so forth. This approach will help identify the optimal objective function threshold (i.e. cut-off value), beyond which all improvements are significant, and consequently the optimum identifiable subset of parameters. The results of the iterative incremental approach are depicted in Fig. 7 . Typically, when a mean square error approach is considered, the objective function, J ( p ), decreases until a plateau is reached. The initial point of the plateau can be considered as the cut-off value as no significant improvement can be achieved from that point onwards, which consequently sets the limit of the optimal identifiable parameter set. Fig. 7 (a) indicates that the top 7 ranked parameters, in the case of the nominal values obtained from Borsos et al. (2016) , and the top 8 ranked parameters, in the case of the simultaneous optimization approach, are sufficient to capture the information contained in the experimental data. Despite the fact that using more parameters may lead to a slight decrease in the objective function, as depicted in Fig. 7 , the estimability approach guarantees the best trade-off between model reliability and minimum set of parameters to be identified. Fig. 7 also confirms that different nominal parameter values, as clearly shown in Fig. 7 (a) and (b), lead to different threshold values (340 in case a and 290 in case b). In this particular case, the estimability approach implemented with the nominal vector inherent to a simultaneous identification approach out forms the quality of the one carried out with Borsos and coauthors' nominal value obtained sequentially. It should be noted that the objective functions show non-smoothness in both cases which is likely due to the high nonlinearity and stiffness of the set of ODEs and the increased correlation between the parameters as more parameters are being added. This non-smooth behaviour may also indicate that the local solver got stuck in local optima.

Global sensitivity analysis
The global sensitivity analysis (GSA) is utilized here in order to assess the performance of the model itself and to cross-validate the local estimability analysis approach discussed earlier. The method provides another alternative to rank the model parameters and identify the optimal set of parameters that could be estimated from the experimental data. In this case, the total order sensitivity index will be used to rank the parameters, followed by an incremental optimization-based selection approach whose performances will be compared against the previously described estimability approach, associated with the local sensitivities.
The Sobol analysis is performed as follows. Firstly, a nominal set of model parameters is defined followed by the definition of the probability distributions for each individual parameter. In this work, a Gaussian distribution was assigned for every parameter by considering 2 % variance. Narrow limits are applied since the population balance models for crystallization processes present, in general, high stiffness, which might have a considerable effect on the computational burden. Random combinations of the parameter values are generated from the assigned probability distribution functions. Thus, the output of the model is evaluated for different parameter sets along with the uncertainties. Consequently, the sensitivity indices are calculated in order to assess the effect of the parameters and rank them accordingly.
The global sensitivity analysis is performed here by taking into account two different scenarios. In the first scenario, the effects of the parameters are analyzed considering only the model predicted outputs inherent to the set of the DAEs representing the studied system. Hence, the impact of the parameters on the joint moments and on the concentration of the solution and impurities is investigated based exclusively on simulations (i.e. without considering the sampling times). The second scenario considers the mean AR measurements. Hence, Sobol analysis is applied for the decomposition of the variance which is associated with the difference between experimental and simulation data (i.e. global estimability analysis). In more detail, the computation of the root mean square error can provide information with the scope of parameter ranking and model selection (cut-off value determination). The 23 unknown model parameters estimated in this work and defined in Table 3 are used as inputs for the sensitivity and estimability analysis.
It was demonstrated that a tradeoff between computational accuracy and efficiency of the first and total order sensitivity indices can be achieved at a cost of ( N p + 2 ) N model evaluations ( Saltelli et al., 2005 ), where N is the number of samples that should be between 5 × 10 2 and 1 × 10 3 and N p is the number of parameters (23 in our case). In this analysis, a conservative approach is adopted by considering N = 1 × 10 3 and the total number of evaluations as 25 × 10 3 for both scenarios. The results were validated using different numbers of samples ( N ) to ensure consistency and robustness.
The results are summarized in Figs. 8 and 9 , where the first and total order Sobol sensitivity indices are presented in descending order for the two scenarios. Fig. 8 (a) and (b) indicate that both first and total order Sobol sensitivity indices yield the same order of priority for the first scenario which illustrates that certain parameters have a considerable impact on the output variable (i.e. AR) both directly (relative impact of each input variable) and indirectly (interaction among the input parameters). In a similar way to the discussion above, the effect of randomly generated subsets of parameters on the mean square error between the measured and predicted mean AR is analyzed for the second scenario.
The greater the sensitivity indices are, the more critical the parameters are for the model. Figs. 8 and 9 show that the parameters g 1 and g 2 , which are the exponents of the growth kinetic equations in the x 1 and x 2 dimension respectively, possess the highest total sensitivity indexes. This was expected since a growth dominated physical system is under investigation. The analysis also demonstrates that G ads , 2 , G des , 2 and K e, CGM 2 , which represent the adsorption, desorption kinetics and the thermodynamic mass distribution coefficient for CGM2 respectively, can be reliably identified. This can be anticipated as well since it was experimentally proven ( Borsos et al., 2016 ) that the CGM2 (i.e. sodium hexametaphosphate) has a more prominent effect compared to CGM1 (i.e. aluminum sulfate). When both growth modifiers are present in the system, the AR decreases which is caused by CGM2, even when lower amounts of CGM2 are used. The nucleation kinetics present low sensitivity values, which is consistent with the outcomes of the estimability analysis based on local sensitivities. By comparing the two scenarios, the majority of the parameters show significant lack of sensitivity. However, in both scenarios interesting patterns emerge. Sensitive parameters (high s i values) affect the output through both direct and indirect effects (high s t values). Thus, the parameters with moderate and low sensitivity values cannot affect the system even indirectly (i.e. through interactions) from a sensi-  The total order indices, presented in Figs. 8 (b) and 9 (b), are used to identify the cut-off value for the selection of the optimal subset of model parameters. As it can be seen, the values of the total order indices are reduced until a plateau is reached. The initial point of the plateau can be considered as the cut-off value since the addition of more parameters, from that point onwards, does not improve the prediction capability of the model The Sobol analysis indicates that a cut-off value can be identified directly from the total order indices and accordingly the top 7 and 8 parameters are sufficient to build a reliable model for the 1st and 2nd scenarios respectively. However, for the sake of consistency and in order to enable a reliable comparison between the two estimability methods, the cut-off value will be identified from the profile of the objective function associated with the parameter identification problem (minimization of the error between the model predictions and the experimental data). The profile of the objective functions for the two scenarios, obtained by an incremental iterative approach as described above for the case of LSbased estimability, are depicted in Fig. 10 . As noticed in the previous case, the objective function decreases significantly with the introduction of the top few parameters. The diagram confirms that the selection of the top 8 parameters can be sufficient enough to maximize the prediction capabilities of the model. Despite these consistent outcomes, the selection process through an incremental iterative parameter estimation procedure is highly advised as it is more reliable compared to the selection based on the magnitudes of the total order sensitivity index. Fig. 10 also shows non-smooth behavior similar to Fig. 7 .
To make a reliable and effective comparison between the two methods described in the paper (the estimability method based on LS and Sobol method with two scenarios), the parameter ranking and optimal parameter sets are summarized in Table 5 . Although, each method yields a different classification. As expected, some consistency was achieved as the same four parameters, highlighted in red, which were identified by both methods as the ones with the most prominent effects. The inconsistencies can be explained by the fact that the methods use essentially different approaches, LS and GL, besides, the quality of the nominal vector of parameters can play a key role in both cases. Although, both technics can be used separately, the outcomes of the analysis show that their combination can provide a more systematic and robust selection of the subset of parameters that provide guaranteed optimal model prediction capabilities, based on the available data. In addition, the methodology can provide a basis to assess the quality and quantity of the experimental data or alternatively inform or help design the required experiments and/or measurements (DoE) that could improve the estimability potential of a specific parameter, which in turn helps improve the prediction capabilities of the mathematical model, particularly in the case of multi-dimensional popula- Table 5 Summary of the parameter ranking based on Orthogonalization algorithm and the Sobol analysis. tion balance models. Although Sobol analysis provides one of the most accurate methods for calculating the sensitivities of the parameters, the method doesn't consider the correlation between the parameters systematically during the ranking process as opposed the local estimability which addresses quite effectively the correlations issue, as the sequential orthogonalization approach is used precisely to exclude the parameters showing high correlations from being selected amongst the optimal top ranked set.
Finally, to further demonstrate the benefits of the estimably analysis and appraise the prediction capability of the model built with the optimal subset of parameters, the model predictions are compared against the experimental data as well as the predictions of the model built without the estimability approach ( Borsos et al., 2016 ). Three different experiments associated with mean AR measurements are used, as shown in Fig. 11 . It should be noted that obtaining accurate AR data is very challenging. The PVM is currently the main technique available to measure real time the AR despite the inherent noisy and non-smooth data, as clearly seen in Fig. 11 , which is commonly associated with most of image monitoring tools.
Although both model seem to provide a good fitting, Fig. 11 (a) and (b) show that the mathematical model with estimability approach demonstrates better prediction capability. This outcome is consistent with the incremental objective function analysis ( Fig. 7 ). The results show that the model build by identifying the 8 most estimable parameters outperforms the one build by identifying all parameters sequentially, as can be seen also in Fig. 12 . It becomes clear that the estimability approach makes the identification process more accurate and less laborious, as a reduced set of parameters is identified while the rest of the parameters are kept to their nominal values without compromising the prediction capability of the mathematical model.

Conclusions
Parameter estimability is essential to assess whether the model parameters can be reliably identified from existing data, which consequently provides a key step towards more predictable and robust mathematical models. Within this perspective, a novel es-timability framework that combines a sequential orthogonalization of the local sensitivity matrix and Sobol, a variance-based global sensitivities technic, was proposed. The estimability analysis requires an initial or nominal vector of model parameters. When either of the two situations occurs: a nominal vector of parameters is not available or the initial parameter estimates are considered as highly uncertain, the framework suggests a simultaneously identification of the whole set of parameters using a hybrid global optimization approach. The estimability procedure can be then conducted using the nominal vector of parameters in conjunction with the available experimental data. The systematic combination of two different estimability methods guarantees a robust selection of the optimal subset of parameters; the set that can be identified more reliably with a guaranteed maximum model prediction capability. As such, both parameters significance and correlations should be considered to rank the model parameters. The framework suggests a systematic methodology, based on the parameter identification objective function, to identify the cut-off value which indicates the boundary between the parameters that can be reliably identified (the optimal subset) and those who should be set to their nominal value. When the resulting model prediction capability is not satisfactory or/and very limited number of parameters can be identified reliably, the method suggests extracting additional experimental data that can be based on appropriate design of experiments.
As a validation step, the methodology was implemented to a complex multi-dimensional morphological population balance for batch crystallization processes, which combines the effects of different crystal growth modifiers/ impurities on the crystal size and shape distribution of the population of needle-like crystals. Initially, two situations were considered regarding the nominal vector of parameters: parameters obtained from literature and those identified using a simultaneous global optimization. The first evaluation of the quality of the nominal vector of parameters revealed that most of the nominal parameters are inherently uncertain, based on the confidence domains, which justifies the need for the estimability analysis. The 23 model parameters were ranked accordingly in terms of highest local sensitivity magnitude and least correlation, in the case of the sequential orthogonalization method, and total order sensitivity indices, in the case of Sobol. The correlation patterns confirmed the existence of strong correlation between some parameters, which helped explain the resulting parameter ranking. The least square incremental parameter identification procedure helped determine the cut-off value and consequently the optimal subset of parameters which turned out to be 8 parameters using both methods. Despite some slight parameter ranking differences, the two different estimability methods managed to capture consistently the most significant parameters. However, it is highly recommended to run both methods to maximize the benefits of the estimability approach and minimize the least square value at the cut off value, which guarantees maximum model prediction capability. The case study showed that although noisy AR data with low information content were used, a set of the most influential and the least correlated parameters could be identified, providing enhanced prediction capabilities of the dynamic model of the studied crystallization process. As a consequence, the frame-work can be extremely valuable in complex model systems when a large number of parameters needs be identified from low information content data, which is commonly encountered in real systems. The proposed framework can also embed an optimization of the experimental campaign to maximize the information content and reduce the cost inherent to redundant experimental information. In the case of systems utilizing different sensors, the information content of each sensor can be assessed and consequently the number of parameters that can be estimated from each individual PAT or from their combination (e.g. sensors providing different outputs) can be determined, which helps select the most appropriate PAT depending on the targeted level of prediction capability and application (e.g. process control). To describe the dynamic evolution of the crystal shape distribution, a multidimensional population balance equation (PBEs) with two characteristic lengths x l = { x 1 , x 2 } was considered that can be written as: where n ( t, x ) is the number density function, δ( x − x 0 ) is the delta distribution that characterizes the formation of the nuclei, B p is the primary nucleation rate and G i is the crystal growth rate of the i th characteristic crystal facet. The initial and boundary conditions of the PBE are respectively: where ∂ sz is the space boundary of the particle size.
The model can be reduced from a partial differential equation (PDE) to a set of ordinary differential equations (ODEs) by using the standard method of moments (SMOM). Since only average properties are needed for the determination of the mean crystal AR, the SMOM method can provide an efficient and accurate method for the estimation of the key characteristics of the crystal population.

(A.5)
This set of ODEs coupled with the component mass balances, for the solute and impurities, describes a comprehensive momentbased model for crystallization processes under the presence of one or multiple impurities/ additives. The interpretation of the most critical joint moments is as follows: μ 0, 0 is the total number of crystals ( # / m 3 ) and μ 2, 1 represents the crystal volume in a unit volume of suspension ( m 3 / m 3 ). However, although these are the only joint moments that have a physical meaning, other ones can be used to determine other key properties of the crystal population. Furthermore, the moments can be utilized to determine the mean crystal sizes ( Eqs. (A.6) and (A.7) ) of the total population of each characteristic length, while the mean AR of the crystals ( Eq. (A.8) ) can be estimated by the division of the mean sizes as illustrated below: x 1 = μ 0 , 1 μ 0 , 0 (A.6) x 2 = μ 1 , 0 μ 0 , 0 (A.7) AR = x 1 x 2 (A.8) It should be noted the model is based on several assumptions mainly: • All the new formed crystals have a nominal size L x 1 ,n ≈ L x 2 ,n ≥ 0 . Hence, it can be considered that the initial nuclei size is L n ≈ 0 (in most of the modelling studies, describing crystallization processes, the initial nucleus size is set to zero for practical purposes). • The process operates under well-mixed conditions, so it could be assumed that the system is perfectly mixed. Hence a lumped parameter model is developed since the dependent variable does not change with spatial location (e.g. the density function, the concentration of the different chemical compounds and the moments are functions of time and not space). • Only primary nucleation and crystal growth is considered since only these phenomena were detected experimentally. Thus, agglomeration and breakage can be neglected. • Size-independent growth rates are assumed for the two characteristic faces since the SMOM is applied for the identification of the parameters • Two different impurities and two different active sites are taken into account, which are located on two different crystal facets. • There is no interaction between the active sites. • Impurity effect on the nucleation is insignificant and hence it is considered negligible.
Equilibrium adsorption model is considered.