Machine learning for viscoelastic constitutive model identification and parameterisation using Large Amplitude Oscillatory Shear

Identification and parameterisation of constitutive models can be a challenging task in rheology. We investigate the use of Random Forest (RF) regression to estimate viscoelastic constitutive model parameters using Large Amplitude Oscillatory Shear (LAOS) data. Specifically, we deploy the RF to predict constitutive model parameters using the spectra of Chebyshev coefficients pertaining to the stress-strain-strain rate Lissajous curves. As far as we know, this is the first time Machine Learning (ML) has been utilised for this predictive task. We test three constitutive models: the linear and exponential Phan-Thien-Tanner (PTT) models, and the RoLiE-Poly model. For both PTT models, the RF estimator predicts the model parameters from the Chebyshev spectra with high accuracy. For the RoLiE-Poly model, the RF estimator demonstrates lower accuracy in model validation, however the predicted parameters still accurately capture the rheological behaviour in both oscillatory shear and oscillatory extensional flow. This is because multiple sets of RoLiE-Poly model parameters can yield very similar rheological responses and indicates practical identifiability issues under the investigated conditions. This does not represent an inadequacy of the framework, but rather a fundamental challenge in estimating model parameters using rheometric data. Overall, the study highlights the strong potential ML has to offer for selecting and parametrising constitutive models based on rheometry data. Our results can help in allowing for widespread robust, modelling of viscoelastic fluid flows.


Identifying constitutive viscoelastic models
Viscoelastic fluid flows are ubiquitous across a wide range of industries and sectors, ranging from Fast Moving Consumer Goods (FMCG) [1,2] to health-care [3][4][5][6].With respect to the the former example, constitutive models are particularly important for understanding how product design and formulation affects rheological behavoiour in process equipment (mixing devices [7,8], pipe bends [9], contractions [10] etc).This is a key step in transitioning from product development and accelerating towards manufacture.Moreover, given the current climate emergency and the need to make our societies more sustainable, a particular challenge within FMCG and many other sectors at present is the rapid incorporation of novel and sustainable raw ingredients.Since the introduction of new materials into products can influence their rheology [11], it is vitally important that we develop tools to accurately and rapidly identify suitable constitutive models, to allow for robust digital modelling.Without access to robust digital modelling tools, industries still have to rely on trial-and-error experiments to scale-up and roll-out new products, which wastes time and increases carbon emissions.Additionally, although it gives rise to complex fluid behaviour, viscoelasticity can be a desirable material property in many cases.For example, this is the case for hydrogels used within the context of regenerative medicine.Viscoelasticity is a key property of biological soft-tissues, as it helps create the mechanical environments required to direct cell morphology appropriately [12,13].Therefore, the effective design and manufacture of viscoelastic hydrogels via digital modelling is crucial for regenerative technology.
A vast number of constitutive equations (or models) have been proposed to enable continuum scale modelling of viscoelastic fluid flows.To model these flows with, for example, advanced numerical tools such as Computational Fluid Dynamics (CFD), one must accurately select and parameterise (or fit) a particular constitutive model using some experimental data.Usually, this data is obtained uing shear rheometery.Provided that an analytical solution exists for the chosen constitutive model under the chosen rheometric test (or kinematics), model parameter estimation is relatively straightforward as solution to a nonlinear program [14].However, if no analytical solution exists for the particular choice of constitutive model and flow, parameter estimation becomes more challenging.For such cases, the constitutive model can be solved numerically using an appropriate scheme [15].
The numerical solution can then be incorporated into an optimisation scheme for estimation of the parameters [16,17].This is, however, more complex and time-consuming than the case where the model can be solved analytically, and in practice model nonlinearity poses a considerable challenge.Some of the most physically advanced and recently developed constitutive models tend to include large numbers of fitting parameters.This can pose a challenge to parameter estimation both from the perspective of practical identifiability and computation.In the former case, this might be resolved through the collection of more rheometric data.Note that in the previous discussion regarding model fitting, it is already assumed that an appropriate constitutive model has already been selected.If one is not certain as to which constitutive model is best to use for a given material, this parameterisation process must be repeated for a range of models in order to assess which one best fits the experimental data.Moreover, if spatial heterogeneity arises in the flow due to complex phenomena such as shear banding [18,19] (which is commonly observed for many personal care products such as shampoos and shower gels [20]), this must also be captured in the numerical modelling in order to properly fit the model response to the rheometric data.This means the numerical simulations must resolve the flow spatially, which dramatically increases the number of discretised equations to be solved and makes the parameterisation process even more complex and time-consuming.
Considering the disadvantages of the traditional fitting methods discussed previously, we therefore propose, in this proof-of-concept study, an alternative approach for constitutive model identification and parameterisation based on Machine Learning (ML) [21].
In the following sub-sections, we will give brief reviews of the deployment of Machine Learning within the fields of chemical engineering and then, more specifically, rheology and constitutive modelling.We will also introduce to the reader constitutive modelling for viscoelastic materials, and detail the particular constitutive models employed in this investigation.

Machine Learning in Chemical Engineering
ML has been widely investigated in Chemical Engineering applications as far back as the early 1990s.The process systems engineering community, which develops computational tools for design and operation of multi-scale systems, has demonstrated particular interest in these algorithms [22,23].What unites this interest is the use of flexible, black-box functions to approximate relationships for which one has observations, but few processes to reason about mechanistically.For example, in operational decision problems such as control there has been interest in the development of Reinforcement Learning (RL) [24].RL can be conceptualized as a data-driven approach to identifying an approximately optimal policy function approximation.Depending on the problem, one may be agnostic of the optimal policy function structure and parameters.In these cases, RL provides an attractive solution method.As a result of its generality, it has observed applications in production scheduling [25], supply chain optimisation [26] and integrated design and control [27].
Another domain where model structure may evade reasoning processes include complex reaction systems and their process dynamics.For example, Gaussian process have been used to model the discrete-time state transitions of biochemical reaction systems [28].However, purely black-box approaches have limited extrapolation ability.For this reason, hybridization of ML and first principles models has had academic and industrial impact.This is because the resultant models perform in both extrapolative and interpolative predictive tests [29][30][31].
Highly dimensional data sources have historically provided challenge to model identification.However, many ML models are able to identify structure within the data, learn lower dimensional representations of it, and then make robust predictions.For example, [32] proposed the use of deep learning and data assimilation to forecast drop dynamics in a microfluidic device from real-time video recordings.This is also exploited in application areas such as soft-sensing [33,34], which provides means to estimate process variables that are conventionally difficult to measure by utilising other available process sensor measurements.
ML has also demonstrated application, beyond operational decision-making, in design and analysis.For example, surrogate modelling of CFD simulations presents as attractive means of model-based design [35,36].In general, these approaches aim to identify a nonlinear relationship between the design decisions and the design objective for exploitation by model-based optimisation.Finally, it is worth noting that although ML describes a set of blackbox models, once the model has been identified, it can be interrogated for the purpose of analysis.For example, Kim and Maravelias [37] demonstrated the use of ML for analysing the relationship between features of batch production scheduling mixed-integer linear programs and respective solution times.
Having provided wider context and perspectives on the motivation for use of ML across chemical engineering, we now review its application within rheology.

Machine Learning in rheology
Recently a number of authors have employed ML for data-driven constitutive modelling of complex fluids [38][39][40][41][42][43].In the large majority of these studies, the ML algorithm is trained to reproduce observed rheological behaviour by formulating a 'generalised' or 'meta' non-linear constitutive model.That is, a generic model which satisfies required physical principles such as material-frame indifference [44], but is not absolutely restricted in terms of the forms or positions of the non-linear functions within the model.Therefore, these data-driven models could represent blends of pre-existing constitutive models or even new constitutive models entirely.Mahmoudabadbozchelou et al. [38] presents a multi-fidelity neural network approach for modelling the rheology of complex fluids.In this method, high fidelity data (experimental or numerical) is used in combination with low fidelity data generated for simple Generalised Newtonian Fluid (GNF) models (in particular the power-law and Herschel-Bulkley models) for the neural network training.The results showed that the multi-fidelity approach was able to reproduce experimental rheometric data much more accurately than the more traditional approach, which is purely data-driven and is not trained based on constitutive models.Mahmoudabadbozchelou et al. [43] present a framework which the authors call 'Rheoloy-informed Graph neural Netorks' or RhiGNets.In this study, the authors use a physics-informed approach to recover rheological behaviour based on a small set of experimental data for a given material.This is particularly useful for constitutive models containing very large numbers of fitting parameters, where the generation of large training data sets is computationally demanding due to the dimensionality of the problem.This is particularly the case for recently developed constitutive models [45][46][47] for Thixotropic Elasto-Visco-Plastic (TEVP) materials, which typically contain more than 10 fitting parameters [48].
For the majority of available CFD codes (at least those primarily of interest in industry settings), the user is still restricted to using a set of chosen constitutive models, rather than ML informed models.Also, even if ML informed constitutive models are implemented and become the norm in major CFD codes, it is still important to be able to quickly and accurately select and parameterise well-known constitutive models derived from micro-structural theories.This is particularly the case when one wishes to undertake fundamental studies of the fluid dynamics of these constitutive models, which is still a popular and impactful topic [49][50][51][52][53].
In this proof-of-concept study, we explore a means by which to approximate the mapping between rheometric data and an optimal parameterisation of a constitutive model by leveraging the highly flexible, and nonlinear set of models encompassed by Machine Learning.Specifically, we explore the use of Random Forest (RF) regression.The major benefit of this is that we mitigate dependence on optimisation-based parameter estimation routines to quickly provide a model estimate.The manuscript is organized as follows: in Section 2, viscoelastic constitutive models of interest to this work are introduced; Section 3 presents the proposed framework to enable parameterization of a constitutive model from Large Amplitude Oscillatory Data (LAOS) via the inference process of an ML model; in Section 4, key results and discussions are presented before we subsequently present our conclusions from the study.

Preliminary: Viscoelastic constitutive modelling
Here, viscoelastic constitutive models of interest to this work are introduced.For a more thorough introduction to constitutive modelling and viscoelasticity, the reader is referred to the book by Bird et al. [54].We start by detailing the Phan-Thien-Tanner [55] constitutive models.A generic polymer network-type constitutive model may be written as where A is the conformation, or configuration, tensor describing the stretching and orientation of the ensemble of microscopic entities (polymers or micelles etc), λ is the viscoelastic relaxation time, and D(A) and C(A) are scalar functions describing the rates of microstructural destruction and creation, respectively.A here is defined as A ≡ tr(A) ≡ A 11 +A 22 +A 33 .

□
A denotes the objective Gordon-Schowalter time derivative [56] defined as where D is the rate-of-strain tensor defined as D ≡ (∇u + ∇u T )/2.The parameter ζ (where 0 ≤ ζ ≤ 2) physically describes "slip" between the polymeric material and the solvent.In the case that ζ = 0, the upper-convected (or Oldroyd [57]) derivative is obtained, denoted by where the reference frame translates, rotates, and deforms affinely with the material.In the case that ζ = 1, the co-rotational (or Jaumann) derivative is obtained, where the reference frame only translates and rotates with the material.Finally, in the case that ζ = 2, the (less frequently employed) lower-convected derivative is obtained.The extra-stress tensor τ (the total or Cauchy stress tensor minus the isotropic pressure) is recovered from A by In the Phan-Thien-Tanner (PTT) family of viscoelastic models, which are popular choices of constitutive models for many viscoelastic materials [58][59][60], the assumption is made that . The original PTT model [55], which we denote as the linear PTT (lPTT) model, uses a linear function for D(A), which is given by where ε represents the rate of destruction of the junction points in the Lodge-Yamamoto network theory, from which the PTT model is derived.The PTT model was, shortly after, modified to include an exponential term for the junction breakage [61].In this model, which we denote as the exponential PTT (ePTT) model, D(A) is defined as Therefore, if one wishes to use either the lPTT or ePTT model to simulate the flow of a real-world viscoelastic material, the parameters η p , η s , λ, ζ, and ε must be obtained via fitting with experimental data.The analytical or numerical solution for the constitutive model is obtained, most often, by substituting into Equations ( 1) and (3) known kinematics (i.e. a prescribed velocity gradient tensor ∇u), which may be time-dependent.In the case of simple (or homogeneous) shear flow, where the 1,2, and 3 components of the orthonormal coordinate system are in the velocity, velocity gradient, and vorticity directions, respectively, the only nonzero component of ∇u is (∇u) 21 = γ.The resulting solution, with initially guessed model parameters, is then to be compared with the experimental data for the fitting of the parameters.
We also test in this study the ROuse LInear Entangled POLYmers (RoLiE-Poly) model, which was proposed by Likhtman and Graham [62].The RoLiE-Poly model is derived from a refined version of the Doi-Edwards tube theory for entangled polymers [63,64] and accounts for the physical (microscopic) processes of reptation, convective constraint release (CCR), chain stretch and retraction [65].The model is given by and τ is then recovered using Equation (3) with ζ = 0, since the RoLiE-Poly model uses the upper-convected derivative.The parameter z defines the ratio between the reptation relaxation time λ and the Rouse relaxation time λ R .Note that the entanglement number Z is often used in the literature and is defined as Z = z/3.For clarity, we do not refer to lowercase z in Equation ( 6) as the entanglement number, it is simply, for our purpose, a model parameter that we desire to fit.β CCR is the convective constraint release parameter, where 0 ≤ β CCR ≤ 1. δ is a fitting parameter, whose optimum value has been found to be −0.5 [63].Accordingly, we use δ = −0.5 in this study, and so the non-linear model parameters (those other than λ, η p , and η s ) we seek are z and β CCR .

Methodology
As this is primarily a proof-of-concept study, we use only the three aforementioned constitutive models to highlight the methodology we are proposing.The methodology is designed such that it can be easily repeated for further sets of constitutive models.This is particularly useful since new and improved constitutive models are frequently being proposed.In this section, we will introduce the reader to the rheometric flows we use for model fitting and validation, and then we will detail the RF algorithm and its implementation in the proposed workflow for this study.In this study, we focus solely on Large Amplitude Oscillatory Shear (LAOS) as the rheometric test to be employed for the fitting of the model parameters.During an ideal oscillatory shear flow, the strain γ(t) follows a sine wave as γ = γ 0 sin(ωt) where γ 0 and ω represent the amplitude and the frequency of the oscillation respectively.The strain rate γ(t) ≡ dγ/ dt, is then given as γ = γ0 cos(ωt) where γ0 ≡ γ 0 ω is the strain rate amplitude.
The velocity gradient tensor is therefore expressed as Due to the time-periodicity of the applied flow, the response of a viscoelastic material or constitutive model to oscillatory shear flow can be decomposed, in an ideal case, as where G ′ and G ′′ are, respectively, the storage and elastic moduli, which quantify elasticity and viscosity, respectively [66].For Small Amplitude Oscillatory Shear (SAOS), the stress response of the material or model is linear such that a single mode of G ′ and G ′′ is sufficient to reconstruct the stress (i.e. the response is linear).During a LAOS experiment or simulation, the non-linearity in the stress response can be interpreted as higher-order harmonics in the stress response.Experimentally, non-linearity arises due to intra-cycle microstructural changes which are induced by the large deformations (or rates of deformation) in the flow.
LAOS is considered to be especially useful for fitting constitutive models to experimental data due to the fact it captures both transient and non-linear viscoelastic behaviour [16].Responses of viscoelastic materials and constitutive models under LAOS flow can be rich with physical information.Moreover, even in the non-linear viscoelastic regime, the responses of some viscoelastic models can only be distinguished from one another in transient flows such as LAOS [67,68].However, whilst analytical solutions exist for various constitutive models in SSSF, it is not always possible to obtain analytical solutions in LAOS for even relatively simple constitutive models.Various authors have obtained analytical solutions, utilising symbolic computational tools, in a regime denoted as Medium Amplitude Oscillatory Shear (MAOS) [69], where the response is only weakly non-linear.In this case, the fitting of these viscoelastic constitutive models to experimental MAOS data does not have to involve numerical simulations.In the MAOS regime, however, the stress response might be largely insensitive to the values of the model parameters, due to the response only being weakly non-linear.Currently, for LAOS, the only viable option for fitting constitutive models involves minimising the error between a numerical solution and experimental data through optimisation.This can often be challenging and expensive due to high model nonlinearity.ML presents therefore as a potential tool for quickly fitting constitutive models to experimental LAOS data, by parameterising the mapping between LAOS data and optimal model parameters.
We now define the following dimensionless variables Therefore, according to Equations ( 1) through ( 6), and Equation ( 9), the lPTT and ePTT models can be written for homogeneous oscillatory shear flow in dimensionless form as where The RoLiE-Poly model can be written as We then recover the dimensionless extra-stress (reaffirming that ζ = 0 for the RoLiE-Poly The dimensionless groups which arise from the non-dimensionalisation are the Deborah number De = λω, representing dimensionless frequency, the Weissenberg number W i = λ γ0 , representing dimensionless amplitude, and the viscosity ratio β = η s /(η s + η p ).Note here that we assume the flow is spatially uniform (creeping flow and no shear-banding), and so ∇A = ∇τ = 0.The dimensionless velocity gradient tensor is given for the oscillatory shear flow as Equations ( 9) through ( 14) form a set of closed Ordinary Differential Equations (ODEs), which we solve numerically using MATLAB's ode15s solver for each set of model parameters.
This solver uses built-in adaptive time-stepping.All simulations are initialised with A = I would likely be linear, meaning the shear stress response would not be sensitive to the nonlinear model parameters (hence they would not be able to be determined).Also, since we are using a non-dimensional stress based on γ0 η 0 where η 0 = η p + η s is the zero-shear viscosity, the zero-shear viscosity must also be known for a given material for the implementation of this proposed methodology.However, both λ and η 0 are typically very easy to measure.
There is no reason that the ML algorithm could not be trained on data for multiple values of De and/or W i simultaneously (e.g.data for an amplitude sweep).However, as this is primarily a proof-of-concept study, and we wish to limit the amount of generated training data, we use one fixed point in Pipkin space.
The assumption that the flow is homogeneous, at least in the direction of the velocity gradient, is not realistic since it is widely known that the inclusion of a nonzero slip parameter ζ in the constitutive model results in shear-banding for a wide range of model and system parameters [70][71][72].The RoLiE-Poly model is also widely studied specifically for its ability to predict shear banding [73,74].We could relax this assumption by solving both the constitutive model and the momentum equation in a 1D gap of fluid and allowing the flow to shear-band, as is done in the study of John et al. [68].However, since we wish to create a large training database containing thousands of LAOS simulations for varying sets of parameters, this would take considerably longer.In this study, we just wish to highlight the potential for ML to correlate LAOS results with constitutive model parameters, and so we assume spatially uniform flow.If this method were to be implemented to process real rheometry data, the 1D method would need to be implemented in the RF training to account for potential shear-banding.

Large Amplitude Oscillatory Extension (LAOE)
We also check the accuracy of the parameter predictions of the RF algorithm in a homogeneous oscillatory planar (i.e.2D) extensional flow.We test this flow to check if the rheological behaviour under this flow is more sensitive to the model parameters than for LAOS.To simulate this flow, we impose a velocity gradient tensor given by (∇u and solve the resulting system of Equations after substituting this into the constitutive model (see (Appendix B)), as was done for the oscillatory shear flow in previous sections.We use the first normal stress difference N 1 = τ22 − τ11 as the measure of the polymeric stress.Note that for the non-dimensionalisation of the equations, the extensional strain rate ε is used instead of the shear strain rate γ.Therefore, W i = λ ε0 .
Whilst oscillatory homogeneous shear flows can be easily created experimentally with modern shear rheometers, homogeneous extensional flows are significantly more challenging to realise.Thus, whilst the methodologies and results for this study for LAOS can be readily implemented, this section of the study is more hypothetical in nature.Many candidates for extensional rheometers (planar, uniaxial, and biaxial) have been proposed.These include, but are not limited to, four-roll mills [75], fibre spinning devices [76], counter-rotating drums [77], Capillary Break-up Extensional Rheometers (CaBER) [78], Filament Stretching Extensional Rheometers (FiSER) [79], and microfluidic devices [80].As shown in Equation (15), we will limit our attention to planar extension in a 1 − 2 plane.Perhaps one of the most promising candidates for a planar extensional rheometer is the Optimised Shape Cross-slot Extensional Rheometer (OSCER) proposed by Haward et al. [81].

Stress Decomposition and Chebyshev Analysis
As shown by Equation ( 8), a particular elegance of LAOS (and LAOE) data is that the periodic shear-stress response (usually containing hundreds or thousands of data points) can be represented as a relatively small set of basis functions and coefficients by considering only a sufficient number of Fourier modes.In the context of utilising ML for model fitting, this is useful in two ways.Firstly, the dimensionality of the problem is drastically reduced (i.e. the machine need only be trained on the small set of coefficients rather than the individual data points in the stress waveform).Secondly, this technique removes requirements to match the sampling rates of the experimental data with that used for training data; enabling a uniform characterisation of stress waveforms gathered from different users.
We decompose the periodic stress response using the Stress Decomposition (SD) technique introduced by Cho et al. [82].Here the shear-stress is decomposed into the sum of an instantaneous elastic stress τ e and an instantaneous viscous stress τ v as Following the methodology first detailed by Ewoldt et al. [83], we then approximate τ e and τ v using Chebyshev polynomials of the first kind, denoted by T , as In general, the quality of this approximation increases as the order n of the polynomials increases; however, we use a finite n that provides an adequate representation of the wave-form.Once the data for the limit-cycle has been generated for all data points in the model parameter space, we fit a series of n Chebyshev polynomials to all waveforms.We then reconstruct the waveform from the fitted coefficient spectra and compute the total error across all data points between the real and reconstructed waveforms (real here meaning numerical, not experimental).We choose a value of n where the error does not change with increasing n.In essence, we ensure the series representation of the waveform is converged to the real waveform for all waveforms.For the LAOE flow, we decompose and interpolate the waveform for N 1 using Chebyshev polynomials in the same way as described previously for LAOS.
In Equation ( 17), ψ denotes any variables other that De and W i on which there is a dependence of e n and v n .In the context of constitutive models, this would represent the set of non-linear fitting parameters (such as L 2 in the FENE-P model [84] or ε and ζ in the PTT models).Experimentally, whilst it is likely not possible to explicitly define ψ, this might reflect, for example, variables such as polymer concentration, or the degree of chain-branching etc.The relationship between e n and ψ, and that between v n and ψ, can be found analytically for some constitutive models using MAOS solutions, as discussed.In this study, however, we wish to use ML to correlate the viscoelastic model fitting parameters ψ with the coefficients e n and v n in the fully non-linear (LAOS) regime.To the best of our knowledge, there exists no simple optimisation-free method for generally parameterising non-linear viscoelastic constitutive models under LAOS.

Data generation
For a given constitutive model to be tested, we first define the upper and lower bounds for each model parameter, and then we fill the model parameter space ψ with a number n p of data points, i.e. each point in this space corresponds to a particular parameterisation of the constitutive model.For some model parameters (i.e. in certain dimensions in the parameter space), it is intuitive to use log-spacing rather than linear spacing for the data points.For example, we know that for β → 1, the response will practically be Newtonian.
We desire that the training and validation data contains more responses for lower values of β, where the model response will be more non-linear, than for higher values of β.Therefore, we use a lower bound for β of 10 −5 and an upper bound of 0.9, and we space the data points logarithmically (with base 10) in this dimension.For the PTT models, we use lower and upper bounds, respectively, of 0 and 0.2 for both ζ and ε.We chose to limit ζ at 0.2 due to the fact that larger values led to highly un-realistic LAOS responses, at least for the viscoelastic materials we are generally concerned with.In any case, we only wish here to assess if the RF algorithm can learn the complex relationship between model parameters and Chebyshev spectra.Using 0 ≤ ζ ≤ 0.2 provided us with a wide range of Lissajous curves (and hence Chebyshev spectra) for this purpose.For the RoLiE-Poly model, we use lower and upper bounds of 1.2 and 100, respectively, for z.And we use lower and upper bounds of 0 and 1, respectively, for β CCR .For both of these parameters, logarithmic (base 10) spacing was used.
We use Latin Hypercube Sampling (LHS) to sample the parameter space [85].This is particularly appealing because it provides better coverage of the space than naïve random sampling, especially when the size of the dataset is relatively small.This is important given the predictive performance of ML models on interpolation tasks is strong, but generally poor under extrapolation.

Random Forest Modelling
The ML algorithm we utilise in this study is the Random Forest (RF) algorithm, which will now be detailed.RFs are bootstrap ensembles of Decision Trees (DTs), which are each identified on subsets of the training data available to the modeller.In the context of this work, DTs are mulitvariate, scalar-valued functions, h : R nz → R.They have a logic structure, which partitions input data into a set of K disjoint regions, {R k } k=1:K .Each disjoint region, R k ⊂ R nz , is associated with its own weak estimator, which generates a prediction given the model input [86].This is outlined by Fig. 1.Typically, the weak estimator is a constant, y k ∈ R, such that the prediction of the DT can be expressed as: where I k (z) : R nz → Z 2 is the indicator function that takes a value of 1 if the input, z, is partitioned into region, R k , and a value of 0 otherwise.This means that given M datapoints to train the DT, D DT = {z i , v i } i=1:M , a region R k is associated with a subset of the inputoutput data, D R k DT ⊂ D DT .This data can be used to determine the the weak estimator, y k , and this is usually done in a non-parametric fashion as [86]: When deployed within the bootstrap aggregation framework provided by the RF, predictions from each of N DTs, are averaged to generate a prediction, ȳ: where the index j denotes a single DT estimator within the RF.Uncertainty estimates can also be gained by estimating the variance of the ensemble's predictions.Due to the interpretable structure of DTs one can also gain measures of the input features' importance to making predictions [87].This is quantified by metrics such as permutation importance [88], and can provide information to the modeller about the calibration of the model provided one has intuition regarding those variables that are strongly correlated with the prediction.
RFs have a number of hyperparameters, which control the subsampling of data for identification of each DT, as well as the complexity of the model itself.In the case of the DT, complexity essentially refers to the definition and depth of the logic structure.This controls the number of disjoint regions and the order of interactions between input features captured in the prediction [86].When viewed in the context of a forest, complexity refers to the the selection of the number of DTs, N .We denote the depth of the trees in the forest (the number of splits that each tree can make) as d max .In the following, we approach the RF  hyperparameter optimisation problem via grid search and 10-fold cross validation.
It is worth noting, in this work we are interested in identifying a multivariate, vectorvalued function to map from Chebyshev polynomial coefficients to the optimal parameterisation of a constitutive model.In order to do this, we construct independent scalar-valued RF models, one for each parameter in the constitutive model.We then combine the independent predictions of each RF model to provide a parameter estimate, given a setting of Chebyshev coefficients.In this way, we allow for a partitioning of the input space that accounts for the different relationships between Chebyshev polynomial coefficients and constitutive model parameters.

Workflow
Here, for clarity, we briefly outline the general workflow we are proposing to create a ML

lPTT and ePTT models
In this section, we present the results for the lPTT and ePTT models.

lPTT model
For both PTT models, it was found from an initial grid search of hyperparameters that the optimal hyperparameters for the RF algorithm were d max = 10 with N = 100.Figure 3 shows, for the training data and validation data, parity plots for the RF predictions of the 3 viscoelastic model parameters (β, ζ, and ε) with varying amounts of training data.
< l a t e x i t s h a 1 _ b a s e 6 4 = " l X h I 0 R g     is given in Table 1.
Note that 90% of the data is used for the regression (blue symbols) and the remaining 10%  3.
In order to test how well the RF predicted parameters predict the rheological behaviour under LAOS, we solve the ODEs for the lPTT model under LAOS (at the same conditions as those used for the training, De = 1 and W i = 10) using both the real parameters and the RF predicted parameters.Of course, for perfect predictions of all the model parameters, there is no need to check the predictions of the rheological behaviour since the real and predicted parameterised constitutive models are identical.In Figure 4, we highlight the accuracy of the predictions by showing the viscous Lissajous curves (stress vs strain rate) for 6 random data points in the validation data set.The pink solid lines are generated using the real model parameters, and the blue dashed lines are generated using the RF predicted parameters.Note that the RF predicted parameters are of course predicted from the real Lissajous curves (pink solid line).Here, we have used the real value for β rather than the RF predicted value since this parameter can be measured directly using a flow curve measurement.However, the RF prediction for β is still highly accurate, as shown in Figure 3, and so the Lissajous curves would still be predicted exceptionally well even with the RF value of β.The results in Figure 4 show that the RF algorithm is able to predict the correct LAOS behaviour even for drastically different Lissajous curves.
To quantitatively assess the LAOS predictions for all validation data points, we compare the Chebyshev coefficients between the real and RF predicted Lissajous curves.In Figure 5, we show the parity plots for all viscous and elastic Chebyshev coefficients for the validation data in Figure 3 for n p = 16 3 .There is very good agreement between the predicted and real model parameters, as highlighted by the values of R 2 displayed in the plots.The RF algorithm is, thus, seemingly able to the map between Chebyshev coefficients and constitutive model parameters with high levels of accuracy. .Note that we have used the real value for β rather than the RF predicted value.The value of β is given in each plot.Each plot also contains a table with the real and RF predicted model parameters.

ePTT model
Next, we turn our attention to the ePTT model.It was found, again, that using hyperparameters of d max = 10 and N = 100 provided optimal performance of the RF algorithm.
Figure 6 shows the model parameter parity plots for the training and validation data for the ePTT model using n p = 16  6.
dictions of the model parameters from the inputted LAOS data.This is reinforced by Figure 7, which visualizes the parity plot between the predicted and actual chebyshev coefficients in validation.
In Figure 8 we show the effect of the RF algorithm hyperparameters (d max and N ) on the predictions of ζ and ε.The data here is all validation data (i.e.not used for training).
n p = 16 3 in all cases.The predictive accuracy does seem to be somewhat sensitive to the hyperparameters used, which is completely expected and is frequently observed in other applications of RF.The accuracy of the predictions of both model parameters is increased as d max is increased (increasing beyond 10 did not yield any further improvement in the predictions).The number of estimators N does not seem to have much impact on the accuracy between 50 and 100, indicating N > 50 is sufficient for this application of RF.It is often found for RF that the predictive error decreases initially as N is increased and then converges to a plateau at a particular value of N .Evidently, N = 50 is beyond this value.
Overall, this highlights that the RF algorithm is a very robust regression method for this application of constitutive model parameterisation, and could potentially be useful in other areas of rheology.

RoLiE-Poly model
In this subsection, we present the results for the RoLiE-Poly model.We have observed already that the RF algorithm is very capable at predicting the lPTT and ePTT model parameters from the Chebyshev spectrum of LAOS responses.We now test the developed workflow/methodology for the more physically complex RoLiE-Poly model.Note that W i and De in Equation ( 12      are only significant provided W i is large enough relative to z [65].In the limit that z → 0 (see Equation ( 6), the RoLiE-Poly model converges to the quasi-linear Oldroyd-B model.at De = 1 and W i = 6.The accumulated strain during the oscillation will be higher in this   10).This figure highlights that the rheology is still predicted with almost perfect accuracy even though some of the model parameters were not predicted well.
case, and the non-Newtonian behaviour will be expected to be more pronounced.In Figure 15, we show the 6 examples (the 6 data points labelled in Figure 10) of the viscous Lissajous curves for the oscillatory extensional flow at De = 1 and W i = 6.For points 1, 5, and 6, the oscillatory extensional rheology is still predicted remarkably well considering the model parameters are not predicted particularly well.For points 2, 3, and 4, it is now clear that the poor parameter predictions do translate to poor predictions of the rheological behaviour in this flow.This highlights two interesting points.Firstly, the RF algorithm, despite not predicting the correct parameters in some cases, can still provide parameter predictions provide the correct rheological behavior even in different flow types and deformation rates.
Secondly, to improve the accuracy of the RF predictions, extensional flow data with strong enough velocity gradients should likely be used in conjunction with, or potentially instead of, shear flow data for training.This may help resolve some of the practical identifiability issues observed, and will be explored in future work.However, unless the user wishes to simulate such a flow using the parameterised model, our results show that the RF algorithm will provide an adequate set of model parameters (even if it is one set of a number of sets).See Figure 10 for the parity plots for the RF parameter predictions.Training was undertaken using LAOS at W i = 1 and W i = 50.The numbers 1-6 in the plots correspond to the data points with poor predictions labelled in Figure 10.This figure highlights that the extensional rheological behaviour is now poorly captured for some of the data points with poor parameter predictions.

Conclusion
We have investigated the use of Random Forest (RF) regression for the prediction of viscoelastic constitutive model parameters from LAOS data.We use the RF regression method to essentially solve the inverse problem for parameterisation of viscoelastic constitu- We investigated the linear and exponential PTT models, as well as the RoLiE-Poly model.
For both versions of the PTT model tested, the RF algorithm was able to predict the model parameters using the input Chebyshev coefficients for a LAOS simulation with high levels of accuracy.The accuracy of the predictions was, naturally, better when the amount of training data was increased.It was found that using a max depth of 10 and 100 estimators for the RF algorithm gave the best model performance.The model performance was found to decrease slightly when the max depth for the algorithm was decreased.However, reducing the number of estimators did not significantly affect the predictive performance, at least within the range investigated.
The RF algorithm was less capable of predicting the model parameters for the RoLiE- Poly model, at least with the hyper-parameter sets we investigated.However, despite not being able to predict all of the data points accurately, the model parameters predicted still represented the rheological behaviour with high levels of accuracy.This was shown to be caused by the fact that multiple sets of model parameters can yield very similar LAOS behaviours.It was found that the shear rheological behaviour was predicted accurately at the conditions used for the training (De = 1 and W i = 50), as well as at different conditions (De = 0.5 and W i = 100), despite the difference between the real and predicted parameter values.We also tested the parameter predictions in a homogeneous oscillatory extensional flow.At De = 1 and W i = 3, the model parameters predicted by the RF algorithm still captured the extensional rheological behaviour well.At De = 1 and W i = 6, however, the extensional rheological was not captured well for some of the data points where the parameter predictions were inaccurate.This suggests combining LAOS and LAOE data (and training at high enough values of W i) might help to improve predictive performance.
In this current study, we have limited ourselves to homogeneous flows (i.e.spatially uniform velocity gradients).To implement this methodology in real-world applications where many materials will exhibit shear banding during the rheolometric testing, the RF model should be trained using simulated rheological data where the flow is allowed to shear-band.
This will involve solving the constitutive models in both space and time, and obtaining the waveform for the stress at the top boundary.However, overall, we have shown that the RF algorithm is capable of learning the complex relationship between non-Newtonian rheology and constitutive model parameters, and therefore has strong potential to aid in modelling and simulation of complex fluid flows, which is of particular use across a wide range of sectors and industries.Particularly, the methodology that we have developed and proposed in this

(
i.e. τ = 0) and are run until the response converges to the limit cycle (i.e. a steady-periodic state is reached).The equations solved in this study are presented in full in Appendix A and Appendix B.The Deborah number quantifies the unsteadiness of the system (i.e. the ratio of a flow time-scale and a material time-scale), whilst the Weissenberg number represents the degree of anisotropy in the flow (i.e. the ratio of elastic and viscous forces).For oscillatory flows, the De − W i space is known as Pipkin space.LAOS corresponds to the region in Pipkin space where the value of De is moderately large and the value of W i is large.In this study, for the lPTT and ePTT models, we use values of De = 1 and W i = 10 for the ML training.We use De = 1 and W i = 50 for the training for the RoLiE-Poly model (this will be discussed later).For the trained ML algorithm to be utilised for fitting of experimental data, this means λ must be measured separately using a SAOS measurement.Only then can ω and γ0 be calculated such that the experimental LAOS data corresponds to the chosen values of De and W i for the training.If we were to train the machine using a fixed values of ω and γ0 , and were to include λ as one of the fitting parameters, many of the responses in the training data where γ/γ 0 = [−1, 1] and γ/ γ0 = [−1, 1] are the normalised strain and normalised strain rate, respectively.G ′ n and G ′′ n are recovered from e n and v n , respectively, as (a) Domain partitioning via a decision tree (b) Decision tree model ensembling

Figure 1 :
Figure 1: Figurative description of the key characteristics of decision tree models: a) partitioning of the domain through the identification of hyperplanes, b) ensembling of the weak estimator provided by each decision tree in the forest.
framework for constitutive model selection and fitting: 1 -Choose a viscoelastic constitutive model. 2 -Fill the space of constitutive model parameters with training and validation points through LHS sampling.3 -Solve, for each point in the training and validation data, the system of ODEs for the constitutive model under LAOS for the limit-cycle (either solving directly for the limit-cycle[89] or solving in time until the response is converged to the limitcycle).4 -Obtain e n and v n for all data, ensuring that n is large enough such that the most non-linear response in the data set is well-represented by the truncated series of Chebyshev polynomials T n .5 -Train the RF model to correlate the Chebyshev coefficients, e n and v n , with constitutive model parameters.This is detailed by Fig. 2a.Our method essentially solves an inverse problem for model parameterisation.Once steps 1 through to 5 have been completed, one then only needs to obtain some experimental LAOS data for a given material at De = 1 and W i = 10 (or any other chosen values of De and W i used for the training) and the trained RF algorithm will provide the values of the model parameters ψ which best fits the data.Steps 1 through to 5 only need to be completed once for a given constitutive model, whilst numerous sets of experimental data for different materials can be analysed almost instantly by the trained RF model, as depicted in Fig. 2b.For the task of constitutive model selection, one need only repeat steps 1 through 5 for a set of constitutive models.If experimental LAOS data is analysed by each trained machine (i.e.for each viscoelastic model), the user can then choose the best fitting model qualitatively, or some quantitative error can be used to select the most appropriate model.Note that we have chosen LAOS as rheometry protocol, but there is no reason this framework could not be extended for other rheometry protocols such as flow curves or CaBER tests etc.(a) Workflow of model construction (b) Model inference

Figure 2 :
Figure 2: Figurative description of a) the workflow for model construction, and b) model inference when presented with a new material.

2 <
7 T w 5 r 8 7 b r D X j z G f 2 4 Q e c 9 y + 6 T I + n < / l a t e x i t > n p = 16 3 l a t e x i t s h a 1 _ b a s e 6 4 = " U X 6 Y p W / n

2 <
r g Z D N 7 8 A P O + x e 0 N I + j < / l a t e x i t > n p = 12 3 l a t e x i t s h a 1 _ b a s e 6 4 = " D c W i a 4 q n

2 <
j P 8 O I o 5 8 l 5 d d 5 m r R l n P r M P P + C 8 f w F L 0 4 9 u < / l a t e x i t > l a t e x i t s h a 1 _ b a s e 6 4 = " 1 A D e e y H W K

2 <
e H G U 8 + S 8 O m + z 1 g U n m 9 m D H 3 D e v w B F u 4 9 q < / l a t e x i t > l a t e x i t s h a 1 _ b a s e 6 4 = " Q T E 3 O Y k U 0 J w r j D n E m S s u m 9 6 S F B I = " > A A A B 7 H i c d V B N S 8 N A E N 3 4 W e t X 1 a O X x S J 4 K k k p f t y K X j x W M G 2 h D W W z n b R L N 5 u w O x F K 6 W / w 4 k E R r / 4 g b / 4 b t 2 2 E K v p g 4 P H e D D P z w l Q K g 6 7 7 6 a y s r q 1 v b B a 2 i t s 7 u 3 v 7 p Y P D p k k y z c Hn i U x 0 O 2 Q G p F D g o 0 A J 7 V Q D i 0 M J r X B 0 M / N b D 6 C N S N Q 9 j l M I Y j Z Q I h K c o Z X 8 b g j I e q W y W 3 H n o M v E 8 2 r V K + r l S p n k a P R K H 9 1 + w r M Y F H L J j O l 4 b o r B h G k U X M K 0 2 M 0 M p I y P 2 A A 6 l i o W g w k m 8 2 O n 9 N Q q f R o l 2 p Z C O l e X J y Y s N m Y c h 7 Y z Z j g 0 v 7 2 Z + J f X y T C 6 D C Z C p R m C 4 o t F U S Y p J n T 2 O e 0 L D R z l 2 B L G t b C 3 U j 5 k m n G 0 + R R t C N + f 0 v 9 J s 1 r x z i v e X a 1 c v 8 7 j K J B j c k L O i E c u S J3 c k g b x C S e C P J J n 8 u I o 5 8 l 5 d d 4 W r S t O P n N E f s B 5 / w L d t o 6 7 < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " e A S 5 L d o g l A L U 4 C L L G h Z 8 s L F K d x M = " > A A A B 7 H i c d V D L S g N B E O y N r x h f U Y 9 e B o P g K e y G 4 O M W 9 O I x g p s E k i X M T i b J k N n Z Z a Z X i C H f 4 M W D I l 7 9 I G / + j Z N k h S h a 0 F B U d d P d F S Z S G H T d T y e 3 s r q 2 v p H f L G x t 7 + z u F f c P G i Z O N e M + i 2 W s W y E 1 X A r F f R Q o e S v R n E a h 5 M 1 w d D 3 z m / d c G x G r O x w n P I j o Q I m + Y B S t 5 H c e O N J u s e S W 3 T n I M v G 8 a u W S e J l S g g z 1 b v G j 0 4 t Z G n G F T F J j 2 p 6 b Y D C h G g W T f F r o p I Y n l I 3 o g L c t V T T i J p j M j 5 2 S E 6 v 0 S D / W t h S S u b o 8 M a G R M e M o t J 0 R x a H 5 7 c 3 E v 7 x 2 i v 2 L Y C J U k i J X b L G o n 0 q C M Z l 9 T n p C c 4 Z y b A l l W t h b C R t S T R n a f A o 2 h O 9 P y f + k U S l 7 Z 2 X v t l q q X W V x 5 O E I j u E U P D i H G t x A H X x g I O A R n u H F U c 6 T 8 + q 8 L V p z T j Z z C D / g v H 8 B A m 2 O 0 w = = < / l a t e x i t > ⇣ < l a t e x i t s h a 1 _ b a s e 6 4 = " 3 n S s A K W k 3 a E b U S q a x c a c P u 6 T V H E = " > A A A B 8 n i c d V D L S g N B E J y N r x h f U Y 9 e B o P g K e y G 4 O M W 9 O I x g n n A Z g m z k 9 l k y O z M M t M b C E s + w 4 s H R b z 6 N d 7 8 G y f J C l G 0 o K G o 6 q a 7 K 0 w E N + C 6 n 0 5 h b X 1 j c 6 u 4 X d r Z 3 d s / K B 8 e t Y 1 K N W U t q o T S 3 Z A Y J r h k L e A g W D f R j M S h Y J 1 w f D v 3 O x O m D V f y A a Y J C 2 I y l D z i l I C V / N 6 E a J Y Y L p T s l y t u 1 V 0 A r x L P q 9 e u s Z c r F Z S j 2 S 9 / 9 A a K p j G T Q A U x x v f c B I K M a O B U s F m p l x q W E D o m Q + Z b K k n M T J A t T p 7 h M 6 s M c K S 0 L Q l 4 o a 5 O Z C Q 2 Z h q H t j M m M D K / v b n 4 l + e n E F 0 F G Z d J C k z S 5 a I o F R g U n v + P B 1 w z C m J q C a G a 2 1 s x H R F N K N i U S j a E 7 0 / x / 6 R d q 3 o X V e + + X m n c 5 H E U 0 Q k 6 R e f I Q 5 e o g e 5 Q E 7 U Q R Q o 9 o m f 0 4 o D z 5 L w 6 b 8 v W g p P P H K M f c N 6 / A N O v k Z 4 = < / l a t e x i t > " < l a t e x i t s h a 1 _ b a s e 6 4 = " Z h 1 E e h Y B y A Z L m F f B A Z W K O P a e l c 4 = " > A A A C L n i c d V D L S g M x F M 3 4 t r 6 q L t 0 E i 6 A L y 0 w p P n a i C C 4 V b C u 0 p W Q y t 2 0 w 8 y C 5 U 1 q G + S I 3 / o o u B B V x 6 2 e Y 1 i l

Figure 3 :
Figure 3: Parity plots (real versus predicted) for lPTT model parameters.Hyperparameters: d max = 10 and N = 100.Grey shaded area represents ±10% error.For each plot (training and validation) the value of R 2 is given in Table1.

Figure 4 :
Figure 4: Examples of viscous Lissajous curve (stress vs strain rate) predictions for lPTT model with n p = 163 .Note that we have used the real value for β rather than the RF predicted value.The value of β is given in each plot.Each plot also contains a table with the real and RF predicted model parameters.

Figure 5 :
Figure 5: Parity plots (real vs predicted) for Chebyshev coefficients for lPTT model with n p = 16 3 .This data corresponds to the validation data shown in Figure 3.

Figure 6 :
Figure 6: Parity plots for training and validation data for the ePTT model parameters.RF algorithm was trained at De = 1 and W i = 10.RF hyperparameters: d max = 10 and N = 100.n p = 16 3 .

Figure 7 :
Figure 7: Parity plots (real vs predicted) for Chebyshev coefficients for ePTT model with n p = 16 3 .This data corresponds to the validation data shown in Figure 6.
) are based on the reptation relaxation time, and not on the Rouse relaxation time.If instead we base W i and De on the Rouse relaxation time, the value of λ R would need to be known for the material beforehand.Basing W i and De on the reptation relaxation time λ simplifies matters as λ can be determined fairly easily from a SAOS measurement.We train the RF algorithm for the RoLiE-Poly model under LAOS at De = 1 and W i = 50.A larger value of W i is used for this model due to the fact the effects of Rouse relaxation (and hence the sensitivity of the Chebyshev spectra on the parameter z) < l a t e x i t s h a 1 _ b a s e 6 4 = " V 6 h U G M 1 a O 7 Y c j j 7 0 V V e J t c v d g 7 k = " > A A A B 6 n i c d V D L S g N B E O y N r x h f U Y 9 e B o P g K e y E 4 O M W 9 O I x o n l A s o T Z y W w y Z H Z 2 m Z k V w p J P 8 O J B E a 9 + k T f / x t l k h S h a 0 F B U d d P d 5 c e C a + O 6 n 0 5 h Z X V t f a O 4 W d r a 3 t n d K + 8 f t H W U K M p a N B K R 6 v p E M 8 E l a x l u B O v G i p H Q F 6 z j T 6 4 z v / P A l O a R v D f T m H k h G U k e c E q M l e 6 w 6 w 7 K F b f q z o G W C c b 1 2 i X C u V K B H M 1 B + a M / j G g S M m m o I F r 3 s B s b L y X K c C r Y r N R P N I s J n Z A R 6 1 k q S c i 0 l 8 5 P n a E T q w x R E C l b 0 q C 5 u j y R k l D r a e j b z p C Y s f 7 t Z e J f X i 8 x w Y W X c h k n h k m 6 W B Q k A p k I Z X + j I V e M G j G 1 h F D F 7 a 2 I j o k i 1 N h 0 S j a E 7 0 / R / 6 R d q + K z K r 6 t V x p X e R x F O I J j O A U M 5 9 C A G 2 h C C y i M 4 B G e 4 c U R z p P z 6 r w t W g t O P n M I P + C 8 f w F w a I 1 A < / l a t e x i t > 100 < l a t e x i t s h a 1 _ b a s e 6 4 = " 0 H 6 x u b W / n t w s 7 u 3 v 5 B 8 f C o r a J E M m y x S E S y 6 1 G F g o f Y 0 l w L 7 M Y S a e A J 7 H i T m 7 n f e U C p e B T e 6 W m M b k B H I f c 5 o 9 p I z d q g 6 x u b W / n t w s 7 u 3 v 5 B 8 f C o r a J E M m y x S E S y 6 1 G F g o f Y 0 l w L 7 M Y S a e A J 7 H i T m 7 n f e U C p e B T e 6 W m M b k B H I f c 5 o 9 p I z d q g

8 <
6 W 7 b m r G z m G H 7 A e v 8 C n x m M 0 w = = < / l a t e x i t > l a t e x i t s h a 1 _ b a s e 6 4 = " / F 3 0 B j m / I E w t w c c a t i u 3 b W h 1 g 6 8= " > A A A B 6 X i c d V D L S g N B E O y N r x h f U Y 9 e B o P g K e y E 4 O M W 9 O I x i n l A s o T Z y W w y Z H Z 2 m Z k V w p I / 8 O J B E a / + k T f / x t l k h S h a 0 F B U d d P d 5 c e C a + O 6 n 0 5 h Z X V t f a O 4 W d r a 3 t n d K + 8 f t H W U K M p a N B K R 6 v p E M 8 E l a x l u B O v G i p H Q F 6 z j T 6 4 z v / P A l O a R v D f T m H k h G U k e c E q M l e 6 w Oy h X 3 K o 7 B 1 o m G N d r l w j n S g V y N A f l j / 4 w o k n I p K G C a N 3 D b m y 8 l C j D q W C z U j / R L C Z 0 Q k a s Z 6 k k I d N e O r 9 0 h k 6 s M k R B p G x J g + b q 8 3 t y I 6 J o p Q Y 8 M p 2 R C + P 0 X / k 3 a t i s + q + L Z e a V z l c R T h C I 7 h F D C c Q w N u o A k t o B D A I z z D i z N x n p x X 5 2 3 R W n D y m U P 4 A e f 9 C w J h j Q Y = < / l a t e x i t > 10 < l a t e x i t s h a 1 _ b a s e 6 4 = " 5 R + q L H e Z L + r R O e 4 3 L k t 5 a P A 3 R U k = " > A A A B + X i c d V D L S s N A F J 3 4 r P U V d e l m s A i u S l K K j 1 3 R j c s K 9 g F t C J P J p B 0 6 M w k z k 2 I J + R M 3 L h R x 6 5 + 4 8 2 + c t B G q 6 I G B w z n 3 c s + c I G F U a c f 5 t F Z W 1 9 Y 3 N i t b 1 e 2 d 3 b 1 9 + + C w q + J U Y t L B M Y t l P 0 C K M C p I R 1 P N S D + R B P G A k V 4 w u S n 8 3 p R I R W N x r 2 c J 8 T g a C R p R j L S R f N s O / W z I k R 5 L n n H 0 k O e + X X P q z h x w m b h u s 3 E F 3 V K p g R J t 3 / 4 Y h j F O O R E a M 6 T U w H U S 7 W V I a o o Z y a v D V J E E 4 Q k a k Y G h A n G i v G y e P I e n R g l h F E v z h I Z z d X k j Q 1 y p G Q / M Z B F S / f Y K 8 S 9 v k O r o 0 s u o S F J N B F 4 c i l I G d Q y L G m B I J c G a z Q x B W F K T F e I x k g h r U 1 b V l P D 9 U / g / 6 T b q 7 n n d v W v W W t d l H R V w D E 7 A G X D B B W i B W 9 A G H Y D B F D y C Z / B i Z d a T 9 W q 9 L U Z X r H L n C P y A 9 f 4 F h z e U Q A = = < / l a t e x i t > d max < l a t e x i t s h a 1 _ b a s e 6 4 = " Q Z 8 Q n D 4 h c o B K L k p l D q 4 t g / n i J 3 M = " > A A A B 6 H i c d V D J S g N B E K 2 J W 4 x b 1 K O X x i B 4 C j M h u N y C X j x J A m a B Z A g 9 n Z q k T c 9 C d 4 8 Q h n y B F w + K e P W T v P k 3 d p I R o u i D g s d 7 V V T V 8 2 L B l b b t T y u 3 s r q 2 v p H f L G x t 7 + z u F f c P W i p K J M M m i 0 Q k O x 5 V K H i I T c 2 1 w E 4 s k Q a e w L Y 3 v p 7 5 7 Q e

Figure 8 :
Figure 8: Parity plots for ζ and ε predictions for ePTT model with various RF hyper-parameters.This data is validation data (i.e.not used for training).n p = 16 3 .

Figures 9
Figures 9 and 10 show the model parameter parity plots for the training and validation data, respectively, for the RoLiE-Poly model.Here, we used n p = 8 3 .We also found the predictions were most accurate using d max = 20 and N = 200 for the RF algorithm, which is a more complex structure than that which was used for the lPTT and ePTT models.Whilst the predictive accuracy for the training data seems reasonably high for both z and β CCR , the accuracy of the predictions of the validation data is significantly reduced for β CCR .At this point, we must question whether this inaccuracy in terms of model parameter prediction actually translates into inaccuracy of the prediction of the rheological behaviour (which could suggest the RF algorithm is not adequate for this particular constitutive model).To

Figure 9 :
Figure 9: Parity plots for training data for the RoLiE-Poly model parameters.RF algorithm was trained at De = 1 and W i = 50.RF hyperparameters: d max = 20 and N = 200.n p = 8 3 .

Figure 10 :
Figure 10: Parity plots for validation data for the RoLiE-Poly model parameters.RF algorithm was trained at De = 1 and W i = 50.RF hyperparameters: d max = 20 and N = 200.n p = 8 3 .

Figure 11 :
Figure 11: Parity plots for the first 5 Chebyshev coefficients for the RoLiE-Poly model under LAOS at De = 1 and W i = 50 (i.e. the same conditions as that used for the training).Note how the prediction of the shear rheology is highly accurate even though the prediction of model parameters is poor in some cases.

Figure 12 :
Figure 12: Examples of viscous Lissajous curves (stress vs strain rate) for RoLiE-Poly model under LAOS at De = 1 and W i = 50.The 6 plots correspond to the labelled data points with poor predictions in Figure 10.This figure highlights clearly that more than one set of model parameters can give very similar shear rheology.

Figure 14 shows the parity plots for the first 5
Figure 14 shows the parity plots for the first 5 Chebyshev coefficients for an oscillatory extensional flow at De = 1 and W i = 3.It is clear that, even in this extensional flow, the RoLiE-Poly model parameters predicted by the RF algorithm trained using LAOS data at De = 1 and W i = 50 (see Figure 10) still work very well at predicting the rheological behavior.Finally, we now check if the predicted parameters in Figure 10 (based on training data for LAOS at De = 1 and W i = 50) work at predicting homogeneous oscillatory extensional flow

Figure 13 :
Figure 13: Parity plots for the first 5 Chebyshev coefficients for RoLiE-Poly model under LAOS at De = 0.5 and W i = 100.The parameter predictions were made for LAOS data at De = 1 and W i = 50 (see Figure10).This figure highlights that the shear rheology is still predicted remarkably well even in a flow which was not used in the training, despite the seemingly poor parameter predictions in some cases.

Figure 14 :
Figure 14: Parity plots for the first 5 Chebyshev coefficients for RoLiE-Poly model under LAOE at De = 1 and W i = 3. Parameter predictions were made using LAOS data at De = 1 and W i = 50 (see Figure10).This figure highlights that the rheology is still predicted with almost perfect accuracy even though some of the model parameters were not predicted well.

Figure 15 :
Figure 15: Examples of viscous Lissajous curves for RoLiE-Poly model under LAOE at De = 1 and W i = 6.See Figure10for the parity plots for the RF parameter predictions.Training was undertaken using LAOS at W i = 1 and W i = 50.The numbers 1-6 in the plots correspond to the data points with poor predictions labelled in Figure10.This figure highlights that the extensional rheological behaviour is now poorly captured for some of the data points with poor parameter predictions.
tive models.For a given constitutive model, we solve the model under oscillatory shear flow at a given value of De and W i for a wide range of model parameters.The RF regression is then used to create a black-box type function mapping from Chebyshev spectra to model parameters.This tool can then be used to rapidly parameterise viscoelastic models.Previously, one would need to solve the ODEs for each constitutive model numerically in each iteration of an optimisation-based fitting procedure, which currently hinders the widespread use of digital modelling for viscoelastic fluid flows.

2A 12 A 22 A 12 11 +2A 12 A 22 A 12
investigation could be used to strong effect in industry to drive a change from wasteful trialand-error experimentation towards digital modelling for robust and sustainable process scaleup and optimisation.Moreover, since our methodology can allow for rapid parameterisation of constitutive models, one can quickly and accurately model the rheological behaviour of new products made with novel and sustainable raw ingredients.Appendix A. ODEs for PTT/ePTT model under LAOS Noting that A 13 = A 23 = 0 in simple shear flow, and that A is a symmetric second-order tensor, Equation(10) can be written out in full as cos( t) A 22 A 12 cos( t) 0 A 22 )cos( t) 0 (A 11 + A 22 )cos( t) 2A 12 cos( tform of D(A) depends on the specific PTT model used (i.e.PTT or ePTT).See Equation (11) for the forms of D(A) for the PTT and ePTT models.Note that regardless of the form of D(A), dA 33 / d t = 0 at all times, and so, for initial conditions of A = I, A 33 = 1 at all times.Appendix B. ODEs for RoLiE-Poly model under LAOS and LAOE To solve the RoLiE-Poly model under LAOS, Equation (12) can be written out in full as cos( t) A 22 A 12 cos( t) 0

Table 1 :
red symbols) is used to validate the predictive accuracy of the fit (i.e. it is not used in the regression).The training data is predicted well by the RF algorithm, even when relatively limited amounts of training data are used.The prediction of the validation data seems to be more sensitive to the amount of training data used, with the accuracy of interpolation increasing with dataset size.With n p = 16 3 (i.e. an average of 16 data points in each Values of R 2 for each of the parity plots in Figure dimension of the parameter space), the prediction of the validation data is highly accurate, with only a few outliers.The values of R 2 for each fit are displayed in Table1.The values of R 2 increase with dataset size in all cases.

Table 2 :
Values of R 2 for each of the parity plots in Figure 3. Again, the RF algorithm is able to make highly accurate pre-

Table 3 :
Values of R 2 for each of the parity plots in Figure8.