Model-inferred mechanisms of liver function from magnetic resonance imaging data: Validation and variation across a clinically relevant cohort

Estimation of liver function is important to monitor progression of chronic liver disease (CLD). A promising method is magnetic resonance imaging (MRI) combined with gadoxetate, a liver-specific contrast agent. For this method, we have previously developed a model for an average healthy human. Herein, we extended this model, by combining it with a patient-specific non-linear mixed-effects modeling framework. We validated the model by recruiting 100 patients with CLD of varying severity and etiologies. The model explained all MRI data and adequately predicted both timepoints saved for validation and gadoxetate concentrations in both plasma and biopsies. The validated model provides a new and deeper look into how the mechanisms of liver function vary across a wide variety of liver diseases. The basic mechanisms remain the same, but increasing fibrosis reduces uptake and increases excretion of gadoxetate. These mechanisms are shared across many liver functions and can now be estimated from standard clinical images.


Introduction
Measurements of liver function are important to determine the optimal therapeutic strategy in cases of severe chronic liver disease (CLD), and for prevention of post-treatment hepatic failure [1]. Estimating liver function is also important when planning surgical treatment, because postoperative hepatic function insufficiency is associated with both morbidity and mortality [2]. Sensitive biomarkers for liver function would also be useful for the management and early identification of drug-induced liver injury (DILI), which is a leading cause of acute liver failure [3] and also of drugs being withdrawn from the market [4].
Different options for estimation of liver function are used clinically today, but they all have some shortcomings. For instance, the primary clinical screening tool for liver injury in clinical trials, serum alanine aminotransferase (ALT), neither indicates the severity of liver injury nor estimates liver function [5]. In addition, ALT (and other transaminases) only indicates injury at a late stage when substantial tissue damage has already occurred [6]. Alternative methods for measuring liver function include Indocyanine-Green 15 retention rate (ICGR15) [7] and Tc-99m galactosyl human serum albumin (GSA) [8], which both measure the liver´s capacity to clear substances from the blood, and the galactose breath test [9], which measures the liver's metabolic capacity. These are all examples of global indicators that provide indirect measurements of liver function. Furthermore, GSA involves the injection of a radioactive isotope, which from a practical point of view is cumbersome and suffers from limited spatial and temporal resolution, and importantly is not widely available. In summary, biomarkers that are sensitive and respond early to changes in liver function would be beneficial both in a clinical setting as well as in the pharmaceutical industry and regulatory agencies [10,11]. Because of the low quality of available measures of liver function, little is known about the more detailed mechanisms of liver function, and about how these mechanisms change at different stages of CLD.
One of the most promising state-of-the-art methods for assessing clearance-based liver function is to use magnetic resonance imaging (MRI) in combination with the liver-specific contrast agent gadoxetate (Bayer Schering Pharma, Berlin, Germany). This method has the potential to allow investigation of liver function at a regional level without the need for any ionizing radiation. As a liver-specific contrast agent, gadoxetate is actively accumulated within hepatocytes [12] and is commonly used for characterizing lesions. The gadoxetate uptake is mainly associated with the function of the organic anion-transporting polypeptide 1 (OATP1) family of transporters [13]. The subsequent excretion into the bile occurs via the multidrug resistance-associated protein 2 (MRP2) transporter [14]. These transporter proteins have important functions, such as mediating the clearance of bilirubin, toxins, drugs, and other organic solutes [15]. For these reasons, gadoxetate MRI has the potential to facilitate study of the aspect of liver function that has to do with these uptake and excretion processes.
There are a number of previous studies which indicate the high potential of gadoxetate MRI as a biomarker for liver function. Early studies established a correlation between gadoxetate MRI and common clinical markers for liver function [16]. A more recent study demonstrates the ability of gadoxetate MRI to predict liver failure after surgery [17]. Furthermore, a recent prospective follow-up study on patients with primary sclerosing cholangitis showed that quantitative gadoxetate MRI could predict solid clinical endpoints, such as liver transplantation, cholangiocarcinoma, and liver related death [18]. The approach used in that study separated the population into two clear groups, one with >90% survival and one with <60% survival. Finally, in rats, similar analyses have shown promising results of using gadoxetate MRI as a biomarker for DILI [19,20].
All the above clinical studies have used a simple analysis, called relative enhancement, which simply compares signal intensity before and after gadoxetate injection; therefore, the studies could not elucidate the detailed mechanisms of liver function. More advanced approaches can make use of the information in an entire time series of images and use this to extract different gadoxetate transport rates. Such methods require the use of mathematical models. One common class of such models is the perfusion-based model [21,22]. These models require images with a very high temporal resolution, which limits the spatial resolution, meaning that the images cannot be used for conventional radiological reading. An alternative to perfusion-based models is models based on simulation and optimization of ordinary differential equations. These models do not need such high temporal resolution, but can utilize the high-spatial low-temporal resolution images used in clinical MRI protocols today. One such model, describing how gadoxetate is distributed in the whole body, as well as taken up and excreted by the liver, was described by Forsgren and colleagues [23]. While the Forsgren model can arguably be viewed as the most realistic gadoxetate uptake model, it has several shortcomings. The model is the most realistic in the sense that it is the only compartment model to describe the dynamics in liver, blood, spleen, and extracellular extravascular space. Furthermore, the model has been validated in healthy humans with gadoxetate doses up to 20 times higher than the clinical dose used for model training. On the other hand, the model has not been personalized. One effective approach used for such personalization is non-linear mixed-effects modeling (NLME; Fig 1A and 1B). NLME is effective because it can deal with low-informative data [24], which could allow for fewer images, shorter clinical examinations, and more reliable parameters. However, NLME has not been applied to any gadoxetate uptake model. Another shortcoming with the Forsgren model is that it has not been tested in patients with liver diseases. Therefore, it is not known how the mechanisms in the model vary across different stages and etiologies of CLD. Furthermore, the model has not been validated with respect to other important independent measures such as biopsies and post-procedural blood samples. These limitations have been due to the lack of relevant data.
In this study, a new modeling framework is created (Fig 1C) by combining i) state-of-theart MRI processing of high resolution gadoxetate-enhanced time series [25,26]; ii) the mechanistic gadoxetate uptake model, [23]; and iii) NLME model parametrization methods. To validate the model, a large clinical study was conducted by recruiting 100 patients, who were subjected to a variety of different measurements. These new data validate the model in three different ways. First, the extended model can describe patient variation across all stages of CLD. Second, the model can predict quantified images from later time points, which were not included in the estimation data; this implies the possibility of a shorter clinical protocol. Third, the model can predict independent validation data from blood samples and biopsies. Finally, it is demonstrated how the estimated model properties, such as OAPT1 and MRP2 transport rates, change with varying severity of CLD. These results point to a new avenue for estimation of liver function. The 'non-linear mixed effect' (NLME) method. The shapes of the population parameter distributions are first postulated, then distributions are parametrized to all datasets, and finally each patient is parametrized following the population parameter distributions with a joint-likelihood function. This allows NLME to use the global information obtained from an entire cohort, which is utilized to improve model parametrization for each individual subject. (C) The framework consists of gadoxetate-enhanced images, which are processed to obtain gadoxetate concentrations in the liver. A mechanistic systems pharmacology model, describing how gadoxetate is taken up and excreted, is fitted to the data using NLME parameterization to obtain reliable pharmacokinetic parameters, which can be used as biomarkers for liver function. (D) Schematic diagram of the mechanistic model for quantification of liver transporter function. Rounded rectangles represent compartments in the model, with arrows indicating gadoxetate fluxes between blood plasma and extracellular extravascular space (EES; k diff ), elimination via the kidneys to urine (CLr), uptake into hepatocytes (k ph ), back-flux from hepatocytes into blood plasma (k hp ), and excretion from hepatocytes into bile (k hb ). Gadoxetate injection into the blood-plasma compartment is indicated in blue. Gray areas show the signal part of the model in which compartmental gadoxetate concentrations are combined to predict the information in the gadoxetateenhanced MRI time series. https://doi.org/10.1371/journal.pcbi.1007157.g001

Study population
A total of 100 patients, with suspected CLD were included in the study and each underwent an MRI examination followed by two liver biopsies. Of these, eight patients were excluded because they aborted the examination and one patient was excluded due to poor data quality, giving a final cohort of 91 patients. The demographic characteristics and clinical diagnoses of the final study population are presented in Table 1. None of the included patients showed signs of hepatic decompensation.

The model framework is applicable to all stages of chronic liver disease
All patients were given an injection of gadoxetate and several MR-images were acquired over a period of 30 minutes for each patient. Time series for each patient were made by quantifying the change in R 1 relaxation rate (ΔR 1 ), which is proportional to gadoxetate concentration [27], in the liver and spleen. These time series were used to parametrize the mechanistic model ( Fig  1D) for each individual patient using the NLME method. That is, the NLME algorithm was used to identify optimal model parameter values for each patient (e.g., describing the function of OATP1 and MRP2) such that the model predictions of the MRI data in the liver and the spleen matched the measured MRI data. Fig 2A and 2B shows the observed values and the model predictions for two typical patients, one without fibrosis (F0) and one with cirrhosis (F4). Goodness-of-fit for was assessed each patient. In all but one of the 91 patients, the model predicted the observed MRI data without being rejected by the goodness-of-fit test. This indicated that the same mechanisms were at play in all stages and etiologies of CLD, and that only the quantitative details were different.

Measurements of gadoxetate concentrations in blood and biopsy samples validates the model
shows a comparison between the gadolinium concentrations (the paramagnetic nucleus responsible for the contrast enhancement in gadoxetate) measured in the blood samples and biopsies and the gadolinium concentrations predicted by the mathematical model. At a group level, there were no differences between the gadolinium concentrations predicted by the mathematical model and the concentrations measured using inductively coupled plasma sector field mass spectrometry (ICP-SFMS) (Fig 2C and 2E). At an individual level, there was a moderate Lin's concordance correlation between the predicted and measured gadoxetate concentrations in blood samples (r c = 0.62; Fig 2D). However, there was only a low correlation in liver biopsy samples (r c = 0.31; Fig 2F).

NLME enables short-duration gadoxetate MRI examinations
To assess whether the NLME-model parameterization method outperforms the standard twostage approach (STS), and whether it is possible to reduce the examination time to 10 min from gadoxetate injection, the dataset for each patient was divided into two parts: all data points from within 10 min after gadoxetate injection were used as estimation data, and the validation data included the remaining later time points. Both the NLME and STS parametrization methods were used to parametrize the model with the estimation data. The resulting model predictions were compared to the estimation data and validation data, and goodnessof-fit was assessed for each patient individually. The NLME parameterization was implemented using a 'leave-one-out' design, meaning that one data set at a time was truncated while all other data sets were complete. This design was to demonstrate how NLME could be used in a clinical situation where the distributions of the parameters have already been determined in clinical studies. Four patients had insufficient data for this analysis (there was no available data after 10 min), so the test included data from 87 patients.
Both the NLME and STS methods produced model predictions that passed the statistical test for goodness-of-fit for the estimation data in all 87 patients. The NLME method predicted the data in the validation dataset in 81 patients (93%) without being rejected, compared with 37 patients (43%) with the STS method. Fig 3A and 3B shows an example of a patient for whom the NLME method could predict the validation data, while the STS method failed, particularly when predicting the liver signal.
Significant Lin's concordance correlation was observed between predicted and measured blood plasma gadoxetate concentrations when using the NLME parametrization to data from the first 10 min (r c = 0.60). Notably, there was no significant difference between the predicted Model-Inferred Mechanisms of Liver Function and measured blood plasma concentrations with prediction based on parametrization to data from the first 10 min. Correlation between predicted and measured blood plasma gadoxetate concentrations based on the STS parametrization was not as strong as with the NLME parametrization (r c = 0.24).
The array of model parameters for each patient, estimated by the NLME parametrization method using data from the first 10 min, were compared with the same parameters estimated by the NLME parametrization method using the full dataset (Fig 3C-3H). The OATP function, i.e. hepatocyte uptake rate (k ph ; Fig 3C), was unaffected by the sparse (�10 min) estimation shows a simulation of the same patient after STS parameterization. The natural logarithms of the model parameter values hepatocyte uptake rate (k ph ; C-D), hepatocyte elimination rate (k hb ; E-F), and hepatocyte to plasma flux (k hp ; G-H), estimated with the NLME method using the full data set, were compared with the parametrization using data up to 10 minutes. Significant differences were observed for k hp . The solid line is a linear regression and the dotted lines are 95% confidence intervals.
https://doi.org/10.1371/journal.pcbi.1007157.g003 with a highly significant correlation between sparse and full estimations (Fig 3D). At a group level, the MRP2 function, i.e. hepatocyte elimination rate (k hb ; Fig 3E), was unaffected by the sparse estimation data. However, the individual values for each patient were affected and hence the correlation was poor (Fig 3F). The hepatocyte to plasma flux (k hp ; Fig 3G) was significantly affected by the sparse estimation data, with a non-significant correlation (Fig 3H).

Hepatic accumulation of gadoxetate is significantly affected in patients with fibrosis
In the liver, ΔR 1 is lower in patients with increased fibrosis stage (Fig 4A). In the spleen, ΔR 1 appears to be unaffected (Fig 4B). Furthermore, the hepatocyte uptake rate of gadoxetate by OATP1, differentiated between fibrosis stages, is shown in Fig 4C. The figure shows that the uptake is decreased in patients with advanced fibrosis and cirrhosis. Furthermore, the hepatocyte excretion rates were differentiated between cirrhosis and both advanced fibrosis and mild fibrosis (Fig 4D). Finally, Table 2 shows a confusion matrix of the ability of k ph to identify patients with advanced fibrosis, i.e. �F3, when using a cut-off of 0.00198 s -1 .

Discussion
A new next-generation framework to measure liver function using MRI was developed. This framework was successfully applied and validated with liver biopsy and blood samples in a clinical setting in a diverse cohort. More specifically, the model could describe data from all patients and it was able to adequately predict gadoxetate levels in both blood plasma and biopsies. Furthermore, the introduced NLME method for parameter estimation is more robust on shorter protocols, compared to the previously used STS method; this allows for shorter examinations. Finally, the validated model allowed for the examination of how the biomedical mechanisms for clearance-based liver function vary across different stages of CLD.
The new framework is validated in several different ways. First, the model was validated by the fact that it could be used to extrapolate data points not used for parameter estimation (example in Fig 3A and 3B). Second, the model could also be fitted to data from patients with a wide variety of different chronic liver diseases (Table 1). Third, the concentrations of gadoxetate in both liver biopsy and blood samples were measured by ICP-SFMS (Fig 2C-2F). On a group level, there was no significant difference between the predicted and measured gadoxetate concentrations. On an individual level, there was a moderate correlation in the blood, while there was a low individual correlation in the biopsy. This lower correlation may be due to contamination of the biopsy samples from gadoxetate in the bile ducts. These correlations do not necessarily mean that our modeling framework should be used for individual predictions of gadoxetate concentrations. However, the results do show that our modeling framework produces realistic parameter values. Last, the framework was also validated by the fact that the model parameters corresponding to OATP1 and MRP2 functions varied as expected in the patient population (Fig 4).
The variation of the parameters across the patient population requires some additional remarks. First, the population covered a wide spectrum of both etiologies (Table 1) and severity and we used liver fibrosis to indicate severity. Second, hepatic uptake via OATP1 transporters (k ph ) decreased significantly with increasing levels of fibrosis. Similar results were previously obtained in studies with perfusion-style model frameworks [28,29]. Possible reasons for reduction of OATP1 function include restricted access of gadoxetate to the hepatocytes, reduction in the number of functional hepatocytes, and competitive inhibition. Third, with respect to hepatic excretion via MRP2 (k hb ), a significantly higher excretion rate was estimated in patients with cirrhosis, compared to patients with lower levels of fibrosis. Other studies have reported mixed results. Previously, a small study using a perfusion model indicated the opposite, that gadoxetate excretion decrease in cirrhotic humans [30]. Therefore, it is interesting to look at studies of gene expression. Some studies on cirrhotic rats have shown an upregulation of MRP2 [31][32][33], which is consistent with our findings. In contrast, one study found a lower expression of MRP2 in rats with fibrosis [34]. In humans, the picture is also mixed and CLD has been found to be associated with either no difference [35], a slight increase [36], or in some CLD etiologies, a decrease [37] in MRP2 expression.
By using the NLME parameterization scheme, the time needed for MRI-examinations could be reduced, while still being able to estimate reliable parameters, as well as predicting  both the liver and spleen signals (Fig 3). This reduction was accomplished because NLME allows for information to be shared among the parameter estimations of all patients, thus requiring fewer new datapoints per patient. This reduction in the examination time is beneficial, since it reduces cost and patient discomfort, and requiring only a few images also allows for our method to be included for liver function evaluation in short abbreviated MRI-protocols. Such protocols, (sometimes called AMRI) are gaining popularity, e.g. when screening cirrhotic patients for hepatocellular carcinoma [38,39]. It can be noted that while the STS scheme failed to predict the liver signal, STS was still able to predict the spleen signal. The reason for this is that almost all dynamic information of the spleen signal is contained within the first ten minutes. Removing all later time points should therefore not affect the ability to predict the spleen signal. Furthermore, it can also be noted that while the χ 2 -test was used to evaluate the goodness-of-fit of the model, the NLME framework offers other methods for assessing model performance, such as visual predictive check and normal predictive distribution errors. However, since NLME was only used to increase the robustness of the predictions of the individual patients, the χ 2 -test should be enough. Comparing different methods would be interesting, but was unfortunately beyond the scope of this work.
Another strength of this work is that data are presented from a new clinical study, where 100 patients were recruited. The patients were selected to represent the actual flow of patients being referred to a hepatology department, with a normal variation in both disease etiology and severity. This gives a more realistic picture of the clinical situation, as most other studies have either been small or not prospective. Additionally, the study included dual biopsies, as well as blood samples, from the patients, in conjunction with the MR examination. These rare measurements allowed for extensively validate the model. Lastly, the study was conducted over a span of around six years. Such a long time could be seen as a limitation, as changes occur to an MR-system over time, such as software upgrades. On the other hand, this could also be seen as a strength of the method, since it was found that all data could easily be analysed in the same framework.
Although this methodology is still in the research phase, the methodology is better suited for clinical implementation, compared to other similar methods, for a variety of reasons. First, the modeling framework uses the same type of clinical images, already collected in routine examinations. Therefore, the liver function estimation can easily be included in clinical workflows or studies that already use gadoxetate MRI, by simply adding a few more breath holds. Second, the model is based on simulations of ordinary differential equations, which has additional advantages. For instance, the model, unlike previous non-simulation-based models [19,21,22], can easily be combined with other models describing detailed processes in the liver, and thus can possibly characterize other aspects of liver function, such as metabolic aspects [40]. Third, the simulation-driven model can also be combined with more zoomed out wholebody models. The result of such combinations is multi-level models which can simultaneously describe multiple organs and processes in the body [41][42][43]. For all these reasons, the framework could be further extended and reused in a variety of different contexts, both regarding clinical implementation and research.
In conclusion, this study presented a new integrative MRI-based framework for estimating liver function. The extendable framework has been validated in a variety of ways and has allowed for a new and deeper look into the variation of mechanistic parameters across a clinically relevant cohort.

Study design and population
Between 2007 and 2014, 100 patients were recruited on referral to the Linköping University Hospital, Linköping, Sweden for evaluation of chronic (> 6 months) elevation of levels of one or more of ALT (>1.10 μkat/L for men and >0.75 μkat /L for women), aspartate aminotransferase (AST; >0.75 μkat /L for men and >0.60 μkat /L for women), and serum alkaline phosphatase (ALP; >1.80 μkat /L regardless of gender). All patients who, on clinical indication or as part of a clinical study, needed a liver biopsy for histopathological evaluation were asked to participate in the study. Exclusion criteria included contraindications for MRI (presence of pacemaker devices, implants with ferromagnetic properties, pregnancy, and claustrophobia) and liver biopsy (presence of primary or secondary coagulative disorder, prothrombin time > 1.5 times the international normalized ratio, platelet count <50×10 9 /L, hepatic malignancy, and clinical signs of decompensated cirrhosis).
A diagnostic work-up was performed prior to MRI, including a physical examination, laboratory investigations, and ultrasonography. The pathologist was blinded to the results of the diagnostic work-up, and the radiologists were blinded to the diagnosis and the pathology findings.

Ethics statement
This study was approved by the regional ethics committee (Reference No. M72-07 T5-08). All patients gave informed consent to participate before the inclusion.

Gadoxetate-enhanced magnetic resonance imaging
MRI was performed within two months of the diagnostic work-up with a Philips Achieva 1.5 T MR scanner (Philips Healthcare, Best, Netherlands) and a phased-array body coil. Singlebreath-hold symmetrically sampled T1-weighted gradient-echo two-point Dixon 3D images [44] were acquired using sensitivity encoding [45].
All patients received a bolus injection of gadoxetate (gadolinium ethoxybenzyl diethylenetriamine pentaacetic acid, or Gd-EOB-DTPA, marketed as Primovist in Europe and Eovist in the USA by Bayer Schering Pharma, Berlin, Germany), at a dose of 0.1 ml/kg, administered intravenously at a rate of 1 mL/s by a power injector (Medrad Spectris Solaris, Pittsburgh, PA, USA), followed by a 30 mL saline flush. Image time series were acquired prior to (nonenhanced) and directly following gadoxetate injection (Fig 5). The post-injection time series corresponded to the arterial and portal-venous phases, as well as 3 min, 10 min, 20 min, and 30 min following injection. Additional acquisitions between 3 min and 30 min were added from 2012 and onwards.
The field of view (FOV) and acquisition matrix were adjusted to accommodate patients of different sizes. Higher temporal resolution was used during the initial contrast agent wash-in phase, the arterial phase. The non-enhanced and post-injection images were acquired using the following sequence parameters: repetition time = 6.5 ms, echo time = 2.3 ms and 4.6 ms, flip angle = 13˚, typical acquisition matrix = 168×168, typical FOV = 261 mm by 200 mm by 342 mm, slice thickness = 4 mm. We used interpolation, with zero padding in the z-direction, and up-interpolated from 4 to 2 mm.

Image post-processing
The acquired in-phase and opposite-phase images were reconstructed into separate water and fat images by our previously developed inverse-gradient method [26]. The signal intensity of MR-images is not absolute. Hence, if images are acquired in a time series, the signal intensity in the images can vary, even though all images are acquired using the same parameters. We corrected for this variation by using voxels of pure adipose tissue as an internal reference throughout the time series. [25] This was an important step in the quantification process.
To extract signal intensities (SIs) for the quantitative analysis, two clinical radiologists (BN, ND; with more than ten years of experience in abdominal radiology) placed ROIs in the reconstructed water-image time series, in the liver (N = 7), and spleen (N = 3). Liver ROIs were placed in both the left and right liver lobes to avoid any large vessels or focal lesions, but not strictly following the Couinaud segmental division. The sizes of ROIs were arbitrarily chosen by the radiologists. However, the ROIs were adjusted to be equal in size and approximate position throughout the time series. Landmarks in the images were used to correct for movement of the liver between the acquisitions. Fig 5 shows an example of ROI placement.
Mean SIs in the ROIs were normalized and the relaxation rate (R 1 ) was calculated as described previously [46]. The induced change in the relaxation rate (ΔR1) was directly proportional to gadoxetate concentrations [27].
For quality assurance, the image data were inspected visually for quality issues and potential data exclusion. As the two radiologists independently placed ROIs in the images, they took particular notice of cases of poor image quality (such as artifacts resulting from breathing or post-processing failure). Then, both radiologists reviewed these cases of potential poor quality and reached a consensus about whether to exclude the images, to return the images for manual image reconstruction, or to accept the images. After the radiologists were satisfied, the data analyst continued to search for any significant outliers in the extracted time series. The radiologists were then instructed to review these latter outliers, but they were not told why each case was to be reviewed. If they were still satisfied with the placement of ROIs and with the image quality, nothing was corrected.

Estimation of the lower limit of data uncertainty
The lower limit of uncertainty in the extracted gadoxetate time series was estimated by calculating the average standard deviation in the normalized signal intensities (Eq 1, where S = SI (t)/SI(t = 0)).

sS ¼ jSj
ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi sSIðtÞ SIðtÞ The uncertainty was then averaged over the entire study population; for this averaging, each entry was in turn the mean of each patient's liver and spleen ROIs. Fig 6 shows a histogram of the estimate of the lower limit of uncertainty, and the fitted normal distribution had a mean of 0.18. Unless the standard error of the mean across the spleen or liver ROIs exceeded 0.18, this lower limit value was used as the standard deviation in the following statistical test for the mechanistic model goodness-of-fit.

Mathematical model and model parametrization
The whole-body model devised by Forsgren and co-workers [23] was used here to quantify the liver function. Fig 1C shows a schematic diagram of the model, which has two parts: a dynamic model and a signal model. Briefly, the dynamic model describes five separate fluxes of gadoxetate: between the blood plasma and the extracellular extravascular space (EES; k diff ); elimination via the kidneys to urine (CLr); uptake into the hepatocytes (through the OATP1 family transport proteins; k ph ); back-flux from the hepatocytes into the blood plasma (through the transport protein MRP3; k hp ); and excretion from the hepatocytes into the bile (through the transport protein MRP2; k hb ): where C hep , C p , and C ees is the gadoxetate concentration in the hepatocytes, blood plasma, and EES respectively. V l , V ees , and V p are the volumes of the liver, EES, and blood plasma respectively (assumed to be 1.43, 14.77 and 2.57 L). Alb is the fraction of Gadoxetate not bound to serum albumin (assumed to be 0.9), v h is the volume fraction of hepatocytes in the liver (assumed to be 0.68), and u is the injection of gadoxetate. CLr is assumed to be 118 mL/min. The signal model was used to predict ΔR 1 in the gadoxetate MRI time series as a function of the gadoxetate concentrations in the compartments. The model takes into account the parenchyma volume fractions as well as the in situ tissue-specific relaxivity properties of gadoxetate [23]: where ΔR 1,l and ΔR 1,s are the ΔR 1 in the liver and spleen respectively, v p,l and v ees,l are the volume fractions of plasma and EES in the liver (assumed to be 0.12 and 0.20), and v p,s and v ees,s are the fractions of plasma and EES (assumed to be 0.35 and 0.20) in the spleen. ξ is an arbitrary scaling parameter and r 1,hep , r 1,p , and r 1,ees are the tissue-specific relaxivities in the hepatocytes, blood plasma, and EES respectively (assumed to be 10.7, 7.3, and 6.9 mmol -1 s -1 ). The model was parametrized separately using the STS method and the NLME method, described in Fig 1A and 1B. The STS parametrization was performed by minimizing the following costfunction, which follows a χ 2 -distribution: where y and σ are the measurements and standard deviation of the measurements respectively, y is the predicted data as a function of time and the estimated model parameters (p), and the index i indicates liver or spleen. When using NLME, all the optimized parameters have two parts, a fixed effect and a random effect. The fixed effect is the same across all patients and represents the typical parameter value. The random effect describes how each individual deviate from the typical value and is thus allowed to vary across the population, but is still constrained to a normal distribution: where p j is a generic parameter for patient j, θ p is the fixed effect for parameter p, and η j p is the random effect for parameter p for patient j. If p is postulated to be normally distributed, Eq 8 is used, while Eq 9 is used if p is lognormally distributed. More details of the STS parametrization are described in [23] and the details of NLME in [24,47].
Population distributions in the NLME model parametrization were defined as a normal distribution for the scale parameter (ξ) and lognormal distributions for the four rate parameters (k diff , k ph , k hp , and k hb ). The distributions were a priori parametrized from the results of parametrizing the model to healthy human patients, which has been described previously (Table 3 in [23]), where the expectation values were ξ = 1.6, k diff = 1.7 ms -1 , k ph = 4.7 ms -1 , k hp = 28 ms -1 , and k hb = 38 ms -1 . The a priori standard deviations were chosen such that the optimization algorithm would not be unnecessarily limited (ξ = 1, k diff = 0.1 ms -1 , k ph = 0.1 ms -1 , k hp = 0.01 ms -1 , and k hb = 0.01 ms -1 ).
The mechanistic model framework assumes that the compartments are well mixed containers (a fundamental property of ordinary differential equation models). In addition, there are interfering effects from the bolus injection during the arterial and portal-venous phases. Therefore, only data at the 3 min point and later after contrast injection were included in the model parameterization.

Blood sampling and elemental analysis
Immediately following the MR examination, 3 mL venous blood (collected in a 3 mL BD Vacutainer sterile hematology tube with K 2 -EDTA) was drawn from each patient for elemental analysis of gadolinium content. Samples were transferred to 4 mL sterile low-temperature freezer vials (VWR, Sweden) for freezing and storage at -80˚C. The frozen samples were sent to an external laboratory (ALS Scandinavia AB, Luleå, Sweden) for elemental analysis by ICP-SFMS: 0.20 mL from each thawed blood sample was mixed with 1.00 mL 'super pure' HNO 3 (pure with respect to traces of metal) and digested in a 600 W microwave oven operating at 75% power for 30 min. Each of these mixtures was then diluted up to 10.00 mL with MilliQ ultrapure water for ICP-SFMS analysis, which had a detection limit for gadolinium of 0.05 μg/L.

Liver biopsy and histopathology
Immediately after completion of the MR examination and blood sampling, two ultrasonographically guided liver biopsy procedures were performed, on an outpatient basis. The biopsy samples were obtained percutaneously with a 1.6 mm BioPince needle (BioPince Full Core Biopsy Instrument, Argon Medical Devices, Plano, TX, USA) in either the left or right liver lobe depending on which location offered the best combination of a successful biopsy and maximum patient safety. A histopathologist graded and classified one of the biopsy samples according to the Batts and Ludwig system [48], through which fibrosis was staged as no fibrosis (F0), portal and/or perisinusoidal fibrosis (F1), periportal and perisinusoidal fibrosis (F2), bridging fibrosis (F3), and probable or obvious cirrhosis (F4). The biopsies were also graded for inflammation. All biopsy samples were graded by the same histopathologist.
The second biopsy sample of each pair was weighed and directly frozen at -80˚C. The frozen samples were later freeze dried, and the dry weight was measured prior to submission to our external analysis partner (ALS Scandinavia AB) for elemental analysis. The dried samples were digested by adding 2.50 mL 'super pure' HNO 3 and 0.25 mL H 2 O 2 followed by a 30 min treatment at 170˚C in a microwave oven. The samples were then diluted to 5.00 mL with MilliQ ultrapure water, for ICP-SFMS analysis.

Statistical analysis
The goodness-of-fit of the model to the data was investigated on a subject basis using a χ 2 test (Eq 7; α = 0.05) with degrees of freedom equal to the number of observations in the gadoxetate-enhanced MRI time series. Group differences were investigated using an unpaired twotailed Mann-Whitney U-test (α = 0.05). A paired two-tailed Mann-Whitney U-test was used when comparing two model parametrization method estimates of model parameters (α = 0.05). Linear regression and Lin's concordance correlation were used to investigate correlation between variables that measure or describe similar entities. For correlation between non-similar variables, a Pearson correlation coefficient was calculated. An ANOVA with Tukey's posttest was used to investigate sources of variation and biomarker performance between fibrosis stages.
Supporting information S1 File. Matlab data. The file contains the time series for each patient together with the fibrosis stage, saved in .mat format. (MAT)