Defining vitamin D status using multi-metabolite mathematical modelling: A pregnancy perspective

Vitamin D deficiency is linked to adverse pregnancy outcomes such as pre-eclampsia (PET) but remains defined by serum measurement of 25-hydroxyvitamin D3 (25(OH)D3) alone. To identify broader changes in vitamin D metabolism during normal and PET pregnancies we developed a relatively simple but fully parametrised mathematical model of the vitamin D metabolic pathway. The data used for parametrisation were serum vitamin D metabolites analysed for a cross-sectional group of women (n = 88); including normal pregnant women at 1 st (NP1, n = 25) and 3rd trimester (NP3, n = 21) and pregnant women with PET (n = 22), as well as non-pregnant female controls (n = 20). To account for the effects various metabolites have upon each other, data were analysed using an ordinary differential equation model of the vitamin D reaction network. Information obtained from the model was then also applied to serum vitamin D metabolome data (n = 50) obtained from a 2nd trimester pregnancy cohort, of which 25 prospectively developed PET. Statistical analysis of the data alone showed no significant difference between NP3 and PET for serum 25(OH) D3 and 24,25(OH)2D3 concentrations. Conversely, a statistical analysis informed by the reaction network model revealed that a better indicator of PET is the ratios of vitamin D metabolites in late pregnancy. Assessing the potential predicative value, no significant difference between NP3 and PET cases at 15 weeks gestation was found. Mathematical modelling offers a novel strategy for defining the impact of vitamin D metabolism on human health. This is particularly relevant within the context of pregnancy, where major changes in vitamin D metabolism occur across gestation, and dysregulated metabolism is evidenced in women with established PET.

The data used for parametrisation were serum vitamin D metabolites analysed for a cross-sectional group of women (n = 88); including normal pregnant women at 1 st (NP1, n = 25) and 3rd trimester (NP3, n = 21) and pregnant women with PET (n = 22), as well as non-pregnant female controls (n = 20). To account for the effects various metabolites have upon each other, data were analysed using an ordinary differential equation model of the vitamin D reaction network. Information obtained from the model was then also applied to serum vitamin D metabolome data (n = 50) obtained from a 2nd trimester pregnancy cohort, of which 25 prospectively developed PET.
Statistical analysis of the data alone showed no significant difference between NP3 and PET for serum 25(OH) D3 and 24,25(OH) 2 D3 concentrations. Conversely, a statistical analysis informed by the reaction network model revealed that a better indicator of PET is the ratios of vitamin D metabolites in late pregnancy. Assessing the potential predicative value, no significant difference between NP3 and PET cases at 15 weeks gestation was found. Mathematical modelling offers a novel strategy for defining the impact of vitamin D metabolism on human health. This is particularly relevant within the context of pregnancy, where major changes in vitamin D metabolism occur across gestation, and dysregulated metabolism is evidenced in women with established PET.
It is still unclear whether vitamin D-deficiency during pregnancy is simply a manifestation of the broader prevalence of low vitamin D status in populations across the globe, or whether this reflects a normal physiological drop in 25(OH)D3 concentrations during pregnancy, or if pregnancy is a stress test which can exacerbate and unmask pathological vitamin D deficiency. Whatever the case, it is important to recognise that, to date, studies have relied on measurement of a single parameter to define optimal vitamin D in pregnancy -namely serum levels of 25(OH)D3, the major circulating form of vitamin D. We have hypothesized that measurement of total serum concentrations of 25(OH)D3 alone provides only a limited view of vitamin D status, particularly in pregnancy where significant changes in vitamin D physiology occur from early gestation [20]. This includes changes in the serum carrier vitamin D binding protein (DBP) during pregnancy [21] which may influence the bioavailability of unbound or DBP-bound 25(OH)D3 [22].
During pregnancy there is also a dramatic rise in maternal circulating concentrations of the active hormonal form of vitamin D, 1,25-dihydroxyvitamin D (1,25(OH) 2 D3) from the 1 st trimester of gestation [23]. Thus, the impact of vitamin D during pregnancy may be better defined by analysis of multiple serum vitamin D metabolites -the vitamin D metabolome. In women with PET we have previously reported significant dysregulation of multiple vitamin D metabolic pathways in the 3 rd trimester of pregnancy, with decreased serum 1,25(OH) 2 D3 levels and enhanced catabolic 24,25-dihydroxyvitamin D3 (24,25(OH) 2 D3) and 3-epi-25(OH)D3 concentrations observed for women with PET [20]. Conversely, in a study of pregnant women in the 1 st trimester of pregnancy who subsequently progressed to normal or PET pregnancies we did not observe any significant variations in serum vitamin D metabolites between normal and PET pregnancies [24]. The aim of the current study was to develop a mathematical model of the serum vitamin D metabolome to provide an improved statistical analysis for screening for adverse events in pregnancy, such as PET, by identifying markers of vitamin D function and disease based on analysis of multiple vitamin D metabolites. Specifically, we aimed to detect a signal within vitamin D metabolite data recorded both in early (1st trimester) and late (3rd trimester) pregnancy, to allow classification of cases into high and low risk of PET. The study found that, rather than considering individual metabolite levels, inspection of specific ratios of these metabolites (as determined by the mathematical model) was a more informative predictive measure. Implementing data from a larger cohort of patients will transform this study from proof of concept into a strong predictive screening tool for PET. Though we have utilised data specifically with pregnancy in mind, the approach presented provides a novel analytical tool for the study of vitamin D metabolites more generally in human health and disease.

Human pregnancy vitamin D metabolite datasets
Serum vitamin D metabolite data obtained from two distinct sets of pregnant women was utilised to ascertain the accuracy of each statistical approach (one considering individual metabolite levels, the other their ratios). The first group of women were from a crosssectional pregnancy study which included 3 groups of pregnant women booked in the West Midlands, UK (i) first trimester uncomplicated pregnancies (8-13 weeks) (NP1; n = 25), (ii) healthy 3 rd trimester pregnancies (> 37 weeks) (NP3; n = 21) and (iii) third trimester pregnancies complicated by PET (PET; n = 22). A healthy non-pregnant female 'control' group (n = 20) was also recruited for comparative serum vitamin D analysis as described in detail previously ( [24]. As previously described, sera samples from n = 50 low-risk nulliparous pregnant women taken at 15 weeks gestation (2 nd trimester) were analysed. Of these, n = 25 prospectively developed PET and n = 25 were selected as normotensive controls matched for maternal age, ethnicity and body mass index (BMI) ( Table 1) [24]. Vitamin D metabolites were extracted from serum for LC-MS/MS analysis and measured as described previously [20,24,25]. Serum concentrations were obtained as outlined in Supplemental Table 1 and Supplemental  Table 2.

Analysis of serum vitamin D metabolites
Analysis of serum concentrations of vitamin D metabolites (25(OH) D3, 3-epi-25(OH)D3, 1,25(OH) 2 D3 and 24,25(OH) 2 D3) was performed using previously reported liquid chromatography-tandem mass spectrometry (LC-MS/MS) methods [20,25]. In brief, samples were prepared for analysis by protein precipitation and supported liquid-liquid extraction. Analysis of serum was performed on a Waters ACQUITY ultra performance liquid chromatography coupled to a Waters Xevo TQ-S mass spectrometer. The LC-MS/MS method was validated based on US Food and Drug Administration guidelines for analysis of these metabolites, with CVs for each vitamin D metabolite analysis is shown in Table 2.

Regression analysis of vitamin D metabolite data across pregnancy
Serum vitamin D metabolite data from the West Midlands healthy pregnant (NP1 and NP3) and non-pregnant cohorts were first assessed by (linear) robust regression analysis to estimate the relationship between metabolome concentrations and gestation progression. This was performed using iteratively reweighted least squares with a bi-square weighting function using the MATLAB Curve Fitting Toolbox [26]. Robust regression using iterative re-weighting and non-standard weighting functions was chosen in favour to the more standard linear least-squares regression due to its capabilities to reliably handle outliers present in the raw data.

Principal component analysis
We began by applying a standard statistical analysis to the data independent of the mathematical model. We examined serum vitamin D metabolites for PET and healthy patient serum data from the SCOPE data for all patients where all five metabolites were available (N = 16 cases and N = 22 controls). Each metabolite value was normalised by multiplying by a scaling factor such that the mean for each normalised metabolite became equal to one. We then ran a principal component analysis (PCA) on the normalised dataset using the MATLAB 'PCA' function.

Linear classification analysis
The task of statistically classifying metabolite data into groups of healthy normotensive and PET women in an interpretable manner can be conveniently carried out using linear support-vector machines (SVMs) [26,27]. In this approach we find, given the training data, the best decision function which maps data to one of the two possible outcomes (healthy versus PET in our case) based on a linear combination of various features of the data, e.g. concentrations of different vitamin D metabolites. Using SVMs we can select the best possible linear decision function in terms of the accuracy of its predictions.

Kinetic mathematical model of vitamin D function
To account for the effects that vitamin D metabolites have on each other, initially the full chemical reaction network was modelled as a system of ordinary differential equations (in the next section this was reduced to a simpler but equally informative model that can be fully parametrised). As outlined in Table 3 and Fig. 1, variables were used to design a mathematical kinetic model for vitamin D physiology that incorporated the main cellular features of vitamin D metabolism and function; active, inactive and epimerised forms of vitamin D, as well as the key catabolic enzyme 24-hydroxylase [33,34]. These cellular variables were then used as a basis to model circulating vitamin D metabolites.
The metabolite network for vitamin D (Fig. 1) was mathematically interrogated by constructing a set of ordinary differential equations (ODEs), depicted in Tables 4 and  5, describing the time-dependent behaviour of vitamin D metabolites in the metabolite network. Based on saturation kinetics for vitamin D metabolism [28], we assumed that the conversion of 25(OH)D3 to 1,25(OH) 2 D3 follows Michaelis-Menten kinetics. For simplicity, and to facilitate model parametrisation, all other reactions were assumed to obey mass action kinetics (i.e. either they do not saturate or they do not saturate at the levels present in our data).

A reduced mathematical model of vitamin D function
As shown in Table 3, serum concentrations of 3-epi-1a,25(OH) 2 D3 and 24-hydroxylase enzyme activity were not measured in the current study. A reduced mathematical model of vitamin D metabolism was therefore designed by applying simplifying assumptions into the complete model given in Table 5, incorporating solely 25(OH)D3; 1,25(OH) 2 D3; 24,25(OH) 2 D3 and 3-epi-25(OH)D3; and their effective mutual interactions (Fig. 2). Firstly, we assume that levels of 24-hy-droxylase are relatively constant so that we can assume conversion of 25(OH)D3 into 24,25(OH) 2 D3 and loss of 1,25(OH) 2 D3 from the system occur at a constant rate. Secondly, as we have no information regarding 3-epi-1a,25(OH) 2 D3, we can simply consider loss of 3-epi-25(OH)D3 and 1,25(OH) 2 D3 from the system to also capture their loss through conversion into 3-epi-1a,25(OH) 2 D3. Thus, these simplifying assumptions (that have the benefit of facilitating complete model parameterisation) should not have any untoward effect on overall data. The resulting reduced network has 8 unknown reaction constants and kinetic rate equations (see Tables 6 and  7) and can be described by a system of four ODEs (see Table 7). The equations can be fully parameterised (see Supplementary Information), and the steady states readily obtained (Table 8). These steady states are used to inform the statistical analysis in Section 3.4.

Regression analysis of vitamin D metabolite data across pregnancy
Consistent with previous single metabolite data [24], regression analysis of data from the first group of pregnant women (West Mid-lands) showed increased serum concentrations of 1,25(OH) 2 D3 across normal pregnancy, but also revealed trends towards increased concentrations of 25(OH)D3, 3-epi-25(OH)D3, and 24,25(OH) 2 D3 from 1st to 3rd trimester (Fig. 3). Data for women with PET were not included in the regression analysis but are shown for reference, providing further evidence for dysregulation of vitamin D metabolism in PET relative to healthy 3rd trimester pregnancies.

Principal component analysis (PCA)
Using strategies outlined in Section 2.4, PCA analysis was carried out to further interrogate serum vitamin D metabolite data from the West Midlands group. The first two components (PCA1 and PCA2) explained 74% of data variance but did not appear to separate PET and healthy 3rd trimester pregnancies (Supplemental Fig. 1). It was possible that another classifier could distinguish the normal and PET groups based on the remaining variation, or that a non-linear projection was required. To investigate these possibilities, we ran multiple classification algorithms using the MATLAB Classification Toolbox [32], however no strong classifiers were identified. We therefore utilised to two linear classifier analyses: one independent from the mathematical model and one informed by the mathematical model.

Strategy 1: linear classifier analysis of metabolite data
Using data from the West Midlands group and 2nd trimester data from the SCOPE group we investigated a possible linear classifier strategy for defining normal healthy and PET pregnancies based on serum vitamin D metabolites. This approach aimed to delineate control/healthy normotensive and case/PET pregnant women by using a data dividing plane, as illustrated in Fig. 4. The dividing plane found is the best possible given the data in this study. A one-dimensional linear classifier specifies a single criterion for a single metabolite to predict whether the data represents control or case. A two-dimensional classifier, as illustrated in Fig. 4, sets combined criterion for two metabolites. For example, the threshold for 3-epi-25(OH)D3 increases as 25(OH)D3 increases and vice versa. For a combination of more than two metabolites the linear classifier generalises to provide a criterion based on the combined metabolite data. The efficacy of the linear classifiers is determined by the accuracy of the classification (percentage of correct classifications on the data set). Only classifiers achieving significantly > than 50% accuracy were of interest.
From earlier work [24] we postulated that individual vitamin D metabolite data cannot accurately predict future risk of PET onset. As shown in Fig. 4, individual metabolite classifiers did not reach accuracies > 68%. However, combining the data of four metabolites (25(OH)D3, 3-epi-25(OH)D3, 1,25(OH) 2 D3, 24,25(OH) 2 D3) for a single and finding the best four-dimensional linear classifier increased the accuracy of distinguishing between PET and normal pregnancies to 74%, suggesting that measurement of multiple vitamin D metabolites has the potential to be a more accurate representation of vitamin D 'status'.

Strategy 2: linear classifier analysis on metabolite ratios informed by the mathematical model
An alternative strategy to analyse serum vitamin D metabolite data in pregnancy is to consider metabolite ratios. Rather than calculating all possible combinations of vitamin D metabolites to form pairwise ratios, informative ratios, designated as α, β and γ (Table 9), were determined directly from the steady state relationships outlined in the reduced kinetic model for vitamin D metabolism ((3.1b)-(3.1d), Table 8).
Using the estimated Michaelis constant K (Supplementary Table 4) individual patient data were transformed (non-linearly) into three metabolite ratios (Table 9), reflecting the different major pathways in vitamin D metabolism. For example, rather than looking at isolated 3-epi-25(OH)D3 values, the concentration of epimerised 25(OH)D3 in relation to 25(OH)D3 was considered, encoded by ratio 'β'. This approach appears more informative than interpretation of single 3-epi-25(OH)D3 concentrations which alone do not help explain whether an increased intake of 25(OH)D3 determines downstream vitamin D metabolite concentrations, or whether it is the epimerisation pathway that is specifically upregulated.
The effect of transforming metabolite data into ratios before finding the best linear classifier was also assessed. This revealed significant improvement in the accuracy of classifiers. For example, solely utilising the 3rd trimester epimerisation ratio (3.2b) as the linear classifier, results in an accuracy of 72% in delineating between healthy and PET pregnancies. Combining metabolite ratios together resulted in an accuracy of 84% for the West Midlands 3rd trimester data. The high accuracy for the linear classifier based on the epimerisation ratio, β, alone suggests that changes in the vitamin D epimerisation pathway may be a marker of PET. Conversely, as illustrated in Fig. 5, the combined metabolite ratio analysis did not improve the overall accuracy of classification in the 2nd trimester SCOPE data.

Discussion
Vitamin D-deficiency has been linked to the pregnancy disorder PET, a condition that can increase morbidity and mortality in pregnant women, and their unborn and newborn babies. To date, studies of vitamin D and PET have focused on serum measurement of a single vitamin D metabolite, 25(OH)D3 despite the fact that pregnancy is characterised by distinct changes in vitamin D metabolism [29,30]. The metabolic pathway for vitamin D is relatively well-defined and multiple vitamin D metabolites are measurable from serum. Using data from pregnant and non-pregnant women, where concentrations of four of these metabolites have been measured, the principal aim of this study was to use a combination of data analysis and mathematical modelling to identify differences in vitamin D metabolism between normal and PET pregnancies. In particular, the data from 2 nd trimester pregnant women in the SCOPE study investigated to determine whether the models we used could accurately classify healthy normotensive (control) and PET (case) pregnant women prior to the onset of PET symptoms based upon early 2nd trimester vitamin D metabolite data.
Mathematical modelling of metabolic networks can often lead to large numbers of ordinary differential equations governed by many parameters that are often neither measurable nor straightforward to estimate from experimental data. To address this, data from 3 rd trimester pregnant women were utilised to develop a kinetic model of vitamin D metabolism incorporating 25(OH)D3, 3-epi-25(OH)D3, 1,25(OH) 2 D3 and 24,25(OH) 2 D3. From this we were able to reduce the full vitamin D network to include only these metabolites without disturbing the network shape surrounding them. In conjunction with literature values for half-life times this facilitated full parametrisation of the ordinary differential equation model of the reduced vitamin D network, an achievement that is rare in related mathematical models.
In the current study we compared two main modelling strategies, (i) linear classifier analysis using metabolite data and (ii) linear classifier analysis using metabolite ratios; the latter was motivated by analysis of our reduced kinetic model of vitamin D metabolism. It was shown that this permits more accurate classification of trends in vitamin D metabolism in normal and disease pregnancy compared to individual metabolite data analysis. Moreover, utilising combined metabolite ratios to reflect the interplay between the major metabolic pathways 1α-hydroxylation, epimerisation and 24-hydroxylation, a more accurate classification of 3 rd trimester healthy controls and PET cases was obtained.
The high accuracy for the linear classifier based on the epimerisation ratio suggests changes in the epimerisation pathway of the vitamin D metabolism could be a marker of PET. Since this was only visible in the third trimester it appears changes in vitamin D metabolism are gradual and that significant dysregulation develops as a result of PET, rather than being a precursor to disease development, although it is important to recognise that the current study involved relatively small samples sizes.

Summary and future work
The current study describes a novel strategy for understanding the differences in vitamin D metabolism between pregnant and non-pregnant women and distinguishing those pregnant women who have developed the pregnancy disorder PET. A major limitation of the current study was the small sample numbers in both the 2 nd trimester (SCOPE) and 3 rd trimester (West Midlands) studies. Future studies will aim to collect data from larger diverse cohorts of women to improve the reliability of both the statistical analyses and the model parametrisation described in the current mathematical analysis. The aim of mathematical modelling in this fashion is to generate new hypotheses for variations in vitamin D metabolism during pregnancy. In turn this may help to identify novel intervention and supplementation strategies for both normal pregnancy and PET. However, the potential applications for this mathematical modelling strategy extend far beyond pregnancy, with clear opportunities for refining our overall approach to the assessment of vitamin D related health and disease.

Supplementary Material
Refer to Web version on PubMed Central for supplementary material.   Table 6.

Figure 4. A two-dimensional linear classifier (dotted line) incorporating serum 3-epi-25(OH)D3 (y-axis) and 25(OH)D3 [nM] (x-axis) concentrations for the SCOPE (15 weeks of pregnancy) (left) and West Midlands (third trimester) data (right).
The linear classifier classes serum data in the grey section as PET case and the remainder as being healthy pregnancies. False-positives and false-negatives are clearly visible and lead to a classification accuracy of only 56% (left) and 68% (right).   Table 1 Demographic summary and analysis of participants in West Midlands and SCOPE women analysed in this study.
The West Midlands group includes non-pregnant female controls, normal pregnant women (n = 20) at first (NP1, n = 25) and third trimester (NP3, n = 21) and women with PET (n = 22  Table 2 Accuracy and precision data for vitamin D metabolite analyses.
Intra-day precision was calculated based on %RSD following the analysis of six replicates for each of the three concentration levels of each metabolite. Inter-day precision was performed over three days calculated based on %RSD following the analysis of six replicates for each of the three concentration levels of each metabolite per day for three days. Accuracy represents the percentage deviation from the true value and was calculated based on the analysis of six replicates of each concentration level for each metabolite.   Table 4 Reaction rate terms for the full kinetic model of vitamin D metabolism shown in Fig. 1.
Reaction numbers correspond to Fig. 1. Metabolites and enzymes with abbreviations are shown in Table 3. Vitamin D metabolite concentrations and the Michaelis constant K are measured in nM. Non-catalytic rate constants (k 3 -k 14 ) and catalytic rate constants (V 7 , V 9 ) are measured in d −1 and nM −1 d −1 respectively (d = days). Base production constants (P 25(OH)D3 and P 24-hydroxylase ) and maximal enzyme rate a are measured in nM•d −1 .  Kinetic equations representing the vitamin D metabolism model network depicted in Fig.  1.

Reaction Description
Reaction terms Ri for i = 1, …,14 correspond to the respective rate terms as given in Table 4 Table 7 Kinetic equations for the reduced vitamin D metabolism model network depicted in Fig. 2.
Reaction terms R, for i = 1, …8 correspond to the respective rate equations as depicted in Table 6.  Table 8 Steady-state equations for reduced kinetic model of vitamin D metabolism.
Relationships between metabolite concentrations at steady-state for the reduced kinetic model of vitamin D metabolism, depicted in Fig. 2 are shown. Metabolites are listed with abbreviations in Table 3. Reaction parameters (k 3 -k 8 , K, a, P 25(OH)D3 ) as defined in the rate equations in Table 6. Concentrations and Michaelis constant K are measured in nM. Non-catalytic rate constants (k 3 -k 8 ) are measured in d −1 . Base production constants (P 25(OH)D3 ) and maximal enzyme rate (a) are measured in nM·d −1 .