Plasma proteins associated with circulating carotenoids in Nepalese school-aged children☆

Carotenoids are naturally occurring pigments that function as vitamin A precursors, antioxidants, anti-inflammatory agents or biomarkers of recent vegetable and fruit intake, and are thus important for population health and nutritional assessment. An assay approach that measures proteins could be more technologically feasible than chromatography, thus enabling more frequent carotenoid status assessment. We explored associations between proteomic biomarkers and concentrations of 6 common dietary carotenoids (α-carotene, β-carotene, lutein/zeaxanthin, β-cryptoxanthin, and lycopene) in plasma from 500 6–8 year old Nepalese children. Samples were depleted of 6 high-abundance proteins. Plasma proteins were quantified using tandem mass spectrometry and expressed as relative abundance. Linear mixed effects models were used to determine the carotenoid:protein associations, accepting a false discovery rate of q < 0.10. We quantified 982 plasma proteins in >10% of all child samples. Among these, relative abundance of 4 were associated with β-carotene, 11 with lutein/zeaxanthin and 51 with β-cryptoxanthin. Carotenoid-associated proteins are notably involved in lipid and vitamin A transport, antioxidant function and anti-inflammatory processes. No protein biomarkers met criteria for association with α-carotene or lycopene. Plasma proteomics may offer an approach to assess functional biomarkers of carotenoid status, intake and biological function for public health application. Original maternal micronutrient trial from which data were derived as a follow-up activity was registered at ClinicalTrials.gov: NCT00115271.


Introduction
Carotenoids are pigments occurring naturally in fruits and vegetables [1]. They are compounds synthesized from eight isoprenoid units, and more than 700 are found in nature [2][3][4][5][6]. Structurally, hydrocarbon carotenoids are referred to as carotenes and oxygenated carotenoids are termed xanthophylls. Among them, α-carotene, β-carotene, β-cryptoxanthin, lycopene, lutein, and zeaxanthin are the major dietary carotenoids found in human plasma [7]. While α-carotene, βcarotene and β-cryptoxanthin are provitamin A carotenoids, meaning they can be metabolized to retinol, lycopene, lutein and zeaxanthin cannot be converted to vitamin A [8]. Provitamin A carotenoids may be particularly important dietary sources for maintaining vitamin A status in impoverished regions, such as in rural Southern Asia [9], where vitamin A deficiency (VAD) has been shown to affect 20-35% of young children, school-aged adolescents and women of reproductive age [10][11][12] and is widely associated with low intakes of carotenoid-rich foods such as dark green leaves, yellow and orange vegetables and fruits and egg [12]. Carotenoids also appear to have in vivo antioxidant [13] and immunoregulatory [14] properties that are thought to give rise to frequent associations between their dietary intake or circulating concentrations and reduced risks of cardiovascular disease [15], various cancers [16] and macular degeneration [17,18]. Thus, plasma carotenoid concentrations may comprise a class of in vivo biomarkers that both reflect a diverse and nutritious diet [19][20][21] and nutritional, antioxidant, anti-inflammatory health of populations.
However, as molecules largely detected by chromatographic methods [22], carotenoids represent a group of micronutrients that are rarely assessed in low-middle income countries, signaling a need to explore novel approaches for their assessment in populations. In exploring plasma proteomics as an approach to ascertain potential biomarkers of micronutrient, functional, and health status in an undernourished population of school-aged children in Nepal, we have revealed associations between clusters of circulating proteins and micronutrient status (vitamins A, E, D and K, copper and selenium) [23][24][25][26], inflammation [27], cognition [28] and anthropometry [29] in an undernourished population of school-aged children in Nepal. Findings to date suggest that plasma proteomics can identify proteins predictive of nutritional and health status that are candidate biomarkers with the potential to be measured by multi-analyte approaches for protein quantification. Missing from the emerging knowledge base is evidence of proteins that reflect plasma carotenoids, which could benefit from assays more readily conducted than conventional biochemical methods. The objective of this study was to explore the direction, strength and plausibility of association between plasma proteins and plasma carotenoid concentrations in a rural population of Nepalese school-aged children.

Methods
Study cohort and field data collection. We obtained plasma samples from 3305 children 6-8 years of age living in the southern plains district of Sarlahi, Nepal, born to mothers who had previously participated in a 5-arm antenatal micronutrient supplementation trial [30]. Following stratification by original maternal supplement allocation group, 1000 samples were randomly selected (200 per original trial group) from children with multiple aliquots of plasma samples, complete data records from both the original trial and current follow-up study, and valid birth size measures for multiple biochemical assessments [31]. Of these, 500 samples were randomly chosen for proteomics analysis, maintaining original trial balance. Children from whom samples were selected have been described in detail previously and are typical of children in the region [23][24][25][26][27][28][29]31]. Follow-up data were collected on household socioeconomic characteristics, dietary frequencies and morbidity history for the previous 7 days and anthropometry (weight, height, mid-upper arm circumference), as reported earlier [24]. Weight-for-age Z-score (WAZ), height-for-age Zscore (HAZ) and body mass index (BMI)-for-age Z-score (BMIZ) were used to characterize nutritional status [32]. Venous blood was drawn from children following an overnight fast, which was processed into plasma aliquots and stored in liquid nitrogen in a field laboratory. Frozen plasma was transported in vapor phase liquid nitrogen shippers to the micronutrient analysis laboratory at Johns Hopkins University in Baltimore, Maryland, U.S.A. and stored at −80°C until analysis. In both the original field trial and follow-up study, informed consents were obtained and protocols were approved by the Nepal Health Research Council in Kathmandu, Nepal, and the Institutional Review Board at Johns Hopkins Bloomberg School of Public Health in Baltimore, Maryland, USA.
Plasma carotenoid analyses. Plasma carotenoids including β-carotene, lutein and zeaxanthin, β-cryptoxanthin, α-carotene, and lycopene were analyzed by HPLC (Waters 2795) with a quaternary gradient pump, autosampler, photodiode array detector and Empower 2 software following the procedure of Yamini et al. [33]. The peaks of lutein and zeaxanthin could not be distinguished as they are combined. Separation of carotenoids was achieved using an All sphere ODS-2, 5-μm, 4.6-mm column (Alltech) and a Supelguard Discovery C18 2-cm × 4.0-mm guard column (Sigma-Aldrich). The assay was calibrated using the National Institute of Standards and Technology standard reference material SRM968d.
Plasma proteomics analysis. Mass spectrometric and proteomics procedures developed for this study have been reported elsewhere [23,34]. In brief, 500 plasma samples (40 μL) were immunoaffinitydepleted of six high abundance proteins (albumin, IgG, IgA, transferrin, haptoglobin and antitrypsin), which constitute 85-90% of total plasma protein content, using a Human-6 Multiple Affinity Removal System (MARS) LC column (Agilent Technologies). Protein extracts (100 μg each) were TCA/acetone precipitated, trypsin digested, labeled by isobaric mass tags (iTRAQ 8-plex reagents), and then seven samples plus one pooled sample for quality control was fractionated by strong cation exchange (SCX) chromatography and analyzed on a LTQ Orbitrap Velos mass spectrometer (Thermo Scientific). MS/MS spectra were searched against the RefSeq 40 database using Mascot (Matrix Science) through Proteome Discoverer software (version 1.3, Thermo Scientific) to quantify proteins with respect to the within-iTRAQ medians of log 2 transformed and normalized reporter ion intensities derived from Proteome Discoverer. Data were obtained from 72 iTRAQ experiments with average 589 ± 65 proteins quantified per iTRAQ experiment. A total of 4705 proteins were detected, with 982 quantified in > 10% (n > 50) of all samples [23] and 146 proteins measured in all 500 samples, representing the plasma proteome for this study.

Statistical analysis
Detailed information on estimation of protein relative abundance from reporter ion intensities within each iTRAQ experiment was published elsewhere [34]. We applied linear mixed effects models (LME) to determine the association between log 2 transformed plasma concentration of each carotenoid and the relative abundance of individual plasma proteins accounting for multiple iTRAQ experiments.
The expected values of log 2 carotenoid concentrations for each individual protein from the LME can be expressed as E{N rk } = b 0 + B r + b 1 P rk where N rk is the log 2 -transformed plasma concentration of each carotenoid, k is the index for each sample in each r iTRAQ experiment, and P rk is the protein relative abundance estimate. The parameter b 0 is the estimate of the intercept which is the overall mean concentration of each carotenoid; B r is the random deviation of experiment r from this mean; and, b 1 is the estimate of the slope of the nutrient:protein association. Statistical significance of a protein:nutrient association was assessed by a two-sided hypothesis test for b 1 = 0. For individual significant nutrient:protein correlations, a q-value, an adjusted p-value to control false discovery rate (FDR) was reported [35]. Protein:nutrient correlation (r) and R 2 were calculated based on the observed plasma carotenoid concentrations and their respective best linear unbiased predictions from the LME models [36]. We present a list of all proteins with an FDR less than 10% (q < 0.10) in their associations with each plasma carotenoid, and their corresponding Human Genome Organization (HUGO) gene symbols [37], the number of samples with detected protein values (n), protein:nutrient correlation (r), the amount of variance in nutrient concentration as explained by the protein (R 2 ), p-value derived from testing the fixed effects slope of carotenoid concentration on protein abundance, chance-adjusted p-value (q), the slope (b 1 ), denoting relative (%) change in carotenoids per 2-fold (100%) increase in relative abundance of each protein and GenInfo identifier (gi) accession number Correlation coefficients were generated using all complete pairwise data. All of the correlations were statistically significant (p < 0.0001) except log 2 βcarotene with log 2 α-carotene (p = 0.56), log 2 βcarotene with log 2 lycopene (p = 0.96) and log 2 αcarotene with log 2 lycopene (p = 0.0786). a Four proteins quantified by mass spectrometry and estimated by linear mixed effects (LME) modelling in > 10% of the samples (50 < n ≤ 497) that are correlated with plasma log 2 β-carotene, subjected to a false discovery rate (FDR) cutoff of 10% (q < 0.10), and listed in increasing order of q, defined as candidate protein biomarkers for a plasma β-carotene proteome. b n represents the number of child plasma samples in which a protein was detected and quantified by iTRAQ MS. c b 1 represents the percent change in plasma β-carotene, (in μmol/L) per 2-fold (100%) increase in protein relative abundance. d GenInfo Identifier sequence number, as assigned to all nucleotide and protein sequences by the National Center for Biotechnology Information at the National Library of Medicine, National Institutes of Health, Bethesda, MD, USA. a Eleven proteins quantified by mass spectrometry and estimated by linear mixed effects (LME) modelling in > 10% of the samples (50 < n ≤ 500) that are correlated with plasma log 2 lutein/zeaxanthin, subjected to a false discovery rate (FDR) cutoff of 10% (q < 0.10), and listed in increasing order of q, defined as candidate protein biomarkers for a plasma proteome of lutein/zeaxanthin. b n represents the number of child plasma samples in which a protein was detected and quantified by iTRAQ MS. c b 1 represents the percent change in plasma lutein/zeaxanthin, (in μmol/L) per 2-fold (100%) increase in protein relative abundance. d GenInfo Identifier sequence number, as assigned to all nucleotide and protein sequences by the National Center for Biotechnology Information at the National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
[37] is provided in the tables. Datasets of plasma carotenoid concentrations and protein relative abundance presented in this study are available in Supplementary  Table 1. All analyses were conducted using open source software built under the R statistical computing environment [38].

Results
Nutritional status and demographic characteristics of study children (n = 500) are shown in Supplementary Table 2. Study children were undernourished as reflected by low anthropometry Z-scores: 48.5%, 39.1%, and 16.1% of them were considered underweight, stunted, and thin (weight-for-age, height-for-age, and BMI-for-age Z-scores < -2, respectively), relative to the World Health Organization (WHO) reference population [32]. For β-carotene, 41.6% children had plasma concentrations < 0.09 μmol/L (Table 1), which is considered low [33]. While the plasma xanthophyll carotenoids, β-cryptoxanthin and lutein/ zeaxanthin were detected among all children, plasma carotenes, αcarotene and lycopene, were more likely to be below detection limits. Values below the detection limit were not included in the distribution of values shown in Table 1.
Of the 982 detected proteins, 4 were associated with plasma βcarotene, 11 with lutein/zeaxanthin, and 51 with β-cryptoxanthin, meeting a FDR threshold of 10% (q < 0.10). No proteins met this criterion of association for plasma α-carotene or lycopene.
We examined the extent of correlation across proteins associated with log 2 β-cryptoxanthin, comprising the largest plasma carotenome, restricted to associations with FDR < 5% (Fig. 2). Within each of the pairs of proteins of the β-crytoxanthin proteome, the correlation coefficients (r) ranged from 0.28 to 0.96. We demonstrated that proteins positively and negatively associated with β-cryptoxanthin were also consistently correlated with each other in the expected directions given their associations with β-cryptoxanthin, with the exception of protein phosphatase, Mg 2+/ Mn 2+ dependent, 1 M (PPM1M) and leucine rich repeat containing 47 (LRRC47), which were also more weakly correlated with other proteins than most.

Discussion
Provitamin A carotenoids play important roles as dietary precursors of vitamin A that may take on particular significance in impoverished regions, such as in rural Southern Asia, where vitamin A deficiency (VAD) is endemic among young children, adolescents and women of reproductive age [11]. Carotenoids also may have important antioxidant [13], immunological [14] or metabolic [39] functions and thus serve as indicators of general population health [40]. However, given their infrequent assessment, and strengthening evidence supporting the use of plasma proteomics for assessing population status with respect to other micronutrients [23][24][25][26], inflammation [27], cognition [28] and growth [29] in this setting, we have revealed in this study protein biomarkers associated with circulating log 2 -normalized concentrations of six common dietary carotenoids which were plausible in their direction and strength of association.
We observed four proteins associated with β-carotene, eleven with lutein/zeaxanthin, and fifty-one with β-cryptoxanthin, all with a probability of false discovery below ten percent. APOA1, a major component of high density lipoprotein (HDL) in plasma [41], was positively associated with each of the three carotenoids, possibly reflecting shared lipoprotein transport or, co-existing antioxidant, antiinflammatory and other metabolic functions [42]. On the other hand, TNIP1, an inhibitor of the pro-inflammatory transcription factor, NF-kB [43,44] was negatively correlated with all three carotenoids. We had also shown relative abundance of TNIP1 to be positively associated with the acute phase reactant, alpha-1-acid glycoprotein (AGP), or orosomucoid, in this population [27], explained by a negative feedback loop whereby TNIP1 is upregulated by inflammation in order to maintain immune homeostasis [44]. TNIP1 also functions as a retinoic acid receptor corepresor in the presence of its ligand [45].
Nearly all proteins negatively associated across proteomes of βcarotene, lutein/zeaxanthin, and β-cryptoxanthin were previously found to be positively correlated with inflammation markers AGP and C-reactive protein (CRP) [27]. Among these proteins, complement factor B (CFB) and complement 9 (C9) are involved in regulation of complement activation [46]; haptoglobin (HP) and haptoglobin-related precursor (HPR) are responsible for scavenging of heme iron from plasma in response to inflammation and oxidative stress in red blood cells [47,48], and these proteins were negatively associated with βcryptoxanthin. Inflammatory proteins such as AGP isoforms of orosomucoid (ORM1) -inversely associated with β-caroteneand (ORM2) [49]. Serine peptidase inhibitors, serine peptidase clade A, member 3 (SERPINA3), also known as alpha-1-antichymotrypsin, which increases in the blood during the inflammatory response [50,51], and interalpha-trypsin inhibitor heavy chain H4 isoform (ITIH4) as a type II Table 5 Plasma proteins negatively associated with plasma log 2 β-cryptoxanthin in 6-8 year old children of rural Nepal (n = 500). a Gene Name HUGO Gene Symbol a Twenty proteins quantified by mass spectrometry and estimated by linear mixed effects (LME) modelling in > 10% of the samples that were negatively correlated with plasma log 2 β-cryptoxanthin (p < 0.01, q < 0.10), listed in increasing order of q, defined as negatively associated protein biomarkers of a plasma βcryptoxanthin. b n represents the number of child plasma samples in which a protein was detected and quantified by iTRAQ MS (excludes subsequent imputations required for multivariable LME models). c b 1 represents the percent change in plasma β-cryptoxanthin (in μmol/L) per 2-fold (100%) increase in protein relative abundance. d GenInfo Identifier sequence number, as assigned to all nucleotide and protein sequences by the National Center for Biotechnology Information at the National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
acute-phase protein involved in the inflammatory response to trauma [52], were all negatively associated with plasma β-cryptoxanthin.
Somewhat surprisingly, the proteome for β-carotene, to our knowledge the most metabolically active carotenoid in human tissue and a specific vitamin A precursor, was quite limited in size (n = 4) and overlapped with that of β-cryptoxanthin, with the exception of PKM.
Despite a modest, albeit universally detectable, concentration of plasma β-cryptoxanthin in the bloodstream, its proteome was far more extensive than β-carotene's. Variation in carotenoid hydrophobicity [53] may offer one explanation for this difference. Being less hydrophobic than β-carotene, β-cryptoxanthin is more likely located on the lipoprotein surface than in the core, where β-carotene is transported, thus allowing more extensive interactions with circulating proteins than possible with β-carotene.Secondly, as β-cryptoxanthin is known to be carried by HDL [54], it is notable that nearly half of the proteins found to be positively (ANTXR2, APOA1, APOA2, APOC1, APOC3, APOD,  CLU, GPLD1, GSN, IGFALS, LUM, PCYOX1, PLTP, PON1 and RBP4) and  negatively (CFB, C9, HP, ITIH4, LBP, ORM1, ORM2 and SERPINA3) associated with β-cryptoxanthin are known constituents of the HDL complex in human circulation [55].
Carotenoids exert their biological activity as antioxidants due to their extended conjugated carbon-carbon bonds [13]. The protective roles of carotenoids have been explored in blood plasma, where βcarotene, lutein, and zeaxanthin inhibited lipid peroxidation and hemoglobin oxidation but surprisingly lycopene and β-cryptoxanthin did not [56]. While an antioxidant function of β-cryptoxanthin has not been demonstrated in in vivo studies, we found it to be positively correlated with SEPP1, the major plasma carrier for selenium [57], an essential trace element that displays antioxidant activity by serving as an essential cofactor of glutathione peroxidase [58]. Plasma β-cryptoxanthin was also positively associated with PON1, an antioxidant/anti-inflammatory protein mostly synthesized by the liver and primarily associated with serum HDL [59]. To our knowledge, this is the first study demonstrating strong associations between antioxidant/anti-inflammatory PON1 and SEPP1 with plasma β-cryptoxanthin in a human population.
Plasma β-cryptoxanthin was positively correlated with relative abundance of IGFALS, IGFBP3, CNDP1, and cartilage oligomeric matrix protein (COMP), proteins that we have previously reported to be positively associated with child height and arm muscle mass in this population of school-aged Nepalese children [29], suggesting that βcryptoxanthin nutriture, as reflected in plasma, is associated with general nutritional status, although mechanisms explaining this relationship remain unknown.
Lutein and zeaxanthin, measured together, were associated with an intermediate proteome of 11 proteins. While present in plasma, lutein and zeaxanthin are concentrated in the macula, the central region of the retina [60]. These macular carotenoids protect the retina from lightinduced damage via filtering blue light [61]. Both lutein and zeaxanthin are effective antioxidants like other major carotenoids found in human plasma [62]. Lutein has been shown to protect against inflammation, by reducing the production of pro-inflammatory factors observed in retinal injury [63]. There was a positive correlation between plasma lutein/ zeaxanthin and proteoglycan 4 (PRG4), a glycoprotein recently identified at the ocular surface where it functions as a lubricant [64] and its loss results in inflammation [65].
In summary, a plasma proteomics approach has revealed an extensive proteome that covaries with relative abundance of β-cryptoxanthin, despite its low circulating concentration in a generally undernourished rural population of Nepalese school-aged children. The number and diversity of plasma proteins associated with β-cryptoxanthin suggests involvement in vitamin A metabolism, lipid transport and immunoregulation. Moreover, for the first time, we speculate an in vivo antioxidant function of β-cryptoxanthin. Our findings suggest that plasma proteins could be measured in populations as surrogates for carotenoid intake or status, and help reveal protein:carotenoid functional relationships. More work is merited in this line of study to verify our findings and probe the implications of these novel findings for carotenoid assessment, metabolism and function and health.