An improved approach to report creatinine-corrected analyte concentrations in urine

Abstract Traditionally, urinary analyte concentrations (UACObs) are divided by the observed urine creatinine (UCRObs) concentrations to allow for hydration correction. However, this method ignores the variability in the levels of urine creatinine due to such factors as age, gender, race/ethnicity, and others. Consequently, a method to develop a correction factor that incorporates adjustment due to most, if not all the factors that may affect urine creatinine concentrations was developed. This correction factor is applied to UCRObs to determine UCRCorr, which can then be used in place of UCRObs to compute modified creatinine-corrected analyte concentration as UACObs/UCRCorr instead of UACObs/UCRObs. For this study, data for urine creatinine from National Health and Nutrition Examination Survey (NHANES) for 2007–2010 were used to develop this correction factor to account for variability in urine creatinine due to age, race/ethnicity, gender, and body mass index. For each participant, correction factor β and its standard error for each of the 64 categories of age-race/ethnicity-gender were computed. In order to compute creatinine-corrected analyte concentration, observed analyte concentration was divided by the corrected value of observed urine creatinine whereas the corrected value of urine creatinine was the observed value minus the correction factor. Correction factor for each participant was a random number drawn from the normal distribution with mean β and standard deviation SE. The proposed methodology was applied to the 2009–2010 NHANES data for urinary 3-phenoxybenzoic acid, for 2013–2014 NHANES data for urinary cadmium and lead, and NHANES 2011–2012 data for urinary perchlorate, nitrate, and thiocyanate.


Literature review and statement of the problem
Analyte concentrations in urine are often reported as creatinine-corrected concentrations. If observed analyte concentration was UAC Obs and observed urine creatinine concentration was UCR Obs , then creatinine corrected analyte concentration, UAC Corr1 is reported as UAC Obs /UCR Obs . Since, most of the times, UCR Obs is measured in spot urine samples, rather than 24-h urine samples, hydration correction becomes necessary. Reporting of UAC Corr1 , rather than UAC Obs is supposed to adjust for hydration correction. This mechanism of reporting UACs implicitly assumes that UCR Obs are affected by urinary dilution only. Barr et al. (2005) used data from National Health and Nutrition Examination Survey (NHANES, www.cdc.gov/nchs/nhanes.htm) for the years 1988-1994 and showed that age, gender, race/ethnicity, and body mass index (BMI) also affect UCR Obs . For example, Barr et al. (2005) showed non-Hispanic blacks (NHB) to have higher mean UCR Obs than non-Hispanic whites (165.4 vs. 124.6 mg/dL) and females to have lower mean UCR Obs than males (113.5 vs. 148.3 mg/dL). In addition, children aged 6-11 years and senior citizens aged ≥70 years old were shown to have the lowest levels of UCR Obs (102.1 and 97.99 mg/dL, respectively) and those aged 12-19 and 20-29 years were shown to have the highest levels of UCR Obs (161.5 and 161.8 mg/dL, respectively); UCR Obs for the samples collected in morning was higher than for the samples collected in the evening; for a unit increase in BMI, UCR Obs was found to increase by 1.3 g/dL, persons with diabetes had lower UCR Obs than persons without diabetes; and kidney function was also shown to affect UCR Obs (Barr et al., 2005). However, Stiegel, Pleil, Sobus, Angrish, and Morgan (2015) found poor correlation between kidney injury panel and UCR Obs . In kidney stone patients with Type II diabetes, HbA1c was found to be correlated with UCR Obs (Fram, Moazami, & Stern, 2015). Decreased UCR Obs was found to be associated with sleep deprivation (Giskeodegard, Davies, Revell, Keun, & Skene, 2015).
Consequently, the values of UCR Obs must be corrected for the effect of factors other than urinary dilution before computing creatinine-corrected analyte concentrations. If the corrected value of urinary creatinine is denoted as UCR Corr , then "true" creatinine corrected analyte concentration denoted as UAC Corr2 should be computed as UAC Obs /UCR Corr . In order to account for the effect of all factors that affect UCR Obs , Barr et al. (2005) recommended using unadjusted UAC Obs in the regression models as dependent variable with UCR Obs used as one of the independent variables. This author fully supports this recommendation. When UCR Obs is used as one of the independent variables in statistical models, UAC Obs continues to be reported in per unit volume of the urine, for example, ng/mL or μg/L. However, data on differences in UCR Obs by age, gender, and race/ethnicity as provided by Barr et al. (2005) can still be used to compute UAC Corr2 as will be seen in this communication. Recently, O'Brien, Upson, Cook, and Weinberg (2015) proposed a two-stage model to adjust for the effect of factors other than dilution on UCR Obs . It would be of interest to compare the performance of single-stage adjustment model as proposed by Barr et al. (2005) and two-stage adjustment model as proposed by O'Brien et al. (2015). However, this may be a topic for future research and beyond the scope of this study as described in the next section.

Study objectives and proposed methodology
The sole objective of this study was to evaluate how traditional method of computing UAC, i.e. UAC Corr1 performs as compared with corrected method of computing UAC, i.e. UAC Corr2 for a selected number of urinary analytes. The correction factor needed to convert log 10 transformed values of UCR Obs or log 10(UCR Obs ) to log 10 transformed values of UCR Corr or log 10(UCR Corr ) will be determined by fitting a regression model for log 10(UCR Obs ) as the dependent variable and age, race/ethnicity, gender, and BMI as the independent variables. The regression slope β along with its standard error SE for 64 combinations of age, race/ethnicity, and gender to be presented as an Excel Table will provide a correction factor for each of these 64 demographic groups needed to convert log 10(UCR Obs ) to log 10(UCR Corr ). The data presented in this Table can be used in practical clinical situations where UCR Obs and UAC Obs are available but UAC Corr2 may be needed. A large data-set on UCR from NHANES for the period 2007-2010 will be used to fit the proposed model. The applicability of the correction factors developed by fitting the model for 2007-2010 will be tested for NHANES data for 2011-2012 and 2013-2014.

Materials and methods
All data available in the public domain from NHANES used for this study were collected by necessary approvals of the Institutional Review Boards of the National Center for Health Statistics and the Centers for Disease Control and Prevention.

Urine creatinine database
Data from NHANES (www.cdc.gov/nchs/nhanes.htm) from demographic, urine creatinine (UCR), and body measure files for those aged ≥6 years for the period 2007-2014 were downloaded and match merged. The sampling plan for NHANES is a complex, stratified, multistage, probability cluster designed to be representative of the civilian, non-institutionalized U.S. population. Sampling weights are created in NHANES to account for the complex survey design, including oversampling, survey non-response, and post-stratification. A total of 31,964 participants with non-missing values of UCR were available for analysis. For the purpose of this study, overall database for 2007-2014 was split in to three databases, namely, data for 2007-2010, 2011-2012, and 2013-2014, respectively. Detailed sample sizes are given in Table 1. All data analyses completed for this study incorporated sampling weights as well as survey design characteristics, namely, stratification and clustering. cadmium and lead; urine perchlorate, nitrate, and thiocyanate; and urine 3-phenoxybenzoic acid In order to generate a database for 3-phenoxybenzoic acid (3-PBA), data from NHANES for 2009-2010 from demographic, body measures, and pyrethroids, herbicide, and organophosphate metabolite files were downloaded and match merged by the ID for each participant labeled as SEQN in NHANES data files. A total of 2,703 participants aged ≥6 years with non-missing values of 3-PBA were available for analysis. Details are given in Table 2. Percent observations at or above the limit of detection (LOD) for 3-PBA were 73.4%.

Databases for urine
In order to generate a database for urinary cadmium (UCD) and lead (UPB), data from NHANES for 2013-2014 from demographic, body measures, and urinary metal files were downloaded and match merged by the ID for each participant labeled as SEQN in NHANES data files. A total of 2,681 participants aged ≥6 years for UCD and UPB were available for analysis. Details are given in Table 2. Percent observations at or above LOD for UCD were 89.3% and 97.2% for UPB.
In order to generate a database for urinary perchlorate (UPC8), nitrate (UNO3), and thiocyanate (UTHIO), data from NHANES for 2011-2012 from demographic, body measures, and UPC8, UNO3, and UTHIO files were downloaded and match merged by the ID for each participant labeled as SEQN in NHANES data files. A total of 2,506 participants aged ≥6 years were available for analysis. Details are given in Table 2. Percent observations at or above LOD for UPC8, UNO3, and UTHIO were 100, 99.7, and 99.9, respectively. All values below the LOD were imputed as LOD/Sqrt(2).

Outcome variables
Since the distribution of UCR Obs was found to be positively skewed (skewness = 1.1, see Table 3), log 10 transformed values of UCR Obs were used as the outcome/dependent variable for the regression model fitted to predict the values of UCR. log 10 transformed values of 3-PBA, UCD, UPB, UPC8, UNO3, and UTHIO were used to compute geometric means for these six analytes by both traditional as well as modified methods to compute creatinine-corrected urinary analyte concentrations.

Statistical analysis
All data were analyzed using SAS University Edition (www.sas.com). Specifically, Proc SURVEYREG was used to compute unadjusted geometric means (UGM). Pairwise comparisons to evaluate statistical differences between UGMs were done using t-test. All pairwise UGMs were considered to be statistically significant if α < 0.05.

Analysis of urine creatinine data
First a regression model with log 10(UCR Obs ) as dependent variable and age, gender, race/ethnicity, and BMI as dependent variables for NHANES 2007-2010 data was fitted. The regression slopes (β) and their standard errors (SE) for each of the 64 categories formed by 2 genders, 4 race/ethnicities, and 8 age categories were computed. Values of log 10(UCR Corr ) were computed for each participant in each of the 64 categories by subtracting a randomly drawn normal variate N(β i ,SE 2 i ) for the ith category from log 10(UCR Obs ). Table 4 provides β and SE for each of these 64 categories. Next, a regression model for NHANES 2007-2010 data with log 10(UCR Corr ) as the dependent variable and age, gender, race/ethnicity, and body mass index as the independent variable was fitted. If the procedure of modifying log 10(UCR Corr ) from log 10(UCR Obs ) was a success, in the model with log 10(UCR Corr ) as the dependent variable, the model effect of age, gender, and race/ethnicity should no longer be statistically significant. These results are provided in Table 5. The adequacy of method to modify log 10(UCR Corr ) from log 10(UCR Obs ) was further tested by applying the modification procedure to NHANES data for 2011-2012 and 2013-2014. These results are provided in Table 6.       Table 7 for 3-PBA, in Table 9 for UPC8, UNO3, and UTHIO, and in Table 8 for UCD and UPB.

Urine creatinine statistics
When the distribution of an analyte is positively skewed, the mean of the distribution is supposed to be substantially higher than its geometric mean (GM) and that is exactly what was observed for the distribution of UCR Obs (Table 3). Irrespective of age, gender, and race/ethnicity, means were generally higher than GM by about 20-30%. For example, for females, while mean was 106.1 mg/dL, the GM was 82.3 mg/dL (Table 3) for a difference of about 29%.

Adequacy of fitted models for UCR Obs
Neither gender-age-race/ethnicity categories nor gender, age, and race/ethnicity remained statistically significant after the models were fitted for the modified values of log 10(UCR Obs ) or log 10(UCR Corr ) for the 2007-2010 data as would be expected. However, R 2 decreased about 15% to about 3% (Table 5) as would be expected. This is explained further in the Discussion section. However, when the models for log 10(UCR Corr ) were fitted for 2011-2012 and 2013-2014 data, while the model effects of gender and race/ethnicity still remained statistically insignificant, effect of age became statistically significant (Table 6).

Statistics for 3-PBA
UGMs for 3-PBA based on UCR obs and UCR Corr are presented in Table 7. UGMs based on UCR Corr were higher than those based on UCR obs irrespective of age, gender, and race/ethnicity. Males had lower UGMs than females (p < 0.01) based on UCR obs but these differences were not observed for UGMs based on UCR Corr (Table 7). Similarly, based on UCR obs , UGMs for NHW > NHB (p = 0.03) but these differences disappeared for UGMs based UCR Corr (Table 7).

Statistics for UCD
UGMs for UCD based on UCR obs and UCR Corr are presented in Table 8. UGMs based on UCR Corr were higher than those based on UCR obs irrespective of age, gender, and race/ethnicity. However, the magnitude by which UGMs for UCD Corr2 was higher than UGMs for UCD Corr1 varied by gender, race/ethnicity, and age. For example, for females, UGM for UCD Corr2 was 0.210 ng/mg creatinine and UGM for UCD Corr1 was 0.174 ng/mg creatinine or a difference of about 21%. For those aged 12-19 years, UGM for UCD Corr2 was 0.11 ng/mg creatinine and UGM for UCD Corr1 was 0.058 ng/mg creatinine or a difference of about 90%. Males had lower UGM for UCD Corr1 than females (p < 0.01, Table 8) but UGMs between males and females for UCD Corr2 were not statistically significantly different (Table 8). NHW had lower UGMs for UCD Corr2 than NHB (p < 0.01) but these differences were not observed between the UGMs based on UCD Corr1 .

Statistics for UPB
UGMs for UPB based on UCR obs and UCR Corr are presented in Table 8. UGMs based on UCR Corr were higher than those based on UCR obs irrespective of age, gender, and race/ethnicity. Statistically significant differences for UGMs between males and females were not observed for UPB Corr1 but males had higher UGM than females for UPB Corr1 (p < 0.01, Table 8). The order in which UGMs for UPB Corr1 by race/ ethnicity was observed was OTH > NHW > HISP > NHB but the order in which UGMs for UPB Corr2 was NHB > OTH > HISP > NHW (Table 8). While NHW had higher UGM for UPB Corr1 than NHB (p = 0.02), the reverse was observed for UPB Corr2 (p < 0.01, Table 8).

Statistics for UPC8
UGMs for UPC8 Corr2 were consistently higher than UGMs for UPC8 Corr1 . However, the magnitude of differences between UGMs for UPC8 Corr1 and UPC8 Corr2 varied with age, gender, and race/ethnicity. For example, UGMs for A12 were 2.699 and 5.205 ng/mg creatinine for UPC8 Corr1 and UPC8 Corr2 , respectively, for a difference of about 93%. For OTH, UGMs for A12 were 3.593 and 4.790 ng/mg creatinine for UPC8 Corr1 and UPC8 Corr2 , respectively (Table 9), for a difference of about 22%. While UGMs for A12 were statistically lower than for A20+ (p < 0.01) for UPC8 Corr1 , these differences were not observed for UPC8 Corr2. The order of UGMs by race/ethnicity for UPC8 Corr1 was OTH > NHW > HISP > NHB but for UPC8 Corr2 , the order was HISP > NHW > OTH > NHB (Table 9).

Statistics for UTHIO
UGMs for UTHIO Corr2 were consistently higher than UGMs for UTHIO Corr1 . However, the magnitude of differences between UGMs for UTHIO Corr1 and UTHIO Corr2 varied with age, gender, and race/ethnicity. For example, UGMs for A6 for UTHIO Corr1 and UTHIO Corr1 were 1.277 and 1.671 μg/mg creatinine, respectively (Table 9), for a difference of 31%. On the other hand, UGMs for NHB for UTHIO Corr1 and UTHIO Corr1 were 1.0 and 1.887 μg/mg creatinine, respectively (Table 9), for a difference of about 89%. While for UTHIO Corr1 , NHW had statistically significantly higher UTHIO Corr1 (p < 0.01, Table 9) than NHB, these differences were not found to be statistically significant for UTHIO Corr2 . On the other hand, NHB had statistically significantly higher UTHIO Corr2 (p < 0.01, Table 9) than HISP, these differences were not found to be statistically significant for UTHIO Corr1 .

Discussion
Traditional methods to compute creatinine-corrected analyte concentrations in urine ignore variability in the observed levels of urine creatinine due to factors other than hydration. In this paper, a modified method to compute creatinine-corrected analyte concentrations that also adjusts for variability in the observed urine creatinine measurements due to age, gender, race/ethnicity, and BMI was presented. Regression slopes (β) of correction factors with their standard errors (SE) that need to be applied to the observed values of urine creatinine before using them in the denominator to compute creatinine-corrected analyte concentrations were presented for 64 combinations of 2 genders, 8 age groups, and 4 racial/ethnic groups in Table 4. A random number from a normal distribution, N(β,SE 2 ) should be used to adjust (subtract) log 10 transformed observed values of urine creatinine for an individual located in one of the 64 age-race/ethnicity-gender categories before creatinine-corrected analyte concentrations are computed for that particular individual. Analysis tool pack freely available in Excel can be easily used to generate this random number by providing mean β and SE as the standard deviation.

Urine creatinine levels
Order of observed urine creatinine means and geometric means (Table 3) by age, gender, and race/ ethnicity in this study was the same as reported by Barr et al. (2005). However, for every age, gender, and race/ethnic category, means reported by Barr et al. (2005) were higher than those observed in this study. For example, while mean reported by Barr et al. (2005) for NHW was 124.6 mg/dL, the mean observed for this study was 115.3 mg/dL, a difference of 9.3 mg/dL. For those aged 12-19 years, the mean reported by Barr et al. (2005) was 161.5 mg/dL, the mean observed for this study was 147.2 mg/dL, a difference of 14.3 mg/dL. On the other hand, for those aged 40-49 years, the mean reported by Barr et al. (2005) was 124.6 mg/dL, the mean observed for this study was 122.5 mg/dL or the differences were minimal. The data reported by Barr et al. (2005) were for the years 1998-1994 and the data reported for this study were for the years 2007-2010. It is possible that the levels of UCR over time may have decreased. More work will be needed to confirm this observation and explain the factors that may be responsible for decreasing time trends in the observed levels of UCR.

Adequacy of the model fitted for UCR Corr
The sole purpose of fitting a model for UCR Corr was to remove the variability in UCR Obs that can be attributed to gender, age, and race/ethnicity. If the fit for the model for UCR Corr was a success, estimated model effects for gender, age, and race/ethnicity should not be statistically significant. And, in fact, this is what was observed (Table 5). In the model fitted for UCR Obs , estimated correction factors to be applicable to UCR Obs were based on 64 combinations of age, gender, and race/ethnicity with the combination representing OTH females aged ≥70 being used as the reference category. It is certainly possible to use different numbers (lower or higher) of the combinations of age, gender, and race/ethnicity with a different combination of age, race/ethnicity, and gender, for example, NHB males aged 20-29 years as the reference category. It should not make a major difference but it is unknown in what way, this could have affected the estimated correction factors and the final model fit. Since the sole purpose of fitting a model for UCR Corr was to remove the variability attributable to gender, age, and race/ethnicity, R 2 for the model for UCR Corr should be expected to be smaller than the R 2 for the model for UCR Obs and that is exactly what was observed (Table 5).
There is always a concern that a model fitted for one data-set may not perform well when used for a different data-set. In order to address that concern, correction factors estimated by fitting model for UCR Obs for NHANES 2007-2010 data were used to fit models for UCR Corr for both NHANES 2011-2012 and 2013-2014 data-sets. While model effects remained statistically insignificant for both gender and race/ethnicity for models for both 2011-2012 and 2013-2014 data-sets, model effect for age was observed to be statistically significant for the models for both 2011-2012 and 2013-2014 data (Table 6). However, out of a total of 28 possible pairwise combinations of eight age groups, only four pairwise comparisons for 2013-2014 data-set and three pairwise comparisons for 2011-2012 data-set were found to be statistically significant.

Urinary creatinine corrected analyte concentrations -two alternate approaches
In order to compare the mean or geometric mean values of UAC Corr1 and UAC corr2 , it is necessary to understand the factors and the direction of effect they may have on both the numerators and denominators used in computing UAC Corr1 and UAC corr2 . As has been shown in this study as well as by Barr et al. (2005), NHB had higher levels of UCR than NHW. As such, in order to neutralize the effect of race/ethnicity on UCR Obs , UCR Obs will need to be adjusted downwards for NHB and upwards for NHW or UCR Corr < UCR Obs for NHW and UCR Corr > UCR Obs for NHB. If race/ethnicity did not affect UAC Obs , then UAC Corr1 > UAC Corr2 for NHB and UAC Corr1 < UAC Corr2 for NHW. If race/ethnicity does affect both UCR Obs and UAC Obs , then the difference in mean or geometric mean values of UAC Corr1 and UAC Corr2 may be small or large, positive or negative. Consequently, small differences, if so observed between UAC Corr1 and UAC Corr2 , should not be of concern nor it should be concluded that adjustment in the values of UCR Obs is of no significance. Emphasis should be placed on appropriate analytical methodology and the research has proven that the values of UCR Obs , in addition to urinary dilution, are also affected by age, gender, race/ethnicity, BMI, and possibly other factors. It should also be remembered that there are multiple factors, possibly in opposite directions, which affect UCR Obs . For example, for NHB children 6-11 years old, UCR Obs need to be adjusted downward because of NHB race/ ethnicity but upwards because of age. In order to compare the adequacy of analyte estimates based on UCR Corr , it will be unwise to make comparisons between the analyte levels based on UCR Obs and UCR Corr as alluded to above. Pairwise analyte differences based on the use of UCR Obs and UCR Corr can switch from being (i) statistically significant to statistically insignificant as was seen for male-female differences for 3-PBA (Table 7) and for A12-A20+ differences for UPC8, (ii) statistically insignificant to statistically significant as was seen for NHW-NHB differences for UCD (Table 8), and (iii) statistically