Clinical review: Predictive value of neutrophil gelatinase-associated lipocalin for acute kidney injury in intensive care patients

Neutrophil gelatinase-associated lipocalin (NGAL) may be an early marker of acute kidney injury (AKI), but elevated NGAL occurs in a wide range of systemic diseases. Because intensive care patients have high levels of comorbidity, our objective was to conduct a systematic review of the literature to evaluate the value of plasma and urinary NGAL to predict AKI in these patients. We conducted a systematic electronic literature search of MEDLINE through PubMed, EMBASE, and Cochrane Library for all English language research publications evaluating the predictive value of plasma or urinary NGAL (or both) for AKI in adult intensive care patients. Two authors independently extracted data by using a standardized extraction sheet including study characteristics, type of NGAL measurements, and type of outcome measures. The primary summary measure was area under receiver operating characteristic curve (AuROC) for NGAL to predict study outcomes. Eleven studies with a total of 2,875 (range of 20 to 632) participants were included: seven studies assessed urinary NGAL and six assessed plasma NGAL. The included studies varied in design, including observation period from NGAL sampling to AKI follow-up (range of 12 hours to 7 days), definition of baseline creatinine value, and urinary NGAL quantification method (normalizing to urinary creatinine or absolute concentration). AuROC values for the prediction of AKI ranged from 0.54 to 0.98. Five studies reported AuROC for use of renal replacement therapy ranging from 0.73 to 0.89, and four studies reported AuROC for mortality ranging from 0.58 to 0.83. There were no differences in the predictive values of urinary and plasma NGAL. The heterogeneity in study design and results made it difficult to evaluate the value of NGAL to predict AKI in intensive care patients. NGAL seems to have reasonable value in predicting use of renal replacement therapy but not mortality.


Introduction
Acute kidney injury (AKI) is frequent in critically ill patients admitted to intensive care units (ICUs) and is independently associated with increased morbidity and mortality [1]. For many years, serum creatinine (sCr) has been the principal marker of AKI even though it is widely acknowledged that sCr is not reliable during acute changes in kidney function and varies with gender, age, muscle mass, dietary intake, and hydration status. sCr does not refl ect real-time decline in glomerular fi ltration rate (GFR), because creatinine has to accumulate as a result of a decrease in GFR before increased concen tra tions are detectable. A real-time marker of AKI may allow the institution of earlier, and therefore more eff ec tive, renoprotective therapies; one such marker is neutro phil gelatinase-associated lipocalin (NGAL).
NGAL, also known as lipocalin-2 (lcn2), is a 25-kDa protein and member of the lipocalin superfamily [2]. It was named after its expression in neutrophils and found to have bacteriostatic eff ects by interfering with bacterial siderophore-mediated iron uptake [3]. NGAL expression has been shown to increase in response to infl ammation in epithelial cells regularly exposed to microorganisms [2] and in response to cellular oxidative stress [4]. Increases in plasma NGAL have been reported in a wide range of systemic diseases, including acute infections, pancreatitis, heart failure, and cancer [5][6][7][8], but in recent years the potential role of plasma and urinary NGAL as early markers of AKI has been studied. A study in mice showed marked urinary NGAL increase within 2 hours of renal injury, by far preceding conventional markers of AKI [9]. In children undergoing elective cardiac surgery, plasma and urinary NGAL measure ments at 2 hours after surgery were highly predictive of the development of AKI within 72 hours; area under receiver operating charac teristic curves (AuROCs) were 0.91 and 0.99, respectively [10]. Diff erences between plasma and urinary NGAL kinetics are likely because of local synthesis and excretion of NGAL in the distal tubules of the nephron, further supported by a calculated fractional NGAL excretion of more than 100% [11].
Patients admitted to the ICU have higher levels of comorbidity than other patient categories, possibly confounding the value of NGAL as a marker of AKI. Th is is supported by a study showing higher plasma and urinary NGAL levels in septic AKI versus non-septic AKI patients [12]. Also, the exact onset of a renal insult in intensive care patients is often less clear, and this further hampers the interpretation of elevated NGAL in these patients.
Th erefore, the aim of this review was to systematically evaluate the predictive value of plasma and urinary NGAL measurement for AKI in ICU patients. Given the presumed confounding increase in measured NGAL during acute infections, we aimed to do a subgroup analysis in patients with sepsis.

Methods
Th is systematic review was conducted according to PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guide lines [13]. Th e following inclusion criteria were defi ned a priori: studies of adults in an ICU setting evaluating the value of urinary or plasma NGAL measurements (or both) as an early marker of AKI. Studies not performed primarily in an ICU setting (for example, surgery) and articles not in English were excluded.

Information sources and search strategy
We conducted an electronic search in MEDLINE through PUBMED database, EMBASE, and Cochrane Library in February 2012. Th e search was conducted with the following search string: (NGAL OR 'neutrophil gelatinase-associated lipocalin' OR lipocalin-2 OR lcn2) AND ('acute kidney injury' OR AKI OR 'acute renal injury' OR ARI OR 'acute renal failure' OR ARF OR 'acute kidney failure' OR AKF). Two authors (PH and MW) indepen dently screened all articles for inclusion. Diff erences were discussed and resolved with a third party (AP). A supple mental search was conducted by screening citations of review articles and research papers to identify potential studies not included in the search string.

Study selection and data extraction
For studies that fulfi lled inclusion criteria, two authors (PH and MW) independently extracted data by using a standardized extraction sheet (Supplemental Digital Content 1). When diff erences in opinion occurred, they were resolved by discussion involving a third party (AP). Data were extracted from each included trial on (a) study characteristics (including setting, inclusion and exclusion criteria, year of publication, and population size), (b) type of NGAL measurements (including plasma or urine or both, assay used, and timing and frequency of sampling), and (c) type of outcome measure (including AKI defi nition, observation period for AKI, use of renal replacement therapy (RRT), and mortality).

Assessment of methodological quality and risk of bias in individual studies
Th ere are, to our knowledge, no specifi c guidelines on how to assess the methodological quality of individual studies of diagnostic markers. We used the following criteria to assess the risk of bias: (a) prospective study design and analysis, (b) a validated diagnostic scale defi ning AKI: Acute Kidney Injury Network (AKIN) or Risk, Injury, Failure, Loss, and End-stage kidney disease (RIFLE), (c) clearly described selection criteria for study participants, (d) suffi cient description of NGAL measurements permitting replication of the study, and (e) any potential confl ict of interests described.
Both the RIFLE [14] criteria and the AKIN [15] criteria were accepted as validated diagnostic scales as they have shown comparable predictive values for mortality and ICU length of stay in a large cohort of ICU patients [16]. For studies using sCr criteria only, AKI defi nition was determined by modifi ed RIFLE/AKIN and still accepted as valid. Studies fulfi lling four or fi ve criteria were classifi ed as studies having low risk of bias, and if three criteria were fulfi lled, the studies were classifi ed as having medium risk of bias. Otherwise, they were classifi ed as studies having high risk of bias.

Summary measures
AuROC was the primary measure for the value of NGAL to predict AKI, RRT, and mortality. Furthermore, the AuROC of AKI stratifi ed for severity was extracted if reported. If NGAL thresholds were reported, sensitivity, specifi city, and positive and negative predictive values were extracted. If the study conducted a sensitivity analysis to exclude patients with reduced kidney function on entry, AuROC and method were extracted.

Meta-analysis and subgroup analysis
We planned to conduct a meta-analysis of the predictive value of plasma and urinary NGAL for AKI and a subgroup analysis of patients with sepsis.

Study trial fl ow
Th e search string produced a total of 1,041 potentially relevant articles. No additional studies were found in the supplemental hand search, but one unpublished study was identifi ed through personal contact. Figure 1 shows the fl owchart of study selection.

Study characteristics
We included 11 studies (Table 1); for further study characteristics, see Supplemental Digital Content. Th e studies were published between 2009 and 2011; one study was only in abstract form, and one was unpublished at the time of analysis. Nine studies were in general ICU patients, one was in ICU patients with multiple trauma, and one study was in patients with septic shock. Th e 11 studies included a total number of 2,875 patients (range of 20 to 632).
Five studies measured urinary NGAL only, four measured plasma NGAL only, and two studies measured both plasma and urine; that is a total of seven studies of urinary NGAL and six of plasma NGAL. All studies of plasma NGAL reported absolute concentration, whereas urinary NGAL studies varied: three reported NGAL in absolute concentration, and four reported NGAL nor mal ized to urinary creatinine.
Th e included studies varied in kidney-specifi c exclusion criteria: some excluded patients with history of chronic kidney disease, and others excluded patients based on admission sCr. AKI defi nition varied among studies: four used either RIFLE or AKIN creatinine and urine output criteria, four used modifi ed either RIFLE or AKIN with creatinine criteria only, one used both AKIN and RIFLE criteria, and one reported two AuROC values using modifi ed AKIN and modifi ed RIFLE criteria, respectively. In one study AKI was not reported, but the use of RRT was. Th e included studies used diff erent defi nitions of baseline creatinine: some used admission sCr, and others used sCr obtained prior to admission. When creatinine obtained prior to admission was used, 28% to 51% of patients had missing values when reported. Th e handling of missing baseline values varied: some used admission creatinine, some used back calculating with the Modification of Diet in Renal Disease (MDRD) formula, and one study used multiple imputation. Seven studies used NGAL samples at or close to admission to evaluate AuROC for AKI, whereas two studies reported that AuROC values were retrospectively based on NGAL samples taken on a fi xed time point prior to AKI (12 hours and 2 to 3 days, respectively).
According to our predefi ned risk-of-bias assessment, nine studies had low and two studies medium risk of bias ( Table 2). In all studies having use of RRT as an outcome measure, clinicians were blinded to NGAL concentrations, reducing the risk of NGAL values aff ecting the decision to use RRT. Table 3 shows the results of the included studies' primary analyses of the value of NGAL to predict AKI, RRT, and mortality. Th e incidence of AKI across the studies ranged from 14% to 72% of patients included in the primary analyses. Th e follow-up time from the assessment of NGAL to AKI ranged from 12 hours to 1 week. Th e AuROC values for prediction of AKI ranged from 0.54 to 0.98 in all included studies and from 0.54 to 0.92 in the studies of general ICU patients. In fi ve studies, AuROC for prediction of use of RRT was performed and these  (Table 3). In four studies, sensitivity analyses were conducted to exclude patients with pre-existing reduced kidney function (Table 4). In the remaining studies in which sensitivity analyses were not performed, kidney-specifi c characteristics of patients included in the primary analyses were presented.

Value of NGAL to predict study outcomes
Two studies conducted analyses stratifying for AKI severity. Th e results are presented in Table 5, showing increase in AuROC values with increasing degree of AKI.
Our plans of conducting a meta-analysis and a subgroup analysis of patients with sepsis were aborted because of heterogeneity and lack of data, respectively; see the Discussion for further argumentation.

Discussion
Th e aim of this review was to systematically evaluate articles investigating the value of plasma or urinary NGAL (or both) to predict AKI in adult ICU patients. Th e results of the included studies varied greatly, as did those of studies in general ICU patients only. Put another way, the results ranged from a predictive value equivalent to fl ipping a coin to NGAL being an excellent early marker of AKI. A reason for these results may be the marked diff erences in study design. Th e observation    period for AKI varied greatly between studies, but there was no clear association between length of observation period and AuROC as studies with observation periods of 5 days or more appeared to have higher AuROC values. However, the two studies reporting AuROC values for more than one observational period using same AKI defi nition showed a decline in AuROC with longer observation period. Also, kidney-specifi c characteristics of patients included in the calculation of AuROC varied, making direct comparison between studies less reliable; this probably contributed to the marked range in AKI incidence even among studies performed in general ICU populations. When RIFLE or AKIN criteria are used to defi ne AKI, a baseline creatinine value for each included patient is required. Th e included studies used diff erent defi nitions of baseline creatinine (see Supplemental Digital Content), and this may have caused one patient to be classifi ed as having AKI in one study but not in another, even though the studies appeared to use the same criteria for AKI. Some studies excluded patients with reduced kidney function at inclusion, whereas others conducted a sensi tivity analysis based on author-defi ned kidney impair ment. We believe the latter approach to be more appro priate because it provides more information to clinicians and researchers. Moreover, the varying methods in the individual studies make it diffi cult to compare the results.
For the studies reporting the secondary outcomes, use of RRT and mortality NGAL performed more homo genously than for AKI. Studies reporting AuROC for both use of RRT and AKI showed that NGAL performed more homogenously well in predicting use of RRT than in predicting AKI. Th is fi nding, combined with the fi nding that NGAL performed better with increasing severity of AKI, indicates that NGAL has greater potential as an early marker of severe AKI in the ICU. A possible explanation is that ICU populations have higher levels of comorbidity and thereby other sources of NGAL, but further studies are needed to confi rm this hypothesis. For the fi ve studies reporting the value of NGAL to predict mortality, NGAL performed homo gen ously poorly with the exception of one study with low mortality rate.
Two studies evaluated both plasma and urinary NGAL. Th e largest study of this review with 632 participants found no signifi cant diff erence between AuROC for plasma and urinary NGAL; this is interesting given the presumed diff erences in metabolism of plasma and urinary NGAL.
As noted, some studies used urinary NGAL-creatinine ratios as opposed to absolute NGAL concentrations. A recent study showed signifi cant increase in intra-individual variations in urinary NGAL when using absolute concentrations compared with concentrations normalized to creatinine [17]. Using NGAL concentrations normal ized to creatinine, however, has also been criticized, especially during non-steady state as in AKI, in which urinary creatinine excretion rate changes over time and there is active tubular secretion of creatinine [18]. Th e recommendation given in the latter article was to use timed collections providing biomarker excretion rates. A comparison of the three methods of measuring NGAL was conducted by Endre and colleagues [19], who showed comparable AuROC values for NGAL to predict outcomes, though favoring normalizing to urinary creatinine. We aimed to conduct a meta-analysis, but given the variations in study design, we do not believe that metaanalyses would contribute useful data on the value of NGAL to predict AKI. As only one study was conducted exclusively in patients with sepsis and none of the other studies reported AuROC for this subgroup, the planned subgroup analysis was also aborted.
Th ere are general limitations and challenges when conducting studies evaluating the value of NGAL to predict AKI. Firstly, the use of creatinine-based AKI defi nition as reference standard is challenging. Creatinine is not an ideal marker of AKI and this poses a challenge when conducting studies evaluating potential early markers of AKI. A recent meta-analysis of cardiac surgical and ICU patients proposed that an NGAL increase in patients not fulfi lling conventional AKI criteria may be a sign of subclinical AKI with signifi cantly increased risk of need of RRT and not a false-positive test result as often reported [20]. Whether or not this theory applies when exclusively examining ICU patients with presumably more abundant sources of confounding NGAL is not known, but further studies are called for. Secondly, the hand ling of missing baseline creatinine values may confound results. Th e Acute Dialysis Quality Initiative (ADQI) recommends back-calculating from the MDRD formula from an estimated GFR of 75 mL/minute per 1.73 m 2 [14], but controversy exists, resulting in great variations in the handling of missing baseline values. Th irdly, the observation period from NGAL sampling to AKI by conventional criteria poses a challenge. Th e longer the observation period, the higher the risk of including renal insults acquired after NGAL sampling.
Conversely, if the time period is short, there is a risk of excluding late AKI.
More studies are needed to further clarify the role of NGAL as an early marker of AKI in intensive care patients. Some form of consensus of study design is paramount in order to make comparison of results more meaningful. Studies on selected patient groups in the ICU would be desirable as general ICU patients con stitute a heterogenic population. We recommend calculating AuROC for AKI based on patients not fulfi lling AKI criteria on entry and applying ADQI recommendations when missing baseline creatinine value obtained prior to admission. Th e proportion of patients with missing baseline creatinine should be stated, and a sensitivity analysis excluding these patients would be desirable. Given the characteristics of NGAL, an AKI observation period of approximately 3 to 5 days seems to be appropriate. Th e optimal quantifi cation method of urinary NGAL has not yet been established, and we recommend reporting both absolute concentration and concentration normalized to creatinine and, if possible, the NGAL excretion rate.

Conclusions
Th is systematic review has shown that studies evaluating plasma and urinary NGAL as early markers of AKI in ICU patients showed great heterogeneity in design and results. Th e results varied from NGAL being virtually use less to NGAL being an excellent early marker of AKI. Th e results for the secondary outcome measures use of RRT and mortality were more homogenous, with NGAL being a reasonable predictor of use of RRT. In contrast, NGAL appeared to be a poor predictor of mortality.

Key messages
• Th e results of the value of plasma and urinary NGAL in predicting AKI in intensive care patients varied, and so NGAL cannot at present be recommended as a marker of AKI in the ICU. • Diff erences in study design, including observation period for AKI, NGAL quantifi cation method, and kidneyspecifi c patient characteristics, made comparison across studies less reliable and led to the aborting of a planned meta-analysis. • Studies investigating the value of NGAL in predicting the use of renal replacement therapy showed homogenously reasonable predictive value of NGAL. Area under receiver operating characteristic curve presented as ± 2 × standard errors or (95% confi dence intervals). AKIN, Acute Kidney Injury Network; RIFLE, Risk, Injury, Failure, Loss, and End-stage kidney disease.