Introduction

Parenteral nutrition (PN) has been in common use since the 1960s [1] and is accepted as the standard of care for patients with chronic non-functioning gastrointestinal tracts [2, 3]. The appropriate use of PN in the intensive care unit (ICU), however, is somewhat more controversial [4, 5].

A recent survey revealed that, depending on country of admission, 19–71% of patients received PN as their sole source of nutritional support at some time during their ICU stay [6]. The promotion of high-quality evidence has been shown to result in a more consistent approach to the provision of nutritional support, which led to improved patient outcomes [7].

Although the results obtained from well conducted level I studies often disagree with the findings of meta-analyses based on level II trials [8], rigorously conducted systematic reviews can be used to support clinical decision making [9]. The strength of the conclusions reached by a systematic review, however, are intimately related to the quality of the individual trials included in it [10]. The importance of considering individual trial quality can be illustrated by contrasting the results of two recent meta-analyses.

The Cochrane Injuries Group Albumin Reviewers’ [11] meta-analysis reported treatment with human serum albumin increased the risk of death compared with either crystalloids or no treatment. This meta-analysis included numerous pseudo-randomised trials, which are known to be subject to allocation bias [12]. A subsequent meta-analysis which specifically addressed individual components of trial methodological quality found that when only high-quality trials were considered, there was no significant effect of albumin use on overall mortality [13]. The inclusion of trials of low methodological quality is known to have a substantial impact on the conclusions reached by meta-analyses [14].

Previous systematic reviews of nutritional support interventions [15, 16, 17, 18], have identified high-quality studies using composite methodological quality scales [19] which combine different dimensions of trial quality into an overall summary score. Most composite scales are known to have used weak methodology when selecting items for inclusion and ranking them for relative importance [20].

The composite scale selected to reflect overall trial quality can dramatically influence the conclusions of a meta-analysis [21]. Because composite scales may mask important differences in true methodological quality, the use of a ‘component approach’ has been recommended [10, 22]. A component approach assesses key methodological dimensions individually, without the calculation of a summary score [14].

The purpose of this meta-analysis was to use a component approach to investigate the effect of methodological quality on the overall conclusions reached when trials comparing the use of standard PN to standard enteral nutrition (EN) in critically ill patients were aggregated.

Materials and Methods

Literature search

Medline (http://www.PubMed.org) and EMBASE (http://www.Ovid.com) were cross-searched using sensitive (broad) search statements [23] customised to each engine to detect all controlled trials, overviews and evidence-based guidelines of primary feeding interventions. Reference lists of overviews and evidence-based guidelines were hand searched. Experts and Industry representatives were contacted in order to ensure trials were not missed. The final closeout date for the search process was 30 April 2003.

Study selection

All controlled trials comparing primary feeding interventions and published in the English language [14, 24] were reviewed. Only truly randomised trials comparing standard EN to standard PN and reporting clinically meaningful outcomes were eligible. Standard EN and PN solutions were defined as any solution not supplemented with additional glutamine, arginine or other immune enhancing ingredients. PN was defined as an intravenous solution containing protein and a source of non-protein energy with or without lipids [18].

Because previous overviews detected heterogeneity between critically ill and non-critically ill patient populations [18], an explicit, objective definition of a critically ill patient population was applied [25]. Trials conducted in non-critically ill patient populations were not considered for inclusion. A study was determined to have been conducted in a critically ill patient population if the manuscript reported one of the following:

  • The patients were recruited in an ICU.

  • The inclusion criteria described were such that the patients would normally be cared for in an ICU (e.g. all patients were receiving invasive mechanical ventilatory support).

  • The patients were suffering from a condition that usually requires care in an ICU (e.g. severe thermal burns of >40–50% total body surface area, multi-trauma that required urgent laparotomy).

  • The patients had an average ICU length of stay longer than 2 days.

  • A majority of the patients received a therapy that is delivered in the ICU (e.g. invasive mechanical ventilation).

  • A severity of illness score was reported that was commensurate with the patients being critically ill.

True methodological quality

Methodological quality was based on the reporting of three key methodological components in the published manuscript: (a) presentation of an intention to treat (ITT) analysis, (b) maintenance of allocation concealment during randomisation and (c) appropriate use of blinding (e.g. participants, investigators, outcome adjudicators) [14, 22].

To investigate the effect of trial quality on overall conclusions, a series of meta-analyses were planned. The initial meta-analysis considered trials that addressed all three dimensions of quality, followed by meta-analyses based on each component individually.

To address the impact of incomplete follow-up a sensitivity analysis was undertaken. The primary meta-analysis was conducted on all trials presenting complete follow-up. Since some degree of random loss to follow-up may occur, complete follow-up was defined as full reporting on at least 95% of all patients. The sensitivity analysis was conducted including trials reporting loss to follow-up by study arm, where the total loss did not exceed 10% of all patients. The conservative assumption that the lost patient encountered an undesirable outcome was made.

The presence of excessive loss to follow-up, defined as loss of more than 10% of patients, was considered a major methodological flaw [26, 27]. Trials with excessive loss to follow-up were not eligible for consideration [28].

Any differences in interpretation were resolved by discussion.

A priori defined subgroup analysis

A treatment-subgroup interaction was investigated for trials comparing early EN initiation to PN and trials comparing late EN initiation to PN. Early EN was defined as feeding within 24 h of ICU admission or initial injury [7].

Outcomes assessed

A clinically meaningful outcome was defined as a direct measure of how a patient feels, functions or survives [29]. Any trial explicitly reporting clinically meaningful outcomes (e.g. a validated quality of life instrument, duration of survival, quality-adjusted survival or landmark mortality) was considered for inclusion. Trials reporting only unvalidated surrogate outcomes were not eligible for inclusion [30].

Previous reviewers have suggested that infectious complications (ICs) are clinically important [15]. The presence of an IC was defined by positive culture results. Since it is generally difficult to determine when one infection has resolved, and a second, independent infection has begun, the proportion of individual patients with positive cultures was abstracted in preference to the total number of positive cultures [31].

Statistics

Meta-analyses were conducted with a fixed effects model [32] using the odds ratio metric [33]. The presence of heterogeneity was assessed using the χ2 statistic [32] and the I2 measure [34]. The presence of a priori hypothesised subgroup differences were investigated using a formal test of treatment-subgroup interaction [35].

Primary analysis was conducted using Revman (RevMan version 4.2 for Windows, Oxford, UK; Cochrane Collaboration, 2003). Formal tests of treatment-subgroup interactions were conducted using PC SAS proc logist (version 6.12, SAS Institute, Cary, N.C. USA.). Primary stratification by study was maintained using dummy variable coding. A p value less than 0.05 indicated statistical significance, while a p value greater than 0.05 but less than 0.10 indicated a trend towards statistical significance.

A p value less than 0.10 was used to indicate the presence of potentially important heterogeneity or subgroup-treatment interactions.

Results

Literature search and study selection

Cross-searching Medline (from 1966) and EMBASE (from 1980) revealed 2, 287 unique abstracts. Independent review of all abstracts resulted in the retrieval of 465 papers. Figure 1 presents the results of the study selection process using the flow-diagram recommended by the Quality of Reporting of Meta-analyses (QUOROM) conference participants [36]. No non-English language publications were identified on this topic. Twenty-two trials were found to compare the effects of EN to PN on clinically meaningful outcomes in a critically ill patient population.

Five of the 22 publications were based on subgroups of patients that were reported in a subsequent larger published trial [37, 38, 39, 40, 41] and thus did not qualify for inclusion as individual trials.

Three of the remaining 17 trials compared immune enhanced EN and/or PN [42, 43, 44], and one trial was pseudo-randomised, using an alternating date of admission allocation sequence [45].

Of the remaining 13 trials one failed to report outcomes on 21% of all randomised patients [46], and a second failed to report outcomes on 12% of all randomised patients [47]. Neither of these trials reported loss to follow-up by study arm.

Nine studies qualified for consideration in the primary analysis with two additional studies qualifying for the sensitivity analysis. Eight studies presented 100% follow-up [48, 49, 50, 51, 52, 53, 54, 55], and one study reported complete follow-up (outcomes reported on 95% of randomised patients) [56]. One paper reported outcomes on 94.3% of randomised patients [57], and one reported outcomes on 91.5% of randomised patients [58]. Loss to follow-up in all three trials was reported by study arm.

Table 1 presents further details describing the study population, criteria used to identify the critically ill patient population and nutritional support goals for each of the 11 included trials.

Fig. 1
figure 1

QUOROM flow-chart illustrating the study selection process. RCT Randomised controlled trial; N number of papers; EN enteral nutrition; PN parenteral nutrition

Table 1 Summary of study characteristics for the 11 included trials (EN enteral nutrition, ENR EN received, PN parenteral nutrition, PNR PN received)

True methodological quality

None of the 11 trials explicitly reported the maintenance of allocation concealment, and only one reported the use of blinding [48]. Nine of the 11 presented sufficient follow-up to allow the conduct of an ITT analysis.

Reporting of infectious complications

Six of the trials with complete follow-up reported positive cultures. Three trials reported the number of individual patients with positive cultures [48, 53, 56], and three reported the total number of positive cultures but not the number of patients with positive cultures [49, 55, 52]. The types of ICs reported by each trial are presented in Table 2.

Table 2 Infectious complications reported by study (UTI urinary tract infections)

Publication bias

Formal assessment of the funnel plot did not yield any evidence of a publication bias.

Primary analysis: mortality

Landmark mortality was the only clinically meaningful outcome reported in all trials. When the nine trials presenting ITT results were aggregated, a statistically significant mortality benefit was evident for the use of PN [odds ratio (OR) 0.51, 95% confidence interval (CI) 0.27–0.97, p=0.04; Fig. 2). The χ2 test for heterogeneity was non-significant (p=0.50) and the I2 measure was zero.

Fig. 2
figure 2

Total parenteral nutrition (TPN) vs. enteral nutrition (EN): effect on mortality, primary ITT analysis. OR Odds ratio; N total number of patients in the group; n number of patients who died in the group

Primary analysis: infectious complications

When the six trials reporting positive culture results were aggregated, there was a significant increase in ICs with PN use (OR 1.66, 95% CI 1.09–2.51; p=0.02; Fig. 3). The χ2 for heterogeneity was non-significant (p=0.16). The I2 measure of heterogeneity was 37.7%.

Fig. 3
figure 3

Total parenteral nutrition (TPN) vs. enteral nutrition (EN) effect on infectious complications, primary ITT analysis. OR Odds ratio; N total number of patients in the group; n number of patients with infectious complications (or total number of infectious complications) in the group

A priori subgroup analysis

Timing of feeding and effect on clinically meaningful outcomes

Six of the nine trials commenced enteral feeding within 24 h of intensive care admission or injury [48, 49, 53, 50, 51, 55]. In all cases early EN was achieved via transpyloric or jejunal feeding tubes.

Three trials did not meet the definition of early enteral feeding (<24 h). Two began EN within 48 h of ICU admission or injury [54, 56] while the third enrolled patients when there was an “actual or anticipated inadequate oral intake for longer than 7 days” [52].

Maintaining stratification by study, there was a potentially important (OR 1.07 PN vs. early EN, OR 0.29 PN vs. delayed EN, p=0.055) difference in the magnitude of the treatment effect for PN vs. early EN compared to PN vs. late EN (i.e. a treatment-subgroup interaction).

When the use of PN was compared to the provision of early EN, there was no significant effect on mortality (OR 1.07, 95% CI 0.39–2.95, p=0.89; Fig. 4). The χ2 test for statistical heterogeneity was non-significant (p=0.75) and the I2 measure was zero.

Compared to the provision of delayed EN there was a statistically significant mortality benefit in favour of the use of PN (OR 0.29, 95% CI 0.12–0.70, p=0.006). There was no evidence of statistical heterogeneity (p=0.60) and I2 was zero. Details regarding the timing of the onset of feeding for each trial are presented in Table 1.

Fig. 4
figure 4

Total parenteral nutrition (TPN) vs. enteral nutrition (EN): effect on mortality, sensitivity analysis and subgroup analysis. OR Odds ratio; N total number of patients in the group; n number of patients who died in the group

Timing of feeding and effect on infectious complications

Of the six trials reporting ICs four compared early EN to PN. When aggregated, the use of PN approached a statistical trend towards more ICs compared to early EN (OR 1.47, 95% CI 0.90–2.38, p=0.12). Potentially important heterogeneity was present (p=0.07), and the I2measure was 56.9%. Only two trials comparing late EN to PN reported ICs [52, 56]. Due to the small number of trials, a meta-analysis could not be conducted. Individually neither trial reported a significant difference in ICs.

Sensitivity analysis

Two additional trials qualified for consideration in the sensitivity analysis [58]. Based on aggregation of all 11 trials the statistically significant mortality benefit remained in favour of PN use (OR 0.56, 95% CI 0.33–0.93, p=0.03; Fig. 4). The χ2 test for heterogeneity was non-significant (p=0.51) and the I2 was zero.

Both additional trials compared PN with late EN. In one study patients were required to have persistent coma (Glasgow Coma Scale <9) for at least 24 h prior to randomisation [58], and in the second patients were enrolled within 4–6 days of sepsis or surgery [57].

Considering these additional trials in the analysis of PN compared to late EN, the significant mortality benefit of PN remained (OR 0.44, 95% CI 0.24–0.81, p=0.008; Fig. 4). The test for statistical heterogeneity was non-significant (p=0.35) and the I2 measure was 10.0%.

Discussion

We used a component approach to assess the effect of methodological quality on the results obtained when trials comparing the use of standard PN to standard EN were aggregated. Since the extensive search did not detect any non-English language publications, it is unlikely a significant language bias exists [14] on this topic.

When ITT trials were considered, a statistically significant mortality benefit was evident in favour of PN (OR 0.51). Based on an a priori specified subgroup analysis this overall benefit was attributable to trials that compared the use of PN to delayed EN (OR 0.29). Although ICs were significantly increased (OR 1.66), given the evidence of a mortality benefit the clinical importance of these infectious complications should be questioned.

Clinically meaningful outcomes

Although the results of the current overview may appear novel, they are robust. When two additional trials were considered in the a priori defined sensitivity analysis, the statistically significant mortality benefit in favour of PN remained (OR 0.56), and, as suggested during the review process, when a more conservative random effects model was applied to the ITT trials, the benefit in favour of PN also remained (OR 0.49, 95% CI 0.25–0.98, p=0.04). Finally, although a previous review conducted on this topic reported non-significant results (OR 0.91, 95% CI 0.51–1.62), which resulted in a recommendation for the use of EN in preference to PN, it did not rule out the possibility of a mortality benefit in favour of PN [15]. Indeed, the point estimate of mortality benefit obtained in this current meta-analysis (OR 0.51) falls within the 95% confidence interval obtained in the previous review [15], which included numerous studies of questionable methodological quality [45, 46, 47].

The purpose of using a component approach to appraise trials was to investigate the impact of bias on the overall conclusions [14]. Nine of 11 trials considered in this meta-analysis presented complete (>95%) reporting and follow-up, which enables the conduct of an ITT analysis. An ITT analysis is known to provide the most conservative estimate of treatment effect [59] and protects against attrition bias, which results from non-random loss to follow-up [27]. Recent editorials reveal a strong prejudice against the use of PN in critically ill patients [60]. It is possible that the current findings are due to the exclusion of low-quality trials that overestimated the benefit of EN due to an inherent prejudice against PN use.

The results obtained from this current overview are also consistent with the evidence-based recommendations (EBRs) made in a recent cluster randomised trial evaluating the impact of evidence-based ICU feeding guidelines [7]. The guideline evaluated in this cluster randomised trial included a strong EBR for early feeding (within 24 h of ICU admission) with preference to the enteral route. If it was anticipated that enteral feeding could not be initiated within 24 h of ICU admission, the early use of PN was recommended. Our a priori specified subgroup analysis compared the effect of PN in trials that initiated early EN (<24 h) to the effect of PN in trials in which EN was delayed. While there was no benefit from the use of PN when EN was initiated early, there was a statistically significant benefit from the use of PN in trials in which EN was delayed (OR 0.29, p=0.006). The statistically significant benefit in this subgroup remained when additional trials were considered in the sensitivity analysis.

Infectious complications

The current overview documented a statistically significant increase in ICs associated with PN use (OR 1.66). A previous review also documented an increase in ICs (OR 2.51, 95% CI 1.66–3.79, p<0.0001) [15]. Although the point estimate for increased ICs obtained from the current overview (OR 1.66) is more conservative, it is contained within the 95% CI generated by the previous review, which included trials of questionable methodological quality [46, 47]. It is known that estimates of treatment effect decrease when more rigorous definitions of clinical infections are employed [61]. Previous reviews pooled suspected infections (e.g. fever of unknown origin) with positive culture results to obtain ‘total infectious complications’. Considering that no trials of PN vs. EN adjudicated suspected infections using an objective, blinded, repeatable process, it is likely that this outcome is highly susceptible to bias. We employed a more objective definition of infection based on positive culture results and obtained a marginally lower estimate of IC rates due to PN use.

A comprehensive review of the clinical importance of ICU-based infections found that the nature of the organism, the site of infection and the interaction between organism and site all had a significant impact on outcome [62]. Based on this finding an approach to classifying and combining ICU infections that accurately reflects the contribution that the infecting organism makes to patient outcome was proposed [62]. Due to incomplete reporting, it was not possible to classify and combine infections based on risk of outcome (e.g. severe infection, moderate infection, sub-clinical infection). Although the χ2 test for statistical heterogeneity was non-significant (p=0.16), the I2 measure of heterogeneity was 37.7%, suggesting that it may not be appropriate to pool all types of infections.

Statistical heterogeneity is said to be present in a meta-analysis when the differences in outcomes between studies is greater than expected by chance alone [34]. In the presence of unexplained heterogeneity pooling of study results may not be appropriate, even with a random effects model [32]. Although the χ2 test is commonly used to detect the presence of heterogeneity, this test has very poor power when the number of included studies is low [63]. The I2 statistic is an accepted measure of heterogeneity that does not depend on the number of included studies for interpretation [34]. Interpretation of the I2 measure reveals that 37.7% of the total variability in ICs was attributable to true differences between studies (heterogeneity) rather than random variability (sampling error). Although there are no clear guidelines as to what constitutes an unacceptable level of heterogeneity as measured by I2, a value of 20% reflects the presence of moderate heterogeneity [34].

Regardless of whether the clinician interprets information obtained from all trials (OR 1.66), from trials reporting the number of patients with infections (OR 1.88) or from trials reporting only the total number of infections (OR 1.46; Fig. 3), the ORs, and thus the estimate of increased ICs attributable to PN, are remarkably similar. Indeed, as suggested during the review process, if the incidence of infection is considered, which accounts for patient-time at risk, the OR does not change (OR 1.51). Regardless of the approach used to aggregate infections the clinician should be aware that the use of PN was associated with an increase in infectious complications. Because all infections do not lead to similar outcomes, and mutually inconsistent definitions of infections were used in each trial, the clinical importance of this finding is open to interpretation.

True methodological quality

Although previous reviewers have based their assessment of methodological quality on published manuscripts [15, 16, 17, 25, 64], it has been suggested that direct communication with the authors of contributing trials could improve our understanding of methodological quality. Due to resource and logistical constraints we were unable to communicate directly with the authors of the publications reviewed in this meta-analysis. Although the current state of knowledge regarding the importance of methodological quality is based on assessments of published manuscripts, not direct communication with authors [10, 12, 14, 22, 36], we believe that this is an area that merits further research.

The two main areas of methodological quality that were consistently found to be deficient in the manuscripts reviewed for this meta-analysis were the appropriate use of blinding and explicit reporting of allocation concealment.

Blinding

Since the appropriate use of blinding can reduce overoptimistic estimates of treatment effects by up to 26% [10], many novel and innovative processes have been developed for blinding complex intensive care interventions [65, 66]. Regardless of the complexity of the intervention, if a subjective outcome (e.g. ventilator-associated pneumonia, suspected infection) is important, it is always possible to blind outcome adjudicators. Because it may be important to understand whether patients, health care providers, researchers, outcome adjudicators, data collectors and even the data analysts were blinded, use of the term ‘double blinded’ is discouraged in preference to an explicit list of exactly who was blinded [67].

Allocation concealment

Trials with inadequate or unclear reporting of allocation concealment are known to produce up to 40% larger estimates of treatment effects [10, 12]. Allocation concealment refers to a process used to randomise participants so that patients, clinicians and researchers cannot predict or influence which participants are assigned to a given intervention (definition of allocation concealment contained on the Consolidated Standards of Reporting Trials web site http://www.consort-statement.org/allocationconcealment.htm, accessed 21 June 2004). A single sentence describing the use of ‘sealed, opaque, sequentially numbered envelopes’ or a ‘central call-in randomisation centre’ is sufficient to ensure the reader that allocation concealment was maintained [12].

Summary

The inclusion of trials of low methodological quality is known to have a substantial impact on the results and conclusions obtained from meta-analyses [14]. Based on the results of the nine trials presenting complete follow-up, meta-analysis revealed a significant mortality benefit in favour of the use of PN. A priori specified subgroup analysis demonstrated the benefit from PN use was greatest in trials in which EN was delayed (>24 h). Although we also documented an increase in ICs with PN use, because we were unable to separate sub-clinical infections from serious infections in the face of improved mortality the clinical importance of these ICs could not be established.

The overall findings of this meta-analysis would not lead us to recommend the use of PN in patients in whom EN could be initiated within 24 h of ICU admission or injury. In consideration of the overall results, including the increase in ICs, a grade B+ EBR [68] could be generated for the use of PN in patients in whom EN could not be initiated within 24 h of ICU admission or injury. This grade B+ EBR, where the ‘B’ indicates the recommendation is based on level II evidence and the ‘+’ indicates there is no heterogeneity between trials (‘Evidence-based Recommendations section of the Evidence-based Decision Making in Critical Care web site’ http://www.EvidenceBased.net, accessed 16 June 2004), is consistent with the explicit recommendation made for PN use in a recent cluster randomised trial of evidence-based ICU feeding guidelines that resulted in an overall 10% reduction in mortality [7].

It is important to recognise that the clinical trials included in this meta-analysis constitute level II evidence [68]. Because meta-analyses based on level II evidence often disagree with the findings of subsequent, well-conducted level I trials [8], we strongly recommend the conduct of a level I study addressing the use of standard PN in patients in whom standard EN cannot be started for at least 24 h after ICU admission or initial injury. This study should be adequately powered to detect a difference in a clinically meaningful outcome, should employ a blinded assessment of clinically important infections and should embrace all major aspects of study design that are known to reduce bias. In addition, the conduct of such a trial would present the ideal opportunity to conduct a full economic analysis of the use of PN compared to EN, expressed as costs per life saved [69].