What is new?
- •
The results of RCTs and nRCTs differ significantly in at least 20% of the variables, but more often when the number of participants in RCTs increases.
- •
Between-study heterogeneity is more common in nRCTs than in RCTs.
- •
Comparing cholecystectomy patients' baseline characteristics, the external validity of RCTs did not differ from that of nRCTs.
Randomized controlled trials (RCTs) are widely accepted to represent one of the highest levels of evidence in the hierarchy of research designs. Treatment comparisons based on RCTs are recognized as the most valid method to avoid selection and confounding biases in clinical research. If properly designed and well conducted, RCTs are likely to have high internal validity, that is, they measure what they are intended to measure. Thus, using RCTs, researchers are able to detect even small or moderate treatment effects [1]. In contrast, claims have been made that treatment effects of observational studies and nonrandomized controlled trials (nRCT) might be overestimated [2], [3], [4] because of the lack of internal validity (e.g., baseline comparability of the groups) [5].
The predominance of RCTs has been criticized for several reasons. First, RCTs are assumed to have no sufficient external validity, that is, the applicability of the results of RCTs to the general population may be low [6]. This is often illustrated by highly selected study participants [7], in terms of the patients included in RCTs, are in average younger or healthier than those included in nRCTs. Second, a notable drawback is the high costs of RCTs as a result of numerous quality aspects that are required when an RCT is performed [8]. Third, a reliable estimation of the incidence of rare side effects may be difficult, as RCTs are often based on small sample sizes. Finally, performing RCTs in nonpharmacological interventions, such as surgery, is also questionable [9], [10] because of the problems in the standardization of surgical procedures, blinding, or a lack of acceptance by patients and surgeons [11].
Facing these disadvantages of RCTs, the performance of nRCTs may have some merits. Although, in terms of external validity, nRCTs may be superior compared with RCTs, in terms of efficacy results, the RCT is superior. However, depending on the study characteristics, some nRCTs may closely approximate the “true” efficacy result [12]. It has been suggested that, for specific medical topics, both RCTs and nRCTs may sometimes yield very similar results [13], [14]. In addition, various studies have identified different methodological aspects that increase the scientific value of nRCTs [15], [16], [17]. For example, by using data from a general practice database, the results of a large RCT for the assessment of hormone replacement therapy in women at risk of coronary heart disease could be accurately replicated [18]. Although the reputation of nRCTs has been improved, most health care agencies accept the coverage of novel pharmacological interventions only in cases where data from RCTs indicate a significant increase of clinical effectiveness.
In the area of laparoscopic cholecystectomy (LC), both surgeons and researchers have not agreed yet about the optimal approach for evaluating a surgical procedure. Because the advantages of LC compared with open cholecystectomy (OC) were overwhelming for many years, RCTs were not performed [19]. Even in high-quality journals, observational data from nRCTs were accepted for publication [20], because it was held impossible to conduct RCTs on this topic. When eventually, RCTs and even blinded trials were performed [21], [22], LC was to be found less advantageous than expected from previous nRCT data. However, bile duct injury, which may occur when performing LC, was primarily seen in case series and registry studies [23], [24]. This adverse event may never be detected when solely relying on RCT data. Thus, nRCTs might be of considerable value in the evaluation of surgical procedures [25].
In summary, there is still controversy about whether and under what circumstances the results of nRCTs may agree with the results of RCTs. As RCTs are currently more accepted, the scientific value of nRCTs has not been sufficiently justified yet. LC might serve as an ideal showcase, because a wide variety of studies were published on this procedure in a short period of time. Although some modifications in LC technique have been developed, for example, mini-instruments or less trocars, none of these have gained widespread acceptance so that LC is a highly standardized technique. The aims of this literature analysis were as follows: first, to compare the results of RCTs vs. RCTs in terms of their internal validities (study results); second, to compare the results of RCTs vs. nRCTs in terms of their external validities (baseline characteristics); and third, to assess which characteristics of nRCTs are associated with less-reliable study results.