An overview of systematic reviews of complementary and alternative therapies for infantile colic

Background Infantile colic is a distressing condition characterised by excessive crying in the first few months of life. The aim of this research was to update the synthesis of evidence of complementary and alternative medicine (CAM) research literature on infantile colic and establish what evidence is currently available. Methods Medline, Embase and AMED (via Ovid), Web of Science and Central via Cochrane library were searched from their inception to September 2018. Google Scholar and OpenGrey were searched for grey literature and PROSPERO for ongoing reviews. Published systematic reviews that included randomised controlled trials (RCTs) of infants aged up to 1 year, diagnosed with infantile colic using standard diagnostic criteria, were eligible. Reviews of RCTs that assessed the effectiveness of any individual CAM therapy were included. Three reviewers were involved in data extraction and quality assessment using the AMSTAR-2 scale and risk of bias using the ROBIS tool. Results Sixteen systematic reviews were identified. Probiotics, fennel extract and spinal manipulation show promise to alleviate symptoms of colic, although some concerns remain. Acupuncture and soy are currently not recommended. The majority of the reviews were assessed as having high or unclear risk of bias and low confidence in the findings. Conclusion There is clearly a need for larger and more methodologically sound RCTs to be conducted on the effectiveness of some CAM therapies for IC. Particular focus on probiotics in non-breastfed infants is pertinent. Systematic review registration PROSPERO: CRD42018092966.


Description of the condition
Infantile colic (IC) is a common childhood condition affecting 5% to 20% of infants worldwide [1,2]. In 2016, the new Rome IV criteria for colic defined it as 'an infant who is less than five months of age when symptoms start and stop; recurrent and prolonged periods of infant crying, fussing or irritability reported by caregivers that occur without any obvious cause and cannot be prevented or resolved by caregivers; no evidence of infant failure to thrive, fever or illness [3]'. Prior to this, colic the was most commonly diagnosed using the Wessel 'rule of three' criteria [4].
Much research has been conducted over the past 50 years to try to establish the underlying aetiology. Formula intolerance, immaturity of gastrointestinal tract, food allergies, intestinal cramping or excessive gas formation have all been suggested [5], alongside psychosocial causes, e.g. maternal anxiety and maternal-infant bonding issues [6], but its pathophysiology remains unclear. Although it is considered a self-limiting condition, it can be distressing for both parents and babies.

Conventional treatment options
Our lack of understanding of IC makes it difficult to find an effective treatment. Current conventional treatments include dietary (particularly mother's diet), physical, behavioural and pharmacological. With little evidence to support the first three approaches, the medication Simethicone is commonly used. However, this is no longer recommended on the NHS website [7].

CAM treatment options
Dissatisfaction with conventional health care and shortage of treatment options may lead parents to seek complementary and alternative (CAM) healthcare for their infants. CAM has been defined as '… diagnosis, treatment and/or prevention which complements mainstream medicine by contributing to a common whole, by satisfying a demand not met by orthodoxy or by diversifying the conceptual framework of medicine' [8].
New parents in particular find IC stressful, resulting in high usage of CAM in this population [9]. Thus, further investigation is needed to evaluate the effectiveness and safety of CAM approaches and treatments. Information and advice regarding the treatment or management of colic is currently available to parents from a wide range of generally unregulated sources (e.g. websites) often which make claims that have no empirical support [10]. The main CAMs used for IC are probiotics, spinal manipulation, herbal medicine and acupuncture. A description of each alongside its justification for use in colic can be found in Additional file 1.

Previous overview of reviews
There have been several previous overviews of reviews investigating CAM interventions for IC which have predominantly focused on spinal/chiropractic manipulation [11][12][13] or have included IC alongside other conditions affecting children [14]. Most lack adequate quality assessment of the included systematic reviews.

Why is it important to do this review?
Several trials have been published in recent years that indicate an effect of some CAMs for IC; these have mainly included trials of probiotics [6,[15][16][17]. The aim of this overview is to synthesise CAM research on IC and establish what evidence is currently available. As systematic reviews are considered the least biased source of evidence to evaluate the effectiveness of a particular intervention, this overview will focus on reviews of IC.

Methods
This systematic overview was conducted following a predetermined written protocol registered on the PROS-PERO database: registration number CRD42018092966. To be considered eligible for this overview, reviews were required to meet the following criteria: Type of reviews-all systematic reviews of randomised controlled trials (RCTS) of infantile colic (IC).
Type of participants-human subjects diagnosed with infantile colic using standard diagnostic criteria (e.g. WESSEL criteria 3 ). No restrictions regarding gender, condition duration or intensity were applied. Age was restricted to infants under 1 year.
Type of intervention-reviews of effects of any CAM therapies. Reviews that included multiple CAM therapies were also included, as long as the CAM therapies were not used in combination.
Type of comparator-placebo, active treatment, no treatment, treatment-as-usual or waitlist control groups.
Type of outcome-any review that included studies that reported measures of colic severity (e.g. parentreported crying diaries; questionnaires and parental interviews).
The full criteria are listed in Table S4, Additional file 1.

Data sources
Medline, Embase and AMED (via Ovid), Web of Science and Central via Cochrane library were searched from their inception to September 2018. Google Scholar (first 20 pages) and OpenGrey were searched for grey literature and PROSPERO for ongoing reviews. All reviews from 2011 were assessed for eligibility. The search strategy was structured using subject (MeSH) headings, text word terms and their derivatives: homoeopath, acupuncture, spinal manipulation, hypnosis, reflexology, phytotherapy, probiotics, infant, colic, systematic review, meta-analysis. Full details of the search can be found in Additional file 1.

Data extraction
One reviewer (RP) extracted data and summarised the review characteristics (see Table 1). Extracted data was checked by another reviewer (VL or PD). Disagreements were resolved through discussion. Information was extracted on author, date of review, country, list of studies included in the individual review, intervention and comparator summary, number of participants, diagnosis criteria, meta-analysis results or summary of main between-group results, whether a sensitivity or subgroup analysis was conducted, risk of bias assessment and adverse events. We reported the standard mean difference (SMD) and 95% confidence intervals (CI) and results of any tests of heterogeneity reported in the relevant meta-analyses. Individual study results (mean and standard deviation (SD)) of continuous variables and any between-group statistical analysis were reported when there were no pooled results available (Additional file 1: Table S7). Pooled odds ratios or risk ratios and associated 95% CIs were reported for any dichotomous data.

Data synthesis
Due to the expected overlap of studies and heterogeneity between reviews (particularly with regard to intervention and comparator arms), we conducted a narrative synthesis of the findings rather than pooling of meta-analyses from the included reviews. We applied the following thresholds for the interpretation of the reported I 2 statistic that assesses heterogeneity [18] in any reported meta-analysis.
0% to 40%: might not be important 30% to 60%: may represent moderate heterogeneity 50% to 90%: may represent substantial heterogeneity 75% to 100%: considerable heterogeneity Assessment of methodological quality/bias of the included reviews The quality of each systematic review was assessed using the newly developed AMSTAR-2 scale (Assessing the methodological quality of systematic reviews [19]). This is the revision to the validated and frequently used AMSTAR scale [20]. This was used alongside the ROBIS tool (Risk of Bias in Systematic Reviews tool [21]). Three reviewers (RP, VL, PD) independently assessed each review using both tools. Two reviewers had limited experience of using the ROBIS tool, so a third reviewer, who helped develop the tool (PD), was asked to complete both ratings. Meta-analyses were checked by a statistician experienced in meta-analyses (CP).

Deviation from the protocol
There were two changes to the protocol; originally, we were going to exclude reviews that reported on multiple conditions of which infantile colic was one, but we changed this to include reviews that had at least two studies of IC described, to capture more data. We also allowed active treatments to be the comparator option.

Results of the literature search
The search identified 903 potentially relevant papers and 669 titles/abstracts were screened (see Fig. 1). Forty-three full text articles were assessed for eligibility and 16 systematic reviews were included in this overview. Results of the included reviews are presented in Table 1 (details of the individual studies can be found in Additional file 1: Table  S7). Only identifiable randomised controlled trials (RCTs) from each review are reported. The summarised AMSTAR-2 scores and ROBIS scores are presented in Tables 2 and 3. Further justification statements for ROBIS are presented in Additional file 1: Table S6. All excluded reviews are listed in Additional file 1: Table S5.
Dobson et al.'s [26] Cochrane review of spinal manipulation therapy (SMT) found that five of the six included studies reported a beneficial effect on the course of colic. They found manipulation therapies had a greater effect on daily crying time, reducing it by 1 h 12 min/day on average with moderate heterogeneity (MD = − 1.20; 95% CI − 1.89 to − 0.51, I 2 = 56%). However, there was no evidence of a difference when restricting to the studies judged as having a low risk of performance bias (blinding of parent outcome assessors). Three studies measured full recovery (odds ratio [OR] = 11.12; 95%CI 0.46 to 267.52, I 2 = 89%), but confidence intervals were extremely wide and heterogeneity was substantial. Adverse events were only assessed in one study [83] and none were found. Overall, firm conclusions cannot be drawn from such limited data. The GRADE (Grading of Recommendations Assessment, Development and Evaluation) evaluation indicates low quality of evidence for both outcomes. The results from AMSTAR-2 and ROBIS indicate a high level of confidence in the findings and were judged to be of low risk of bias in all domains.
Gleberzon et al.'s [27] narrative synthesis included three RCTs of SMT for infantile colic [68,71,84]. Two of these RCTs were reported in Dobson et al.'s [26] review. Browning et al.'s [84] trial was excluded from the Dobson review as it compared two active treatments (spinal manipulation and occipito-sacral decompression). Just one trial indicated an effect on IC symptoms compared to comparator [68]. There were issues with both quality (AMSTAR-2) and bias (ROBIS) with this review.

Herbal medicine
There was one main review of herbal medicine [30]. It assessed various types of herbal medicine for several conditions. We extracted just the data on infantile colic. Evidence was found for different fennel preparations (e.g. oil, tea, herbal compound Colimil) in treating children with colic, whereas peppermint oil was not found to be effective. No serious adverse effects were reported.
Herbal medicine was also assessed in the four multiple CAM reviews [22]. The same studies were included. Gutierrez-Castrellon et al. [25] pooled various herbal extract studies (peppermint, fennel seed oil, Colimil (containing Matricariae recutita, Foeniculum vulgare and Melissa officinalis) in their NMA and demonstrated a weak effect, (WMD = − 61.2 min (95% CI 0.8 to − 122.0, P = 0.05, I 2 is 98%). The wide confidence intervals, considerable heterogeneity and the crossing of the line of no effect suggest that herbal medicine has limited effect on crying time. In conclusion, Gutierrez-Castrellon et al. does not recommend herbal medicine for colic.
Harb et al. [24] conducted a subgroup meta-analysis focussing only on extracts containing fennel and demonstrated it to be effective for reducing colic symptoms in solely breast-fed infants. However, this analysis also had considerable heterogeneity and wide confidence intervals (MD = − 72.07, 95% CI − 126.43 to − 17.70, I 2 = 99.5%). It should be noted that much of the fennel oil evidence stems from one trial (Arikan et al. 2008 [73]) which was at particular risk for blinding, selective reporting and issues of randomisation. Massage was one of the arms in this trial; a therapy where it is impossible to blind the therapist. Further, the wide confidence intervals and the crossing of the line of no effect suggest that this trial has limited, if any, value. These issues certainly cast concern on the overall findings.
Perry et al. [22] and Bruyas-Bertholon et al. [23] both reported results that demonstrated fennel herbal extracts to be effective in reducing colic symptoms, but both concluded that the methodological issues of the individual studies (discussed above) call these results into question.

Acupuncture
One systematic review of acupuncture [29] used individual participant raw data (IPD) which is considered the gold standard of evidence synthesis due to the high level of precision and consistency [38]. A difference was found at the mid time-point analysis favouring acupuncture: MD = − 24.9 min (95% CI − 46.2 to − 3.6; P = 0.02, I 2 = 9%), but not after the removal of the one unblinded trial, [47] (MD = − 13.8, 95% CI − 37.5 to 9.9, P = 0.25, I 2 = 0%). The GRADE evaluation indicates moderate quality evidence. No major adverse events occurred although acupuncture induced some crying during treatment, believed to be linked to the insertion of needles into the infant [39,40]. This was a well-conducted review that was judged to be low risk of bias in most domains, but with low confidence in the results (AMSTAR-2). One issue with this review is the authors assessed all their own trials [46,78,84], although they appeared to be objective in their evaluation. Gutierrez-Castrellon et al. [25] pooled two of the above studies [53,57] and initially found an effect favouring acupuncture (at week 1 and 2) but was no longer evident by week 3 (WMD of − 11.2 min (95% CI 2.0 to − 23.0), P = 0.08, I 2 = 0%). There were issues with bias and quality in the review process. Bruyas-Bertholon et al. [23] echoed these findings in their multiple CAM review.
Sung et al. [31] examined the effectiveness of L. reuteri DSM17938 versus either placebo or the drug simethicone. They meta-analysed three trials of breast-fed infants [15,62,74]   This appears to be due to differences in how the authors estimated means + SD from medians. Subgroup analysis of probiotics versus indistinguishable placebo showed a similar reduction in crying time at 21 days (MD = − 55.48 [95% CI − 59.46 to − 51.49], I 2 = 0%). In a separate analysis, probiotics were also shown to improve treatment success at 21 days (RR = 0.06 (95% CI 0.01 to 0.25), P < 0.000001, I 2 = 0%). Urbanska et al. [33] and Xu et al. [34] restricted their analysis to trials of L. reuteri versus indistinguishable placebo only. Urbanska et al. [33] pooled three trials and found that L. reuteri They also found that L. reuteri improved colic treatment effectiveness at three but not 4 weeks.
The most up-to-date review was by Sung et al. [37] who compared L. reuteri DSM17398 versus placebo using individual participant raw data (IPD). Assessments took place at day 7, 14 and 21. The probiotic group averaged less crying and/or fussing time than placebo group at all time points. The results at day 21 using IPD were − 25.4 min (95% CI − 47.3 to − 3.5) P < 0.05 (adjusted for baseline). Intervention effects were particularly impressive in breastfed infants-MD = 46.4 min (95% CI − 67.2 to − 25.5, P < 0.05), but not so in formula-fed infants-MD = 41.0 (95% CI − 20.1 to 102.2, P > 0.05).

Other CAM therapies
Several other CAMs were assessed as part of the multiple CAM reviews [22][23][24][25]. Three trials favoured sugar solutions over either placebo control [59,61], or no treatment [73]. Three trials [66,73,82] assessed massage, of which a pooled estimate of two trials [66,82] demonstrated an effect. One trial on reflexology [67] found no difference compared to the control. Two trials assessed soy added to formula [60,78], but it was not possible to report the pre-washout data [60] or distinguish the soy data from other supplements analysed.

Adverse events
With regards to the safety of the reported CAMs, there have been some concerns raised in recent years. In particular, manipulation therapies have come under scrutiny [41,42] but serious adverse events are rare [43]. Reported adverse events from probiotics are generally mild gastrointestinal disorders such as abdominal cramping, nausea, diarrhoea, flatulence and taste alteration [44].
Poor reporting of adverse events (AEs) is a common criticism of CAM research [45]. It is particularly difficult in infants who cannot communicate their responses effectively. AEs in our included reviews were primarily based on parental reports. However, nine of the 16 reviews did report on AEs with the majority reporting that there were none. The acupuncture review [29] had the highest number of AEs from acupuncture (i.e. bleeding at acupoint, increased hiccoughing), although these are relatively minor. Just six reviews did not report or only partially report on AEs [23,25,27,31,32,36]. In addition, several trials ignored safety issues by not providing the reasons why subjects dropped out. Also, due to the small sample sizes of the trials, it is difficult to draw definitive conclusions on safety.

Results of AMSTAR-2
A summary of the AMSTAR-2 results can be found in Table 2. Most reviews were rated as having critically low confidence in the results, four were rated a low and one (the Cochrane review [26]) was considered to have high confidence in the results. Seven questions that relate to critical domains were identified by Shea et al. [19]; more information about these domains can be found in Additional file 1.

Results of ROBIS
The ROBIS tool is divided into four domains (see Table 3 for summary of results and Additional file 1: Table S6 for full results). With regard to domain 1, which assessed any concerns regarding specification of study eligibility criteria, six reviews achieved a low risk of bias rating overall [22,26,28,29,32,33]. Domain 2 assessed any concerns regarding methods used to identify/select studies and six achieved a low risk of bias rating overall [26,28,29,31,32,34], two were rated as unclear [22,37] and the remaining eight achieved a high risk of bias rating. Domain 3 assessed concerns regarding methods used to collect data and appraise studies. Ten of the 16 reviews achieved a low risk of bias rating overall. With regards to domain 4, which assessed concerns regarding the synthesis and findings, the majority were rated as high risk of bias. The final section of the tool provides a rating for the overall risk of bias of all the reviews; four achieved a low rating [22,26,32,34], four [28,29,31,37] an unclear rating and eight a high rating [23-25, 27, 30, 33, 35, 36].

Summary of the main results Manipulation
From the six reviews that reported on spinal manipulation, the most robust evidence comes from Dobson et al.'s [26] Cochrane review, rated as good quality (AMSTAR-2) with low risk of bias (ROBIS). The other reviews had issues with bias (apart from two [22,28]) and quality. Most trials indicate a positive effect on crying time alongside other improvements (e.g. 'recovery from colic'). Blinding was an issue in most trials as the clinician will always know whether they performed a treatment. Thus, the effectivenss of the intervention was called into question when the analyses were restricted to trials with adequate blinding.

Herbal medicine
From the five reviews that reported on herbal medicine, preparations containing peppermint demonstrated no evidence of effect. Preparations containing fennel oil demonstrated some effect for reducing the symptoms of colic, but there were limitations to these findings with regards to quality and bias in the review process. Welldesigned clinical trials are required to strengthen the evidence.

Acupuncture
One systematic review of acupuncture [29] found some evidence of effect of acupuncture but this disappeared when the unblinded trial was removed. They conclude that needle acupuncture should not be recommended for infantile colic on a regular basis. This review scored 'low' in most domains of ROBIS, but low overall confidence in result (AMSTAR-2). Gutierrez-Castrellon et al. [25] supported these findings and Bruyas-Bertholon et al. [23] found an initial effect that disappeared at week 3. It is important to note that using acupuncture in infants may raise ethical issues for future trials, as the infant's response to needle insertion is difficult to evaluate [46] and parents may feel anxious about needles being inserted into their child [40].

Probiotics
The majority of probiotic reviews indicated that L. reuteri is effective in reducing symptoms of colic. Some reviews were of better quality than others as assessed by AMSTAR-2 [32] and ROBIS [32,34]. There were issues of quality with most of these reviews, so caution is needed before firm conclusions can be drawn. The following issues were identified: Simithicone was used as a placebo in one trial [62] which was then included in three meta-analyses [31,32,37]. As it is an active treatment, the robustness of the results of these meta-analyses needs to be taken with caution. Several reviews [33,34,36,37] reported a difference in the efficacy of L. reuteri between breastfed and formula-fed infants; however, on closer inspection, there was just one study of formula-fed infants which included some mixed-fed babies. Thus, their conclusions over-emphasise the impact of their results. Reviewers [35][36][37] combined different outcomes from the Sung et al. [6] trial. Either 'crying' and 'crying/fussing' times were analysed, it was unclear why reviewers either combined or separated these outcomes. The use of medians in some reported meta-analyses was also problematic and it was unclear if the conversion to means had actually taken place. It was also unclear why different pooled estimates were found when using the same studies [31,32].
Overall, these reviews suggest that probiotic L. reuteri may lead to reduced crying time in IC; however, cautious interpretation must be taken due to substantial heterogeneity and small number of trials in the analyses, and low-quality evidence of most of the reviews.

Other CAMs
There was some support for both sugar solutions [59,61,73] and massage [66,82], but the trials of reflexology [67] and soy formula [60,78] did not show support. Concerns regarding the high phytoestrogen content of soy [46,47] reiterated in a recent Cochrane review of dietary interventions for IC (October 2018) [48] makes this difficult to recommended.

Future work
It is important to highlight that the nonspecific effects (e.g. the therapeutic effects of time, attention and touch alongside the placebo effect) of many CAM therapies included in this review are poorly understood but may play a role.
The self-limiting nature of infantile colic means that RCTs are the best way to assess the effectiveness of treatments. Given that there was little convincing evidence for acupuncture, and because funding for CAM research is difficult to obtain, additional research should focus on treatments that offer the most robust evidence. Thus, as encouraging results were demonstrated for manipulation, fennel extracts and sugar solutions, these CAMs require further investigation through welldesigned RCTs.
Recent research into colic has focused on probiotics. The majority of reviews concluded that further trials into probiotics for breastfed-only infants are no longer needed. On closer inspection, this conclusion might be premature as the quality of the evidence is currently low. Its role in formula-fed babies certainly requires further research but trialling this will be more problematic as infant formulas commonly contain probiotics [49]. Crying time as measured by parental diaries was the main outcome in most reviews, which is highly subjective; more consideration is needed to accurately measure crying time in future trials (e.g. phone audio or video recordings of the colic episodes).
CAM therapies are difficult to study as some of the most common treatments, (e.g. acupuncture, osteopathy and chiropractic) cannot be adequately blinded. Even trials of other CAMs (such as herbal remedies) have had difficulty in blinding, making it impossible to totally remove bias from the research studies. However, it appears that parents are driving the use of CAM therapies due to limited routine medical care solutions for problem of infant colic. Therefore, those therapies with promising emerging evidence that have, so far, been found safe or without adverse events such as herbal medicines (in particular, fennel oil), probiotics, chiropractic and osteopathic manual therapies may provide reasonable approaches to the problem. Nevertheless, the evidence is far from definitive and more high-quality research is required to help parents decide on the most efficacious therapy for their infant suffering from colic.

Potential bias in the overview process
One reviewer (RP) assessed their own work in this overview [22]; however, two other reviewers also assessed each review using AMSTAR-2 and ROBIS. One of the developers of the ROBIS tool was involved (PD). She was invited for her level of expertise in using the ROBIS tool, as the other reviewers had limited experience.

Strengths and limitations
The search was thorough and included grey literature searching. We believe the systematic approach taken in this overview limits bias. Difficulties in using both ROBIS and some questions on AMSTAR-2 may have led to errors in assessment.

Conclusions
Spinal manipulation shows promise to alleviate symptoms of colic, although concerns remain as positive effects were only demonstrated when crying was measured by unblinded parent assessors. Fennel is the most promising herbal remedy, but again concerns on the quality of the included studies make any conclusions cautionary.
The majority of the reviews indicate that L. reuteri DSM17938 should be recommended for breastfed infants with colic, but caution is needed due to the poor quality of the included reviews. Its role in formula-fed babies, in particular, needs further research. Acupuncture and soy are currently not recommended. More rigorous clinical trials are needed for these interventions.
Additional file 1. Description of included CAMs [51- 53,55]. Table S4: table of inclusion/exclusion criteria. Details of the search and data extraction [56].  .  Table S7: Summary of systematic reviews of CAM treatments for infantile colic. Criteria for assessing confidence in AMSTAR-2.