Scolaris Content Display Scolaris Content Display

Cochrane Database of Systematic Reviews Protocol - Intervention

Oral probiotics for the treatment of infantile colic

Collapse all Expand all

Abstract

Objectives

This is a protocol for a Cochrane Review (intervention). The objectives are as follows:

To assess the effects of probiotics for infantile colic in infants younger than four months of age.

Background

Description of the condition

Although infantile colic (IC) is considered to be a self‐limiting and benign condition, it understandably leads to exhaustion, anxiety, and concern in parents and caregivers, and is a common reason for consultation with healthcare professionals (Lucassen 2001). Defined as periods of inconsolable, unexplained, and incessant crying in a seemingly healthy infant, it is a common condition, affecting 4% to 28% of infants, depending on the case definition used (Lucassen 2001). A recent systematic review reported that prevalence rates of IC ranged between 2% and 73%, with a median of 18% (Vandenplas 2015). Aside from the impact of case definition, parental perception (St James‐Roberts 1995), methods of data collection, parental well‐being (Rautio 1999), and cultural variations in infant care practices (Sondheimer 2002), all affect the prevalence of IC. There is growing recognition that colic represents the upper end of the normal crying curve of healthy infants, and on average, peaks at six weeks and diminishes by 12 weeks (Brazelton 1962Wolke 2017). The cause of infantile colic remains unclear. In the evaluation of excessive crying, organic pathologies account for less than 5% of cases. Urinary tract infection is the most common; a range of other gastrointestinal, psychosocial, and neurodevelopmental disorders contribute to this statistic (Johnson 2015). 

The 'rule of three', or the Wessel criteria, has been traditionally used to define IC (Wessel 1954). This triad includes unexplained episodes of paroxysmal crying for more than three hours per day, for three days or more per week, for at least three weeks. In 2006, the Rome III working group, recognising that most IC crying tends to resolve spontaneously by three to four months of age, stipulated that in order to make a diagnosis of IC, all the following criteria must be present in infants less than four months of age (Hyman 2006).

  1. Paroxysms of irritability, fussing (generally considered as behaviour that is not quite crying, but not content either (Barr 1988)), or crying that starts and stops without obvious cause.

  2. Episodes that last three or more hours per day, and occur on at least three days, for at least one week.

  3. No failure to thrive.

In 2017, the Rome IV criteria for functional gastrointestinal disorders in infants and toddlers were published (Benninga 2016). Rome IV abandoned the modified Wessel's criteria used in the Rome III (episodes lasting three or more hours per day, occurring on at least three days per week, for at least one week), as the Rome committee felt the minimum crying time of three hours per day was too arbitrary, and there was no clinically significant difference between a child who cried for two hours and 50 minutes per day and those who cried for three hours per day. The criteria for IC was revised, and the Rome IV criteria include: 

  1. Infant is less than five months old when the symptoms start and stop;

  2. Recurrent and prolonged periods of infant crying, fussing, or irritability reported by caregivers that occur without obvious cause, and cannot be prevented or resolved by caregivers; and

  3. No evidence of infant failure to thrive, fever, or illness.

Of value, Rome IV also included diagnostic criteria for clinical research purposes that recognise the intensive and time‐consuming demands on parents to complete seven‐day behaviour diaries (Benninga 2016), which include the preceding diagnostic criteria, and both of the following.

  1.  During a telephone or face‐to‐face screening interview with a researcher or clinician, caregiver reports that the infant has cried or fussed for three or more hours per day during three or more days, or more than seven days; and

  2. A total of 24 hours crying plus fussing in the selected group of infants is confirmed to be three hours or more, when measured by at least one prospectively kept 24‐hour behaviour diary.

Because the condition is self limiting in most cases, regardless of any identified cause, offering no treatment is a viable option (Benninga 2016). The aetiopathogenesis of infantile colic remains undefined. It is most likely multifactorial, and may relate to behavioural factors (i.e. psychological and social) or biological components (i.e. food hypersensitivity, allergy, gut microorganisms, dysmotility). A range of explanatory theories has been proposed, including intestinal gas overproduction, forceful intestinal contraction, cow's milk protein hypersensitivity (Lucassen 2000), transient lactase deficiency (Kanabar 2001), infant's temperament (Canivet 2000), and mother's postpartum adjustment (Akman 2006). 

From an immunological perspective, cow's milk protein in infant formula or breast milk has been considered allergenic to the infant, thus inducing the symptoms of colic. Consequently, a low allergen maternal diet or hypoallergenic infant formula has been proposed as a form of treatment (Hill 2005Iacovou 2018Schach 2002). Since evidence shows that 25% of infants with moderate or severe colic have cow's milk, protein‐dependent colic (Hill 2005), which improves in response to a hypoallergenic diet (Campbell 1989Iacono 1991), dietetic treatment should be considered as a therapeutic approach (Perry 2011). 

Lactose intolerance, due to a relative lactase deficiency, has been identified as a possible causative factor in infantile colic (Kanabar 2001). Malabsorption of carbohydrates results in colonic fermentation of sugars and a subsequent increase in levels of hydrogen gas (Infante 2011). The rapid production of hydrogen distends the colon, which when combined with the production of lactic acid and lactose within the bowel lumen, causes an influx of water via osmosis, leading to further distension of the bowel (Kanabar 2001). While some studies have revealed increased breath hydrogen levels in colicky infants (Hyman 2006Moore 1988), this finding is inconsistent with those from other studies (Mentula 2008). 

There is also a growing body of evidence suggesting that the intestinal microbiota in colicky infants differs from those in healthy controls. Lower counts and specific colonisation patterns of intestinal lactobacilli have been observed in colicky infants (Savino 2004Savino 2005aSavino 2010). Furthermore, coliform bacteria, namely Escherichia coli, have been found to be more abundant in the faeces of colicky infants, suggesting a role for coliform colonic fermentation, and consequent excessive intra‐intestinal air production, aerophagia, and pain (Rhoads 2009Savino 2009). There is also evidence that the microbiota in infants without colicky symptoms is more diverse (De Weerth 2013).

Human milk contains prebiotics, defined as indigestible oligosaccharides, which are thought to enhance the proliferation of certain probiotic bacteria within the colon, especially the Bifidobacterium species; this is mimicked in many formula milks (Thomas 2010). Because of this, investigations of prebiotics alone as a dietary intervention for infantile colic in trials have not been conducted (Gordon 2018).

Description of the intervention

In a diverse microbial environment, microbes, such as Bifidobacterium and Lactobacilli, use most of the available nutrients that keep the growth of proteobacteria under control, commonly referred to as eubiosis. If the bacterial ecosystem becomes disrupted, proteobacteria may take over and prevent the growth of other organisms, stimulating intestinal inflammation; a state referred to as dysbiosis (Simonson 2021). Probiotics may serve a valuable role in the treatment of infantile colic.

Probiotics are live microorganisms that when administered in adequate amounts, confer a health benefit to the host (Sanders 2008). Lactobacillus and Bifidobacterium species are the organisms most frequently used as probiotics. Prebiotics are indigestible food ingredients that benefit the host by selectively stimulating favourable growth or activity, or both, of one or more indigenous probiotic bacteria (Roberfroid 2007). Synbiotics are products that contain both probiotics and prebiotics (Piątek 2021). These agents can be prepared as tablets, capsules, suspensions, dry foods, or granules. Since the licensing arrangements for prebiotic and probiotic preparations vary from agent to agent, there is a wide range of specific regimens available.

Commonly used probiotics for infantile colic include Lactobacillus reuteri (American Type Culture Collection Strain 55730 or DSM 17 938 (Turco 2021)), Lactobacillus delbrueckii subsp.delbrueckii DSM 2007, and Lactobacillus plantarum MB 456, all of which have proven inhibitory activity against gas‐forming coliforms; and Bifidobacterium strains (Savino 2009), namely Bifidobacterium breve B632 (DSM 24706), B2274 (DSM 24707), B7840 (DSM 24708), Bifidobacterium longum subsp. longum B1975 (DSM 24709 (Chen 2021)).

How the intervention might work

There is growing evidence that suggests that supplementation with probiotics can modulate intestinal bacterial patterns by aiding the colonisation of beneficial bacteria (Boirivant 2007Hill 2014). There is also an accumulation of evidence showing that infants with IC have more gas‐forming Clostridium difficile, Klebsiella pneumonia, and Escherichia coli in their intestines, and less microbial diversity than their non‐colic counterparts (De Weerth 2013Rhoads 2009Savino 2009). It has been postulated that probiotics can suppress intestinal inflammation by preventing the overgrowth of inflammation‐inducing microbes and gas‐forming coliforms (Gareau 2010Savino 2011). In theory, probiotic bacteria can also influence sulfate‐reducing bacteria, methanogens or acetogens, or both, which play an important role in the functioning of the gut (Nakamura 2010Wong 2019). 

Why it is important to do this review

Infantile colic, despite being a benign condition, is a significant source of maternal and paternal anxiety and depression (De Kruijff 2021), and impaired family functioning (Smart 2007), and is the most common reason for seeking medical advice within this age group (Wake 2006). Given its universal prevalence, the direct and indirect economic burden associated with IC is notable. In the UK, cost burden analysis estimates the direct cost to the NHS of infant colic and functional gastrointestinal disorders to be over GBP 70 million per annum, with over GBP 30 million spent on prescriptions for medicine (Mahon 2017). Furthermore, IC is associated with maternal postpartum depression, early breastfeeding cessation, parental guilt and frustration, shaken baby syndrome, formula change, and long‐term behaviour problems, all of which take a toll on both the healthcare system (which likely is far in excess of the estimate given above), and the health of all members of the family (Nocerino 2020). It has also been suggested that colic in infancy increases the susceptibility to recurrent abdominal pain, allergic diseases, and psychological disorders in childhood (Savino 2005b).

A previous Cochrane Review, published in 2019, found that prophylactic probiotics made little or no difference to the occurrence of infantile colic, but appeared to reduce crying time (Ong 2019). This finding was echoed in a further systematic review, which specifically highlighted the effectiveness of Lactobacillus reuteri DSM 17938 in reducing the duration of crying (Hjern 2020). Core to both of these were questions about the specific preparations, length of therapy, core characteristics of the infants and the presence of concurrent interventions, all of which need consideration. The impact of IC, alongside the increasing use of probiotics in neonatology and paediatrics, and the relatively low cost and availability of probiotics, reinforces the need to rigorously evaluate the current evidence on the efficacy and safety of probiotics in the field of infantile colic. It is timely to conduct this review at a time when the number of trial registrations for infantile colic research continues to grow rapidly. 

Objectives

To assess the effects of probiotics for infantile colic in infants younger than four months of age.

Methods

Criteria for considering studies for this review

Types of studies

We will include randomised controlled trials (RCTs) with any level of blinding. We will include standard parallel design RCTs and cross‐over RCTs. For cross‐over studies, we will treat the first treatment period as a parallel trial. For the purpose of analysis of efficacy and safety, we will only use data from the first treatment period.

Types of participants

Studies will be eligible for inclusion if they:

  1. Include infants up to four months of age at study enrolment; and

  2. Explicitly define the diagnostic criteria used for the case definition of infantile colic (IC). This may include, but is not limited to, the Wessel criteria (Wessel 1954), Rome III criteria (Hyman 2006), or Rome IV criteria (Benninga 2016).

We will not exclude studies based on the method of infant feeding (i.e. breastfed, formula‐fed, or a mixture of the two).

We will exclude studies involving infants born prematurely (< 32 weeks' gestation), infants with clinical illness, and infants who have received antibiotics, probiotics, prebiotic, or any combination of these medications in the period preceding the administration of trial products. 

In studies that include participants across an age range that spans the four‐month cut‐off, we will only use the data for participants aged less than four months of age at study enrolment, provided these data are delineated from participants aged four months and above. If these are not available, we will request these from authors. If not provided, then the study will be excluded.

Types of interventions

'Intervention

We will consider any study that describes the use of a probiotic or synbiotic, administered either alone or together (as a synbiotic), regardless of the duration of the intervention. We will accept administration of the probiotic and synbiotic by any route of enteral administration, preparation (tablet, capsule, suspension, dry food, or granule), dose, frequency, and length of administration. 

We will include studies in which parental training or reassurance is provided, as long as both the treatment and control arms receive the same training or reassurance, or both. Further to this, we will include studies in which dietary modifications were made to the mother's diet, as long as the modifications extend across both the intervention and control arms. 

We will exclude studies that investigate prebiotics only (i.e. without a probiotic).

Comparator

A range of therapeutic agents are frequently used in practice for infants with colic, including simethicone, and herbal supplements. For the control arm, we will include any study that describes conventional care (i.e. simethicone), placebo, an active comparator (including other probiotics or synbiotics), or no treatment. We will include studies in which the interventional agent is administered alongside conventional care, provided the control arm receives the same conventional therapies. Control agents may include inert or prebiotic preparations.

Main comparisons

  1. Oral probiotics versus placebo or no intervention

  2. Oral probiotic versus active comparator

Types of outcome measures

For all proposed outcomes, we will collect data on the final outcomes from the end of the study period, in addition to outcomes reported at any other time interval. This will include (but not be limited to) outcomes reported on day 7, day 14, day 21, day 28, and day 35, and at 6 months, and 12 months.

We did not select the number of cases of infantile colic as the first primary outcome for a number of reasons. First, whilst the Rome III (Hyman 2006), Rome IV (Benninga 2016), and Wessel criteria (Wessel 1954), are internationally recognised diagnostic criteria for case definition of infantile colic, they are not proven to have validity as monitoring tools. This is especially an issue with the Rome IV criteria, due to the removal of any quantitative element to colic, meaning that once a person has met the criteria for diagnosis, defining a transition point for resolution becomes challenging. Second, in our group's work with caregivers, it is clear that once diagnosed, the priority is on improvements in crying time and treatment success. As such, we selected global success and crying time at study end points to be the two most important primary outcomes. 

Primary outcomes

  1. Global success, as defined by the primary study (dichotomous outcome). This may be defined in a number of ways, and may include the number of infants who experienced a reduction in specific symptoms by a set proportion or by an absolute value.  In studies that report both caregiver and physician assessments, preference will be given to the former.

  2. Crying time at study end points (continuous outcome, reported as minutes/day, hours/day, hours/week, or any other interval descriptor)

  3. The number of cases of infantile colic at the end of the study, determined with recognised criteria (i.e. Wessel criteria, Rome III criteria, Rome IV criteria), and reported by the primary study (dichotomous outcome)

  4. Withdrawal due to adverse events (dichotomous outcome). As per Zorzela 2016, we define an adverse event as an unfavourable or harmful outcome that occurs during, or after, the use of a therapeutic intervention, but is not necessarily caused by it.

Secondary outcomes

  1. Change in crying time between study start and study end in each group (continuous outcome, reported as minutes/day, hours/day, hours/week, or any other interval descriptor)

  2. Parental or family quality of life, assessed using validated scoring tools, at study end points (continuous outcome). These may include, but are not limited to: the PedsQL (Paediatric Quality of Life Inventory (Varni 2011)) and the Infant Colic Questionnaire (ColiQ (Bellaiche 2021)).

  3. The total number of serious adverse events, as defined by the primary study (dichotomous outcome). If sufficient information is available, we will specify individual serious adverse events.

  4. The total number of adverse events (dichotomous outcome). We plan to take an 'exploratory' approach to adverse event reporting, since there are no specific, significant safety concerns pertaining to probiotic use for infantile colic. We have chosen to take this approach to handle all unanticipated adverse events that are reported in the included studies, which may generate new signals to add to existing safety profiles.

Search methods for identification of studies

We will identify relevant trials by searching the sources described below.

Electronic searches

We will search the following databases and trial registers.

  1. Cochrane Central Register of Controlled Trials (CENTRAL; current issue), in the Cochrane Library, which includes the Cochrane Developmental, Psychosocial and Learning Problems Specialised Register

  2. MEDLINE(R) and Epub Ahead of Print, In‐Process, In‐Data‐Review & Other Non‐Indexed Citations, Daily and Versions(R) Ovid (1946 onwards). We will run separate searches to find non‐indexed records.

  3. Embase Ovid (1974 onwards)

  4. CINAHL EBSCOhost (Cumulative Index to Nursing and Allied Health Literature; 1937 onwards)

  5. APA PsycINFO Ovid (1806 onwards)

  6. Web of Science Core Collection Clarivate (Science Citation Index – Expanded; Social Sciences Citation Index; Conference Proceedings Citation Index – Science; Conference Proceedings Citation Index – Social Science and Humanities; 1970 onwards)

  7. Proquest Dissertations & Theses Global (1637 onwards)

  8. Cochrane Database of Systematic Reviews (CDSR; current issue), in the Cochrane Library

  9. Epistemonikos (www.epistemonikos.org/en/)

  10. ISRCTN registry (www.isrctn.com/)

  11. US National Institutes of Health Ongoing Trials Register ClinicalTrials.gov (clinicaltrials.gov)

  12. Australian and New Zealand Clinical Trials Registry (anzctr.org.au)

  13. World Health Organization International Trials Registry Platform (WHO ICTRP; apps.who.int/trialsearch)

The search strategy for MEDLINE is in Appendix 1. It uses the sensitivity maximising version of the Cochrane highly sensitive search strategy for identifying RCTs or quasi‐RCTs, as recommended in the Cochrane Handbook for Systematic Reviews of Interventions (Lefebvre 2021). We will adapt this strategy for other databases, without imposing any date or language restrictions. 

If we identify non‐English studies that are potentially eligible based on their abstract, we will place them in the Awaiting classification section, and organise translation from one of a number of sources. These sources include the language departments of the authors' host higher education institution, through the Cochrane network, or via links to other researchers in the field. 

Searching other resources

Supplementary searching

We will inspect the bibliographies of included studies for references to randomised controlled trials that may be relevant to this review. We will contact the authors of included studies to request any missing or incomplete data. Further to this, we will also inspect the reference lists of relevant systematic reviews.

Handsearching

We will handsearch conference proceedings from Digestive Disease Week, United European Gastroenterology Week, and the European Society for Paediatric Gastroenterology, Hepatology and Nutrition annual scientific meetings from the past two years, to identify other potentially relevant studies that may not be indexed in bibliographic databases, or if published, not published in full. Concerns have been raised regarding the accuracy of data reported in abstract publications (Pitkin 1999). Therefore, if we identify references to relevant unpublished or ongoing studies, we will attempt to collect sufficient extra information to enable inclusion in this review. We will only include studies from the grey literature if sufficient data are reported to judge eligibility for inclusion. If data are incomplete, we will contact the study authors to verify the eligibility of the study, and we will only include the study if suitable data to assess quality and outcomes are supplied.

Data collection and analysis

Selection of studies

Two review authors (CGC and VS) will independently screen studies for eligibility at the title, abstract, and full‐text review stages, as described below. We will use the systematic review management system Covidence to upload search results, screen abstracts and full‐text study reports, and export data into Excel. We will select studies in accordance with the recommendations in the Cochrane Handbook for Systematic Reviews of Interventions (Lefebvre 2021).

  1. Merge search results from different sources using the reference management system, Covidence, in which duplicate records of the same report will be removed (i.e. records reporting the same journal title, volume, and pages).

  2. Screen the titles and abstracts of all records yielded by the search, discarding those that are clearly irrelevant, and progressing all others with a reasonable possibility of inclusion.

  3. Retrieve the full‐text report of potentially relevant records.

  4. Link multiple reports of the same study. We will not discard secondary reports of a study, since they may contain valuable information about the study.

  5. We will carefully examine full‐text reports, selecting studies against the inclusion and exclusion criteria of this review (see Criteria for considering studies for this review). For each study we exclude at full‐text screening, we will assign a reason for exclusion to each record. We will also progress studies for which further information is required to determine eligibility. Studies that may appear to meet the eligibility criteria, but when inspected further, do not meet criteria for inclusion, will be documented in the characteristics of excluded studies tables, with specific reasons for exclusion.

  6. Correspond with investigators, when required, to clarify study eligibility. This may involve requesting further information to enable us to make a judgement. Note: we will not omit studies from this review solely because measured outcome data were not reported.

  7. Differences in assessment between review authors will be managed through discussion. If disagreement persists, adjudication by a third review author (MG) will take place.

We will outline the selection process in a PRISMA flowchart (Liberati 2009Page 2021a).

Data extraction and management

We will develop a data extraction form a priori, as per the recommendations in the Cochrane Handbook for Systematic Reviews of Interventions, and will pilot the form on two random RCTs to ensure it is fit for purpose (Li 2021). Two pairs of review authors will each be assigned 50% of the reports meeting the inclusion criteria (pair 1: CGC and CW; pair 2: VS and MG). Each review author within the pair will independently extract and record the data using the pre‐designed data extraction form. Following data extraction, the within‐pair review authors will compare extractions, and discuss and resolve any differences. A fifth author, not involved in the extraction (MG), will adjudicate in instances of persisting disagreement. CGC will upload the extracted data into RevMan Web (RevMan Web 2022).

We will record the study title, author list, year of publication, and country of publication for each study. Following this, we will extract the following data for each study.

  1. Method: study design, setting (i.e. hospital, primary care), study period (period of time through which participants were enrolled)

  2. Participants: inclusion criteria, exclusion criteria, concurrent therapies, number randomised, number analysed, post‐randomisation dropouts and exclusions, sex, age (mean, ± standardised deviation (SD)), method of diagnosing colic, mode of infant feeding (i.e. breastfed only, formula‐fed only, mixed feeding)

  3. Intervention: number of arms within the trial, intervention group treatment regimen, control group treatment regimen, length of the intervention, timing of follow‐up

  4. Outcome: outcomes reported (as per Primary outcomes and Secondary outcomes). This will include the number of cases of colic at the study end point (and at other time points, if reported), crying time at baseline and at the study end point (and at other time points, if reported), the definition of global success, global success at the study end point (and at other time points, if reported), adverse events and the number of withdrawals due to adverse events (we will note whether studies actively monitored for adverse events, or if they simply provided spontaneous reporting of adverse events). For continuous outcomes, we will extract the mean and SD at baseline and each time point. If the mean value is not provided, we will extract the median or interquartile range instead.

  5. Other: trial registration details, conflicts of interest, funding details, and risk of bias assessments (based on study design and duration, sequence generation, allocation concealment, blinding of outcome assessors, and evaluation of the success of blinding). We will also record details of any email communication with the study authors.

Assessment of risk of bias in included studies

The included studies will be split into two groups, each allocated to a pair of review authors (pair 1: CGC and CW; pair 2: VS and RG). Within the pairs, each review author will independently evaluate each study for risk of bias, using the criteria recommended in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011) and set out in Appendix 2. 

Using RoB 1, we will assess risk of bias across the following domains: sequence generation; allocation concealment; blinding of parents and health professionals; blinding of outcome assessment; incomplete outcome data; selective outcome reporting; and other potential threats to validity. Review author pairs will discuss their judgements, and resolve any differences in opinions. A fifth review author, not involved in making risk of bias assessments (MG), will resolve any persisting disagreements. We will present the risk of bias judgements for each study in the risk of bias tables in RevMan Web (RevMan Web 2022).

We will assess the risk of bias as high, low, or unclear for each domain. We will consider studies that receive a judgement of high risk of bias in one or more domain(s) to be at high risk of bias overall; those that receive a judgement of low risk of bias in all domains to be at low risk of bias overall; and those that receive a judgement of unclear risk of bias in one or more domains to be at unclear risk of bias overall.

Measures of treatment effect

We will determine measures of treatment effect as per the recommendations set out in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2021).

Dichotomous outcomes

For dichotomous outcomes, we will calculate the risk ratio (RR) and corresponding 95% confidence interval (CI). 

Continuous outcomes

For continuous outcomes measured on the same scale, we will extract mean change or end point data to calculate a mean difference (MD) with corresponding 95% confidence intervals. 

For continuous outcomes measured on different scales, we will extract mean change from baseline or end point data and the corresponding SDs or standard errors (SEs) to calculate a standardised mean difference (SMD) with 95% CIs. We will not combine change and postintervention data in a single analysis using SMD. Instead, we will use postintervention SDs rather than change score SDs.

If both continuous and dichotomous data are available for an outcome, we will include only the continuous outcome in the primary analysis. If some studies report an outcome as a dichotomous measure and others use a continuous measure of the same construct, we will convert the results of the dichotomous measure to a standardised mean difference (SMD), provided we can assume that the underlying continuous measure has a normal or near‐normal distribution or logistical distribution. Otherwise, we will undertake two separate analyses. 

Unit of analysis issues

We will assess all included trials to determine the unit of randomisation, and whether this unit of randomisation is consistent with the unit that has been analysed (i.e. the number of observations in the analysis matches the number of units that were randomised (Deeks 2021)).

Studies with multiple treatment arms

For studies comparing more than two intervention groups, we will make multiple pair‐wise comparisons between all possible pairs of intervention groups. To avoid double counts, we will divide shared intervention or comparator groups evenly among the comparisons. For dichotomous outcomes, we will divide both the number of events and the total number of participants. For continuous outcomes, we will only divide the total number of participants, and leave the means and standard deviations unchanged. If we find a treatment arm is not relevant to our study outcomes, we will exclude the group from our analysis. We will clearly document all decisions in the characteristics of included studies tables.

Cross‐over studies

We will include cross‐over studies, but we will only pool their data if before and after cross‐over data are reported separately, and we will only use pre‐cross‐over data. This is to avoid the risk of carry‐over effects from the prior intervention to the second phase of the study.

Cluster‐randomised studies

It is unlikely that we will find cluster‐randomised trials, because such a design is uncommon in this field. If we do identify relevant cluster‐RCTs, we will only use data if the trial authors used appropriate statistical methods to take the clustering effect into account. If it is unclear whether appropriate controls for cluster effects have been carried out, we will contact the study authors to obtain the necessary information. If appropriate controls have not been applied, we will request the individual patient data (IPD), and re‐analyse the data using the generic inverse‐variance method to adjust for correlation, as outlined in the Cochrane Handbook for Systematic Reviews of Interventions (Deeks 2021). We will exclude cluster‐RCTs in a sensitivity analysis to assess their impact on the results (Sensitivity analysis).

Trials with repeated measurement

In studies with repeated measurements for the same infant over different time points (e.g. assessing crying time at multiple time points), we will prioritise measurements taken at the end of the period. We will conduct separate analyses for data from different points of measurement. 

Dealing with missing data

We will assess missing data and dropouts for each included study. We will attempt to contact the primary study authors to request any relevant missing data. If study authors provide the missing data, we will include these data according to intention‐to‐treat (ITT) principles. For all outcomes in all studies, we will carry out analyses, as far as possible, on an ITT basis; that is, we will attempt to include all infants randomised to each group in the analyses, and we will analyse all infants in the group to which they were allocated, regardless of whether they received the allocated intervention. If applicable, we will conduct a sensitivity analysis where we include studies with available data (see Sensitivity analysis).

If we are unable to obtain a response from study authors after two attempts, we will make no further efforts to obtain information. We will analyse the information that is available and state our assumptions whether the data 'are not missing at random' (i.e. due to unfavourable outcomes or non‐adherence to treatment), or 'are missing at random'. 

If there are missing summary data, we will attempt to contact the study authors for missing data. If we are unable to obtain these data (i.e. standard deviation or mean of the outcomes), we will aim to derive calculated values as per the recommendations in the Cochrane Handbook for Systematic Reviews of Interventions (Deeks 2021). For missing continuous data, we will estimate standard deviations from other available data, such as standard errors, or we will impute them using the methods suggested in Deeks 2021. We will conduct analyses for continuous outcomes based on participants completing the trial, in line with available case analysis; this will assume that data are missing at random. If there is a discrepancy between the number randomised and the number analysed in each treatment group, we will calculate and report the percentage lost to follow‐up in each group. When it is not possible to obtain missing data, we will record this in the data extraction form, report it in our risk of bias tables, and discuss the extent to which the missing data could alter the results, and hence the conclusions of the review. 

Assessment of heterogeneity

We will consider clinical heterogeneity across a number of key characteristics, including participants, interventions, comparators, and outcomes. We will assess methodological heterogeneity by examining the methodological characteristics (i.e. variation in study design and outcome measurement tools), and conducting risk of bias assessments with RoB 1 (Higgins 2011). 

To test for statistical homogeneity or heterogeneity of effect sizes between studies, we will inspect the forest plots (Egger 1997), and use a Chi² test. A P value of less than 0.05 will give an indication of the presence of heterogeneity. Inconsistency will be quantified and represented by the I² statistic, a quantity that describes the approximate proportion of variation in point estimates that can be attributed to heterogeneity rather than sampling error. We will interpret the values as described in the Cochrane Handbook for Systematic Reviews of Interventions (Deeks 2021):

  1. 0% to 40%: might not be important;

  2. 30% to 60%: may represent moderate heterogeneity;

  3. 50% to 90%: may represent substantial heterogeneity; or

  4. 75% to 100%: considerable heterogeneity.

We will not pool data in a meta‐analysis if we detect a considerable degree of statistical heterogeneity (I² > 75%). If there is considerable statistical heterogeneity, we will visually review studies, and investigate whether the heterogeneity can be explained on clinical or methodological grounds, in which case, we will conduct subgroup analysis as planned, and report appropriately (see Subgroup analysis and investigation of heterogeneity). If we cannot find reasons for the considerable statistical heterogeneity, we will present the results narratively, in detail.

We will also report Tau2 as an estimate of between‐study variation when using the random‐effects model. 

Assessment of reporting biases

If we pool data from 10 or more studies in a meta‐analysis, we will construct funnel plots to explore the relationship between the intervention effect estimate and standard error of the intervention effect estimate (Page 2021b). If there appears to be asymmetry, we will explore whether publication bias or small study effects explain it, using Begg's and Egger's test (Begg 1994Egger 1997).  

We will search for protocols or study records of all RCTs included in the review. Where available, we will compare the outcomes reported in the protocol or study record with those in the published report. If we are unable to find the protocol or study record, even after contacting the study authors, we will compare the outcomes reported in the Methods section with those reported in the Results section of the published report. We will identify outcome reporting bias where outcomes are included in the protocol, study record, or Methods section of the published report, but are not included in the Results section (Assessment of risk of bias in included studies).

Data synthesis

We will provide a narrative synthesis of the key characteristics for the included studies (i.e. number of included studies; study designs of the included studies, characteristics of the participants included across the studies; interventions used in both treatment and control groups; and outcome measures reported).

We will pool risk ratios (RR) for dichotomous outcomes and mean differences (MD), or standardised mean differences (SMD) for continuous outcomes, alongside 95% confidence intervals (CI). We will carry out standard pair‐wise meta‐analysis if two or more studies assessed similar populations, interventions, and outcomes. We will analyse studies using RevMan Web (RevMan Web 2022). We will synthesise data using the random‐effects model with inverse‐variance weighting, as this approach minimises the imprecision of the pooled effect estimate (Deeks 2021). 

We will examine the following comparisons: oral probiotics versus placebo or no intervention; and oral probiotics versus active comparator, focussing only on direct comparisons, and making no assumptions about indirect comparisons. 

If we are unable to carry out a meta‐analysis (e.g. data are too heterogenous, high statistical heterogeneity, or there are too few studies), we will present a narrative summary of the results, which we will report according to the Synthesis Without Meta‐analysis (SWiM) guideline (Campbell 2020). 

Subgroup analysis and investigation of heterogeneity

We will undertake subgroup analyses of potential effect modifiers if enough data are available (at least two studies for each analysis). We will use the formal test for subgroup interactions in RevMan Web (RevMan Web 2022), and will use caution in the interpretation of subgroup analyses, as per the recommendations in the Cochrane Handbook for Systematic Reviews of Interventions (Deeks 2021). The magnitude of the effects will be compared between the subgroups by assessing the overlap of the CIs of the summary estimate. Non‐overlap of the CIs indicates statistical significance. 

  1. Specific probiotic preparations or species

  2. Probiotic dose

  3. Length of therapy

  4. Time of outcome measurement (i.e day 7, day 14, day 21, day 28, and day 35; at 6 months, and 12 months)

  5. Parental education, reassurance, or dietary modifications as adjunct therapy

  6. Type of feeding (i.e. breastfed only, formula‐fed only, mixed feeding)

Sensitivity analysis

We plan to undertake sensitivity analyses on the primary outcomes, to assess whether the findings of the review are robust to the decisions made during the review process. In particular, we will reanalyse the data:

  1. excluding studies at high or unclear risk of selection and performance bias;

  2. excluding studies that did not use the international consensus definitions of Wessel or Rome criteria (Lacy 2016Wessel 1954); 

  3. using a fixed‐effect model; and

  4. excluding studies with imputed data, including only studies with available data.

Summary of findings and assessment of the certainty of the evidence

We will assess the overall certainty of the evidence for the primary outcomes (i.e. global success, crying time, number of cases of infantile colic, and withdrawal due to adverse events) and selected secondary outcomes (parental or family quality of life, and the total number of adverse events) using the GRADE approach (GRADEpro GDTGuyatt 2013). We will report these outcomes at the end of study time point. The GRADE approach appraises the certainty of a body of evidence, based on the extent to which one can be confident that an estimate of effect reflects the item being assessed. Evidence from RCTs start as high‐certainty, but may be downgraded by up to three levels due to risk of bias, indirectness of evidence, unexplained heterogeneity, imprecision (sparse data), and publication bias. Two review authors (CGC and VS) will independently assess the overall certainty of the evidence for each outcome after considering each of these factors, and grade them as one of the following.

  1. High certainty: we are very confident that the true effect lies close to that of the estimate of the effect

  2. Moderate certainty: we are moderately confident in the effect estimate; the true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different

  3. Low certainty: our confidence in the effect estimate is limited; the true effect may be substantially different from the estimate of the effect

  4. Very low certainty: we have very little confidence in the effect estimate; the true effect is likely to be substantially different from the estimate of effect

The review authors will manage their differences in assessment through discussion. If disagreement persists, they will ask a third review author (MG) to adjudicate.

We will justify all decisions to downgrade the certainty of the evidence in the table footnotes, and we will make comments to aid the reader's understanding of the review, where necessary.

We will construct summary of finding tables for the following comparisons.

  1. Oral probiotics versus placebo or no intervention

  2. Oral probiotic vs active comparator