Educational interventions to prevent paediatric abusive head trauma in babies younger than one year old: A systematic review and meta-analyses

Background: Paediatric abusive head trauma (AHT) occurs in young children due to violent shaking or blunt impact. Educational and behavioural programmes modifying parent/infant in-teractions may aid primary prevention. This systematic review aims to assess the effectiveness of such interventions to prevent AHT in infants. Methods: We searched Embase, MEDLINE, PsycINFO, The Cochrane library, CINAHL databases and trial registries to September 2021, for studies assessing the effectiveness of educational and behavioural interventions in preventing AHT. Eligible interventions had to include messaging about avoiding or dangers of infant shaking. Randomised controlled trials (RCTs) reporting results for primary (AHT, infant shaking) or secondary outcomes (including parental responses to infant crying, mental wellbeing), and non-randomised studies (NRSs) reporting primary outcomes were included. Evidence from combinable studies was synthesised using random-effects meta- analyses. Certainty of evidence was assessed using GRADE framework. PROSPERO registration CRD42020195644. Findings: Of 25 identified studies, 16 were included in meta-analyses. Five NRSs reported results for AHT, of which four were meta-analysed (summary odds ratio [OR] 0.95, 95 % confidence intervals [CI] 0.80 – 1.13). Two studies assessed self-reported shaking (one cluster-RCT, OR 0.11, 95 % CI 0.02 – 0.53; one cohort study, OR 0.36, 95 % CI 0.20 – 0.64, not pooled). Meta-analyses of secondary outcomes demonstrated marginal improvements in parental response to inconsolable crying (summary mean difference 1.58, 95 % CI 0.11 – 3.06, on a 100-point scale) and weak ev- idence that interventions increased walking away from crying infants (summary incidence rate ratio 1.52, 95 % CI 0.94 – 2.45). No intervention effects were found in meta-analyses of parental mental wellbeing or other responses to crying. Interpretation: Low certainty evidence suggests that educational programmes for AHT prevention are not effective in preventing AHT. There is low to moderate certainty evidence that educational interventions have no effect or only marginally improve some parental responses to infant crying. we found low certainty evidence that preventative interventions focused on educating parents on how to cope with infant crying do not reduce the incidence of AHT. Very low certainty evidence suggests that prevention programmes might reduce the incidence of infant shaking. There is very low to moderate certainty evidence that some of these interventions might marginally increase some intended parental behaviours (e.g. walking away from a crying infant, picking up a child). As RCTs are often not feasible to assess rare outcomes like AHT, more robust, large-scale, comparative NRSs that make use of routinely collected data and include assessment of cost-effectiveness should be conducted to inform decisions on preventative interventions.

Interpretation: Low certainty evidence suggests that educational programmes for AHT prevention are not effective in preventing AHT. There is low to moderate certainty evidence that educational interventions have no effect or only marginally improve some parental responses to infant crying.

Introduction
Paediatric abusive head trauma (AHT) is an injury inflicted on young children, usually through either a direct blow to the head or violent shaking. Most victims are younger than six months (Blumenthal, 2002), and perpetrators are usually parents, in particular fathers or father figures (Starling, Holden, & Jenny, 1995). The incidence in under one year olds in high income countries is 21-35 cases per 100,000 infants (Minns, Jones, & Mok, 2008). Around 18-25 % die due to the injuries sustained, and up to 80 % of survivors are left with lifelong cognitive or neurological impairment (Barr, 2012), such as damaged vision and hearing. AHT is associated with significant financial impact to health and social care, legal costs related to safeguarding processes, and societal costs related to longterm care needs of survivors (Beaulieu, Rajabali, Zheng, & Pike, 2019;Miller, Steinbeigle, Lawrence, et al., 2018;Stabile & Allin, 2012).
The pathway from education to changing behaviour first requires learning and knowledge, which in turn may change attitudes, leading to behaviour change (Ajzen & Fishbean, 1980). Educational and behavioural programmes aim to prevent AHT by modifying carer/infant interactions, particularly during times of peak infant crying (Dias, Cappos, Rottmund, et al., 2021). Most preventive interventions therefore focus on educating carers on patterns of infant crying, and the dangers of shaking their infant (Barr, n.d.;Altman et al., 2011;Bechtel et al., 2011;Bechtel, Gaither, & Leventhal, 2020;Dias et al., 2005;Groisberg, Hashmi, & Girardet, 2020). One such intervention, implemented in New York State and evaluated in an uncontrolled observational study, reported a large reduction in cases (Altman et al., 2011;Dias et al., 2005). These findings were followed by development and implementation of similar programmes, including "The Period of PURPLE Crying" (Barr, n.d.), "I promise", Take 5 (Bechtel et al., 2011) and the ICON programme (ICON, 2020).
Evaluations of these programmes have been limited in size and scope. They have been implemented and maintained with varying success, with one of the key challenges being a paucity of robust evidence on their effectiveness. Identifying whether any of these programmes are effective in reducing this catastrophic injury is essential to inform preventive policies.
We aimed to systematically review preventative strategies, evaluating the effectiveness of educational interventions aimed at reducing AHT in infants younger than one year old.

Study design
We included Randomised Controlled Trials (RCTs). Comparative non-randomised intervention studies (NRSs) were also included if they reported our primary outcomes (AHT, incidence of infant shaking), as AHT is a rare outcome which is rarely measured in RCTs. We excluded uncontrolled before-after/pre-post designs as studies without a comparison group cannot provide estimates of intervention effectiveness.

Participants
The target population included parents, expectant parents, and carers of children under one year old (at the point of enrolment or start of intervention). We also included studies where an intervention was delivered to health and/or social care professionals supporting parents and carers (e.g. training to enable delivery of messages about shaking and AHT), provided the study reported outcomes in infants and/or parents.

Interventions
We included educational or behavioural interventions aimed at preventing AHT. Interventions primarily focused on managing infant crying were also included if the intervention included messaging about the dangers of infant shaking, or instructions to not shake the baby. Studies did not have to state an explicit aim of preventing AHT. Eligible settings included primary, secondary and community health/social care.

Comparators
We included studies that compared interventions with either no intervention, standard/usual care, or an alternative educational/ behavioural intervention. The comparison group could also include some elements of AHT education as in many countries this constitutes standard care.

Outcomes
The primary outcomes were the incidence of AHT and the incidence of infant shaking (e.g. self-reported by parent or carer). Secondary outcomes in parents/carers were (i) response to inconsolable crying, (ii) mental health (e.g. anxiety and depression), (iii) confidence/self-efficacy, (iv) emotional regulation (e.g. use of coping strategies), (v) stress/frustration, and (vi) frequency of seeking support (e.g. from GP or health visitor). Secondary outcomes in infants were (i) frequency of crying, (ii) sleep patterns (e.g. hours of continuous or total sleep), (iii) other forms of abuse (including maltreatment or neglect), and (iv) mortality. We did not explore knowledge and learning based outcomes (e.g. knowledge of shaken baby syndrome). Whilst learning is an important step on the pathway to behaviour change, the measurement of the behaviour change itself is required to assess the effectiveness of educational interventions. Outcomes were not used as an eligibility criterion for RCTs, but NRSs were only included if they reported one of the primary outcomes.

Literature search
We searched Ovid Embase, MEDLINE, PsycINFO, The Cochrane library and CINAHL from inception to September 2021, without language or date restrictions. We also searched trial registries (ClinicalTrials.gov and the WHO International Clinical Trials Registry Platform) and grey literature (Open Grey, Google Scholar, theses, and dissertation databases). Search terms were developed by an information specialist (SD) with the study team and adapted for each database (Appendix 1). We supplemented these searches by examining reference lists of included studies, reviews of similar topics, citation searches on Web of Science and Google Scholar, and contacting experts.

Study selection
Identified titles and abstracts were double-screened for relevance. Discrepancies were resolved through discussion or escalation of the discrepant record to full-text assessment. For the potentially relevant records, full-text articles were obtained, and each was independently assessed against the inclusion criteria by two reviewers, with any discrepancies resolved through consensus and, if necessary, by referral to a third reviewer.

Data extraction
Data extraction forms were developed in a custom-built Microsoft Access database and piloted on two studies prior to use. For each study, we extracted information on study design, location, population characteristics, intervention details, outcomes, and estimates of intervention effectiveness. Data extraction was carried out by one reviewer and checked by another; discrepancies were resolved through discussion. When necessary, authors were contacted for additional information or clarification.

Risk of bias assessments
Risk of bias in RCTs was assessed using the Cochrane risk of bias tool "RoB-2" (Sterne, Savovic, Page, et al., 2019) for key results included in the meta-analyses. For NRSs, we used the "ROBINS-I" tool (Sterne, Hernan, Reeves, et al., 2016). We pre-specified key confounding factors expected to be relevant for most NRSs in this review: socioeconomic status, ethnicity or race, parent's age, sex of the parent and/or partner, parental substance misuse, mother's pre-existing /acute mental illness, and previous safeguarding concerns. The effect of interest was assignment to the intervention (i.e. intention to treat), for risk of bias assessment of both RCTs and NRSs. Risk of bias assessments were carried out independently, in duplicate, by two co-authors and final judgements agreed through consensus.
We used the pilot version of the "RoB-ME" (Risk of Bias due to Missing Evidence) tool (Page, JAC, Boutron, et al., 2020) to explore the potential impact of evidence unavailable for synthesis due to selective non-reporting (e.g. non-significant results not reported in study publications). Constructing funnel plots to consider the possibility of small-study effects, missing evidence, and/or publication bias was not feasible due to small number of included results (Sterne, Sutton, Ioannidis, et al., 2011). Instead, we contacted authors of registered but unpublished studies to informally assess the approximate volume of unavailable results (see detailed methods in Appendix 2).

Synthesis and certainty of evidence
We narratively summarised data from all included studies, grouped by outcomes of interest. Meta-analyses were carried out where study design, interventions and outcomes were similar enough to combine results. Primary analyses used random-effects meta-analyses, due to anticipated heterogeneity between studies (e.g. interventions were expected to differ in content, mode of delivery, settings, dosing and timing), with fixed-effects models provided as sensitivity analyses (with both estimates shown on the same forest plot when feasible). The 'meta summarize' and 'meta forestplot' user-built commands in Stata Statistical Software: Release 16 (StataCorp. 2019. College Station, TX) were used to generate and present summary estimates of effect, 95 % confidence intervals (CIs), and the percentage of variation across studies due to heterogeneity (I 2 ) for the analysed results (see detailed methods in Appendix 2).
We rated the certainty in the overall body of evidence by applying the Grading of Recommendations Assessment, Development and Evaluation (GRADE) framework for our main synthesised outcomes (Guyatt, Oxman, Vist, et al., 2008). We followed the revised GRADE guidelines which propose that when using ROBINS-I tool to assess risk of bias, NRSs can start at "high certainty" (same as RCTs) and be downgraded for identified limitations (Schunemann, Cuello, Akl, et al., 2019). We used the GRADE-Pro web application to construct our summary of findings table (GRADEpro, 2020).

Risk of bias
Risk of bias assessments for 14 outcomes from 10 included RCTs are presented in Table 3. Three studies (Arshadi et al., n.d.;Fujiwara et al., 2020a;Groisberg et al., 2020) were at high risk of bias, two (Barr et al., 2009a;Barr et al., 2009b) at low risk of bias (each with two assessed outcomes), and the remaining RCTs had some concerns for bias. The most frequent reasons for bias concerns were (i) no mention of attempts to conceal the random sequence, (ii) unblinded outcome assessments, (iii) high levels of missing data, and (iv) evidence or suspicion of selective reporting of results. Risk of bias assessments for NRSs are presented in Table 4. One nonrandomised CBA study (Vinchon et al., 2020) was judged to be at critical risk of bias as no attempt was made to control for confounding, and results were reported as a proportion of the national AHT cases from the intervention region compared with another region, which rendered them unusable. The remaining study results were at serious risk of bias. The most common issues were inadequate control of confounding, selection bias, issues with classification of intervention status or outcome measurement, and potential selective reporting.
Assessment of risk of bias due to missing evidence (RoB-ME) identified one pilot RCT (Cook et al., 2015), which remains unpublished due to problems with collection of their primary outcome (not an outcome of interest for our review); the reason for missing results was unrelated to the values of the results required for our synthesis and was thus not considered at risk of bias. Two published studies measured, but did not report, parental depression and/or anxiety, and the reasons for non-reporting were unclear (Barr et al., 2009b;Fujiwara et al., 2012). One study author confirmed the unreported result was 'non-significant' (Fujiwara et al., 2012). Two studies measuring parental stress were available as a conference abstract only (Lou et al., 2011) and a protocol only (JPRN-UMIN000012445, n.d.). We therefore reached a judgement of "some concerns" for missing evidence for syntheses of parental depression, anxiety, and stress. There was no clear evidence of risk of bias due to missing evidence for other syntheses. Recently registered unpublished RCTs were assumed ongoing and not considered as missing evidence. The matrix of reported results used for RoB-ME assessments is presented in Appendix 4, Table S2.

AHT and incidence of infant shaking
No RCTs reported results for AHT. Two USA-based case-control studies (Bechtel et al., 2020;Keenan & Leventhal, 2010), and two CBA studies (Dias et al., 2017;Zolotor et al., 2015) comparing introduction of two different state-wide programmes with five control US states, were included in the meta-analysis for AHT and found no evidence of reduction in AHT associated with the intervention (summary OR 0.95, 95 % CI 0.80 to 1.13, I 2 = 2 %; Fig. 2). The subgroup estimate from two larger and more robust state-level CBA studies is consistent with no effect (OR 1.01, 95 % 0.85 to 1.19, I 2 = 0 %). Estimates from the two small case-control studies suggest an intervention benefit but with a very wide CI, consistent with both benefit and harm (Fig. 2). One further CBA study (Vinchon et al., Table 3 Risk of bias assessments for randomised controlled trials (using RoB2 tool).

Risk of bias domains and judgements
Reasons for concerns

Some concerns
Little information on randomisation and concealment (possibly due to limited content in conference abstract) and no protocol, trial registry or SAP available SAP: statistical analysis plan. Individual domain and overall 'Low' risk of bias judgements are coloured green, 'Some concerns' risk of bias judgements are coloured yellow, and 'high' risk of bias judgements are coloured red. Further, all overall risk of bias judgements are in bold to help guide the reader to the importance of this column.

Table 4
Risk of bias assessments for non-randomised studies (using ROBINS-I tool). Negative controls for outcome and exposure suggest serious unmeasured confounding. Recall bias cannot be ruled out. Some concerns for selection bias over exclusion of cases with uncertain intervention status. Missing confounder data was differential for cases and controls. Protocol not available, possibility of selective reporting cannot be ruled out.

Risk of bias domains and judgements
Individual domain and overall 'Low' risk of bias judgements are coloured green, 'Some concerns' risk of bias judgements are coloured yellow, and 'high' risk of bias judgements are coloured red. Further, all overall risk of bias judgements are in bold to help guide the reader to the importance of this column. 2020), at critical risk of bias, reported a non-significant reduction in AHT, from 18.8 to 13.6 cases/year in the intervention region. One cluster RCT (Fujiwara et al., 2020a) and one NRS (Fujiwara et al., 2020b) found that fewer mothers in the intervention groups admitted to shaking their baby "violently" or "hard". Another RCT (Groisberg et al., 2020) had no reported instances of parents shaking their babies in either group (n = 164; non-estimable). These results were not meta-analysed due to the different study designs, but are presented separately in a forest plot (Fig. 3). Whether any reported shaking events led to AHT was not reported in any study.

Parental responses to infant crying
Four RCTs evaluating the PURPLE crying intervention (Barr et al., 2009a;Barr et al., 2009b;Fujiwara et al., 2012;Fujiwara et al., 2020a) measured four coping responses to infant crying, which the trialists transformed into a scale of 0-100 (higher score indicated Fig. 2. Meta-analysis of controlled before-after study and case-control study results for abusive head trauma, by subgroup and overall. OR: odds ratios; IRR: incidence rate ratios CI: confidence interval; Overall and domain-level risk of bias judgments for the presented result from ROBINS-I tool: L: low, M: moderate, S: serious, C: critical. Risk of bias domains from ROBINS-I tool: 1: Confounding; 2; Classification of interventions; 3: Selection into the study; 4: Deviations from intended interventions; 5: Missing data; 6: Outcome measurement; 7: Selective reporting. response in the desired direction). One cluster-RCT measured 'Active coping' (Fujiwara et al., 2020a), and found no difference between groups (MD 0.59 points on the 1-100 scale, 95 % CI -0.73 to 1.91, n = 2655). Three RCTs (Barr et al., 2009a;Barr et al., 2009b;Fujiwara et al., 2012) found no improvement in the 'response to general crying' (summary MD 0.08 95 % CI -0.84 to 1.00; n = 3794, I 2 = 0 %). The same three RCTs found a negligibly small improvement in the intervention group for the 'response to unsoothable crying' (summary MD 1.58, 95 % CI 0.11 to 3.06, n = 3497, I 2 = 0 %). All four RCTs (Barr et al., 2009a;Barr et al., 2009b;Fujiwara et al., 2012;Fujiwara et al., 2020a) assessed 'self-talk' in response to inconsolable crying (e.g. telling yourself the crying would end or that there was nothing that could be done); the summary estimate suggested no difference between groups (MD 1.38, 95 % CI − 1.27, 4.02, n = 6146, I 2 = 61 %). Forest plots with summary estimates for these meta-analyses are presented in Fig. 4 (fixed effects models shown in Appendix 4 Fig. S1).
The same four RCTs also measured the number of times per day the mother walked away from her inconsolable child (as recorded in participant diaries), a behaviour encouraged as a coping strategy. This occurred more frequently in the intervention group compared with control (summary incidence rate ratio [IRR] from three trials (Barr et al., 2009a;Barr et al., 2009b;Fujiwara et al., 2012) was 1.52, 95 % CI 0.94 to 2.45, n = 3049, I 2 = 61 %; Appendix 4 Fig. S2). The fourth RCT (Fujiwara et al., 2020a) dichotomised this outcome (ever vs. never), and found weak evidence of more frequent walking away in the intervention group (37.8 % vs. 34.6 %, RR 1.09, 95 % CI 0.96 to 1.24, n = 2655). Another RCT (Lou et al., 2011) measured likelihood of maternal walking away using a five-point scale (in a lab based setting), and found that mothers in the intervention group were more likely to do so (t = 2.1, p = 0.04).
One study (Cala et al., 2020) comparing "all babies cry" (ABC) material to other educational material measured change in number of calming strategies employed for parenting stress from enrolment to follow-up and found no difference between groups (MD 0.3, 95 % CI − 0.2 to 0.8, n = 115, p = 0.256).

Parental stress/frustration
This outcome was measured in seven RCTs. In three RCTs (Barr et al., 2009a;Barr et al., 2009b;Fujiwara et al., 2012) evaluating PURPLE crying, mothers rated frustration related to infant crying on a six-point Likert scale; there was no difference between groups (summary SMD − 0.01, 95 % CI − 0.08 to 0.05, n = 2999, I 2 = 23 %; Appendix 4 Fig. S4). Another RCT (Groisberg et al., 2020), evaluating different methods of delivering PURPLE crying material, reported the weekly frequency of parental frustration, and found no difference between groups (p = 0.66, n = 164). One RCT (McRury & Zolotor, 2010) (n = 35) comparing "Happiest baby" intervention to other material reported parenting stress index at 6 and 12 weeks, and found higher stress in the intervention group at both time points (p = 0.07 and p = 0.01, respectively; effect estimates not reported). One RCT (Arshadi et al., n.d.), evaluating impact of hospital discharge planning found that maternal stress (using the Parental Stress Scale) was lower in the intervention group (p < 0.001, n = 92). A final RCT (Lou et al., 2011), assessing the effect of PURPLE crying material on mean frustration levels (rated on a 0-100 scale) when a 10-min audio of a child crying was played immediately after the education session, found some evidence of lower frustration levels in the intervention group (p = 0.10, n = 33).

Frequency or duration of infant crying
This outcome was measured in two studies (Barr et al., 2009b;McRury & Zolotor, 2010). One RCT (Barr et al., 2009b) evaluating effectiveness of PURPLE crying material reported a small increase in average minutes of crying and inconsolable crying per day in the intervention group compared to controls (MD 5.5 min, 95 % CI 2.7 to 8.4 min, n = 1857, and MD 1.9 min, 95 % CI 0.4 to 3.3 min, n = 1857, respectively). A second RCT (McRury & Zolotor, 2010) comparing "Happiest Baby" with a control intervention found no difference in mean daily hours of crying between groups at 4 weeks (p = 0.3, n = 33), 6 weeks (p = 0.4, n = 33) or 12 weeks (p = 0.8, n = 26), but found frequency of crying was higher in the intervention group at 8 weeks (p = 0.04, n = 34).
None of the included RCTs reported outcomes on confidence/self-efficacy, frequency of seeking support, other infant abuse, or infant mortality.

Certainty of evidence
Based on GRADE assessments, there is low certainty evidence from two case-control studies and two CBA studies that educational interventions have no effect on AHT, due to serious bias concerns (downgraded 2 levels for bias). Evidence for self-reported shaking was of very low certainty due to serious bias concerns (downgraded 2 levels for bias) and imprecision. Evidence for parental responses to crying, some parental behaviours (e.g. walking away from infant, picking-up infant and others) and parental frustration ranged from very low to moderate certainty. GRADE assessments are presented in the summary of findings table (Table 5).

Summary of findings
This systematic review found low certainty evidence (from two case-control studies and two controlled before-after studies) of no effect of educational interventions on the incidence of AHT. Further, we found very low certainty evidence, from a cohort study and a cluster-RCT, of reduced incidence of infant shaking; this self-reported outcome had high potential for bias as parents in the intervention group may underreport its occurrence.
There was moderate certainty evidence of negligible improvements in parental response to inconsolable crying, and low certainty evidence of possible increase in the 'walking away' behaviour in intervention groups, although this latter result was imprecise. There was no evidence for three other measures of parental responses to infant crying. In the one study that assessed anxiety there was an improvement in the intervention group, but the study was small. There was no difference between intervention and control in infant picking up behaviours and parental frustration with crying, and also for emotional regulation and postnatal depression. There was some indication that some evidence was potentially missing for parental mental health outcomes (depression, anxiety and stress).  The mixed evidence from our review is comparable with the findings from a recent review of interventions for managing colic (Lynøe & Eriksson, 2021), which concluded that preventive programs showed little or no effect on incidence of AHT. An older review (Lopes, Eisenstein, & Williams, 2013) reported that the eight studies they identified all reported positive effects of the intervention, and another review by the same author  found weak evidence of effectiveness of interventions for reducing infant crying and raising awareness about AHT. However, neither of these reviews considered study design specific issues when assessing bias and quality of the included studies and overall certainty of evidence.

Strengths and limitations
Our review brings a comprehensive synthesis of effectiveness of educational interventions for AHT prevention. Few other reviews have addressed this topic (Iqbal O'Meara, Sequeira, & Miller, 2020;Lopes et al., 2013;Lynøe & Eriksson, 2021), most of which were not systematic. Our comprehensive search strategy, developed by an experienced information specialist (SD) with input from clinical co-authors (ML, JM) and public health experts (JM, JW) imparts assurance that we identified all relevant studies. Our search and inclusion criteria did not impose any limitations by date or language. Cochrane guidance for conducting systematic reviews was rigorously followed.
There were several limitations. Our initial intention to include studies in which the intervention aim was to prevent AHT was challenging to operationalise in practice due to variations in how this was reported. Instead, we opted to examine the content of intervention packages for messaging regarding AHT to determine eligibility, and were sometimes required to contact authors to determine this.
No RCTs measured our primary outcome of AHT, and self-reported infant shaking was reported in three RCTs. Whilst other behaviour changes are proxies for intervention success, AHT reduction is the ultimate focus. We addressed this by including nonrandomised studies that did measure AHT reduction. Most meta-analyses included the same three or four studies from two research groups, and several evaluated the same intervention (PURPLE Crying). Most studies assessed knowledge-based outcomes, which were not of interest for this review as they are process outcomes, early in the pathway to behaviour change. Such process or surrogate outcomes may be misleading as they may not accurately predict clinically important outcomes (AHT in this case) (McKenzie et al., 2022). Several studies included in this review concluded their intervention was effective based solely on these knowledge outcomes, a finding which is largely not supported based on the behaviour change outcomes. Further work is required to understand how to turn the gains made in knowledge acquisition into behaviour change.
We identified important gaps in the evidence. Most studies did not include fathers or male partners who are more frequent perpetrators of AHT (Schnitzer & Ewigman, 2005;Starling et al., 1995). Most included studies were from North America and Japan, with only one from Europe (France), and none reported cost-effectiveness outcomes. Only one study evaluated training for healthcare professionals delivering prevention programmes.
Issues with risk of bias contributed to the GRADE judgment of low certainty evidence for AHT, and risk of bias and imprecision led to the judgment of very low certainty evidence for incidence of infant shaking. Certainty of evidence for secondary outcomes varied from very low to moderate. Not all results from completed studies were publicly available, and some studies remain unpublished or inaccessible.

Implications for research and policy
Our review highlights large gaps in the evidence for effectiveness of educational and behavioural interventions which aim to reduce AHT. Such interventions are unlikely to be harmful (Fujiwara et al., 2020a), but there is very little robust evidence on whether they are effective.
Development of robust evidence to support or refute the role of educational interventions to prevent AHT is therefore needed, but any such evaluations are challenging for several reasons. Whilst a cluster RCT which linked patients to hospital records to accurately assess AHT rates may be possible, this would be difficult to conduct due to the rarity of the outcome, the large number of participants required, and challenges with linking safeguarding outcomes across health and social care agencies. Alternatively, large-scale, welldesigned and conducted comparative NRSs utilising routinely recorded AHT outcome data could be used to assess effectiveness of preventative programmes. Future studies should consider in their designs the opportunity to address areas of limited evidence. Specifically, the ability of the intervention to reach and impact male carers and ensuring that study samples are representative of the population targeted by the intervention.
In summary, we found low certainty evidence that preventative interventions focused on educating parents on how to cope with infant crying do not reduce the incidence of AHT. Very low certainty evidence suggests that prevention programmes might reduce the incidence of infant shaking. There is very low to moderate certainty evidence that some of these interventions might marginally increase some intended parental behaviours (e.g. walking away from a crying infant, picking up a child). As RCTs are often not feasible to assess rare outcomes like AHT, more robust, large-scale, comparative NRSs that make use of routinely collected data and include assessment of cost-effectiveness should be conducted to inform decisions on preventative interventions.

Funding
This study was funded by the NIHR Applied Research Collaboration West (ARC West) at University Hospitals Bristol and Weston NHS Foundation Trust. The views expressed in this article are those of the authors and do not necessarily represent those of the NHS,