An updated evidence synthesis on the Risk-Need-Responsivity (RNR) model: Umbrella review and commentary

Purpose: To conduct an umbrella review of Risk-Need-Responsivity (RNR) principles by synthesizing and appraising the consistency and quality of the underlying evidence base of RNR. Methods: Following PRISMA guidelines, we searched three bibliographic databases, the Cochrane Library, and grey literature from 2002 to 2022 for systematic reviews and meta-analysis on RNR principles. We summarized effect sizes, including as odds ratios and Area Under the Curve (AUC) statistic. We evaluated the quality of review evidence by examining risk of bias, excess statistical significance, between-study heterogeneity, and calculated prediction intervals for reported effect sizes. Results: We identified 26 unique meta-anlayses that examined RNR principles. These meta-analyses indicate inconsistent statistical support for the individual components of RNR. For the risk principle, there were links with recidivism (OR = 1.6, 95% CI [1.1, 2.3]). For the need principle, although there were associations between adherence to intervention programs and recidivism, risk assessment tools reflecting this principle had low predictive accuracy (AUCs 0.62 – 0.64). The general and specific responsivity principles received some support. However, the overall quality of the evidence was poor as indicated by potential authorship bias, lack of trans-parency, substandard primary research, limited subgroup analyses, and conflation of prediction with causality. Conclusion: The prevalent poor quality evidence and identified biases suggests that higher quality research is needed to determine whether current RNR claims of being evidence-based are justified.


Introduction
The Risk-Need-Responsivity (RNR) model has been one of the world's leading models of offender management, used in many jurisdictions, particularly high-income countries (Polaschek, 2012).The RNR model is constructed around three central principles (Bonta & Andrews, 2017).The first -the risk principle -holds that criminal behaviour can be predicted according to levels of risk and that those at higher re-offending risk should be provided with more intensive treatment and management, whereas those at a lower risk should receive less intensive interventions.This is also known as treatment matching or targeting.Second, the need principle.This aims to translate the risk principle into practicethat the risk level is judged according to criminogenic needs.The starting premise here is that the main aim of offender management and treatment is to decrease recidivism.Thus, the need principle focuses on identifying dynamic needsthose that can be addressed by treatment, thereby decreasing the offender's risk level and likelihood of reoffending.Based on a narrative review, Andrews and Bonta (2010) identified eight criminogenic needs, seven of which are dynamic: (1) criminal history (a static need and therefore, although included in the risk assessment framework, not actually addressed by treatment), (2) anti-social personality pattern, (3) pro-criminal attitudes, (4) pro-criminal associates, (5) marital/family, (6) leisure, (7) school/work and (8) substance abuse.
In practice, these criminogenic needs have been used to construct a series of recidivism risk assessment tools that authors describe as fourth generation (Bonta & Andrews, 2017, p. 202).These tools are stated to have been built from previous actuarial tools in producing more accurate predictive results than unstructured professional judgment but also, in focusing on these dynamic factors, are capable of change, thus focusing treatment and so ultimately aiming to reduce recidivism rather than merely predict it.Of the multiple risk assessment tools available, Bonta and Andrews (2017, p. 196) position their own (namely the various iterations of the LSI-R and LS/CMI) as among the most accurate.
Finally, the responsivity principle underpins the 'how' of treatment, determining how it should be delivered to meet criminogenic needs and thus reduce the risk of recidivism.There are two elements to this.General responsivity holds that rehabilitative treatment is most successful when it uses cognitive-social learning methods to influence behavior.At the same time, according to specific responsivity, treatment should be tailored to individual characteristics in order to maximize its impact.This appears to be the least well-developed of the principles, and the literature is mostly ambiguous on what exactly this entails.However, Bourgon and Bonta (2014) suggest such factors as 'motivation', 'intelligence ', 'learning style', 'race', 'ethnicity', and 'gender' are associated with treatment effectiveness.That is, individuals with different characteristics have unique needs and strengths, and the effectiveness of treatment can be associated with these factors.For instance, ethnic minorities have a higher attrition rate in treatment programs.Thus, according to the specific responsivity principle, programs that include fewer sessions and incorporate strategies to promote attendance may enhance treatment impact for these groups.

The evidence base
Much of the popularity of the RNR model derives from statements about its underlying evidence base, which is often contrasted to newer models with less well-developed research in support, such as the Good Lives Model (e.g., Bonta & Andrews, 2017;Ogloff & Davis, 2004;Polaschek, 2012).RNR is reported to have a broad existing literature base, including primary studies, systematic reviews, and meta-analyses across the principles.However, the evidence base is also quite dispersed, making it difficult to assess the model, particularly as many of the systematic reviews in the area present conflicting findings.Existing syntheses, such as Polaschek (2012), Ward, Melser, and Yates (2007), and Ogloff and Davis (2004), tend to focus either on a single principle or to cover the model only conceptually, with little methodological assessment or critique.Further, these reviews are now dated, all of which are over a decade old, and thus require updating.
Umbrella reviews are increasingly used as a validated, systematic and transparent approach to provide information to researchers and practitioners in areas where there is a large body of evidence of varying quality and displaying mixed results.Umbrella reviews bring together systematic reviews and meta-analyses, provide an overview of underlying research quality and highlight evidence gaps (Ioannidis, 2009).We have conducted an umbrella review of RNR, bringing the existing evidence together so that the principles are covered in one place, and evaluating the quality and consistency of the findings.On the basis of this umbrella review, we discuss the strength and robustness of the model's evidence-base and examine its premises and inferences.

Search strategy
We searched three electronic bibliographic databases (PubMed, PsycNet, and Scopus) and the Cochrane Library (of systematic reviews) for the past 20 years (covering 01/01/2002 to 15/12/2022), in addition to forwards and backwards citation chaining and hand-searching the reference lists of included articles.Grey literature was searched using Eldis, Google Scholar, and FindPolicy.To ensure each RNR principle was covered, an individual search strategy was constructed for each principle, using a tailored combination of keywords (see Appendix for the full strategies).

Eligibility criteria
Studies were considered for inclusion if they conducted a meta-analysis (presenting pooled effect sizes such as Area Under the Curve [AUC], or odds ratio [OR], or those that could be converted to AUC or OR), reported on a formal recidivism outcome (re-arrest, reimprisonment, re-conviction, and probation violation outcomes were all included, but self-report measures were excluded), and related to the specific principle outcome in question (e.g., attrition).The following approach was taken for each principle.For risk, studies were eligible if they compared post-treatment recidivism outcomes for high and lowrisk populations.For need, studies were included if they assessed the predictive accuracy of one or more risk assessment tools for recidivism outcomes or if a study directly assessed a treatment program reflecting the need principle.For general responsivity, studies had to compare recidivism outcomes for treatment/intervention adhering to general responsivity with those not adhering to the principle.In order to access the broadest range of literature, we included both studies that explicitly assessed the association between recidivism outcomes and adherence to general responsivity and those that compared recidivism outcomes for treatment based around cognitive behavioral methods (the recommended treatment modality under General Personality and Cognitive Social Learning [GPCSL]) with non-behavioural programmes.Finally, for specific responsivity, studies were included if they examined the association between one of the model's eight specific responsivity factors and either treatment completion rates or recidivism outcomes.This last category included attrition in addition to recidivism as a wider approach as, with specific responsivity the least well-researched principle, we wanted to access the widest range of possible reviews.No restrictions were applied regarding the language (e.g., English or non-English) or type (e.g., published articles, theses, or grey literature) of included studies.

Study selection
This followed a three-stage process, covering a title check (including screening for duplicates), abstract screening, and full-text review.For inaccessible studies, we contacted the author or institution, based on the correspondence address provided.

Data extraction
Data extraction was carried out using a standardized form.Information on the following variables was collected, where accessible: (1) Demographics (population and offense type, setting), (2) Sample (number of independent effect sizes and sample size), (3) Methods (outcome measured, author independence, follow-up length), (4) Effect size and metric, and confidence intervals (upper and lower), and (5) Measures of between-study heterogeneity, referring to variations observed among included studies and measured using Cochran's Q (reported with a χ2 -value and p value) and I 2 statistic.The latter describes the percentage variation across studies due to heterogeneity rather than chance.I 2 , unlike Q, does not inherently depend on number of studies considered.When publications reported separate effect sizes for different forms of recidivism, the broadest category was chosen to ensure consistency (in practice, typically any criminal category [i.e.general recidivism]).Where follow-up durations varied, that closest to 5 years was selected (as 5-year follow up was most commonly reported).If studies reported results from both a combined sample and smaller subsamples, the combined sample was extracted.For instance, if results were reported for the entire sample, and also for men and women separately, the combined sample (i.e., including both men and women) was selected to ensure the largest sample size.Our primary analyses focused on independent reviews (i.e.those not conducted by the developers of RNR, which were 23 of 26 identified meta-analyses).

Quality assessment
Quality was assessed according to a 7-point scale using validated S. Fazel et al. measures, covering 1 point each for: (1) score on the Assessing Methodological Quality of Systematic Reviews-2 (AMSTAR; Shea et al., 2017) of 8 or above (out of a total of 16), which is a validated tool for assessing meta-analyses; (2) score of risk of Bias in Systematic Reviews (ROBIS; Whiting et al., 2016) of 2 or above (out of a total of 4); (3) excess significance bias (the ratio between a meta-analysis' pooled overall effect size and the effect size of its largest included study) of <1.This is based on the assumption that the largest included study is considered the most accurate (Lipsey & Wilson, 2001), thus a ratio > 1 indicates the presence of statistical excess (Kavvoura et al., 2008); (4) between-study heterogeneity within each review was quantified using I 2 with values <50% considered small and thus scoring 1 in the quality assessment; (5) sample size of ≥1000; (6) 95% prediction interval not including 1. Prediction intervals that include the null effect indicate potentially nonsignificant findings in a new population (Higgins et al., 2019;Riley, Higgins, & Deeks, 2011); (7) no statistical significance (at the 5% level) on Egger's regression asymmetry test (Egger, Smith, Schneider, & Minder, 1997).Significant results here are considered evidence for publication bias (Sterne et al., 2011).For the Need principle, as review outcomes were predictive performance rather than intervention effects, the AMSTAR quality rating was not applicable.The scores were therefore out of 6 (rather than 7).These were then aggregated into an overall quality score, with 0-2 classed as low, 3-4 moderate and 5-7 high based on previous work (Fazel, Smith, Chang, & Geddes, 2018).Missing data were recorded as 0 and the number of present/missing quality items was noted.This approach to quality assessment has been used in other umbrella reviews (Fazel et al., 2018;Fazel, Burghart, Wolf, Whiting, & Yu, 2023;Hailes, Yu, Danese, & Fazel, 2019).

Results
We identified 26 separate meta-analyses across the model's three core principles (Fig. 1), including 7 for risk, 6 for need, 15 for general responsivity, and 4 for specific responsivity (with 3 reviews contributing to more than one principle).Overall, we found mixed and inconsistent evidence in support of RNR principles.However, in general, the quality of evidence was poor.

Risk
We identified 7 eligible meta-analyses, with ORs ranging from 1.4 (0.9, 2.1) to 2.8 (1.0, 7.6) across reviews.That is, individuals deemed at high risk that adhered to treatment programmes had a decreased risk of recidivism compared to low-risk persons (who typically received no or minimal services).The majority (5/7) had confidence intervals not crossing 1, suggesting significant differences.Two independent reviews included non-overlapping samples (pooled ORs 1.7 [1.3, 2.3]), although it includes one with potential authorship bias (OR = 1.8 [1.2, 3.0]).Five other reviews included samples overlapping with each other or with the two reviews with independent samples.Fig. 2 shows effect sizes for all eligible meta-analyses, except those with potential authorship bias.ORs (k = 4) ranged from 1.5 (1.0, 2.2) to 2.8 (1.0, 7.6).The quality across the eligible reviews was poor, with none scoring >2 points on the assessment score out of a possible 7. Poor quality resulted from missing information on key measures, such as estimates of publication bias and heterogeneity between studies (Koehler, Lösel, Akoensi, & Humphreys, 2013).This was also indicated by low scores in methodological quality assessed by tools like AMSTAR and ROBIS (Pearson, Lipton, Cleland, & Yee, 2002).

Need
The need principle predicts risk of recidivism according to the eight criminogenic needs.We identified 6 eligible meta-analyses.Three studies evaluated treatment programmes aiming at meeting the identified need, with ORs ranging from 1.1 (1.0, 1.2) to 2.7 (1.3, 5.7).One independent review reported an OR of 1.6 (1.1, 2.3).Two other studies included samples overlapping with each other or the independent review and one with potential authorship bias (OR = 2.7 [1.3, 5.7]).Fig. 3 shows the effect sizes of included reviews, excluding one with potential authorship bias.
Overall, the quality of reviews was mixed.Studies directly assessed the need principle were of low quality, with scores ranging from 0 to 2 (out of 7).Scores were based on AMSTAR and ROBIS, as information on other quality measures was missing.Reviews assessing risk assessment tools based on the need principle had moderate to high quality, with quality scores ranging from 3 to 7 (but with most studies reporting high heterogeneity between studies).

General responsivity
General responsivity holds that rehabilitative treatment is most successful when it uses cognitive-social learning methods to influence criminal behavior.We found 15 eligible meta-analyses on this theme, with ORs ranging from 1.0 (0.7, 1.4) to 2.6 (1.3, 5.4) and 5 having confidence intervals crossing 1.The pooled OR was 1.4 (1.2, 1.7), based on the 5 reviews without overlapping samples and potential authorship bias.Fig. 5 shows the effect sizes for all studies included except the one with potential authorship bias (k = 14).The quality of the included reviews varied.Among 15 eligible reviews, more than half (k = 8) were of low quality, two moderate, and five high.

Specific responsivity
According to the principle of specific responsivity, treatment should be tailored to individuals.We identified four eligible meta-analyses Fig. 2. Meta-analyses examining the effectiveness of the RNR 'risk' principle on recidivism.Note.Non-independent meta-analyses (k = 3) excluded.Grey studies include overlapping primary investigations.OR (odds ratio) is a measure of association between effectiveness of the examined treatments and (lower) recidivism.
reporting on nine outcomes: on characteristics affecting programme attrition (drop-out) (k = 5), whether adapted treatment programmes affect recidivism rates (k = 3), and whether treatment adhering to the specific responsivity principle is associated with lower recidivism (k = 1).Fig. 6 presents the effect sizes.Non-white individuals, ethnic minorities, or aboriginals were associated with higher levels of either attrition or recidivism (ORs, ranging from 1.3 [1.2, 1.4] to 1.7 [1.5, 1.9]).In contrast, women and those with higher levels of education and motivation were associated with lower levels of attrition and recidivism.
The quality of the studies was low to moderate (scores 0-3/6).Scores were mainly based on AMSTAR and ROBIS, as other quality measures were missing.This was the case for studies of all principles, except for reviews examining risk assessment tools reflecting the need principle.

Discussion
In this umbrella review of the evidence underlying RNR principles, we identified 26 systematic reviews and meta-analyses, published from 2002 to 2023, and based on at least 450 primary studies.Overall, the reviews demonstrated inconsistent support for RNR principles.In evidence syntheses conducted by independent researchers, around half the effect sizes were not significant for the risk principle and the impact of criminogenic needs, a core part of the need principle.In addition, around a third of effect sizes were not significant for the general responsivity principle.For the specific responsivity principle, associations between certain subgroups and poorer outcomes indicated some support for it, including in non-white individuals, ethnic minorities, aboriginal populations, and in other sociodemographic subpopulations (being male and low education).In addition, one psychological factor, low motivation, was associated with higher levels of attrition and recidivism (ORs ranging from 1.3 [1.2, 1.4] to 1.7 [1.5, 1.9]).An omnibus measure of discriminative accuracy, the AUC, of risk assessment tools based on the need principle, ranged from 0.62 (0.53, 0.70) to 0.70 (0.65, 0.75), indicating at best modest predictive accuracy.
Alongside the inconsistent evidence based on these effect sizes, across the four RNR principles, we found that the underlying systematic reviews were mostly characterized by low quality and large evidence gaps.There were few systematic reviews on risk and specific responsivity, while need and general responsivity had more evidence syntheses addressing them.However, the majority of these were low quality.For the risk principle, reviews directly examining the need principle, and the specific responsivity principle, the quality rating was typically low.More than half the reviews on general responsivity were low quality.One exception was that reviews on risk assessment tools, based on the need principle, were of medium to high quality.Overall, the findings on effect sizes and low quality of the underlying evidence raise important and timely questions regarding the continued application and utility of RNR as a model informing criminal justice services.In particular, a number of serious limitations undermine conclusions drawn in previous RNR reviews, a literature that has been dominated by the model    Note.ORs reported are for increased recidivism risk.Meta-analyses highlighted in grey include overlapping primary studies.OR = odds ratio.
developers.Here, we outline five key challenges to the evidence.

Authorship bias
The primary studies used for the RNR model, which was mainly developed by Andrews and Bonta, draw heavily on research authored by them, their colleagues and students.This pattern of relying on studies by the model's authors is recognized by Bonta and Andrews (2017), as well as by others (Herzog-Evans, 2017;HM Inspectorate of Probation, 2020;Ogloff & Davis, 2004;Polaschek, 2012;Ward et al., 2007).However, Andrews and Bonta have argued that criticisms focusing on authorship are unhelpful, and a generic criticism does not necessarily correlate with the quality of the underlying work.Further, they posit an alternative explanationthat primary studies by authors and developers tend to put additional effort into treatment fidelity and integrity because they are more invested.While authorship allegiance does not necessarily discount a particular piece of work, it cannot be ignored when conducting evaluations of methodological quality and risk of bias.We found that reviews with potential authorship bias were mostly low quality apart from two meta-analyses of risk assessment tools.For evidence-syntheses on the four RNR principles, all included reviews with potential authorship bias had the lowest quality score (i.e.0/7).Specifically, they scored low in AMSTAR and ROBIS ratings, and data on other key aspects of quality were missing.
Authorship bias has been studied more broadly in treatment and prediction research, with research finding larger effects where allegiance exists, including effects of psychotherapy on recidivism (Dragioti, Dimoliatis, Fountoulakis, & Evangelou, 2015), accuracy of violence risk assessment tools (Boccaccini, Marcus, & Murrie, 2017) and mindfulnessbased interventions for psychiatric disorders where independent studies showed no effect (Goldberg & Tucker, 2020).Our findings are consistent with these studies in finding a strong authorship effect with odds ratios and AUCs being higher in reviews authored by RNR developers and colleagues than those authored by independent groups.For instance, in meta-analyses on the need and general responsivity principles, those cowritten by model developers reported the highest ORs, suggesting potential overestimation of the effect sizes.
In light of the documented allegiance effects in intervention and prediction research, it is notable that many of the included reviews did not address or disclose potential conflicts of interest.This is particularly important when there are potential financial conflicts of interest.Andrews, Bonta, and Wormith hold or have held proprietary rights in the tools developed as part of the RNR model, including the LSI-R, and are reported to receive royalties from sales of the tools and associated training materials (Prins & Reich, 2021).However, articles authored by them and their colleagues consistently do not disclose these competing interests or other potential financial and non-financial conflicts of interest.Lack of transparency contributes to low scores on AMSTAR and ROBIS checklists for their reviews.The absence of financial conflicts of interest disclosures is common in treatment studies (Eisner, Humphreys, Wilson, & Gardner, 2015).

Transparency and accessibility
Failure to report financial and non-financial potential conflicts of interest is one marker of wider issues in reporting standards.For example, many included reviews provided incomplete or no information regarding search strategy, sample size and characteristics, treatments given to control groups, or primary study characteristics and results.In some cases, the included primary studies were not listed at all.Additionally, certain reviews referenced previous articles and studies without proper citation, expecting readers to locate this information independently, for instance master theses and unpublished documents (e.g., Andrews, Dowden, & Gendreau, 1999;Dowden, 1998).
This combination of missing, inaccessible, dispersed, and dated information makes evaluating the evidence base challenging.In particular, many items necessary for quality assessment were unavailable for several reviews, including basic information such as sample size.This was the case for reviews on all RNR principles, except for those on risk assessment tools reflecting the need principle (Burghart et al., 2023;Fazel et al., 2022;Olver, Stockdale, & Wormith, 2009).Meanwhile, as noted above, the tools developed to accompany the RNR model and translate it into practice (including the LSI-R and LS/CMI) are behind a paywall.Thus, they cannot straightforwardly be reviewed, tested, or examined without purchasing them and their related training servicesand even then, there is a lack of transparency regarding how the scoring system works and how the risk thresholds were constructed, leading to difficulties in understanding how individual risk factors contribute to the ultimate risk scores (Fazel, Sariaslan, & Fanshawe, 2022).

Poor quality primary studies
Across the RNR principles, a lack of high-quality primary studies is a recurring theme.This is firstly due to a lack of randomized controlled trials (RCT), which represent the highest-quality design for assessing the principle of responsivity in particular.Several meta-analyses included no eligible RCTs (e.g., Olver, Stockdale, & Wormith, 2011).In other reviews, RCTs comprised a small proportion of included primary studies (e.g., Babcock, Green, & Robie, 2004;Hanson, Bourgon, Helmus, & Hodgson, 2009;Koehler et al., 2013).This limitation is compounded by reporting issues.In some reviews that have included a mixture of study designs, they did not report or test whether effect sizes differed by design (e.g., Andrews & Dowden, 2006;Dowden & Andrews, 2003), making it difficult to evaluate the overall effects.When reviewing the underlying evidence for RNR principles, for example, Koehler et al. (2013) categorized studies based on their adherence levels to a particular principle, distinguishing between low, moderate, and high adherence.However, it is not clear how exactly each study was rated, and the moderator analyses conducted aimed at comparing these three categories were not interpreted correctly.The tests for heterogeneity were non-significant, suggesting no clear differences between adherence levels to tested RNR principles.Despite this, the significant effect across high adherence studies was taken as evidence for RNR's effectiveness.
Even when there have been eligible RCTs, there were quality problems, in particular with regard to control group treatment and reporting.For controls, various interventions have been used, including 'treatment as usual', no treatment, and waitlists.However, there has been limited discussion of how these might affect findings.This is problematic as research has indicated that the nature of control treatments can modify effect size (Cuijpers et al., 2013).Waitlist controls, for example, have been shown to inflate treatment effects, as those on the waitlist may feel neglected, unsupported, or resentfulall of which can contribute to recidivism (Beaudry, Yu, Perry, & Fazel, 2021;Flint, Cuijpers, Horder, Koole, & Munafò, 2015).A further problem is poor reporting, with many studies failing to report what treatment, if any, the control group received and several reviews including effect sizes that combined multiple control treatments without discussion of how they differentially influence treatment effects.
Many primary studies are non-randomised controlled trials, which also have methodological limitations.Some meta-analyses included in this umbrella review examined case-control studies that compared the recidivism rates of a control group receiving non-RNR adhering treatment to an active arm receiving RNR-adhering treatment.As this leads to confounding by indication (i.e. the treatment groups have different background characteristics), in order to address the lack of randomisation in group allocation, matching is done in many studies.Some of these specify their matching criteria, such as ensuring that the groups had the same recidivism baseline risk level, while others simply claimed to have matched on all 'key variables' (Koehler et al., 2013).Such an approach, however, is flawed when drawing causal inferences between treatment characteristics and recidivism rates as the RNR model does (Kyriacou & Lewis, 2016).This is because it is not possible to determine whether resulting differences in effects are a result of the treatment or instead due to confounding factors, such as differences between groups that have not been or inadequately matched.Even then, residual confounding will have to be considered.In particular, non-randomly assigned studies may involve self-selection into the treatment group.This in turn is indicative of a range of factors likely to affect recidivism rates, such as motivation to change and attitudes towards the justice system.Simply matching on risk or demographic variables cannot account for these differences.This means that some individuals are less likely to recidivate even before treatment begins, and any resulting 'treatment effects' may be a product of these differences rather than the treatment itself (Davey Smith & Ebrahim, 2002).

Subgroup analyses
The RNR framework was developed in the context of what works, in what circumstances, and for whom.Therefore, many included metaanalyses compare results between and within study populations, focusing on the effects of treatment programmes adhering to RNR principles compared to those that do not.However, unlike RCTs where confounding is minimized by randomizing treatment and control populations, these meta-analyses first calculate overall effect sizes on recidivism for treatment versus no treatment.They then conduct metaregression analyses to examine whether study-level covariates related to RNR principles (i.e., coding studies as adhering to or not adhering to the principles of risk, need, general, and/or specific responsivity) reduce recidivism.
This approach has two validity risks.First, as the number of trials is small, meta-regression results will be underpowered to detect study level characteristics robustly associated with changes in overall treatment effect.Second, this approach makes confounding across included trials likely (Riley et al., 2022).For example, trials adhering to the principles might also be conducted in different countries, settings and populations, or be using varying forms and dosages of treatment.This is relevant to conclusions drawn and means that the causal statementsthat adherence to the principles is the cause of the lower recidivism rates are misleading.
The RNR literature has mostly used recidivism as its sole measurable outcome in a binary way, and examining the prevalence, frequency, severity, and imminence of reoffending could be explored.In addition, a range of reoffending-related outcomes, such as parole violation, police contact, arrest, warning, conviction, or incarceration (e.g., Beck, 2001;Weisberg, 2014), could be considered.Sensitivity analyses should be conducted to test whether it is possible to combine all outcomes.When narrower definitions (such as rearrest, reconviction, etc.) are used, there are some findings that point to intervention effectiveness for some outcomes but not others (Bouchard & Wong, 2024).

Conflating prediction with causality
The weaknesses discussed above primarily concern the quality of the evidence base, questioning claims about the validity and reliability of the empirical findings supporting the RNR model (HM Inspectorate of Probation, 2020).One issue relates to the concept of risk in the model.The risk principle involves matching treatment intensity to recidivism risk level.However, the primary studies often fail to differentiate between higher and lower risk cases within the same study, thus aggregating approaches to risk measurement.This aggregation method involves categorizing an entire study sample as either high or low risk, based on factors like prior justice system involvement or current correctional supervision.The literature on needs, important for translating the need principle into practical tools, assumes that factors with high predictive power have a causal explanatory role in reoffending.However, predictive power does not imply causation (Ramspek et al., 2021).For instance, age is a strong predictor in predicting risk for heart disease in Framingham, QRISK and other such prediction models but it is not causal (Hippisley-Cox, Coupland, & Brindle, 2017).This conflation of prediction with causation is problematic, particularly when applied to individuals from minority ethnic backgrounds and lower socioeconomic status.Such factors may act as proxies for structural issues or familial/ residual confounding and may not directly indicate the likelihood of criminality or harm.

Conclusion
This umbrella review of underlying systematic reviews and metaanalyses has examined the evidence in support of the Risk-Need-Responsivity (RNR) model.Despite RNR's widespread use in criminal justice and claims from experts, we found that the evidence base is mostly low quality and inconsistent.We outlined five key limitations underlying this low quality that are primarily based on reliability and validity of empirical findings testing the model, and nature of the conclusions drawn.Whether the RNR model has continuing practical utility needs to be more carefully examined, and higher quality research designs are necessary to demonstrate any impact and address theoretical concerns.Without this, introducing RNR into new jurisdictions should not be recommended.

Fig. 3 .
Fig. 3. Meta-analyses examining the effectiveness of the RNR 'need' principle on recidivism.Note.Non-independent meta-analyses (k = 1) excluded.Those highlighted in grey include overlapping primary studies.

Fig. 4 .
Fig. 4. Meta-analyses examining the discriminative accuracy of risk assessment tools based on the RNR 'need' principle on recidivism.Note.Meta-analyses with potential authorship bias (k = 2) are excluded.AUC = area under the curve.AMSTAR and ROBIS are quality checklists.

Fig. 5 .
Fig. 5. Meta-analyses examining effect of the RNR 'general responsivity' principle on recidivism.Note.Meta-analysis with potential authorship bias (k = 1) are excluded.Those highlighted in grey include overlapping primary studies.OR = odds ratio.

Fig. 6 .
Fig.6.Meta-analyses examining subgroup effects based on the RNR 'Specific Responsivity' principle on recidivism.Note.ORs reported are for increased recidivism risk.Meta-analyses highlighted in grey include overlapping primary studies.OR = odds ratio.