Neuropsychological assessment of aggressive offenders: a Delphi consensus study

Objective This study explores the intricate relationship between cognitive functioning and aggression, with a specific focus on individuals prone to reactive or proactive aggression. The purpose of the study was to identify important neuropsychological constructs and suitable tests for comprehending and addressing aggression. Methods An international panel of 32 forensic neuropsychology experts participated in this three-round Delphi study consisting of iterative online questionnaires. The experts rated the importance of constructs based on the Research Domain Criteria (RDoC) framework. Subsequently, they suggested tests that can be used to assess these constructs and rated their suitability. Results The panel identified the RDoC domains Negative Valence Systems, Social Processes, Cognitive Systems and Positive Valence Systems as most important in understanding aggression. Notably, the results underscore the significance of Positive Valence Systems in proactive aggression and Negative Valence Systems in reactive aggression. The panel suggested a diverse array of 223 different tests, although they noted that not every RDoC construct can be effectively measured through a neuropsychological test. The added value of a multimodal assessment strategy is discussed. Conclusions This research advances our understanding of the RDoC constructs related to aggression and provides valuable insights for assessment strategies. Rather than suggesting a fixed set of tests, our study takes a flexible approach by presenting a top-3 list for each construct. This approach allows for tailored assessment to meet specific clinical or research needs. An important limitation is the predominantly Dutch composition of the expert panel, despite extensive efforts to diversify.


Introduction
Aggressive offenses have far-reaching consequences for individuals and society, including financial strain on health and justice sectors, public safety issues, reduced quality of life for victims, their relatives, and the offenders (Patel and Taylor, 2012;Langton et al., 2014;Rivara et al., 2019).Neuropsychological profiling is an underused clinical tool to assess the complex web of risk factors for aggressive behavior.This is surprising as the intricate relationship between cognitive functioning and reactive and proactive aggression has been widely studied.Although empirical studies and systematic reviews have uncovered neurocognitive mechanisms underlying reactive and proactive aggression (Alcázar-Córcoles et al., 2010;Kuin et al., 2015;Van De Kant et al., 2020), expert knowledge on individual neuropsychological assessment has not been integrated into the current body of research.An overarching framework that bridges the gap between fundamental research and clinical experience is therefore much needed.In the current study, a panel of experts is asked (i) which Research Domain Criteria (RDoC) domains (Insel et al., 2010; further explained below) are important in explaining reactive vs. proactive aggression and, (ii) which neuropsychological tasks are suitable for assessing those domains.
A common definition of aggression is "behavior that is intended to harm another person who is motivated to avoid that harm" (Allen and Anderson, 2017).Notably, aggressive offenders make up a substantial proportion (up to 70%) of prisons, forensic hospitals and outpatient mental health facilities (McMurran et al., 2000;Völlm et al., 2018).There is a great need for research into the risk factors of aggressive behavior to help reduce recidivism (Smeijers, 2017;Wigham et al., 2022).One of the potential risk factors for aggression that warrants exploration is cognitive functioning, particularly through neuropsychological assessments.
Cognitive limitations are more prevalent among offenders than in the general population (Ogilvie et al., 2011), particularly among aggressive offenders (Cruz et al., 2020).For example, research found a prevalence of clinically significant executive deficits (a subset of cognitive functions) in an offender population ranging from 5.2% to 27.2% (correctional offenders) and 9.5-35.7%(forensic psychiatric patients), compared to 2.5% in the general population (Shumlich et al., 2019).Furthermore, multiple factors can be at the root of cognitive limitations, including traumatic brain injury, substance abuse, and attention deficit hyperactivity disorder, all of which are more prevalent among offenders (Harris, 2006;Ginsberg et al., 2010;Farrer and Hedges, 2011;Frost et al., 2013;Fayyad et al., 2017;Hellenbach et al., 2017;Muñoz García-Largo et al., 2020;Matheson et al., 2022).As such, it is necessary to further highlight the role of cognitive factors in the context of offending behavior, and this study aims to do so by improving knowledge about neuropsychological assessment in forensic populations.

. Reactive and proactive aggression
The term "aggression" refers to a spectrum of acts that range from shouting or pushing to aggravated assault or homicide.By the definition stated above, rape, sexual assault, and robbery would also be classified as aggressive offenses.As the literature shows, most offenders are generalists, meaning they commit more than one type of crime in their lives (Simon, 1997;Soothill et al., 2000;Sullivan et al., 2006).Therefore, we chose to include aggressive sexual-or property crimes while nonaggressive crimes such as fraud were outside the scope of this study.Understanding the different determinants of aggression has been a subject of interest in various fields such as psychology, criminology, and neuroscience since the mid-20th century.Several taxonomies have been proposed in the literature (Parrott and Giancola, 2007;Krahé, 2013), but there is no consensus yet about which categorization is most appropriate.The most wellknown distinction is the reactive-proactive dichotomy, sometimes referred to as hostile-instrumental (Buss, 1961).Reactive aggression occurs in reaction to a provocation or frustration and is impulsive in nature.Proactive aggression on the other hand is generally goal-directed and premeditated.Both types of aggression can occur within an individual, and thus, the strict classification into one of these two categories has been disputed in the literature (Bushman and Anderson, 2001).Currently, a dimensional view of aggression is favored, acknowledging that individuals often exhibit varying degrees of both reactive and proactive aggression rather than rigidly categorizing them into distinct types.Interestingly, research on factors associated with or related to reactive and proactive aggression provides empirical support for the usefulness of the distinction.For example, reactive aggression has been linked to heightened emotional reactivity, impulsivity, verbal impairments and impairments in executive functioning, and hostile attribution bias.Proactive aggression on the other hand is linked to a lack of moral emotions, callous and unemotional traits, and low physiological arousal (Cima and Raine, 2009).To summarize, individuals can exhibit both types of aggression, with a tendency toward one type, reflecting a predominant behavioral disposition.As both types of aggression appear to be related to different constructs, the current study considers both types of aggression separately.

. Research Domain Criteria (RDoC)
The National Institute of Mental Health (NIMH) developed the RDoC (Insel et al., 2010), to-as opposed to traditional categorial diagnostic systems such as the Diagnostic and Statistical Manual of Mental Disorders (DSM; American Psychiatric Association, 2022)investigate core dimensions of functioning that underlie various mental health conditions.In addition, as aggression can occur within various mental health conditions such as personality disorders, intermittent explosive disorder and conduct disorder, the RDoC framework provides a transdiagnostic perspective to uncover shared mechanisms that contribute to aggression across these diverse disorders.The RDoC describes six domains: (1) Negative Valence Systems, responsible for responses to aversive situations or context, such as fear, anxiety, and loss; (2) Positive Valence Systems, responsible for responses to positive motivational situations or contexts, such as reward seeking, consummatory behavior, and reward/habit learning; (3) Cognitive Systems, responsible for various cognitive processes; (4) Social Processes, which mediate responses to interpersonal settings of various types, including perception and interpretation of others' actions; (5) Sensorimotor Systems, responsible for the control and execution of motor behaviors, and their refinement during learning and development; and (6) Arousal/Regulatory Systems responsible for generating activation of neural systems as appropriate for various contexts, and providing appropriate homeostatic regulation of such systems as energy balance and sleep (see Figure 1). .

Aggression and cognitive domains
In this section, the existing knowledge regarding the interplay between aggression and the domains outlined in the RDoC framework is briefly elucidated.If available, we refer to systematic reviews and/or meta-analyses.Neuropsychological studies have revealed differences between (aggressive) offenders and nonoffending controls in different RDoC domains, such as Cognitive Systems (including executive functions, attention, and language) (Cohen et al., 2003;Ogilvie et al., 2011;Anderson et al., 2016;Burgess, 2020;Chow et al., 2022), Social Processes (Marsh and Blair, 2008;Karoglu et al., 2022), and Positive/Negative Valence systems (Estrada et al., 2019;Manning, 2020; mainly reward and threat processing).In our recent multi-level meta-analysis, we have studied all domains of cognitive functioning in relation to offending behavior (Hutten et al., preprint).Overall, offenders performed worse on neuropsychological tests than non-offending controls, and this was the case for all of the cognitive domains studied.A notable observation from this meta-analysis was the substantial variation in tests (146 different tests), and the lack of studies from non-Western countries.Through the Delphi method, we aim to gather insights from forensic neuropsychology experts across the world to obtain consensus on the most suitable tests to measure neuropsychological functioning in aggressive offenders.With this, we aim to expand on this empirical knowledge by connecting research findings and their translational application in forensic practice.
The primary goal of offender rehabilitation is reducing recidivism.A recent global systematic review found 2-year recidivism rates of 18-55% after incarceration and 10-47% after community sentences (Yukhnenko et al., 2023).Psychological treatment has a small but positive effect on recidivism in violent offenders, with a 10.2% difference in recidivism between treated vs. non-treated offenders (Papalia et al., 2019).Despite these findings, there remains a need for further enhancements in intervention strategies to reduce recidivism more effectively.More knowledge on the relation between the RDoC domains and aggression could enhance offender rehabilitation in several ways.Studies have found worse executive functioning in recidivists compared to first time offenders (Ross and Hoaken, 2011;Sánchez de Ribera et al., 2022).Conventional risk assessment tools appear to have reached their ceiling effect, achieving a moderate area under the curve of 0.70, (Monahan and Skeem, 2014;Ogonah et al., 2023).Risk-assessment tools often measure cognitive factors like impulsivity and self-control through less objective methods such as observer ratings and self-reports.Neuropsychological tasks are considered more objective, excluding the impact of compromised self-insight (Steward and Kretzmer, 2022).Accordingly, expanding risk assessment to include neuropsychological and neurobiological factors alongside the existing psychosocial risk factors may enhance the accuracy recidivism predictions (Aharoni et al., 2013(Aharoni et al., , 2014;;Haarsma et al., 2020;Zijlmans et al., 2021;Nauta-Jansen, 2022).In addition to predicting recidivism, cognitive functioningin particular inhibitory and cognitive flexibility difficulties-also appears to predict treatment dropout and treatment success (Fishbein et al., 2009;Cornet et al., 2014).Identifying the specific cognitive domains that are impaired in offenders and related to aggression is crucial to providing targeted interventions and reducing the risk of criminal behavior.For example, an aggression regulation training could be suitable for individuals with aggression arising from inhibitory problems, while people with difficulties in emotion recognition might benefit more from an emotion recognition training (Li et al., 2023).Hence, misidentification of the determinants of the aggression may lead to suboptimal treatment.
Although a clear link has been demonstrated between cognitive limitations and aggression, the use of neuropsychology in forensic settings has not reached its full potential.For example, incorporation of neurobiological information in Dutch pretrial forensic reports was low and did not rise significantly from 2005 to 2015(Kempes et al., 2019)).Additionally, even when neurobiological factors were acknowledged in relation to the offense, they were often overlooked in discussions about future risk assessment and -management.There are three explanations for this observation which are not mutually exclusive.First of all, clinicians are likely to struggle identifying the most suitable instruments as there is a plethora of neuropsychological tests available.A systematic review on neuropsychological assessment practices in forensic settings found a notable diversity in assessment tools, with 140 different types of tests.The authors conclude that a wide range of neuropsychological functions are being measured by a large number of instruments (Venturi Da Silva and Cavalheiro Hamdan, 2022).Related to this, many tests have multiple outcomes-often measuring different cognitive functions-or multiple ways to calculate the outcomes.This heterogeneity may compromise the reliability of test results and raises questions about how information is understood by clinicians and legal practitioners (Serafim et al., 2015).Second, for most neuropsychological tests normative data are collected from general population samples and have not been validated for the offender population.Possibly, the use of default norm scores leads to insufficient differentiation among individuals in the offender setting (Cornet et al., 2016).As such, it remains unclear which tests are most sensitive and suitable for the aggressive offender population.Third, offender populations present unique challenges in conducting neuropsychological assessments, such as high rates of noncompliance, low motivation (for treatment and/or assessment), and limited education and literacy levels (Hetland et al., 2007;Tuominen et al., 2014).Cultural and linguistic differences may also need to be considered when conducting neuropsychological assessments with offender populations.Considering these challenges, further research and tailored approaches are required to address the selection of suitable tests and norms for the aggressive offender population, to ensure accurate and reliable assessments.

. Study objectives
This study aims to identify the most suitable neuropsychological tests for cognitive assessment within the aggressive offender population, distinguishing between predominantly reactive vs. proactive aggressive offenders.With this, our research contributes to the advancement of forensic neuropsychology.By pinpointing the specific cognitive domains associated with both reactive and proactive aggression, we aim to pave the way for more targeted assessments and interventions in aggressive offender populations.To achieve this goal, we need to bridge the gap between research and clinical practice and strife toward consensus among an international panel of experts from the field of forensic neuropsychology.In the current study, we will apply the Delphi methodology for this purpose.Our objectives encompass two categories of questions posed to the expert panel: firstly, we seek theoretical insights into the constructs commonly associated with aggression, emphasizing their significance in the evaluation of aggressive offenders; secondly, we aim to pinpoint the most suitable tests for this evaluation, thereby facilitating future test selection in forensic contexts.

Materials and methods
This study was preregistered at AsPredicted (#103758) and has been approved by the Ethics Review Board (ERB) of the University of Amsterdam (ERB number: 2022-BC-15289).

. The Delphi methodology
We conducted a Delphi consensus study to obtain consensus among an international panel of experts in forensic neuropsychology.While meta-analyses and reviews allow us to have and overview of the current scientific knowledge, the Delphi method allows us to obtain insight into the existing clinical expertise.The Delphi method is a technique used to achieve consensus among a group of experts by soliciting their opinions through a series of questionnaires and providing them with controlled feedback (Dalkey and Helmer, 1963).The Delphi method is based on the concept of collective wisdom, which assumes that the combined opinion of multiple people is closer to the truth than a single individual's perspective (Habibi et al., 2014).Recently, researchers have been striving to achieve consensus on various neuropsychological topics, such as the definition of the term 'impairment' or inconsistent use of test score labels (e.g., Guilmette et al., 2020).Our study aligns with these developments.The Delphi methodology, with its collaborative and iterative nature, serves as an effective tool within this context, facilitating the establishment of a shared foundation for understanding and addressing diverse neuropsychological considerations in the field.This is carried out by aggregating the results of online, anonymous questionnaires in a systematic way.The current study consisted of three rounds, which are described in 2.4 Procedure and data analysis.

. Expert panel selection
Both researchers and clinicians employed in the field of forensic neuropsychology were invited to participate in the panel.Potential researchers were identified based on the articles that emerged from our literature review which is in review (Hutten et al., preprint).The researchers who had a minimum of two publications on the topic of forensic neuropsychology, of which one in the last five years (to confirm that they were still actively engaged in the field) were approached to participate in the Delphi study.For clinicians, the inclusion criterium is at least 4 years' experience as a (clinical) neuropsychologist in the forensic setting.Recruitment took place through the author's networks, (international) societies or networks for neuropsychology/forensic psychology, social media, and through the "snowballing technique" (Iqbal and Pipon-Young, 2009).Panels with 10 to 50 members are recommended for Delphi studies (Turoff, 1975).In total, 127 potential experts were invited personally by email.Sixty-three potential experts started the questionnaire and provided digital informed consent.Thirty potential experts responded they could not participate (no time: 13, questioned their own expertise: 15, no reason: 2).Finally, 32 experts completed the first-round questionnaire.
. Research Domain Criteria (RDoC) This Delphi study was based on the RDoC framework (Insel et al., 2010).The RDoC model is a research framework that approaches mental health and psychopathology by examining major domains of basic human neurobehavioral functioning, rather than relying on traditional diagnostic categories.The model consists of six major functional domains (see Figure 1), and each domain is studied by exploring different aspects using constructs that are examined across a range of functioning from normal to abnormal.

. Procedure and data analysis
In three consecutive rounds of online questionnaires (compiled through Qualtrics, 2023), experts rated the importance of a predetermined list of the RDoC constructs on a 5-point scale from 1 "not important" to 5 "essential", with a non-neutral midpoint of 3 (moderately important).Using a non-neutral midpoint forces panelists to deliberate and to decide about the importance of the constructs.If they felt incompetent to answer a question, a "don't know" option was available (Linstone and Turoff, 1975).Subsequently, the panel members provided suggestions for tests that can be used to measure the constructs they rated at least moderately important (rating 3 or higher).In addition, they rated each other's test suggestions as suitable or not suitable for aggressive offenders.Throughout the questionnaires, panel members can provide explanations or reasoning.Before distributing the questionnaire for the first round, two clinical neuropsychologists filled in the questionnaire to provide feedback and ensure clarity of the questions.
After each round, the constructs that did not achieve consensus (about their importance) moved into the subsequent round for rerating.Our operationalization of consensus is interquartile range (IQR) ≤ 1.For a four to five-point Likert scale, an IQR of 1 or less is considered a high level of consensus (Raskin, 1994;Rayens and Hahn, 2000).
For the importance ratings of the RDoC constructs, means and standard deviations are reported.We conducted Mann-Whitney U tests to analyze the difference in importance scores between reactive and proactive aggression, primarily due to the ordinal nature of the data.For the suitability of tests, we reported the percentage of the panel that rated the test as suitable.

. . Round
The objectives of the first round were (1) to identify the most important RDoC constructs that should be included in the assessment of aggressive offenders, and (2) to collect suggestions for tests that are recommended to assess these constructs.Before the experts started with the main questions, they were asked to fill in some information about their age, gender, profession, current workplace, and academic degree.
Then, the panel members were asked to rate the importance of the RDoC constructs.For the constructs they rated as at least moderately important, they were also asked to rate the importance of the underlying subconstructs.The experts were able to suggest additional constructs not delineated in the RDoC.The research team (JH, JvH, SH, TZ, and HG) evaluated these suggestions to confirm they were not already covered in the RDoC, they were clearly described, and they were within the scope of the RDoC [as suggested by Jorm (2015)].These additional constructs were then added to subsequent rounds.
Next, for the constructs that they rated as at least moderately important, the experts gave suggestions for tests that they recommend for the assessment of this construct.They could give several suggestions per construct.

. . Round
The 32 panel members who completed round 1 were invited to participate in round 2 of the study, which 26 of them did.(Sub)constructs that did not reach consensus in round 1 were rated again.These constructs were presented along with feedback outlining the average panel rating, each expert's own previous response, and a synopsis of comments that were offered by experts in support of their opinion.In addition, the additional constructs added by the panelists in round 1 were rated for importance.Then, the experts scored the suitability of the recommended tests suggested in round 1 (suitable/not suitable/don't know).If the round 1 tests suggestions were not specific enough-e.g., a test category such as "gambling tests" or a measurement goal such as "verbal comprehension tests"-the panel was asked to specify in this round.

. . Round
The 26 panel members who completed round 2 were invited to participate in round 3 of the study.Round 3 was completed by 24 panel members.This round was mostly similar to round 2. In addition, the top-3 tests that were rated most frequently as suitable and were known by at least half of the panel were presented to the panel members.They were asked to rank these tests from most to least suitable.

Results
Thirty-two experts completed round 1 of the study (mean age = 43.44,SD = 11.20, 15 males, 17 females).Characteristics of the expert panel are displayed in Table 1.Despite repeated attempts (see paragraph 2.2) to gather an international expert panel, most experts were currently working/living in the Netherlands (n = 17).Seven of the experts were researchers, nine were clinicians, fifteen professionals integrated their therapeutic work with scientific research, and one was currently employed as manager.Of the original panel, twenty-four completed all three rounds of questionnaires and were included in the consortium.In round 1, for each construct a panel member rated as at least moderately important (rating 3 or higher), the panel member was asked to suggest one or more tests to measure this construct.In total, 223 different tests were suggested by the panel.
In round 2, the panel rated these tests as "suitable", "not suitable" or "don't know".In round 3, we presented the panel with the three most-suitable tests (that were known by at least half of the panel) per construct, and we asked them whether they agreed with this top three.However, many did not fill in these questions.One explanation is that they did not know one or more of the tests, making it impossible to rank them.Another possibility is a decrease in motivation as the questionnaires were quite extensive and time consuming.Because of this, we based the top-3 tests in Table 3 on the suitability ratings from round 2. For certain constructs, fewer than three tests were familiar to at least half of the panel, resulting in less than three test suggestions (or even zero) being included in the overview.
To aid clinicians in their test selection, we included some practical information about the administration time, age range, manual, and psychometric properties of the tests.We derived this information from test manuals, systematic reviews, and books.If these were not available, we reported on single studies with a sample that was most similar to the aggressive offender population.Our goal was not to create an exhaustive and comprehensive overview, as it falls beyond the scope of this study.Therefore, we refer readers to the British Psychological Society test reviews using the EFPA review model (British Psychological Society, n.d.), the Buros Center for Testing (Buros Center for Testing, n.d.), or for Dutch readers the COTAN (NIP, n.d.) for more information about the psychometric properties of tests.
Below, we discuss the results per domain, sorted by importancerating (see Table 2 and Figure 2).First, the importance ratings are discussed, including the reasoning provided by the panel members.Then, the test suggestions are discussed.

. Negative valence systems
Negative Valence Systems were rated as the most important domain (M = 3.81, SD = 0.14), with a significant difference between reactive (M = 4.08, SD = 0.32) and proactive (M = 3.54, SD = 0.19) aggression (U = 2, p = 0.009).The ability to learn from one's own errors was suggested as an addition to this domain in round 1. Reaction to threat was rated as more important for reactive than for proactive aggression (acute: M = 4.50 vs. 3.24, potential: M = 4.31 vs. 3.44, sustained: M = 4.12 vs 3.68).The panel reasoned that as reactive aggression is driven by an immediate emotional reaction to a perceived threat or provocation, these constructs are more relevant in reactive aggression.Anxiety might make individuals more sensitive to perceived provocations, increasing the likelihood of aggression.Prolonged exposure to threat (sustained threat) might result in chronic stress and might cause individuals to use aggression to end the threat.Loss and being unable to achieve goals or experience rewards (frustrative non-reward) can lead to feelings of anger, sadness and disappointment.Aggression might be a way to cope with these feelings.
Reward Valuation (R: ., P: .)  (Duckworth and Kern, 2011) • Executive functions: r = 0.11, N = 1982 • Delay tasks: r = 0.17, N = 189 • Self-report: r = 0.17, N = 402 • Informant-report: r = 0.13, N = 506 Working Memory (R: ., P: .) Agency and Ownership (R: ., P: .) Sleep/Wakefulness (R: ., P: .): No tests were suggested Language (R: ., P: .) Habit -Sensorimotor (R: ., P: .)  In total, 22 neuropsychological tests were suggested by the panel to assess Negative Valence Systems.The panel commented that it might be better to assess this domain by including biological measures (heart rate, eye tracking/pupil size, skin conductance) or self-report.For frustration, observation from potentially frustrating tests was also suggested, however, it was noted that test observations should not be confused with objective test results.Frustration from not performing the test correctly is not the intended measurement of the test and therefore, not an objective test result.For some Negative Valence constructs, it was impossible to validly assemble a top-3 as many tests that were suggested were unknown by more than half of the panel.Therefore, Acute Threat and Loss have only two tests in the overview and Sustained threat only one.

. Social processes
The domain Social Processes was rated as the second most important overall (M = 3.80, SD = 0.33), with a nonsignificant difference between reactive (M = 3.90, SD = 0.35) and proactive (M = 3.70, SD = 0.37) and aggression (U = 190.5, p = 0.097).Four additional constructs were added to this domain: sympathy, moral reasoning, ability to correctly understand the authenticity of others' emotions, and emotional contagion.The panel commented that the ability to interpret social cues (including other people's emotions and social ambiguity), is crucial in understanding and preventing aggression.Misinterpretations can increase the risk of both reactive and proactive aggression.However, proactive aggression may be less influenced by this as people with high callous-unemotional traits tend to be less concerned with other people's emotions.Another important aspect within this domain is empathy.High callous unemotional traits in individuals displaying proactive aggression often involves cognitive empathy without affective empathy, enabling manipulative behavior.
It was noted by the panel that it might be more feasible to assess Social Processes with interviews, questionnaires and observations instead of neuropsychological tests.In total, 34 tests were suggested for this domain, resulting in a top-3 tests for each construct.Working memory is needed to process and react to triggers (reactive aggression), but on the other hand, working memory is required to plan proactive aggressive behaviors.Attention was deemed important in reactive aggression, as it can be biased toward potential threats, while ignoring neutral or friendly information.Cognitive control is linked to inhibition, which can help prevent future (especially reactive) aggression.In addition, cognitive control is important to be able to find non-aggressive solutions to problems, and to apply lessons from therapy into daily life (also related to declarative memory).Language was considered important in assessing aggression because poor verbal skills can hinder the ability to find non-aggressive solutions in conflicts, potentially leading to misunderstandings and frustration.The role of perception was somewhat unclear among the panel members.Counterfactual reasoning and information processing speed were added as additions to this domain.

. Cognitive systems
For Cognitive Systems, a large number (154) of different tests was suggested.The top-3 tests per construct are displayed in Table 3.

. Positive valence systems
Positive Valence Systems were rated with M = 3.54 (SD = 0.11).Interestingly, this was the only domain that was deemed significantly more important for proactive (M = 3.76, SD = 0.11) than for reactive aggression (M = 3.32, SD = 0.14; U = 144, p < 0.001).The panel members reasoned that individuals with high reward responsiveness may be more motivated to engage in proactive aggression, driven by the pursuit of rewards and experiencing greater pleasure and motivation when such rewards are at stake.Reward learning is important for proactive aggression, as individuals who have learned that aggressive behavior leads to desired outcomes are more likely to repeat such behavior to achieve their goals.Lastly, the value of potential rewards shape proactive aggression, with those highly valuing rewards associated with aggression, like financial gain or social status, being more inclined to engage in this form of aggression.Reactive aggression is more indirectly related to reward as the alleviation of distress or protection can be considered the reward in this context.
The experts suggested 14 different tests to assess Positive Valence Systems.The top-3 tests are displayed in Table 3.For Reward Valuation, there were only two tests known by half of the panel.It was noted by the panel that many of these tests do not directly measure reactions to rewards, but this can be inferred through observation.
. Arousal/regulatory systems Arousal/Regulatory Systems were rated with M = 3.44 (SD = 0.50) importance overall, with non-significant difference between reactive (M = 3.66, SD = 0.63) and proactive (M = 3.21, SD = 0.38) aggression (U = 2, p = 0.400).The construct Arousal was considered very important in reactive aggression (M = 4.38), where impulsive and emotionally charged responses are common.Conversely, in proactive aggression (M = 3.63), the issue often revolves around the absence of arousal or under-arousal, suggesting a potential opposite relationship.It was highlighted that arousal is a state rather than a trait and is subject to rapid fluctuations influenced by environmental factors that can be challenging to measure.Disturbances in sleep and circadian rhythms could have consequences on daily mood patterns, possibly affecting emotional regulation and impulsivity.
The panel noted the absence of tests to measure arousal.Instead, they proposed physiological measures (such as EEG, heart rate variability, skin conductance, pupil dilation) behavioral/observational methods (such as wearables, questionnaires, or diaries), and neuroimaging.The seven tests that were suggested by the panel are often developed to measure different constructs such as motor skills, attention, and inhibition, and were all-except for the go/no-go task-unknown by half of the panel.Therefore, no top-3 could be validly constructed.

. Sensorimotor systems
The domain with the lowest importance rating was Sensorimotor Systems (M = 3.18, SD = 0.48), with nonsignificant difference between reactive (M = 3.25, SD = 0.48) and proactive aggression (M = 3.10, SD = 0.51; U = 45, p = 739).The construct "sensorimotor integration" was suggested as an addition to this domain.The panel reasoned that sensorimotor systems might be relevant in understanding reactive aggression, which can be impulsive and driven by limbic responses, particularly in individuals with trauma or dissociation.These motor reactions can lead to a loss of agency and ownership over actions, potentially becoming self-fulfilling.Automatic aggressive behaviors learned from early experiences may be tied to sensorimotor patterns, particularly in reactive aggression.However, there's debate over whether these constructs can be clinically measured and if they directly correlate with quantifiable aggression.
Nevertheless, the panel suggested 39 different tests to measure Sensorimotor Systems.These tests encompass a wide range from executive functioning/planning tests (e.g., tower tests) to tests that more directly measure motor skills and coordination.A panel member proposed the idea of using advanced technology like movement sensors and virtual reality to understand how people physically react to challenging situations.

. Additional suggestions
Lastly, the panel suggested four constructs that do not fit within the RDoC domains but might be worth considering when assessing aggressive offenders.For intelligence, the panel agreed (IQr = 1) that this is moderately to very important to include (intelligence: reactive 3.54, proactive: 3.75).It was noted that general intelligence might not provide additional information beyond the specific cognitive functions already encompassed within the model or if these specific functions might completely explain the association between intelligence and aggression.Secondly, cognitive distortions-which are biased or irrational patterns of thoughts and perception that can influence a person's beliefs, attitudes, and behaviors-were deemed moderately to very important (IQr = 1, reactive: 3.78, proactive: 3.87).A panel member noted that cognitive distortions are influenced by inner psychological patterns or past traumas and can cause a person to misinterpret what's going on, making them more likely to engage in violent behavior.Third, emotion regulation was rated as essential for reactive aggression (4.39,IQr = 1), but there was no consensus for proactive aggression (3.74, IQr = 2).Lastly, symptom/performance validity was added as a suggestion, but the panel did not reach consensus on this construct (reactive: 2.90, proactive: 3.14, IQr = 2).The panel members commented that the addition of symptom/performance validity tests is valuable for detecting feigned or exaggerated symptoms and can help to ensure that decision about risk assessment/management and legal decisions are based accurate information.However, these type of tests are less directly related to understanding the origins of aggressive/offending behavior per se.

Discussion
In this Delphi study, we investigated two questions by surveying an international expert panel.Firstly, we sought theoretical insights into the constructs commonly associated with aggression, emphasizing their importance in the evaluation of predominantly reactive vs. predominantly proactive aggressive offenders.Secondly, we aimed to pinpoint the most suitable tests for this assessment, thereby facilitating future test selection in forensic contexts.

. RDoC constructs
Overall, all RDoC domains were considered at least moderately important (>3) by the expert panel for the neuropsychological assessment of aggressive offenders.Taken together, Social Processes and Negative Valence Systems were rated as the most important in understanding aggression, while Sensorimotor Systems were considered least important.These findings are in line with studies that found a relation between aggression and executive functions and attention (Bergvall et al., 2001;Ogilvie et al., 2011;Burgess, 2020;Cruz et al., 2020), language (Cohen et al., 2003;Anderson et al., 2016;Chow et al., 2022), social cognition (Karoglu et al., 2022), and reward and threat processing (Estrada et al., 2019;Manning, 2020).Below, we will further explore the importance of the RDoC constructs considering the distinction between reactive and proactive aggression.

. Reactive vs. proactive aggression
The extent to which experts differed in their opinion about the theoretical importance of the RDoC constructs for understanding reactive aggression compared to proactive aggression was rather small for most domains.The most pronounced difference was that Positive Valence Systems were deemed more important to understand proactive aggression, whereas Negative Valence Systems were considered most relevant for understanding reactive aggression.Both come as no surprise based on previous research.Differences in reward processing are found in children and adults with conduct disorder, callous unemotional treats, antisocial personality disorder and psychopathy (Estrada et al., 2019).As these diagnoses are generally related to proactive aggression (Merk et al., 2005;Cima and Raine, 2009;White et al., 2015;Zhang et al., 2017), this outcome fits well into what we know.In addition, studies have shown that in people with impulsive-antisocial traits linked to psychopathy, their brains released more dopamine in the nucleus accumbens when exposed to rewards, suggesting an hyperreactivity to rewards (Buckholtz et al., 2010).This highlights the relevance of trying to unravel the antecedents of aggression for assessment and treatment.Reactive aggression on the other hand is a primary reaction to perceived treat.
For the other domains, the difference in perceived importance between reactive and proactive aggression were rather small.This may be explained by the fact that some RDoC constructs, such as arousal, are quite broad.It has been reported in empirical studies that reactive aggression involves high affective-physiological arousal while proactive aggression is characterized by minimal autonomic arousal (Chase et al., 2001;Blair, 2003).In other words, arousal might be important in both types of aggression, albeit in different ways.Another example: compromised working memory might be associated with increased reactive aggression, as it is needed to process and react to triggers, while in proactive aggression, working memory is required for planning acts of violence, making it equally important but in a different manner.In other words, while RDoC constructs are important to evaluate to gain insights into the determinants of both forms of aggression, they may play different roles in the two types of aggression.

. Expert recommendations for neuropsychological test usage
In total, 223 different tests were suggested by the panel.This indicates that an large number of neuropsychological tests have been developed in the past decades and attests to the field's rapid development.It also presents a challenge for clinicians in choosing the most suitable tests.In addition, aggression is a multifaceted construct that cannot be measured through a single test.The distinction between reactive and proactive aggression adds another layer of complexity.In response to these challenges, we constructed a guide for clinicians and researchers, a curated selection of the three most favored tests as assessed by our panel of experts.
It must be noted that our aim was to provide an overview that offers a selection of the most suitable tests to measure the RDoC constructs, rather than constructing a fixed battery of tests.By presenting an overview of the most important neuropsychological constructs along with the most suitable tests to measure them, clinicians and researchers can select specific constructs that are most relevant to their case.However, for certain subgroups, particularly when assessing patients with intellectual disabilities or patients who are illiterate, the tests suggested in our study might not be suitable.In those cases, clinicians are encouraged to seek for alternative tests.In the case of intellectual disabilities, it is proposed to use adapted versions of the original tests (such as the children's version) (Willner et al., 2010).In the case of illiteracy, the suggestion is to modify tests to resemble real-life situations instead of school-based procedures (Kosmidis, 2018).It is noteworthy that both intellectual disabilities and illiteracy more prevalent in forensic populations than in the general population (Harris, 2006;Tuominen et al., 2014;Hellenbach et al., 2017;Muñoz García-Largo et al., 2020), underscoring the importance of considering these factors in the selection of appropriate assessment tools.In addition, it is important to note that some of the tests that emerged from our study are subject to criticism, often in absence of better alternatives (e.g., the Thematic Apperception Test, see Lilienfeld et al., 2000).It is beyond the scope of this study to address tests individually.
Assessing these constructs may help to explain the determinants of the aggressive behavior which can provide valuable input for tailored treatment planning.Another important outcome is that the panel indicated that not every RDoC construct is appropriate to be measured by neuropsychological testing.For example, it was noted that the construct affiliation/attachment can be more effectively assessed through a structured interview, and arousal through observation or physiological measures.The RDoC matrix provides numerous examples of self-report and physiological measures for assessing its constructs (National Institute of Mental Health, 2023).Hence, a combination of neuropsychological tests, interviews, self-report, observation, and physiological measures might be needed to optimally measure the RDoC constructs.

. Limitations
The findings of this Delphi study need to be considered in the context of a few limitations.Firstly, despite repeated and extensive attempts to include a representative global panel, half of the panel consisted of people from the Netherlands.The continents of South America and Africa were not represented at all and other continents were underrepresented (especially taking the number of inhabitants into account).Since neuropsychological practices are affected by, for example, the country's health care system, legal framework, and cultural norms, this is likely to have influenced the results of the study (Kasten et al., 2021).This may have also limited generalizability as certain recommendations might be more tailored to the Netherlands.Furthermore, while every effort was made to ensure conciseness of the questionnaires, it is essential to recognize that participant motivation can influence the quality and consistency of expert input in iterative research endeavors like the Delphi method.Eight panel members (25% of the original panel, of which three from the Netherlands, 2 USA, 1 Italy, 1 Sweden, 1 Australia) did not complete all three rounds.This may have had implications for representativeness of the panel as they might have had different perspectives than the remaining 24 experts.Fortunately, most information was gathered in round 1 where experts rated all RDoC constructs and provided their test suggestions.
The panel generated a large number (223) of neuropsychological tests that can be used to measure the RDoC constructs.The panel was unfamiliar with many of the tests (56% of the tests were unknown to more than half of the panel), which prevented them from forming an opinion about their suitability.As a consequence, we could not validly construct a top-3 for each RDoC construct.For these constructs, we refer readers to the RDoC matrix for other assessment suggestions (National Institute of Mental Health, 2023).
Other limitations stem from the Delphi methodology.The approach toward consensus may exclude different but possibly important perspectives of individual panel members.The results of a Delphi study represent the ratings with the most overlap between the panel members, but this is not necessarily the "objective truth".Our study wasn't designed to uncover objective truths; instead, we aimed to identify best practices.Moreover, the Delphi procedure precludes direct contact between panel members to avoid group pressure toward conformity and possible effects of authority.However, a discussion can often lead to valuable insights.To address this, the panel members could read each other's comments and reasonings anonymously in round 2 and 3.This could help them in understanding the source of potential discrepancies in ratings and possibly change their opinion.We highlighted that they were not obliged to change their ratings if their opinion had not been changed.

. Implications and future directions
While this study represents a significant step forward in the endeavor to achieve adequate neuropsychological assessment of aggressive offenders, it is essential to acknowledge that our understanding of the relationship between the RDoC domains and aggression remains complex.Studying the interrelations between the constructs might provide more insights into aggressive behavior.For example, a lack of attention might lead to misinterpretation of social cues and a compromised working memory can lead to difficulties in emotion regulation.
Another prominent challenge that emerges from our study is the validation of the neuropsychological tests proposed by the expert panel.Moreover, to ensure that neuropsychological assessments are meaningful and sensitive to the unique characteristics of aggressive offenders, the field should focus on collecting more appropriate normative data.
Furthermore, the possible incorporation of neuropsychological test findings into risk assessment and management should be studied more thoroughly.This approach aligns with the Risk-Need-Responsivity (RNR) model (Bonta and Andrews, 2023), a leading framework in the forensic field.The findings have two connections to the RNR model.First, previous studies have indicated the added value of including biopsychosocial factors for the prediction of recidivism (Aharoni et al., 2013(Aharoni et al., , 2014;;Haarsma et al., 2020;Zijlmans et al., 2021).This aligns with the 'Need' principle of the RNR model, which emphasizes the importance of targeting criminogenic needs that are associated with an individual's likelihood of reoffending.Second, beyond understanding the cognitive limitations associated with aggression, future research should explore how this knowledge can be translated into effective intervention strategies.Specifically, cognitive limitations may play a crucial role in an individual's responsiveness to treatment, adhering to the 'Responsivity' principle of the RNR model.One might expect that if offenders have attentional difficulties or memory problems, that will influence treatment effectiveness.Longitudinal studies can help to understand how changes in neuropsychological function are related to changes in aggression and recidivism, further strengthening the connection between the RNR principles and the incorporation of neuropsychological assessments in risk assessment and -management strategies.

. Conclusion
This Delphi consensus study shed light on the role of the RDoC framework in understanding and assessing aggression in offenders.The experts' ratings underline the multidimensional Frontiers in Psychology frontiersin.orgnature of aggression, calling for a holistic approach when assessing and addressing aggression.Furthermore, distinguishing between reactive and proactive aggression provides useful insights into the mechanisms involved in aggressive behavior.
The extensive list of proposed neuropsychological tests, as well as the construction of a top-3 list for each construct, provide clinicians and researchers with a useful resource when it comes to selecting suitable tests.This overview allows for a flexible approach by tailoring assessments to specific clinical or research requirements.Furthermore, the acknowledgment that certain constructs may be better examined through interviews, observations, or physiological measures emphasizes the added value of a multimodal assessment strategy.
Future research should focus on test validation, normative data collecting, and the integration of neuropsychological findings into risk assessment and intervention as our understanding of the complex relationship between RDoC domains and aggression advances.Our Delphi consensus study not only enhances our comprehension of aggression in offenders through the application of the RDoC framework but also provides a comprehensive guide for clinicians and researchers in the selection of neuropsychological tests.The findings of this Delphi study offer a steppingstone for advancing the field of neuropsychological assessment in understanding and addressing aggressive behavior.

FIGURE
FIGUREOverview of the Research Domain Criteria (RDoC) domains, including constructs (bold) and subconstructs (regular text).

a
Panel members could give multiple answers to this question.b This question was only answered for the panel members who were employed as researcher.c This question was only answered for the panel members who were employed as clinician.

FIGURE
FIGURE Boxplots of importance scores of the RDoC domains for reactive (R) and proactive (P) aggression.SP, Social Processes; SM, Sensorimotor Systems; PV, Positive Valence Systems; NV, Negative Valence Systems; CS, Cognitive Systems; AR, Arousal/Regulatory Systems.
Next, Cognitive Systems were rated as M = 3.65 (SD = 0.44), with a non-significant difference between reactive (M = 3.72, SD = 0.48) and proactive aggression (M = 3.58, SD = 0.44; U = 134.5,p = 0.389).The panel reflected on why the Cognitive Systems are (not) important to consider in aggressive patients.
TABLE Final ratings of the RDoC constructs.TABLE Characteristics of the top-most suitable tests per construct, sorted by highest importance rating (N = ).
* These constructs are not part of the RDoC, but were suggested as additions in round 1.