No short-term treatment effect of prism adaptation for spatial neglect: An inclusive meta-analysis

Despite 25 years of research on the topic


Background
Spatial neglect is a common cognitive consequence of stroke, estimated to affect around one-third of stroke survivors in the acute phase post-injury (Esposito et al., 2021).People with neglect have difficulty attending and orienting towards the contralesional side of space within arm's reach (peripersonal neglect), beyond arm's reach (extrapersonal neglect), and/or to the contralesional side of their own body (personal neglect; Bisiach et al., 1986;Halligan and Marshall, 1991).Compared to patients without neglect, they show a slower recovery pattern, worse rehabilitation outcome, and less functional independence in daily activities such as self-care, navigation, reading, and writing (Gialanella and Ferlucci, 2010;Gillen et al., 2005;Katz et al., 1999).Various cognitive rehabilitation techniques to reduce neglect severity have been proposed and investigated.However, the benefits of these are yet to be confirmed.First, when therapeutic effects are demonstrated, there is little support for these to generalize to activities of daily living (ADL) (Bowen et al., 2013;Luaute et al., 2006;Yang et al., 2013).Second, the quality of the evidence was judged to be very low by a recent Cochrane review (Longley et al., 2021).This assessment was based on the GRADE approach (Grading of Recommendations, Assessment, Development, and Evaluations), which is arguably one of the most thorough and systematic assessments of research quality (NICE Impact Stroke, 2019).The review concluded that no rehabilitation approach to ameliorate neglect can be supported or refuted based on current evidence from randomised controlled trials (RCTs).
One of the most well-studied cognitive rehabilitation methods for neglect is prism adaptation, first used in people with neglect by Rossetti et al. (1998).The general procedure of prism adaptation has three phases.First, during the pre-exposure phase, patients' initial ability to point accurately towards a visual target is measured.Second, during the exposure phase, patients wear glasses fitted with prism lenses shifting their visual field towards their ipsilesional side of space (i.e., to the right side in patients with left neglect following right hemisphere damage).Typically, prism lenses are designed to shift the visual field between 5 and 12 degrees of visual angle (equivalent to around 9-22 prism dioptres).While wearing the prism glasses, patients are asked to make repetitive pointing movements towards visual targets, usually with their right (ipsilesional) hand.During the first part of the pointing movement, the hand is not visible, creating a mismatch between the visually-perceived position of the target and the proprioceptively-felt position of the hand (Newport and Schenk, 2012).The target appears to be shifted in the direction of the prismatic shift, causing the patient to make a pointing error in this direction (i.e., towards the right).The hand is visible either during the second half of the reaching movement (concurrent feedback) or during the last 1-5 cm of the movement (terminal feedback), providing feedback on the errors made.Across trials, there is a gradual error reduction, with performance on the pointing task eventually becoming equivalent to the pre-exposure level.Finally, in the post-exposure phase, the prisms are removed and the after-effect of prisms on pointing accuracy is measured.The visuomotor adaptation results in pointing errors in the direction opposite to the prismatic shift, which would be towards the left (i.e., the neglected side) if rightward prisms had been used.It is generally agreed that a spatial remapping or recalibration of hand-eye coordination results in the after-effect of prism adaptation (Redding andWallace, 2006, 2010;Saevarsson and Kristjansson, 2013).
Prism adaptation has been suggested to reduce neglect, as measured with neuropsychological paper-and-pencil tests (Frassinetti et al., 2002;Làdavas et al., 2011;Rossetti et al., 1998) and tests resembling ADL, such as wheelchair navigation, reading, and writing (Angeli et al., 2004a,b;Jacquin-Courtois et al., 2008;Rode et al., 2006;Watanabe and Amimoto, 2010).The literature on the effectiveness of prism adaptation as a treatment for neglect is extensive, with around 70 empirical trials to date that include a control group, and around 20 RCTs or controlled quasi-random studies (Fig. 1.; Table 1.).Although, most published studies report improvement in some aspect of neglect behaviour (see review: Gammeri et al., 2020), at least six RCTs have failed to replicate the beneficial effect of prism adaptation on neglect (e.g., Hauer and Quirbach, 2007;Mancuso et al., 2012;2016;Ten Brink et al., 2017;Turton et al., 2010;Vilimovsky et al., 2021).
The mixed results from individual studies suggest a need for a quantitative synthesis of the available evidence.This has been attempted in three recent reviews and/or meta-analyses.A Cochrane review by Longley et al. (2021), which provides an update of an earlier review by Bowen et al. (2013), concluded that there is no evidence for immediate or persisting benefit of prism adaptation on paper-and-pencil neglect assessments or measures of functional ability in ADL.Out of two more specific meta-analyses (Li et al., 2020;Qiu et al., 2021), one suggested a net benefit of prism adaptation for short-term clinical measures of neglect (Li et al., 2020), but neither reported significant long-term effects or any effects on ADL-like measures (Catherine Bergego Scale; CBS; (Azouvi, 1996;Azouvi et al., 2006).
These three reviews, however, might give a restricted overview of the relevant evidence.The Cochrane review (Longley et al., 2021) used strict inclusion criteria, such as restricting the study to RCTs, which excluded large numbers of potentially relevant studies, and limited the meta-analytic models to a maximum of five studies.This may limit the generalisability of the overall effect size estimate.It is also worth noting that the meta-analyses within this Cochrane review compared prism adaptation to any control condition, which could include other experimental treatments for spatial neglect, such as functional electrical stimulation (Choi et al., 2019).Therefore, the overall effect size estimate is not a comparison of prism adaptation to sham treatment or treatment-as-usual, which would be the more appropriate comparison to evaluate whether prism adaptation is more beneficial than the standard rehabilitation approaches that patients often receive.Lastly, in calculating the standardised mean difference between prism and control groups, only post-treatment scores were included instead of the change-from-baseline to post-treatment scores.In theory, this is not a problem when analysing large RCTs, where it can be assumed that randomisation balances out random between-subject variation in pre-treatment measures so that any differences between groups represent treatment effect.However, for the small sample sizes that characterise the prism adaptation literature, there could be potentially large variation in baseline (pre-prism) severity of symptoms, which could unduly influence any outcome measure that does not take the baseline symptoms into account.
The two recent meta-analyses (Li et al., 2020;Qiu et al., 2021) restricted their study selection criteria to RCTs using only sham adaptation or treatment-as-usual as control conditions.For both meta-analyses, this meant that a maximum of three studies were available per meta-analytic model.The small set of studies arguably limits the reliability of the results as well as the potential benefit of quantitative meta-analysis over a simple narrative review.Some further limitations of these meta-analyses were the inconsistent selection criteria (e.g., including some studies with pseudorandom allocation and not others, or unexplained exclusion of some studies that met the stated selection criteria) and a lack of description of data extraction and combination procedures.Furthermore, Qiu et al. (2021) incorrectly considered studies by Goedert et al. (2020), Rode et al. (2015), and Turton et al. (2010) as having used the full Behavioural Inattention Test (BIT [Wilson et al., 1987];), including conventional and behavioural sub-tests, where in fact they had only used the conventional sub-tests (BIT-C).Additionally, Li et al. (2020) conducted a fixed-effect meta-analysis, which assumes that the included studies estimated the same underlying treatment effect so that variations in effect sizes would be due to random sampling error alone.This assumption seems implausible, considering that the included studies had different inclusion criteria (time since stroke, neglect severity, lesion site) and treatment protocols (number and length of prism adaptation sessions).To reach a clearer estimate of the treatment effect, there is potential value in a meta-analysis that attempts to consider more of the available evidence, and which uses estimates of treatment effect sizes that account for differences in baseline symptoms between treatment and control groups.

Objectives
The contrasting results of individual RCTs and pseudo-RCT studies of prism adaptation, and the limitations of previous systematic reviews and meta-analyses, mean there is still no consensus on whether prism adaptation is an effective treatment for spatial neglect.The goal of the current meta-analysis is to estimate the effectiveness of prism adaptation measured by the most standard paper-and-pencil tests, with the additional aim of assessing the influence on neglect in everyday life situations.This is especially relevant in the light of the widespread use of prism adaptation, and the continued popularity of research in this area (see the Virtual Special Issue of Cortex to mark 20 years of the field [Rossetti et al., 2019]).
For the current meta-analysis, prism adaptation treatment was compared to non-experimental conditions of sham adaptation or treatment-as-usual, in right hemisphere stroke patients with left spatial neglect (see 2.1 for a more detailed account of inclusion criteria).We focussed on the question of whether reliable benefits of prism adaptation are found in the short-term (from immediate to one week) posttreatment period.Unless there is clear evidence for a short-term benefit to clinical and/or behavioural signs of neglect, there is no realistic prospect of any lasting benefits, which would ultimately be required to make this treatment worthwhile.The current analysis was more inclusive than previous meta-analyses, in the sense that controlled trials O. Székely et al. were included regardless of the randomisation procedure.This is relevant, as the randomisation process for most prism adaptation RCTs was either not truly random or not described.This addition is unlikely to have a substantial impact on the quality of the review but will increase the amount of data available for analysis.
There are many studies that use specific, and often bespoke tasks to assess the influence of prism adaptation, but there is no secure basis for combining outcomes from these disparate measures.This provided the rationale for focusing on specific outcome measures that are most commonly used, namely the BIT-C and cancellation tasks as clinical measures of neglect, and the CBS, which measures neglect severity in everyday life situations.The BIT-C is the most widely used standard clinical battery for neglect.This battery includes six subtests: line crossing, letter cancellation, star cancellation, figure and shape copying, line bisection, and representational drawing (Wilson et al., 1987).In the scoring system of the BIT-C, cancellation tasks (including line crossing) are heavily weighted, contributing 130/146 (89%) of the total score.Since the cut-off for neglect on this test is 129, the only way to receive a diagnosis of neglect is if at least some cancellation omissions are made.We therefore combined evidence from papers using BIT-C with those of papers using cancellation-only outcomes.Cancellation tasks are widely regarded as the single best clinical measure for neglect (Ferber and Karnath, 2001;Moore et al., 2022), so at least one cancellation task is usually included in any study of neglect, even if the main focus is on other behaviours.Focusing on cancellation tasks, in combination with BIT-C results, enabled us to include many more studies than any previous meta-analysis, potentially allowing a more reliable overall estimate of prism treatment effects.
Due to the lack of clear understanding of the differences between left-and right-sided neglect (Chen et al., 2015;Ringman et al., 2004) and the limited amount of research on the effects of prism adaptation on right-sided neglect, this meta-analysis exclusively focussed on right hemisphere stroke patients with left-sided neglect.Studies for which no separate data was available for this patient group were not included.To attempt to reduce the influence of between-subject variation in baseline symptoms, we focused on the standardised difference between treatment and control groups in the pre-post treatment change scores.

Inclusion criteria
Study inclusion criteria and all other steps of this meta-analysis were based on a study protocol, which is archived alongside a document explaining any changes to this protocol and the reasons behind those changes at the Open Science Framework, visit https://osf.io/hzdcq/.The inclusion criteria, based on the a priori established protocol, was built with the aims of increasing quality and reducing heterogeneity of the included trials so they could be reliably integrated into one metaanalytic model.
Studies conducted in 1998 or thereafter were considered, and could be in English, German, Spanish, or Dutch.The lower-bound date of 1998 is the year of publication of Rossetti and colleagues' original report of the application of prism adaptation to the treatment of neglect.Published and unpublished randomized and non-randomized controlled trials were considered for inclusion.Patients had to be adults over 18 years of age, presenting with symptoms of left neglect after a right hemisphere stroke.This had to be defined at the start of the study by performance on tests such as the conventional sub-tests of the BIT-C, or a sub-component procedure (cancellation, drawing, copying, bisection), and studies could include patients that showed neglect on some or all tasks used.Patients in the prism adaptation group had to have been treated with visuomotor adaptation to rightward displacing prisms.
Studies were included only if the control group received a true control 'treatment' such as sham adaptation (repetitive pointing without prism lenses), or treatment-as-usual.In this context, a treatment as usual controlled study meant that, aside from prism adaptation, there was no deviation from the regular neglect treatment protocol in either group.Trials comparing prism adaptation to another experimental treatment or no treatment at all were not included (although there was no instance of the latter).Potentially ambiguous cases were resolved by team discussion, prior to examining the results of the studies.For instance, there were two studies in which the control group received additional general cognitive stimulation that the prism adaptation group did not (Serino et al., 2006;Vangkilde and Habekost, 2010).Because these additional treatments were intended to control for general stimulating effects of prism adaptation, and were not neglect-specific, we included these studies.Another study noted that both the prism and control groups received visual scanning training for around 1 h per week as part of standard rehabilitation for neglect (Ten Brink et al., 2017).Because visual scanning training reflected the baseline of standard care for both groups in this setting, we included this study.By contrast, we excluded a study by Spaccavento et al. (2016), in which the comparison group received visual scanning as an experimental treatment for neglect that was not standard care in that setting, and was not received by the prism group. 1nce inclusion decisions were made in principle, sufficient data had to be available to allow the calculation of the pre-post change for the dependent measures of interest (see 2.6.1.and 2.6.2.) for the patient and control groups, and the standard deviation of the change for each group.If the pre-post change but not its standard deviation could be calculated, then the study was still considered for inclusion using an imputation strategy for the missing standard deviations (see 2.5.).

Information sources
Published studies were identified from three electronic databases: PsycInfo (Ovid), Web of Science-Core Collection, and PubMed.Unpublished trials were identified from two electronic registries: ISRCNT, and clinicaltrials.gov.The last search date for databases and the electronic registers were June 31, 2021 and June 14, 2021 respectively.To identify further potentially eligible studies, the reference lists of previous related reviews (Bowen et al., 2013;Champod et al., 2014;Dintén-Fernández et al., 2019;Li et al., 2020;Liu et al., 2019;Longley et al., 2021;Qiu et al., 2021) were checked.

Search strategy
For the database search, the two key components of the systematic review, defined as 'prism' and 'neglect' were used.To identify all relevant studies, the Boolean operator 'AND' and the truncation symbol '*' were used in the search string 'neglect AND prism*' to cover all related terms, e.g., 'prismatic'.The automatic/suggested strategies for each database (keywords for PsycInfo, topic for Web of Science, and all fields for Pubmed) were used, and the year range was set to studies published from 1998 onwards.On Pubmed, the Article type was set to Journal Article, on Web of Science, the document type was set to Article or Early Access.On the ISRCTN website, a text search was performed with the string 'prism* AND neglect'.On clinicaltrials.gov, the search was set to All studies with condition or disease: 'neglect', other terms: 'prism'.The search on the electronic registries was done by two independent reviewers, and there were no disputes between them to be resolved by discussion.
performance were found at post-test for both the visual scanning training and prism adaptation groups (n = 10 per group), with the improvement being non-significantly larger for the visual scanning training group.

Study screening and selection
All results returned from the three databases had their title and abstract screened according to the inclusion criteria and a reference management software was used to categorize and store all the results returned.After removing duplicates, the results were divided into four equal sub-lists.The first author (OS) screened all studies, and each of the four sub-lists was screened independently by one other member of the research team.Therefore, every item was screened independently by OS and one other reviewer.Any disagreement between reviewers was resolved by a discussion involving the full research team.
Following this initial screening, the full text of all articles deemed potentially eligible was reviewed independently by all five reviewers.The articles were categorised into three groups: 1) to be excluded; 2) to be included; and 3) not sure/more information needed.Any differences between the reviewers' categorisations were resolved via group discussion.Study authors were asked for additional information where necessary to resolve questions about eligibility (e.g., whether data was available separately for the patient group of interest or if data would be available from only the first half of a cross-over trial).See Fig. 1 for the number of excluded/included studies after each stage of the selection process.The list of studies excluded after full-text screening and the reasons for exclusion are provided in Supplementary materials (Table S1.).

Data extraction
Data were extracted using a structured protocol (see Appendix B for the data items extracted).All items were extracted from each study by the first author (OS) and were checked by one other reviewer per study.The authors of the unpublished registered trials were contacted with a data request.

Primary outcomes: BIT-C and cancellation tasks
The primary outcome measure was visuospatial neglect as measured by the BIT-C or by one or more target cancellation tasks.
The BIT-C total score was treated as the preferred measure when available.The full score of BIT-C is dominated by three cancellation tasks, reflecting the number of targets cancelled across line crossing, letter cancellation, and star cancellation (36, 40, and 54 targets respectively) and the remaining 16 points are contributed by other tasks (4 points for figure copying, 9 for line bisection, and 3 for representational drawing).Therefore, cancellation tasks contribute 130/146 (89%) of possible points overall on the BIT-C.
Consequently, we considered performance on any cancellation tasks to be a sufficiently similar outcome measure for studies that did not employ the BIT-C.The preferred metric of cancellation performance was the number of targets cancelled, summed across all cancellation tasks used, to match the scoring system of the BIT-C.If the number of targets cancelled could not be recovered from the paper, data requests were sent to the corresponding first and/or senior authors.If the number of cancellations was not available, then Centre of Cancellation scores (Binder et al., 1992) were considered sufficiently similar.Centre of Cancellation scores have previously been found to correlate nearly perfectly with the number of targets cancelled for line crossing (r = 1.0) and star cancellation (r = 0.95) in a sample of 50 patients with right hemisphere damage (McIntosh et al., 2017).However, in some studies, only left-right asymmetry scores were available for cancellation tasks (e.g., Vangkilde and Habekost, 2010).These studies were excluded because asymmetry measures do not change monotonically with neglect severity (in severe neglect, if cancellations are restricted to one side of the sheet, the asymmetry score will get smaller if fewer targets are cancelled).To obtain as homogenous a dataset as possible, when both BIT-C and cancellation tests were reported separately, only the BIT-C data were used.

Secondary outcome: Catherine Bergego Scale
The secondary outcome measure was the CBS, an observation scale that is currently the most-used standardised test of neglect that measures functional abilities in everyday life situations.The patient is scored on a 0 to 3 scale ('no neglect' to 'severe neglect') during performance of selfcare activities on items, such as 'Experiences difficulty finding his/her personal belongings in the room or bathroom when they are on the left side' and 'Forgets to eat food on the left side of his/her plate'.The final score is then given out of 30, with higher scores indicating more severe neglect.When the standard CBS scores were not available, CBS scores registered via the Kessler Foundation Neglect Assessment Process (KF-NAP) were also accepted (Chen et al., 2015).This maximised the available data, as the KF-NAP scoring system, unlike the standard scoring system, allows for scoring even if some items are not assessed.The final score is calculated with the formula: (sum score/number of scored items) × 10 ( Chen et al., 2012Chen et al., , 2015)).

Risk-of-bias assessment
The Revised Cochrane risk-of-bias tool for randomized trials (Sterne et al., 2019) was used to evaluate potential biases.This tool assesses the risk-of-bias of the results in five domains: randomisation process; deviations from the intended interventions; missing outcome data; measurement of the outcome; and selection of the reported results.Based on the responses to signalling questions, an algorithm proposes a judgement about the risk-of-bias in each assessed domain.Judgements about risk-of-bias can be 'Low', 'Some concerns', or 'High', and the overall risk-of-bias score of a study is based on the least favourable judgement.For instance, a study with high risk-of-bias in just one domain ends up with an overall judgment of high risk-of-bias.Each included study was rated for risk-of-bias by the first author (OS) and independently by another member of the research team (AFTB and AGM).This assessment was done at study level, not separately for each outcome measure.Any disagreements between the reviewers were resolved by discussion.As suggested by the tool, when there were sufficient grounds and agreement for such a decision, the reviewers could override the projected domain-level and overall risk-of-bias judgments.This happened for instance when the main reported analysis was not suitable for the research question within the study (related to domain 5), which would normally be considered as a high risk-of-bias, but the raw data were available or there were enough data to calculate the standardised effect size for our required outcome, resulting in a low risk-of-bias for this domain.Consequently, in these cases, domain 5 was not considered in the overall judgment of a study.See Table 2 for the component-specific and overall risk-of-bias judgments of the included studies.

Effect size measures
The effect size estimate is the standardised mean difference between groups in the post-treatment change from baseline, where a positive value represents a greater improvement in the experimental over the control group.Hedges' g was chosen as the measure of effect size for the meta-analytic models.Hedges' g and Cohen's d are very similar, as both assume equal variances between groups and have a slight positive bias in the results (up to about 4%).However, Hedges' g pools the variance using n -1 for each sample instead of n, which provides a better estimate for small sample sizes (<20).In this way, this measure adjusts for some (but not all) of the over-estimation of the effect (Borenstein et al., 2007).
Studies reporting BIT-C/cancellation scores and those reporting CBS scores were grouped separately.There were 16 studies in the BIT-C/ cancellation group and 8 studies in the CBS group.All the studies in the CBS group were also in the BIT-C/cancellation group.
Data requests were sent to the corresponding authors of each of the 20 eligible trials after full text screening (Fig. 1.) to be able to include an analysis of patient-level data or to obtain the statistical parameters required for the calculation of effect size measures when not reported for the variable of interest, or when only a subset of patients or a part of the study (first part in case of cross-over trials) met the inclusion criteria.Raw or group-level summary data were obtained from 10 studies.When there was not enough data available reported in the original text, or from the authors, an open-source plot digitizer was used to recover data from the figures (Rohatgi, 2021), which is a valid and highly reliable tool for graphical data extraction (Drevon et al., 2017).A detailed description of effect size extraction for each study for the BIT-C and CBS models respectively are provided in Supplementary materials (Table S2.And Table S3 respectively).
As studies reporting BIT-C and cancellation tasks were included in the same model, to combine the scores from different scales, the biascorrected standardised mean difference (Hedge's g) and the corresponding 95% confidence intervals were used in the meta-analysis.When only the group size and the p-value from the group (prism vs. control) by time (pre-vs.post-treatment) interaction were available, the p-value was converted to the corresponding t-value, which was then converted to the corresponding Cohen's d, which in turn was used to obtain Hedge's g and the corresponding sampling variance.
Another common effect size measure for meta-analyses of clinical trials is Glass' delta, which uses the control SD in the denominator of the effect size formula.Hedges' g was chosen over Glass' delta because Hedges g uses the pooled SD of the two groups, which is more stable than the SD of the control group considered alone, especially given the small group sizes of studies on this topic (Table 1., column 3.).However, there was no substantive difference in outcomes when Glass' delta was used instead (see Supplementary materials [ Fig. S1.]).

Synthesis methods
The short-term effect was regarded as the difference between the latest pre-treatment baseline and earliest post-treatment measure (which ranged from immediately after treatment to one week later).For studies where there were two experimental groups (Làdavas et al., 2011), the average of the two experimental groups was used both in terms of the results and the demographic information of patients, as suggested by the Cochrane Handbook (Higgins and Green, 2011).
As the included studies had different inclusion criteria (variation in time since stroke, neglect severity, lesion site) and treatment protocols (variation in number and length of prism adaptation sessions), a random-effects meta-analysis was conducted both on BIT-C/ cancellation and CBS data, using the Metafor package in R (Viechtbauer, 2010).

Search, selection and data extraction results
Out of the 50 articles that were fully retrieved, 31 were excluded due to not meeting the a priori selection criteria (see Table S1.).Nineteen studies were deemed eligible (reported in Table 1) and data extraction from these studies was attempted to allow for calculation of effect sizes of interest.For 15 studies, sufficient data were obtained from the paper or directly from the authors.Data were also available from one unpublished, registered trial, which has since been published (Longley et al., 2023).Hence, the cited publication date of this study falls later than the last search date for databases and electronic registers for our meta-analysis.For four studies, sufficient data were not available in the paper, and at least three email enquiries were made to the authors' last published email address and/or other email addresses found online with no response, or the authors informed us that the data were not available.These four studies could not be included.Overall, 16 studies (430 patients, across treatment and control groups) had sufficient data to be included in the main analysis, to assess the short-term effect of treatment on neglect as measured by BIT-C/cancellation tasks.Results of the search and selection process are summarised in Fig. 1.

Risk-of-bias assessment results
The 16 studies included in our meta-analysis (Table 2.) were rated for risk-of-bias using the Revised Cochrane risk-of-bias tool for randomized trials (Sterne et al., 2019).Out of the included studies, only one was judged as having a low risk-of-bias, four were judged to have some concerns, eleven were judged as having a high risk-of-bias.

Effect size
There was a near-perfect correlation between the raw effect sizes in percentage and the Hedges' g standardised effect sizes used in the random effect models (r = 0.98).This provides further grounds for the use of this measure and the possibility to translate the overall effect size estimate into the measure of the clinical neglect scales used.The bestfitting linear equation predicts that one unit of effect size (Hedges' g) would correspond to a change in BIT-C or cancellation score of 19.4 percentage points (e.g., 28 points on the BIT-C).

Examination of potential moderator variables
Heterogeneity in methods and reporting, the limited data available, and the relatively small number of total studies used in the meta-analysis (n = 16), all presented problems for any attempt to include moderator variables in the random effect models.We initially defined 41 variables of potential interest to be extracted from each study (Appendix B).Most of these variables (N = 37) were not available for most studies, and only four were available for all studies.These moderator variables were the strength of the prisms used, the number of prism adaptation sessions, the number of pointing movements made per adaptation session (prism adaptation variables), and the mean time post-stroke (clinical variable) (see Table 1 for information on which studies had which variables available).Like other clinical variables of potential relevance (e.g., location of injury, neglect severity, and profile of neglect impairment), the time post-stroke would ideally be encoded at the individual patient level, but insufficient patient-level data were available to do this for the current meta-analysis.There was considerable heterogeneity within groups in time-post-stroke: the range of days-post-stroke within a group could be as large as 2790 days (Serino et al., 2006), and it is unclear whether it is meaningful to represent the overall group by a mean value (395 days).Nonetheless, the mean was the only universally available value that could be used to represent time-post-stroke.
We first visualised the relationships of the available potential moderator variables with the treatment effect size.The scattergrams are shown in Fig. 2, along with Spearman correlation coefficients (to limit the influence of outliers) and the Pearson correlation coefficient when excluding an outlier case (Hreha et al., 2018).Note that for the number of studies included (n = 16), a minimum correlation of 0.50 would be required for a significant relationship at p < .05(uncorrected for the five multiple comparisons), and a correlation of 0.58 would be required for significance at an alpha-corrected level of p < .001.There was no suggestion of a dependence of the treatment effect upon the characteristics of the adaptation treatment (Fig. 2, panels a-c.).It is possible that some interaction of these three variables might be a more relevant moderator, but the available dataset could not support this level of secondary exploration.Nor was there any clear association of the treatment effect size with mean days post-stroke, considered as a continuous variable (Fig. 2d.).Due to the heterogeneity of time since injury within each study, encoding days-post-stroke as a continuous variable might be unrealistic.To turn this predictor into a coarser categorical variable, a natural place to divide would be at 90 days, forming a (sub)acute group (<90 days) and a chronic group (>90 days).Looking at the distribution of mean values, this division did not seem to relate to treatment outcome.
Overall, there were no sufficiently compelling relationships, or a priori considerations, to merit the inclusion of any available moderators in the meta-analysis, particularly given the relatively small number of studies overall.The strongest effect was identified between Hedge's g and days post stroke, which would suggest that prism adaptation would be more effective for more chronic patients.This was not a compelling relationship, however, and we did not include this moderator in the meta-analysis.

Random effect meta-analysis on the short-term effect measured by the BIT-C and cancellation tasks combined: heterogeneity and outlier Identification
The heterogeneity of effect sizes can be visually examined in Fig. 3.If the literature assessed is unbiased, it is expected that trials with higher precision near the top of the plot will converge on the average (middle vertical line), and trials with lower precision nearer the bottom will be spread evenly on both sides of the average.Deviation from this expected symmetrical shape is suggestive of bias within the literature (Egger et al., 1997).Egger's test of asymmetry: z = 1.10, p = .27,was not significant, suggesting no substantial bias.The main source of asymmetry within this plot is the presence of an extreme outlier marked with the unfilled circle at the lower right, with a very large treatment effect size and a small study size (n = 13 per group, Hreha et al., 2018).An outlying effect size estimate may exert an undue influence on the overall results.A study may be deemed influential if its omission from the analysis results in significant modifications to the fitted model.To identify such studies, case deletion diagnostics (e.g., Belsley, 1980;Cook and Weisberg, 1982) were applied (see Supplementary materials [Fig.S2.]).These showed that the study by Hreha et al. (2018) has a strong influence on the results (as reflected, e.g., in Cook's distance).The Baujat diagnostic plot (Baujat et al., 2002) has confirmed this study as the main source of heterogeneity in the meta-analysis (see Supplementary materials [Fig.S3.]).Removal of this study would reduce the amount of heterogeneity and increase the precision of the overall effect estimation of the random-effects model.This study was included in the initial random effect model for completeness, but it is important to note that its presence there is likely to bias the overall effect size estimate upwards.

Random effect meta-analysis on the short-term effect measured by the BIT-C and cancellation tasks combined: main results
On the BIT-C/cancellation data, we conducted three random effect models, all of which suggested a null effect overall.The initial random effect model, summarised as a forest plot in Fig. 4a., showed substantial heterogeneity between studies (I 2 = 71.78%,Tau^2 = 0.43), and estimated an effect size of 0.24 (95% CI: 0.14, 0.63), which did not depart significantly from zero.To explore the influence of an outlier case, a second random effect model was conducted including the 15 studies and excluding the study by Hreha et al. (2018) (Fig. 4b.).This decreased the estimated effect of the treatment (dz = 0.10, 95% CI: 0.19, 0.38) andreduced the heterogeneity between studies (I 2 = 46.53%,Tau^2 = The studies are ordered by time of publication.The first 16 rows list the studies included in the main random effects model.Studies in the last 4 rows were eligible according to their design but had insufficient data available for effect size calculations.Abbreviations: PA: Prism adaptation; Na: No data available; TAU: treatment as usual.'N in cancellation/BIT-C model (PA/Control)' refers to the number of relevant patients (right hemisphere stroke with left neglect) for the main meta-analysis (BIT-C/cancellation model) and 'N in CBS model (PA/Control)' refers to the number of relevant patients for the CBS model, which is not necessarily equivalent to the number of patients in the original study.'Single blind' indicates that patients were not explicitly informed about the group they were allocated to, which does not necessarily mean they were unaware of the assignment.'Double blind' indicates that neither the patient nor the assessor was explicitly told about the treatment assignment.Where possible, the days post-stroke values were extracted only for the patients relevant to our main model.If time post-stroke was reported originally in months, a month was considered to be 30 days.When the number of dropouts was reported before the characteristics of included patients, it was unclear how many of the dropped-out patents were relevant to the analysis.Consequently, due to the lack of precise data, dropouts were regarded as the total number of dropouts from the relevant time points (between the start and the end of treatment, not considering any follow-up time points) regardless of whether the patients dropping out were relevant to the analysis.The number of days between the end of treatment and post-treatment measures refers to the first post-treatment measure taken regardless of how many further measures were taken afterwards; the 'Na' values indicate that the time passed after treatment was not mentioned, which likely indicates that measures were collected immediately after treatment.0.14).It is important to note that this outlying estimate was one in which imputation had to be made during the effect size extraction process (Table S2.).In addition, to obtain a higher quality estimate of the effect size, a subset meta-analysis was conducted, excluding those trials that were judged to be at high risk-of-bias (Fig. 4c.).This model again estimated no significant effect of treatment (dz = − 0.18, 95% CI: 0.64, 0.28; I 2 = 41.45%,Tau^2 = 0.12).

Random effect meta-analysis on the short-term effect measured by the Catherine Bergego Scale
Eight studies reported sufficient data to be included in the model assessing the short-term effect of the treatment using CBS (Fig. 5a.) which was not enough to sufficiently explore outliers, heterogeneity, or the effect of potential moderator variables.Hedges' g was used as the measure of effect size, and no imputations were required during effect size extraction for this model.The correlation between raw effect size in percentage and the standardised effect size was strong enough for the possibility to translate the overall effect size estimate into the measure of the clinical neglect scales used (r = 0.97).The best-fitting linear equation predicts that one unit of effect size (Hedges' g) would correspond to 5.3 points on the CBS.
The overall effect size estimate was not significantly different from zero (dz = 0.26, 95% CI: 0.04, 0.56), and there was low heterogeneity between studies (I 2 = 26.12%,Tau^2 = 0.05).For a higher quality estimate of the effect size, a subset meta-analysis was conducted, excluding those trials that were judged to be at high risk-of-bias (Fig. 5b.).This model again estimated no significant effect of treatment (dz = − 0.03, 95% CI: 0.55, 0.49; I 2 = 46.6%,Tau^2 = 0.10).Data from one registered trial were available but, as this was later published, we were able to assess the full text and this study was counted as one of the 51 full-text reports sought for retrieval.

Main outcomes
The present meta-analysis has found no evidence for positive shortterm effects of prism adaptation treatment on spatial neglect, as measured by conventional neuropsychological assessments (BIT-C and cancellation tasks).Nor did we find evidence for therapeutic effects as measured by a standard assessment of neglect in everyday activities (CBS), albeit half as many studies were available for this subsidiary analysis.Null effects were consistent after the removal of influential outliers, when studies with high risk-of-bias were excluded, and when an alternative measure of effect size was considered (Glass' delta versus Hedge's g).We included a more homogeneous patient sample than previous meta-analyses, considering only stroke patients with right hemisphere damage and left-sided neglect, the canonical target group for prism therapy.Even so, by taking advantage of the fact that the conventional subtests of the BIT are mostly cancellation tasks, we were able to include more studies than any previous meta-analysis on the topic (Longley et al., 2021;Qiu et al., 2021).This result is not encouraging, but it is also not definitive, particularly considering the challenges in quantitatively combining evidence across the literature.The general lack of standardisation of study designs, treatment protocols, and outcome measures meant that several compromises had to be made, for instance combining results from the BIT-C with those of various cancellation procedures.We believe this

Table 2
Components and the Overall Risk-of-bias Judgment for the 16 Studies Included in the Main Model.
The studies are ordered by time of publication.Abbreviations: H: High risk-of-bias (red cells); SC: Some concern about bias (yellow cells); L: Low risk-of-bias (green cells).In domain 5, L* indicates the studies that were not judged to be at low risk for this domain based on the information available in the paper, but the judgment had been overridden when data was shared from the authors.(Hreha et al., [2018], depicted by the unfilled circle in the figures) removed.Spearman correlation coefficients (ρ) are reported with all available data points included.See Table 1 for details on data availability.(Hreha et al., 2018), including 15 studies and 417 patients across treatment and control groups; and Panel C: excluding studies with high risk-of-bias, including 5 studies and 189 patients across groups.The size of the boxes represents the weight of each study in the meta-analysis, error bars indicate confidence intervals (CI), and the diamond shape is the overall estimated effect.A positive value indicates that prism adaptation elicited a larger reduction of neglect symptoms than the control treatment, and a negative value indicates the opposite relationship.critical step was justifiable given that 89% of the BIT-C score is determined by cancellation tasks and it allowed us to combine evidence from a much larger set of studies as compared to previous meta-analyses.For instance, Li and colleagues' (2020) meta-analysis did suggest a shortterm effect of prism adaptation on neglect as assessed with paper-andpencil tests, but the number of studies included was extremely low (a maximum of three per outcome measure).The present approach allowed us to include 16 studies in our full model, with five studies in the most reduced model, and seems likely to provide a more realistic estimate of true treatment effects.

Overcoming potential biases in source literature
Another strength of our focus on cancellation behaviour is that it may help to neutralise potential biases within the source literature.A qualitative reading of the 16 articles included in our meta-analysis would find a suggestion of positive treatment effects in most cases.However, all but two of these same articles (Hreha et al., 2018;Serino et al., 2006) showed a non-significant effect within our meta-analysis at alpha level.05(see Fig. 4a).Our decision to focus on cancellation measures (including BIT-C), meant that we extracted a measure of effect size from each study which, whilst of primary relevance to the clinical symptoms of neglect, was rarely the main focus of the study as reported.That is, we selected this standard outcome measure for all studies regardless of how important it was considered in the original papers.The overview we thereby obtained may give a very different impression than would be gained from a qualitative reading of the source material, which usually tends to emphasise the most encouraging outcomes from amongst the various measures that may have been included (e.g., see Gammeri et al., 2020 for a qualitative review).This tendency could only be assessed based on the measures reported, as out of the 16 included trials, only three had been pre-registered or had a published protocol with outcome measures included (Longley et al., 2023;Goedert et al., 2020;and Ten Brink et al., 2017).The lack of bias in the effect size estimates that we extracted was suggested by the funnel plot visualisation in Fig. 3. (and Egger's test of asymmetry).
This neutralising of potential bias may also be important in the context of the finding that the risk-of-bias assessment noted some cause for concern across almost all of the studies considered, with the majority of studies at high risk-of-bias, and only one study at low risk across the board (see Table 2.).This overall high risk-of-bias in the literature likely reflects the difficulty of doing high-quality RCTs for a complex clinical condition such as neglect, but nonetheless demonstrates the need for higher quality evidence on critical questions of rehabilitation.
It is also worth noting that some controlled studies that reported significant effects of prism treatment on neglect made no direct statistical comparison between treatment and control groups (e.g., Frassinetti et al., 2002), which can lead to unwarranted conclusions (see Nieuwenhuis et al., 2011).In other cases, significant differences may have emerged only after splitting patients into post-hoc sub-groups, for instance, based on neglect severity (Facchin et al., 2019;Mizuno et al., 2011) or lesion location (Goedert et al., 2020), or when restricting the analysis to specific sub-sets of conditions (e.g., Làdavas et al., 2011).Post-hoc sub-group analyses may be useful for the exploration of potential moderators and can be invaluable in generating testable hypotheses about the factors influencing therapeutic response, but they do not in themselves allow strong conclusions on the effects of prism adaptation on neglect recovery.Especially with the typically small sample sizes involved, the possibility of false positive results from post-hoc analyses is high.

Potential moderators of therapeutic effects
Our attempts to investigate potential treatment and patient variables influencing therapeutic responses found no clear candidate moderators.As already noted, the strength of prisms and the number of treatment sessions did not modulate the effect size.In terms of prism strength, one explanation could be that the range of strengths used was small (SD = 2.5 degrees of visual angle) so there might not have been enough variance within the data for it to show a relationship with neglect scores.The number of sessions, however, ranged between 1 and 20 and was relatively well-distributed.Similarly, the number of pointing movements made during an adaptation was quite standard across studies, 2 but varied in number between 60 and 100 (M = 86.7,SD = 11.75).The lack of moderator effects here seems more likely to suggest that they do not influence therapeutic effects because these effects themselves are subtle or absent.Of course, the sample of available studies was small and studies typically varied on multiple dimensions, so it is also possible that the data were just insensitive to detect real moderator effects.
In addition to potential moderators related to the treatment, which tend to vary at the study level, the class of potential moderators related to individual patient factors is also critical.Unfortunately, our ability to estimate these factors was hampered by the fact that the relevant variation was at the patient level, but only group-aggregate data were generally available.For instance, patients may vary in the degree of sensorimotor adaptation that they show to the prisms, and this could influence the therapeutic response.However, although all but two of the included studies (Mancuso et al., 2012;Mizuno et al., 2011) stated that sensorimotor adaptation was confirmed after each treatment session, many studies did not describe how this was measured and most reported no quantitative measures of the sensorimotor aftereffect or patient-level data.A few studies did assess the correlation between sensorimotor aftereffects and reduction of spatial neglect following prism adaptation, but no significant relationships were found (Goedert et al., 2020;Nys, 2008;Serino et al., 2006Serino et al., , 2007)).This is in line with the lack of correlation between sensorimotor aftereffects and higher-level 'cognitive' aftereffects of prism adaptation in healthy individuals (Michel, 2016;McIntosh et al., 2023).
Although the sensorimotor aftereffect of prism adaptation has never been found to predict treatment effects of prism adaptation, Serino and colleagues have claimed that the degree of direct error reduction (i.e., the online correction of pointing movements during the adaptation procedure, for instance, measured as the difference between pointing error between the start and end of prism exposure) does predict the treatment effect (Serino et al., 2006(Serino et al., , 2007(Serino et al., , 2009)).Serino et al. (2006) reported a relation between error reduction and BIT scores in patients with neglect during the first week of the treatment, whereas this relation was not found when the entire treatment period was considered.Within a separate study, Serino et al. (2007) divided people with neglect into two groups based on the level of error reduction during the first week of treatment and found that the group with stronger error reduction in the first week showed more improvement on the BIT from pre to post-treatment.However, these groups were unbalanced, with only five patients in the low error-reduction ("not adapting") versus 15 in the high error-reduction group ("adapting"), which questions the reliability of the observed difference.
In addition, differences between adapting versus non-adapting patients could be the result of the selection of a certain patient characteristic (i.e., being able to adapt to prisms, possibly indicating a certain degree of brain plasticity) that might predict outcome (i.e., spontaneous neglect recovery), independent of the effect of prism adaptation itself.The same holds for dividing patients based upon the presence or strength of the sensorimotor aftereffect (Serino et al., 2007), if not compared with an appropriate control group.It may in fact be impossible to define a comparable control group in which no prism adaptation is experienced because the categorization (adapting versus non-adapting) is based on the response to prism adaptation.A solution could be to provide a single session of prism adaptation to all patients and based upon the response in this session categorize patients as adapters or non-adapters.These subgroups could then be equally assigned to a control group or experimental group, after which effects of 10-20 additional sessions of sham versus prism adaptation on neglect recovery could be compared.
In the present meta-analysis, only one variable showed a suggestive correlation with treatment effect size: patient groups who were on average tested at a later time post-stroke showed more reduction of neglect following prism adaptation.The evidence for this relationship is only at the group aggregate level and after the removal of one outlying study (Fig. 2d).A possible explanation is that differential effects of prism adaptation between experimental and control groups may be more difficult to observe in the early stages post-stroke due to spontaneous recovery (Nijboer et al., 2013;Ringman et al., 2004) but become apparent in the chronic stage, when neglect is otherwise more stable.Despite this suggestive pattern, we did not include time-post-stroke as a moderator in our meta-analysis for two main reasons other than the lack of patient-level data.First, there was generally large variation in the time post-stroke within studies, so group averages are not necessarily representative.Second, in several studies, the time post-stroke differed between patients in the control versus experimental groups, which further undermined the possibility of taking an overall mean from each study (Goedert et al., 2020;Làdavas et al., 2011;Serino et al., 2006Serino et al., , 2009;;Vangkilde and Habekost, 2010).Nonetheless, given the slight suggestion that prism adaptation effects may be more visible in the chronic stage, it may be beneficial for future research to focus on patients with chronic neglect (or at least to have a well-defined chronic subgroup).Moreover, to provide the fullest potential to assess the role of patient-level variables, including not only time post-stroke but also (for instance) neglect severity and lesion location (Goedert et al., 2020;Mizuno et al., 2011), studies should report these data at the patient level, and where possible share the patient-level information to facilitate more informative meta-analyses.

Sufficient sample size for testing the prism effect
A novel controlled trial to detect the average effect size estimated from our main meta-analysis (0.40), at a power of .8(two-tailed alpha .05),would require 100 patients in the treatment and control groups (200 patients total).This would be more than four times the scale of the largest RCT yet conducted (Ten Brink et al., 2017).However, it is arguable that this effect size would be too small to be worth investigating, and that a larger effect size would be required for the treatment effect to be clinically important.The cut-off for a minimal level of neglect on the BIT-C is a score of 129 from 146, representing a decrement of 17 points, or 11.6% of the total score (Wilson et al., 1987).If we were to treat this as a minimally clinically relevant level of neglect, then 11.6% could also be regarded as the minimal treatment effect of clinical relevance.Based on the best-fitting linear association between the raw and standardised mean difference effect sizes (see 3.3.),an 11.6% change would correspond to a standardised mean difference of 0.61.A novel trial to detect this minimally clinically-relevant effect, at a power of .8(two-tailed, alpha .05),would require 44 patients in the treatment and control groups (88 patients total), around twice the scale of the largest extant RCT.

Conclusion
This meta-analysis did not find support for the routine use of prism adaptation as therapy for spatial neglect.We found no clear evidence for short-term therapeutic benefits, which makes it highly unlikely that longer-term benefits exist and therefore that prism adaptation is an effective treatment.The clarity of the conclusions that we can reach is 2 Repetitive fast pointing movements were typically made towards targets to the left, centre, and right, with one exception, in which patients performed 60 trials of line bisection and circle crossing as the adaptation routine whilst wearing prisms (Goedert et al., 2020).
O. Székely et al. necessarily limited by the quality and coherence of the available evidence, and a null result from a relatively small number of often small studies cannot be definitive.It remains possible that prism adaptation does provide genuine therapeutic benefits for at least some patients; on the other hand, if these effects were strong or general, we would not expect the question to remain so open after nearly 25 years of research on the topic.Given this state of affairs, it could legitimately be debated whether it is worthwhile to commit further research resources in this direction.However, the difficulty in quantitatively combining the results of the existing literature highlights the need for a more standardised approach if any future work of this kind was to be done.Wellcontrolled trials, sufficient sample sizes to detect a minimal treatment effect size of interest, and patient-level data sharing would be required to answer some of the main questions (e.g., the possibility of larger effects at chronic stages).Formal pre-registration of study design would also be helpful to clarify the distinction between confirmatory and exploratory aspects of studies and limit the scope for selective reporting and generate an unbiased database on this important topic.

Fig. 1 .
Fig. 1.Flow diagram of the literature search and selection process.Data from one registered trial were available but, as this was later published, we were able to assess the full text and this study was counted as one of the 51 full-text reports sought for retrieval.

Fig. 2 .
Fig. 2. The relationship between potential moderator variables and the standardised effect size.Panel A: prism strength.Panel B: number of sessions.Panel C: number of pointing movements.Panel D: number of days since stroke (mean).The dotted line marks 90 days.All Pearson's correlations (r) are reported with the outlier study(Hreha et al., [2018], depicted by the unfilled circle in the figures) removed.Spearman correlation coefficients (ρ) are reported with all available data points included.See Table1for details on data availability.

Fig. 3 .
Fig. 3. Funnel Plot of Standardised Effect Sizes of Studies Included in the Meta-Analysis Against Their Standard Error.The inward sloping lines define the region within which 95% of points are expected to lie in the absence of publication bias.

Fig. 4 .
Fig. 4. Forest Plots of the Initial and Sub-group Random Effect Models on the Short-term scores of Conventional Behavioural Inattention Test/Cancellation Tasks Panel A: Initial model including 16 studies and 430 patients across treatment and control groups; Panel B: excluding an outlier study (Hreha et al., 2018), including 15 studies and 417 patients across treatment and control groups; and Panel C: excluding studies with high risk-of-bias, including 5 studies and 189 patients across groups.The size of the boxes represents the weight of each study in the meta-analysis, error bars indicate confidence intervals (CI), and the diamond shape is the overall estimated effect.A positive value indicates that prism adaptation elicited a larger reduction of neglect symptoms than the control treatment, and a negative value indicates the opposite relationship.

Fig. 5 .
Fig. 5. Forest Plots of the Initial and Lower Risk-Of-Bias Sub-Group Random Effect Models on Catherine Bergego Scale Scores Panel A: Forest plot of the short-term CBS random effects model) including 8 studies and altogether 250 patients across treatment and control groups.Panel B: CBS model excluding studies with high risk-of-bias, including three studies and 151 patients across groups.The size of the boxes represents the weight of each study in the meta-analysis, error bars indicate confidence intervals (CI), and the diamond shape is the overall estimated effect.A positive value indicates that prism adaptation elicited a larger reduction of neglect symptoms than the control treatment, and a negative value indicates the opposite relationship.

Table 1
Characteristics of the 20 eligible studies.