Validation of the PHQ-9 in adults with dissociative seizures Journal of Psychosomatic Research

Background: The PHQ-9 is a self-administered depression screening instrument. Little is known about its utility and accuracy in detecting depression in adults with dissociative seizures (DS). Objectives: Using the Mini – International Neuropsychiatric Interview as a reference, we evaluated the diagnostic accuracy of the PHQ-9 in adults with DS, and examined its convergent and discriminant validity and uniformity. Methods: Our sample comprised 368 people with DS who completed the pre-randomisation assessment of the CODES trial. The uniformity of the PHQ-9 was determined using factor analysis for categorical data. Optimal cut-offs were determined using the area under the curve (AUC), Youden Index, and diagnostic odds ratio (DOR). Convergent and discriminant validity were assessed against pre-randomisation measures. Results: Internal consistency of the PHQ-9 was high ( α = 0.87). While the diagnostic odds ratio suggested that a cut-off of ≥ 10 had the best predictive performance (DOR = 14.7), specificity at this cut off was only 0.49. AUC (0.74) and Youden Index (0.48) suggested a ≥ 13 cut-off would yield an optimal sensitivity (0.81) and specificity (0.67) balance. However, a cut-off score of ≥ 20 would be required to match specificity resulting from a cut-off of ≥ 13 in other medical conditions. We found good convergent and discriminant validity and one main factor for the PHQ-9. Conclusions: In terms of internal consistency and structure, our findings were consistent with previous validation studies but indicated that a higher cut-off would be required to identify DS patients with depression with similar specificity achieved with PHQ-9 screening in different clinical and non-clinical populations.


Introduction
Dissociative seizures (DS) superficially resemble epileptic seizures but are not caused by the abnormal electrical discharges in the brain. DS are interpreted as an experiential and behavioural dissociative response to arousal and perceived as non-volitional [1]. Together with epilepsy and syncope, dissociative seizures account for 90% of clinical presentations with transient loss of consciousness. The prevalence of DS in the general population has been estimated as 50/100.000 [2]. Depression is one of the commonest comorbid disorders in patients with dissociative seizures (DS) [3][4][5]. Prevalence rates of depression in adults with DS range from 21% to 60% and are higher than in the general population and patients with epilepsy [6]. Health-related quality of life (HRQoL) is lower in patients with DS compared to those with epilepsy [7] and closely correlated with depression [8]. A recent systematic review found that depression correlated more closely with HRQoL than with seizure-related factors, and lower depression scores were the only determinant of higher HRQoL [9], suggesting that successful treatment of depression may improve clinical outcomes in patients with DS [10].
Unanswered questions remain about how to optimally identify clinically relevant depression, and what contribution self-report tools can make to this. A recent meta-analysis compared studies based on selfreport methods and clinical diagnoses and highlighted clear disparities in the reported prevalence of depression in adults with DS depending on the case ascertainment method [5]. There are a number of factors, which could adversely affect the accuracy of self-report measures of depression in this population; there is also the pragmatic issue of availability of expertise. For many patients, assessment by an experienced consultant neuropsychiatrist is simply not available and diagnostic support tools are a necessary part of service provision. Among other factors that may affect accuracy of assessment, psychological aspects of DS are often stigmatised [11,12], and patients may find it difficult to admit to negative emotions. Depressive symptoms may also present atypically in this population and, compared to patients with epilepsy, DS patients may be more likely to endorse somatic rather than affective symptoms of depression [5]. These characteristics and the high levels of somatic symptoms reported by many patients in addition to their DS could affect the specificity of screening tools for depression [13].
The Patient Health Questionnaire-9 (PHQ-9) [14] is a brief, selfadministered tool that is widely used to screen for depressive symptoms in clinical and research settings. For instance, in the United Kingdom the PHQ-9 is not only employed by providers of "Improving Access to Psychological Therapies" (IAPT) to assess patients' suitability for particular services but also as an outcome and performance measure of IAPT services. Although it has been widely used in clinical and nonclinical populations, current evidence regarding optimal PHQ-9 cutoffs for depression screening accuracy remains inconclusive. Due to the high specificity and sensitivity reported in the original validation study, a cut-off score of ≥10 has become the recommended threshold for the identification of individuals likely to have clinically significant depression [14]; however, inflexible adherence to a specific cut-off has been criticised by the same authors [15]. Two individual patient data metaanalyses across different patient populations have generated different values as optimal cut-offs: a PHQ-9 ≥ 10 [16], or PHQ-9 ≥ 14 [17].
To our knowledge, no study has yet sought to validate the use of the PHQ-9 and determine optimal cut-off scores in a DS population. Thus, its utility as a screening instrument and sensitive treatment outcome measure remains uncertain in this patient group. The primary aim of this study was, therefore, to measure the sensitivity and specificity of the PHQ-9 in detecting depression in patients with DS, using the Mini -International Neuropsychiatric Interview (M.I.N⋅I.) [18] as a "gold standard" diagnostic reference.
Previous studies indicate that the PHQ-9's cognitive/affective (PHQ-9/CA), and somatic (PHQ-9/S) subscales may differentially contribute to depression screening accuracy in patients with epilepsy [19,20]. Therefore, secondary aims of our study included an examination of the uniformity of the PHQ-9 scale through exploratory and confirmatory factor analysis as well as an examination of the convergent and discriminant (construct) validity of the PHQ-9 in its use for people with DS.

Participants
The patients who contributed self-report and interview data for this analysis were participating in the CODES trial, a multi-centre randomised controlled trial (RCT) comparing the clinical and cost effectiveness of Cognitive Behavioural Therapy (CBT) plus standardized medical care (SMC) and SMC alone in adults with DS. Patient data were captured at the point of recruitment into the intervention phase of this study between January 2015 and May 2017 (see previous publications for details of the trial and recruitment procedure) [21][22][23][24]. Inclusion and exclusion criteria are given in Supplementary Material 1. Written informed consent was obtained from all participants at each phase of the CODES study and ethical approval was granted by the NRES London-Camberwell St Giles Ethics Committee (13/LO/1595).

Measures
The Mini -International Neuropsychiatric Interview (M.I.N.I.; version 6) [18] was used. It is a structured interview for psychiatric diagnoses, based on DSM-IV criteria and the International Classification of Diseases-10 (ICD-10) [25]. It comprises 17 modules, each with 8-10 questions, which measure the symptoms for common psychiatric disorders including major depression. Its reliability and validity are well established [26]. Interviewers were postgraduate research assistants with different training and professional backgrounds. All interviewers had previous direct experience with patients and received a full day training to administer the M.I.N.I. in compliance with the administration procedures laid out in the M.I.N.I. manual. A similar approach to validating the PHQ-9 against the M.I.N.I. has previously been undertaken in patients with epilepsy [20]. The M.I.N.I. was chosen for the main CODES trial and therefore for this study due to its ease of use and previously documented validity in comparison studies with more elaborate diagnostic instruments [27]. In view of our aim to collect a considerable amount of outcome data, we were keen to minimise the assessment load for study participants as much as possible. Although all patients were also assessed by psychiatrists, this assessment was not on the same occasion as the completion of M.I.N.I. and PHQ-9. Furthermore, the trial protocol for this pragmatic trial did not require psychiatrists to complete a structured diagnostic assessment.
The PHQ-9 [14] has been widely used to screen for major depression and grade depressive symptom severity. It includes questions about each of the nine DSM-IV criteria for depression: anhedonia, low mood, fatigue, poor appetite, poor concentration, low self-esteem, hyper or hypoactive behaviours, sleep disturbances and suicidality. Responses range from 0 (not at all) to 3 (nearly every day), with total scores ≥10 typically taken to indicate clinical depression. The PHQ-9 is perceived as reliable, with good sensitivity and specificity across a range of medical settings and populations [28], including patients with epilepsy [20,29]. It has been shown to have good construct validity and can be useful as a categorical or continuous measure of depression severity [14,30,31]. While there is evidence that the PHQ-2 (comprising only the two first items of PHQ-9) could be of similar value as the PHQ-9 as a screening tool for depression [32], we chose the PHQ-9 for this study because it is a more widely used instrument, captures a broader range of (potentially relevant) symptoms and contains item 9 which is appreciated by some services as an independent "risk" item ("Thoughts that you would be better off dead, or hurting yourself").
The Generalised Anxiety Disorder Assessment-7 (GAD-7) [33] is a reliable and valid seven-item scale that can be used as a screening tool and severity measure for generalised anxiety disorder.
The Clinical Outcomes in Routine Evaluation-10 (CORE-10) [34] is a 10-item scale used to measure psychological distress and screen for mental illness. It has high internal reliability and validity [34].
A modified version of the Patient Health Questionnaire-15 (PHQ-15) [35] was used to identify somatic symptoms. This modification comprises 15 items referring to common somatic symptoms (excluding upper respiratory infections), 10 relating to neurological symptoms [36] and five referring to psychological symptoms derived from the Prime MD Questionnaire [37]. This is a dichotomous version that asked patients if they were 'bothered a lot' by each of the symptoms over the past month (0 = No, 1 = Yes).
The Beliefs About Emotions Scale (BES) [38] was used to measure beliefs about the unacceptability of experiencing or expressing negative emotions. The scale has good validity and internal consistency [38].
The Work and Social Adjustment Scale (WSAS) [39] assesses the ability to perform day-to-day tasks including work, home management, relationship interaction, and social and private leisure activities. High validity and test-retest reliability have been reported for the WSAS [39].
The Short Form 12-item Health Surveyversion 2 (SF-12v2) [40] is a widely-used measure of health-related quality of life (HRQoL). It comprises eight subscales assessing mental and physical health domains and gives rise to two summary scores: The Mental Component Summary (MCS) and Physical Component Summary (PCS). The SF-12v2 has good reliability and validity in measuring health status [41].

Statistical analyses
Before tackling the primary aim of this study, we explored the factor structure of the PHQ-9 in our DS population: to this end, the sample was randomly divided in two halves for exploratory and confirmatory factor analyses (EFA and CFA). In both cases, item factor analysis for categorical data was used, using the weighted least squares estimator (WLSMV [45]). Both split-half samples were sufficiently large (N = 169 and N = 199) for factor analyses since they captured over 100 responses with Ns 18 or 22 times greater than the number of items [46].
Measures of fit that are reported include: relative chi-square (relative  [48]). Analyses were conducted in MPLUS [50]. The factor number in the EFA was explored using the Guttman-Kaiser criterion [51,52], parallel analysis [53] for categorical data using R package 'random.polychor.pa' for categorical data [54], a Cattell's scree plot [55] and goodness of fit indices.
In the CFA sample we fitted the unidimensional model, a two-factor model suggested by EFA and a bifactor model combining the two. Following the procedure described [56] we fitted a bifactor model to resolve the dimensionality problem in our data. A multiple indicatorsmultiple causes model (MIMIC; [57]) was fitted to the complete data to explore potential measurement non-invariance (biased measurements) with respect to age and gender (one adjusting for the other).
In order to address the primary aim of our study, the diagnostic accuracy of the PHQ-9 total score was assessed against Current Major Depressive Disorder diagnoses from the M.I.N.I.
The area under the receiver operating characteristics (ROC) curve (AUC) [58] was used to test the performance of the PHQ-9 at various cutoff points using Stata V16.0 (Stata, Texas).
There are many ways to determine optimal cut-off points. To help balance sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV we calculated the following summary statistics: PROC01, the distance between the point on the ROC curve and (0,1) that should be minimised; Youden index (sensitivity + specificity -1), which should be as close to 1 as possible; and the diagnostic odds ratio (DOR), which should be maximised as it indicates the predictive performance of the PHQ-9 cut-off point. More specifically, the DOR calculates the odds of a patient categorised as 'depressed' in the M.I.N.I. as being classed 'depressed' on the PHQ-9, compared to a patient not found depressed.
Reliability of the PHQ-9 was assessed using Cronbach's alpha [59] and item-total correlations. Concurrent (convergent or discriminant) validity was assessed using parametric or non-parametric correlations as appropriate. Table 1 shows basic demographic information for our sample. Further information can be found elsewhere [22]. Table 2 presents scores on measures of psychological distress and somatic symptoms (PHQ-9, GAD-7, CORE-10, Modified PHQ-15), Beliefs about Emotions, psychosocial functioning (WSAS) and health-related quality of life (SF-12v2 and EQ-5D-5L VAS). Further classifications of responses on the EQ-5D-5L are summarised in Supplementary Material 2.

Dimensionality and internal consistency
Parallel analysis indicated that the 1-factor solution was suitable for our data (see scree plot in Supplementary Material 3). One eigenvalue was larger than 1 (5.22), but the second was also close to 1 (0.96). Therefore, we fitted both the 1-and the 2-factor models. As presented in Table 3, a closer fit was achieved for the 2-factor model, but the fit of the 1-factor model was adequate. These results were replicated in CFA where both models fitted well. However, as presented in Table 4, the item loadings on the uni-factor model were similar to the loadings of the items on the general factor of the bi-factor model. According to Reise et al. [56], this indicates that there is no loss of information if a 1-factor model is applied. Secondly, the loadings of the items on the general factor of the bi-factor model were substantially larger than their loadings on the specific factors they were assigned to. According to Reise et al. [56] this indicates that specific factors reflected a general trait. Finally, the loadings of the items on their designated factors in the two-factor CFA model (where there was no general factor) were much larger than their loadings to their designated factors in the bifactor model (where there was a general factor). These results suggest that the general factor is the prominent factor and the specific factors essentially reflect the general factor. The PHQ-9 therefore emerges as a univariate tool in patients with DS. Two significant direct effects were identified, namely the direct effect of age on item 7 (d.e. = 0.013, p < 0.001) and the direct effect of gender on item 5 (d.e. = 0.3, p = 0.026). As only one item per covariate was found non-invariant, with small effect sizes, we conclude that PHQ-9 is measurement invariant with respect to gender or age.
The internal consistency of the PHQ-9 was satisfactory (α = 0.87), with no increase in alpha if any of the items were deleted and satisfactory individual item-total correlations ranging between 0.44 and 0.70.

Diagnostic accuracy
Using Current Major Depressive Disorder from the M.I.N.I. as the gold-standard, the overall area under the ROC curve (AUC) of the PHQ-9 in our sample was 0.80 (Fig. 1). Table 5 indicates that, in our sample, using a PHQ-9 cut-off of ≥10 for detecting depression cases has the best predictive performance; i.e. depressed patients have the greatest odds (DOR = 14.7) of a positive result compared to non-depressed patients. This is consistent with the current clinical cut-off. However, this cut-off achieves high sensitivity at the cost of low specificity, whereby only half of the individuals above this threshold will have clinical depression.
Using the AUC, a more overall measure, a cut-off of ≥13 has the best performance, with an AUC of 0.74 and a Youden index of 0.48. This may be considered an optimal balance of sensitivity and specificity. The distance between the point on ROC curve and (0,1) is also better (smaller) if ≥13 is used as the cut-off compared to ≥10 (0.24 vs. 0.30), as well as the percentage of agreement (71.4% vs. 62.9%). Table 6 shows evidence of convergent validity reflected by high correlations with the GAD-7 (0.779), CORE-10 (0.795) and the MCS score of the SF12v2 (− 0.711). Discriminant validity is indicated by moderate to low correlations with Beliefs about Emotions, psychosocial functioning, health-related quality of life and the PCS score of the SF-12v2.

Brief summary of diagnostic accuracy findings
To our knowledge, this is the first study assessing the validity of the PHQ-9 for depression-screening in adults with DS, comparing it to the M. I.N.I, a structured diagnostic tool. In our sample of DS patients, the PHQ-9 had very good internal consistency (Cronbach's α = 0.87) and area under the curve (AUC = 0.80) values indicating high overall diagnostic accuracy. While the PHQ-9 should never, in isolation, be used as a diagnostic tool for clinical depression, these findings (and our confirmation of the questionnaire as a univariate instrument) suggest that it is a useful screening method for depressive symptoms allowing users to describe the likelihood of a diagnosis of clinical depression at particular cut-off scores. The 'optimal' PHQ-9 cut-off is likely to be highly dependent on the intention of a particular use. If employed as a clinical screening tool to identify depressed patients, a lower cut-off would ensure that no cases are missed, although expert assessments would reveal that many patients scoring above this threshold are not depressed. If used in research with the aim of selecting patients highly likely to be suffering from comorbid major depression, a higher cut-off   would be appropriate. With this proviso, the present study found that using a PHQ-9 cut-off of ≥10 for detecting depression cases had the best predictive performance in terms of the diagnostic odds ratio (DOR = 14.7). This is consistent with the widely-used clinical cut-off [14]. However, a cut-off of ≥10 favoured sensitivity over specificity and greatly overestimated the presence of major depression. Using the AUC (a more global measure), we found that a cut-off of ≥13 had the best performance, with an AUC of 0.74 and a Youden index of 0.48, and provided an optimal balance of sensitivity (0.81) and specificity (0.67). At this score, over two-thirds of individuals with DS could be expected to have current major depression. However users will need to be aware that, at this cut off, one in five individuals with major depression is likely to be missed and one in three of those identified as depressed may not be found to be so on further examination. Nevertheless, the specificity of the instrument at a cut-off of ≥13 compares favourably with the poor specificity value of 0.49 at the 'standard' cut-off score of ≥10. They should also be aware that the PHQ-9 identifies depressive symptoms, as indeed does the M.I.N.I. when used as per our protocol, rather than specifically diagnosing major depressive disorder. We would not, for example, suggest commencing pharmacotherapy on this basis but one might suggest that these tools have a role to play in triaging referral to an appropriate clinician.

Findings in the context of previous literature
Although this study benefitted from a large sample size, it must be acknowledged that participants were captured at a particular time in their illness journey and may not have been typical of the DS population at large. Given that sensitivity and specificity will very depend very much on the prevalence of an observation, it is therefore important to note that, in terms of the distribution of PHQ-9 scores [5], but also in terms of age and gender distribution, comorbidity and social profile, our findings are consistent with those among other unselected DS cohorts [3,6]. Our findings show that a cut-off score of ≥10 overestimates depression diagnoses in patients with DS even more than in other medical conditions [16]. To match the level of specificity of the PHQ-9 in patients with other medical conditions at a cut-off of ≥13, as indicated by Levis et al., [16], we would need a cut-of score of ≥20 in our dataset. Similar findings by the same authors [17] suggested that a higher cut-off score of ≥14 estimated prevalence closest to SCID-based diagnoses, and thus was the most accurate depression estimate.
A possible explanation for the much higher cut-offs needed to achieve high specificity of the PHQ-9 in our DS sample may be due to physical symptom confounding. The high scores on the somatic symptom scale (Modified PHQ-15) and the high level of disabilities reported on the PCS score by patients in our study are in keeping with evidence from other large studies of patients with DS demonstrating the high level of physical comorbidities and medication use in this patient group [60]. It is likely that patients diagnosed with DS who have physical comorbidity will need higher cut-off scores compared to patients screened for depression without such comorbidity. Notably, the symptoms addressed in the PHQ-9 (anhedonia, low mood, sleeping difficulties, tiredness, lack of appetite, low self-esteem, poor concentration and moving slowly or being fidgety) can be associated with physical comorbidity, disability or medication [61][62][63]. Support for this comes from previous findings indicating strong correlations between depression scores and somatic symptoms even where a different version of the PHQ-15 from that used here was employed and in different populations [64,65]. Indeed, we previously indicated [22] that our sample of DS patients most commonly Table 4 Item factor analysis loadings per model.   reported the following co-morbid symptoms on the Modified PHQ-15: tiredness and low energy, headaches, sleeping difficulties, memory and concentration problems, worry, pain in the extremities, dizziness, anxiety, depression in addition to other symptoms. Other factors, including stigmatisation of mental health complaints and response bias, appear prevalent in the DS population [5,11,12] and may explain the low combined sensitivity and specificity of the PHQ-9 in our sample compared to other medical conditions [16]. Indeed, discrepancies between self-report and clinically assessed depression have been previously reported in the DS population [5]. This may be due to DS patients having difficulty accepting the psychological nature of their diagnosis and disclosing negative emotions [11,12]. To a degree, emotional dysregulation and high levels of alexithymia, commonly reported in DS [66], may also compromise the sensitivity of self-report measures.

Implications of clinical correlates with the PHQ-9
Intercorrelations between the PHQ-9 and other baseline measures provided good evidence of construct validity when considering its use in patients with DS; high correlations were found between PHQ-9 and the GAD-7 and CORE-10 measures (0.779, 0.795 respectively) and with the MCS score of the SF-12v2 (− 0.711). Of relevance here of course is the observation of wide overlap between depression, anxiety and somatic symptoms more generally in the population [61]. Discriminant validity was indicated by lower correlations between the PHQ-9 and Beliefs in Emotions, WSAS, EQ-5D-5L VAS, and the PCS score of the SF-12v2.

Factor structure
Factor analysis showed that for this population the PHQ-9 scale is unidimensional. To date, studies assessing the factor structure of the PHQ-9 yield inconclusive findings. While some studies have reported a two-factor structure [67] and found greater diagnostic accuracy using the PHQ9-CA subscale compared to the PHQ-9/S [20] (although the diagnostic accuracy of the PHQ9-CA was similar to the entire PHQ-9), several others suggest a unidimensional factor structure [68][69][70]. Since we have found one main factor, our results do not support considering a differential role of the cognitive/affective (PHQ-9/CA), and somatic (PHQ-9/S) subscales in diagnosing depression in this population.

Implications of a higher cut-off score
More generally, higher cut-offs ranging from 13 to 15 have been previously reported as 'optimal' across a range of conditions [17,20,67,71,72]. With respect to DS, the use of higher PHQ-9 cut-off scores may help reduce the overestimation of depression diagnoses in DS [5]. Determining the optimal diagnostic accuracy of the PHQ-9 will have important implications for depression screening in primary care services, which in turn, will influence management and treatment outcomes. Nonetheless, whether raising the PHQ-9 cut-off score to ≥13 will maximise accurate screening and mental health outcomes and minimise resource costs and adverse events compared to the traditional cut-off at ≥10 remains unclear.

Strengths and limitations
Our sample, derived from a high number of participating centres, allowed a detailed study of the validity of the PHQ-9 in a large, wellcharacterised cohort. We also used a cut-off score approach for the PHQ-9, which has been recently shown to be better than alternative methods, such as algorithms, at detecting clinical depression [73]. However, potential limitations of this study also need to be considered. Our eligibility criteria may have led to the exclusion of people with more marked psychopathology along with those who did not want to be randomised. Additionally, we validated self-report measures against the M.I.N.I. and these self-report measures can be susceptible to response bias. While the M.I.N.I. is a well-validated instrument for the diagnosis of depression which has been widely used to validate self-report measures, structured or semi-structures diagnostic instruments administered by experienced psychiatrists may have produced different results; however, the costs of doing so may have impacted on sample sizes [74]. For instance, previous studies indicate that, compared to the SCID, the M.I.N.I. may also overestimate the number of participants having clinical depression across a range of conditions [75,76], which might require the use of an even higher PHQ-9 cut-off score.
When interpreting our study, it is important to be aware that the PHQ-9 and the M.I.N.I. measure somewhat different constructs of depression. The PHQ-9 is a continuous measure of depressive symptomatology for which scores are given suggesting likelihood of a depressive disorder and consequent treatment implications. The M.I.N.I. by contrast offers a categorical definition of major depressive disorder. This is somewhat akin to measuring or reporting on blood pressure by  systolic/diastolic values or reporting a group of patients as suffering hypertension. There are advantages and disadvantages to both constructs. Kroenke and colleagues [14] in their original validation paper found some problems with mapping one construct onto the other and found a trade-off between sensitivity and specificity when mapping PHQ-9 scores to categorical diagnoses. Importantly, a recent metaanalysis demonstrated that both of these modes of assessment only showed low to moderate correlations with clinical mental health diagnosis (including depression) based on the assessment of an experienced clinician capable of taking account of a much wider range of personal and contextual circumstances [77].
In conclusion, our findings are not inconsistent with other PHQ-9 validation studies suggesting generalisability of our results and that use of the PHQ-9 in assessing depression in patients with DS should be considered, albeit with a higher than normally recommended cut-off score. While the PHQ-9 can be an effective screening instrument in patients with DS, a diagnosis of depression must involve assessment by an appropriately trained clinician.

Funding statement
This paper describes independent research funded by the National Institute for Health Research (Health Technology Assessment programme, 12/26/01, COgnitive behavioural therapy vs standardised medical care for adults with Dissociative non-Epileptic Seizures: A multicentre randomised controlled trial (CODES)). This paper also describes independent research part-funded (LHG, SV, TC) by the National

Declaration of Competing Interest
Markus Reuber is the paid Editor-in-Chief of Seizure-European Journal of Epilepsy and receives authorship fees from Oxford University Press in relation to a number of books about dissociative seizures. Jon Stone reports independent expert testimony work for personal injury and medical negligence claims, royalties from UpToDate for articles on functional neurological disorder and runs a free non-profit selfhelp website, www.neurosymptoms.org. Alan Carson reports being a paid editor of the Journal of Neurology, Neurosurgery and Psychiatry, and is the director of a research programme on functional neurological disorders; he gives independent testimony in Court on a range of neuropsychiatric topics (50% pursuer, 50% defender). The remaining authors have no conflict of interest to declare.