Comparison of symptom-based versus self-reported diagnostic measures of anxiety and depression disorders in the GLAD and COPING cohorts

Background: Understanding and improving outcomes for people with anxiety or depression often requires large sample sizes. To increase participation and reduce costs, such research is typically unable to utilise “ gold-standard ” methods to ascertain diagnoses, instead relying on remote, self-report measures. Aims: Assess the comparability of remote diagnostic methods for anxiety and depression disorders commonly used in research. Method: Participants from the UK-based GLAD and COPING NBR cohorts ( N = 58,400) completed an online questionnaire between 2018 and 2020. Responses to detailed symptom reports were compared to DSM-5 criteria to generate symptom-based diagnoses of major depressive disorder (MDD), generalised anxiety disorder (GAD), specific phobia, social anxiety disorder, panic disorder, and agoraphobia. Participants also self-reported any prior diagnoses from health professionals, termed self-reported diagnoses. “ Any anxiety ” included participants with at least one anxiety disorder. Agreement was assessed by calculating accuracy, Cohen ’ s kappa, McNemar ’ s chi-squared, sensitivity, and specificity. Results: Agreement between diagnoses was moderate for MDD, any anxiety, and GAD, but varied by cohort. Agreement was slight to fair for the phobic disorders. Many participants with self-reported GAD did not receive a symptom-based diagnosis. In contrast, symptom-based diagnoses of the phobic disorders were more common than self-reported diagnoses. Conclusions: Agreement for MDD, any anxiety, and GAD was higher for cases in the case-enriched GLAD cohort and for controls in the general population COPING NBR cohort. For anxiety disorders, self-reported diagnoses classified most participants as having GAD, whereas symptom-based diagnoses distributed participants more evenly across the anxiety disorders. Further validation against gold standard measures is required.


Introduction
Anxiety and depressive disorders are common and debilitating, impacting approximately 30% of the population during their lifetime (Bandelow & Michaelis, 2015;Kessler et al., 2005), and accounting for 10% of years lived with disability (World Health Organization, 2017).This highlights the importance of understanding disorder-related risk factors and outcomes.In order to undertake research or treatment of these conditions, a vital step is identifying participants with or without the disorder of interest.The "gold standard" for ascertaining disorder diagnoses in psychiatric research is a structured or semi-structured diagnostic interview conducted in person or over the phone by a trained interviewer, such as the Composite International Diagnostic Interview (CIDI) (World Health Organization, 1990) or Structured Clinical Interview for DSM-5 (SCID) (First, Williams, Karg, & Spitzer, 2015).However, conducting interviews is time-consuming and costly.Due to the heterogeneous and complex aetiology of anxiety and depression, studies often require extremely large samples to reach sufficient statistical power.This renders diagnostic interviews impractical, and large-scale studies increasingly use online, self-report questionnaires to ascertain anxiety and depressive disorder diagnostic status of participants.
There are two common methods to ascertain a diagnosis when using online questionnaires.Symptom-based diagnoses involve a questionnaire which asks participants to self-report specific symptoms (Davis et al., 2019).The questionnaire responses are then compared to diagnostic criteria, such as the Diagnostic Statistical Manual (DSM-5 (American Psychiatric Association, 2013)), to assess whether the participant meets criteria for a diagnosis.This has also been referred to as strictly-defined or detailed diagnosis (Cai et al., 2020;Davis et al., 2019).Self-reported diagnoses take a contrasting approach and utilise a single question in which participants are asked whether they have received a clinical diagnosis from a health professional for a psychiatric disorder during their lifetime (Davis et al., 2019).This is also known as minimal, broad, or light-touch diagnosis (Cai et al., 2020;Hyde et al., 2016).Both symptom-based and self-reported diagnostic methods are in widespread use in anxiety and depression research; however, it is unclear whether they identify the same individuals.
Most of this work has been conducted on major depressive disorder (MDD).Participants ascertained using self-reported diagnoses have high genetic overlap with symptom-based or clinically-ascertained MDD samples (Howard et al., 2019;Wray et al., 2018), suggesting comparability between individuals ascertained with the two measures.However, symptom-based MDD has significantly higher heritability than self-reported MDD.Higher heritability means more power to detect significant genetic effects.The higher heritability of symptom-based MDD suggests that there are differences between symptom-based and self-reported diagnoses, and also implies that utilising the self-reported measure could decrease the power to detect genetic effects (Cai et al., 2020;Glanville et al., 2020).If the two methods are not comparable, then measure selection or meta-analyses across cohorts with different ascertainment methods may impact the detection of genetic, as well as other (e.g., demographic, environmental, social), risk factors and outcomes.However, if instead the two methods are comparable, not only does this support meta-analyses across datasets using these two approaches, but it also reduces the burden on future participants, researchers, and clinicians in ascertaining diagnoses.Understanding the agreement between these measures is an important goal with clear implications across research and clinical fields.
In this study, we compared symptom-based and self-reported lifetime diagnoses for MDD and the five core anxiety disorders (generalised anxiety disorder [GAD], specific phobia, social anxiety disorder, panic disorder, and agoraphobia).Our aim was to assess agreement between these two diagnostic methods to determine whether they can be used interchangeably in research.

Sample
Data were examined from the National Institute for Health Research (NIHR) BioResource cohort (N = 59,161).
This included 41,708 participants who had been recruited as part of the Genetic Links to Anxiety and Depression (GLAD) Study (https ://gladstudy.org.uk).The GLAD Study is an online research platform to recruit individuals with a lifetime experience of anxiety and/or depression.Recruitment began in September 2018 and was conducted via traditional and social media campaigns or participating NHS sites.
The remaining 17,453 participants were NIHR BioResource members that had taken part in the COVID-19 Psychiatry and Neurological Genetics study (COPING NBR).This included members of the Irritable Bowel Disease cohort (IBD; N = 3,313) and general population cohorts (N = 14,140).There were several methods of initial recruitment to the NIHR BioResource, including through blood donation centres.
Both studies were conducted entirely online.Eligibility was limited to those aged 16 and over and who lived in the UK.Eligibility for the GLAD Study also required a lifetime experience of an anxiety or depressive disorder.Assessment occurred at a single stage in which all participants responded to online, self-report questionnaires that included two methods for ascertaining likely anxiety and depressive disorder diagnoses: symptom-based and self-reported.The analyses presented in this paper include data from all participants who completed the GLAD or COPING survey before 10th December 2020.Additional details of the design and implementation of the GLAD Study are described elsewhere (Davies et al., 2019).

Symptom-based diagnoses
Symptom-based diagnoses were evaluated using the MDD, GAD, specific phobia, social anxiety disorder, panic disorder, and agoraphobia modules from an adapted version of the short form Composite International Diagnostic Interview (CIDI-SF) (Kessler, Andrews, Mroczek, Ustun, & Wittchen, 1998), as used in the UK Biobank (Davis et al., 2020) and Australian Genetics of Depression study (Byrne et al., 2020).The CIDI-SF is based on the DSM-5 criteria for the disorders.Some validation studies of the online, self-administered version of the CIDI-SF for MDD have shown comparable agreement between symptom-based MDD with diagnostic interviews (Levinson et al., 2017;Patten, 1997).However, another study found low agreement between the online CIDI-SF and structured interviews for all disorders (MDD, GAD, specific phobia, social anxiety disorder, panic disorder, and agoraphobia) (Carlbring et al., 2002).Algorithms were developed to categorise participants as having a lifetime symptom-based diagnosis for a disorder if their responses on the CIDI-SF corresponded closely to DSM-5 criteria (see Appendix 1 in Supplementary Materials).

Self-reported diagnoses
Self-reported diagnoses were assessed by the question: "Have you ever been diagnosed with one or more of the following mental health problems by a professional, even if you don't have it currently?"Participants were prompted to select all diagnoses that applied or indicate "None of the above".Participants who did not respond to the self-reported measure therefore had missing data for all self-reported diagnoses.Participants were categorised as having a self-reported diagnosis if they selected the most comparable option to the relevant diagnosis (e.g., "Depression" for MDD).Phrasing for each of these items can be found in Appendix 2 in Supplementary Materials.These self-reported diagnoses reflect self-reports of a previous medically-provided diagnosis and were not validated against electronic health records (EHR).Validation studies for self-reported diagnoses have found moderate agreement of selfreported MDD (Sanchez-Villegas et al., 2008;Stuart et al., 2014) but poor agreement for self-reported anxiety disorders (McManus, Bebbington, Jenkins, & Brugha, 2016) with structured interviews.
For the GLAD cohort, self-reported panic disorder was added partway through data collection and is only included for GLAD participants who completed the questionnaire after 5th November 2018.Participants who signed up before that date were excluded from all agreement analyses for panic disorder.We included self-reported "panic attacks" as well as "panic disorder", and separately compared both to symptom-based panic disorder.We are mindful that panic attacks are transdiagnostic and not specific to panic disorder.Research has shown that patients who have a panic attack are more likely to seek help from physical health professionals (e.g., in hospitals) than mental health services (Katerndahl & Realini, 1995;Wang et al., 2005).However, recognition and diagnosis of panic disorder from physical health professionals is low (Fleet et al., 1996;Lynch & Galbraith, 2003).Given the higher recognition of panic attacks compared to panic disorder, we were interested in comparing agreement between self-reported diagnoses of panic attacks and panic disorder with symptom-based panic disorder.

"Any anxiety" diagnosis
It is common in research to combine the anxiety disorder subtypes into a single category, given that the risk factors and outcomes overlap considerably between them (e.g., Purves et al., 2020).We were interested in assessing agreement of symptom-based and self-reported lifetime diagnoses of "any anxiety" alongside that of the individual anxiety disorders.Symptom-based "any anxiety disorder" was defined as participants with a symptom-based diagnosis for at least one of the individual anxiety disorders (i.e., GAD, specific phobia, social anxiety disorder, panic disorder, or agoraphobia).Self-reported diagnosis of "any anxiety disorder" included participants who self-reported receiving M.R. Davies et al. at least one anxiety disorder diagnosis from a health professional.

Analysis
We calculated the number of participants with zero, one, two, and three or more symptom-based and self-reported diagnoses.Participants with at least one missing value for a symptom-based diagnosis were included in the frequencies for one, two, or three or more symptombased diagnoses.However, they were excluded when calculating the number of participants with zero symptom-based diagnoses.Selfreported panic disorder was added partway through GLAD data collection resulting in 14,858 GLAD participants with missing data on this item.These participants with missing data on self-reported panic disorder were included in all self-reported disorder frequencies.Participants with missing data on the remaining self-reported diagnoses were excluded.
We also assessed the frequency of symptom-based and self-reported diagnoses for each disorder as percentages of the whole sample, excluding participants with missing data on one of the measures for the disorder in question (e.g., a participant with self-reported GAD but missing data for symptom-based GAD was excluded from GAD frequencies for both measures).
Linear regression models were built to assess associations between demographic variables and missing data for symptom-based diagnoses, and logistic regressions were used to assess associations with selfreported diagnoses.
Agreement and disagreement levels between these two diagnostic methods were assessed by calculating: accuracy (the proportion [%] agreement), Cohen's kappa (Cohen, 1960), McNemar's within-subjects chi-squared test (McNemar, 1947), sensitivity, and specificity.Cohen's kappa calculated reliability between the two methods.Values range from zero to one, with higher values indicating greater reliability (Landis & Koch, 1977).McNemar's test assessed whether differences in the predictive accuracy of self-reported and symptom-based diagnoses were statistically significant (α < 0.05).Sensitivity is the proportion of individuals with a disorder that the measure correctly classifies as having a diagnosis (proportion of true positives).In contrast, specificity is the proportion of individuals without a disorder that are correctly classified as not having a diagnosis (proportion of true negatives).Since we lacked a "gold standard" reference in this sample, sensitivity and specificity analyses were conducted in both directions.We interpreted results as "agreed positives/negatives" and "disagreed positives/negatives".
The proportions of agreement and disagreement between these measures were also examined post-hoc by sex and compared using chisquared analyses.

Code availability
All data cleaning and analyses were conducted using R version 3.5.3(R Core Team, 2018), the tidyverse (Wickham et al., 2019), and caret (Kuhn, 2021) packages.The full code for the diagnostic algorithms and analyses included in this paper are available at https://github.com/mollyrdavies/GLAD-Diagnostic-algorithms.

Data availability
GLAD and COPING study data is available via a data request application to the NIHR BioResource (https://bioresource.nihr.ac.uk/usin g-our-bioresource/academic-and-clinical-researchers/apply-for-bior esource-data/).The data are not publicly available due to restrictions outlined in the study protocol and specified to participants during the consent process.A specific data freeze is available including the variables for the analyses described in this paper; email gladstudy@kcl.ac.uk for details.

Sample characteristics
Participants with missing data for sex (N = 754 GLAD only) or age (N = 31 GLAD; N = 6 COPING NBR) were excluded from analyses.The remaining sample included 58,400 participants.Table 1 displays the sample descriptives by cohort.The average age of participants was 43 years, 73% were female, the majority self-defined as white (95%), and a large proportion had a university degree (54%).Characteristics between the cohorts were compared with t-test and chi-squared analyses.Given the large sample sizes, all characteristics were significantly different between the cohorts.The only differences that were clinically meaningful were age and sex, with the GLAD sample having a younger mean age and a higher proportion of female participants.

Frequencies
The frequency of the number of anxiety or major depressive disorder diagnoses was examined (Table 2).Since panic attacks are not a  = 58,400).Characteristics between the cohorts were compared using t-test and chisquared analyses.Significant differences were indicated in the columns using the symbols below.Note that significant differences for ethnicity and highest education were not assessed for each factor level but represent overall differences.*** p < 0.001.
M.R. Davies et al. disorder, they were excluded from self-reported disorder frequencies.
For symptom-based diagnoses, 21,779 participants (13,486 GLAD and 8,313 COPING NBR) had missing data for at least one disorder.Participants with zero self-reported or symptom-based diagnoses who have missing data for at least one disorder on the respective measure (excluding self-reported panic disorder, which was added partway through data collection) are displayed in Table 2 in the appropriate "NA" column.
Being male, self-identifying as Mixed or Asian/Asian British, and having a lower level of educational attainment were significantly associated with more missing data on symptom-based diagnoses.No characteristics were meaningfully associated with missing data on selfreported diagnoses.Full details of these analyses and summary of the results can be found in Appendix 3 in Supplementary Materials, along with a table of missing data by diagnosis and frequencies of the number of missing symptom-based diagnoses.
Frequency of diagnosis varied by cohort.As shown in Table 2, only 3% of GLAD participants did not receive a symptom-based diagnosis and 4% did not report a self-reported diagnosis for any disorder.However, for the COPING NBR sample these proportions were greater with 36% without a symptom-based diagnosis and 72% without a self-reported diagnosis.Of note, the proportion of COPING NBR participants without a symptom-based diagnosis was lower due to the high percentage (34%) in the missing data (NA) column, which included participants with no symptom-based diagnosis and missing data on the algorithm for at least one disorder.This difference in proportions between the cohorts is unsurprising, since GLAD participants were recruited for and therefore self-identified as having had an anxiety and/ or depressive disorder diagnosis at some point in their lives whereas COPING NBR participants were recruited from the general population or for a physical health condition.Overall, 42,487 (73%) participants indicated a self-reported diagnosis of a major depressive or anxiety disorder, whereas 41,752 (71%) participants were identified as having at least one of the symptom-based diagnoses.
Fig. 1 displays the frequencies of symptom-based and self-reported diagnoses for the full sample and by cohort for each of the disorders.The bars for each diagnosis only include participants without missing data for either measure on the specified disorder.For instance, the proportion of participants with a MDD diagnosis was calculated as a percentage of those with no missing data for symptom-based and selfreported MDD.

Agreement
We examined the agreement between symptom-based and selfreported diagnoses.Fig. 2 displays the agreement and disagreement for each disorder.Accuracy, sensitivity, specificity, Cohen's kappa, and McNemar's test p-values are presented in Table 3.
In the full sample, we see a fairly similar pattern of effects for MDD, any anxiety, and GAD.The accuracy or proportion (%) agreement between symptom-based and self-reported diagnoses for these disorders was high (72-84%).Similarly, self-reported diagnoses had high sensitivity (0.83-0.87) and moderate specificity (0.63-0.75) for the respective symptom-based measure.This indicates that these self-reported diagnoses of MDD, any anxiety, and GAD had high proportions of agreed The table displays the number and percentage of the full sample, GLAD, and COPING NBR participants (rows) with 0, 1, 2, or 3 + symptom-based or self-reported diagnoses (columns).The mean number of symptom-based and self-reported diagnoses are reported.We also report the frequencies with 0, 1, 2, or 3 + symptombased or self-reported diagnoses for participants with one or more diagnoses for the opposing measure (e.g., the number of participants with one or more symptombased diagnoses who have 0, 1, 2, or 3 + self-reported diagnoses).Abbreviations: GLAD, Genetics Links to Anxiety and Depression; COPING NBR, COVID-19 Psychiatry and Neurological Genetics NIHR BioResource.a For symptom-based diagnoses, 21,779 participants (13,486 GLAD and 8,313 COPING NBR) had missing data for at least 1 disorder.Participants with at least 1 missing value for a symptom-based diagnosis were excluded when calculating the number of participants with 0 symptom-based diagnoses, and are instead included in the "NA" column under the "symptom-based diagnoses" header.Participants with missing data were included in the remaining frequencies for 1, 2, or 3 + symptombased diagnoses.
b For self-reported diagnoses, the panic disorder option was added partway through data collection.Participants with missing data on any self-reported diagnosis except panic disorder were excluded from these frequencies, but those with missing data for panic disorder were included.The number of excluded participants is presented in the "NA" column under the "self-reported diagnoses" header.Panic attacks were not included in these figures.
positives and slightly lower proportions of agreed negatives with the symptom-based measure.Notably, sensitivity and specificity of symptom-based MDD and any anxiety for self-reported diagnoses were similar, meaning that proportions of agreed positives and agreed negatives for these disorders were comparable between the measures, regardless of the direction of comparison.In contrast, symptom-based GAD had moderate sensitivity and high specificity for the self-reported measure.
Self-reported GAD in the full sample therefore had higher sensitivity (i.e., proportion of agreed positives) for the symptom-based diagnosis than symptom-based GAD had for the self-reported diagnosis.These sensitivity results correspond with Fig. 2, which demonstrated that the largest proportion of disagreement for GAD (21%) were participants with a selfreported but not symptom-based diagnosis.Instead, MDD and any anxiety had equal proportions of the two types of disagreement, hence sensitivity and specificity results for these measures were similar in both directions.Despite the reasonable sensitivity and specificity values for MDD, any anxiety, and GAD in the full sample, Cohen's kappa indicated only moderate agreement for these measures (Landis & Koch, 1977).When results for these three disorder categories are broken down by cohort, the proportion of agreement for MDD and any anxiety remained high for both GLAD and COPING NBR (81-84%), while the proportion of agreement for GAD was lower in GLAD (65%) than COPING NBR (87%).However, sensitivity and specificity results for all three disorders varied by cohort.In the GLAD cohort, self-reported MDD, any anxiety, and GAD had high sensitivity (0.86-0.91) and low specificity (0.33-0.35) for the symptom-based measures, whereas these self-reported diagnoses in the *Any anxiety includes participants with at least one anxiety disorder (GAD, specific phobia, social anxiety disorder, panic disorder, and/or agoraphobia) on the indicated method (symptom-based vs self-reported).* *For panic attacks, symptom-based panic disorder is displayed and compared to self-reported panic attacks.Abbreviations: GLAD, Genetics Links to Anxiety and Depression; COPING NBR, COVID-19 Psychiatry and Neurological Genetics NIHR BioResource; MDD, major depressive disorder; GAD, generalised anxiety disorder.
COPING NBR cohort had low to moderate sensitivity (0.44-0.56) and high specificity (0.92-0.94).For example, self-reported MDD had sensitivity of 0.91 and specificity of 0.33 in GLAD but sensitivity of 0.56 and specificity of 0.92 in COPING NBR.Self-reported MDD, any anxiety, and GAD therefore had high proportions of agreed positives in GLAD, and high proportions of agreed negatives in COPING NBR.Differences were observed in sensitivity and specificity results of the symptom-based diagnoses for the self-reported measures as well, with GLAD having higher sensitivity and lower specificity on each disorder than COPING NBR.For instance, symptom-based MDD had sensitivity of 0.91 and specificity of 0.33 in GLAD and sensitivity of 0.75 and specificity of 0.84 in COPING NBR for self-reported MDD.
Self-reported diagnoses of the phobic disorders (specific phobia, social anxiety disorder, panic disorder, and agoraphobia) showed a consistent pattern of results in the full sample and by cohort, displaying low sensitivity (0.08-0.43) and high specificity (0.86-1.00) for the symptom-based measures.Referring back to Fig. 2, the highest proportion of disagreement between the phobic disorder diagnoses was observed for participants with a symptom-based but not self-reported diagnosis.These results were more pronounced for the COPING NBR cohort, for which the self-reported diagnoses of the phobic disorders had sensitivity values below 0.17 and specificity around 1.00 for the symptom-based measure.Although the proportion of agreement for these disorders ranged between 61% (panic disorder in GLAD) and 98%  a Analyses for self-reported panic attacks were conducted with symptom-based panic disorder.
(agoraphobia in COPING NBR), Cohen's kappa values indicated slight agreement for specific phobia, panic disorder, and agoraphobia diagnoses and fair agreement for social anxiety disorder in the full sample and by cohort.
The self-reported measure of panic attacks had higher sensitivity and lower specificity than self-reported panic disorder for symptom-based panic disorder in the full sample and by cohort.For example, in the full sample self-reported panic attacks had sensitivity of 0.61 and specificity of 0.86 while self-reported panic disorder had sensitivity of 0.15 and specificity of 0.98 for symptom-based panic disorder.Selfreported panic attacks therefore had a higher proportion of agreed positives but lower proportion of agreed negatives than self-reported panic disorder for the symptom-based measure, also observable in Fig. 2. Symptom-based panic disorder showed a comparable proportion of agreement with self-reported panic attacks and panic disorder, but Cohen's kappa indicated better agreement with self-reported panic attacks.

Sex differences in agreement
There was little variation by sex in proportions of agreement and disagreement between the measures (Appendix 4 in Supplementary Materials).Fig. 4. Alternate self-reported diagnoses for participants with symptom-based but not self-reported specific phobia, social anxiety disorder, panic disorder, or agoraphobia.Each graph includes participants with a symptom-based but not self-reported diagnosis of specific phobia, social anxiety disorder, panic disorder, or agoraphobia.The titles of the plots indicate the disorder, and subtitles display the N of participants from the full sample and each cohort with a selfreported and no symptom-based diagnosis for the referenced disorder.The bars display the proportions (%) of these participants with self-reported diagnoses for each of the other disorders, for any other self-reported anxiety disorder, or without any self-reported diagnosis (no diagnosis).The bars for the disorders are not exclusive as participants may have more than one self-reported diagnosis.Proportions exclude participants with missing data for the respective self-reported diagnosis, and the proportion for "no diagnosis" excludes participants with missing data on any of the disorders.Abbreviations: GLAD, Genetics Links to Anxiety and Depression; COPING NBR, COVID-19 Psychiatry and Neurological Genetics NIHR BioResource; MDD, major depressive disorder; GAD, generalised anxiety disorder.

Alternate diagnoses
An unusually large proportion, relative to the other disorders, of the full sample with a self-reported diagnosis of GAD did not receive a symptom-based diagnosis (21%).We conducted a post-hoc analysis to explore whether these participants received any other symptom-based diagnosis.Fig. 3 displays the proportions (%) of participants with selfreported but not symptom-based GAD who had other symptom-based diagnoses, as well as the proportion without any symptom-based diagnosis.Proportions were calculated excluding participants with missing data for that symptom-based diagnosis.The proportion of participants with no symptom-based diagnosis excluded participants with missing data on any of the disorders.
The results showed that the largest proportion of participants with self-reported but not symptom-based GAD for both cohorts had symptom-based MDD (GLAD: 84%; COPING NBR: 59%).Of the anxiety disorders, symptom-based panic disorder was the most common for these participants (GLAD: 37%; COPING NBR: 15%).Over half of GLAD participants with self-reported but not symptom-based GAD had a different symptom-based anxiety disorder (58%), with only 8% having no symptom-based diagnosis.For COPING NBR, approximately one quarter of these participants had a different symptom-based anxiety disorder (27%), but over one-third did not have any symptom-based diagnosis (34%).
The phobic disorders (specific phobia, social anxiety disorder, panic disorder, and agoraphobia) displayed relatively high proportions of participants with symptom-based but not self-reported diagnoses.For these disorders, we were interested in exploring whether these participants reported other self-reported diagnoses, different from the symptom-based diagnosis they received.We hypothesised that a large number of these participants would report a self-reported GAD diagnosis.Fig. 4 displays the proportions of participants with self-reported but not symptom-based diagnoses for specific phobia (N = 7,817), social anxiety disorder (N = 8,191), panic disorder (N = 9,215), and agoraphobia (N = 6,415) who indicated a self-reported diagnosis for another disorder.
Participants with a symptom-based but not self-reported diagnosis of the phobic disorders displayed a somewhat similar pattern of alternate diagnoses.Specifically, MDD and GAD were the most common alternate self-reported diagnoses for each phobic disorder, reported by over threequarters of the GLAD cohort and between one-third and half of the COPING NBR cohort.Self-reported social anxiety disorder was also reported by over one-third of the GLAD cohort.For COPING NBR, in those with symptom-based but not self-reported agoraphobia, only social anxiety disorder was reported at a relatively higher rate.Alternate selfreported diagnoses of the other phobic disorders were rarely reported in either cohort.Notably, between one-quarter and half of the COPING NBR cohort with a symptom-based but self-reported diagnosis of one of the phobic disorders did not have a self-reported diagnosis of any disorder, which was highly uncommon in the GLAD cohort (< 4%).

Overview
In this study, we examined the agreement and disagreement between symptom-based and self-reported lifetime diagnoses of MDD, any anxiety, and the five core anxiety disorders: GAD, specific phobia, social anxiety disorder, panic disorder, and agoraphobia.These approaches are often utilised as the sole diagnostic methods in large-scale research studies and are frequently combined in meta-analyses despite limited evidence of comparability.
Symptom-based and self-reported diagnoses for MDD and any anxiety were reasonably comparable, demonstrating high accuracy (81-84%) and moderate agreement (κ = 0.56-0.61).Particularly high agreement was found for participants with a diagnosis (sensitivity 0.86-0.87).Although accuracy (72%) and Cohen's kappa (κ = 0.44) indicated moderate agreement for the GAD measures, a large proportion (21%) of the sample with self-reported GAD did not receive a symptombased diagnosis, a finding not seen for MDD or any anxiety.This parallels findings from 150,000 individuals recruited from the general population into the UK Biobank, which also found moderate agreement for MDD (κ = 0.46) but much lower agreement for GAD (κ = 0.28) (Davis et al., 2019).The self-reported measures for MDD, any anxiety, and GAD performed well at identifying symptom-based cases in the GLAD case cohort (sensitivity 0.86-0.90),but poorly in COPING NBR (sensitivity 0.44-0.56).In contrast, this approach performed well at identifying symptom-based controls in the COPING NBR general population cohort (specificity 0.92-0.94)yet was poor at identifying those without a diagnosis in GLAD (specificity 0.33-0.35).This suggests that the enrichment of cases (in GLAD) and controls (in COPING NBR) had a large impact on the performance of the two approaches.
In contrast, for the phobic disorders (specific phobia, social anxiety disorder, panic disorder, and agoraphobia) we found slight to fair agreement between measures (κ = 0.17-0.36).Sensitivity and specificity results were also notably different for self-reported diagnoses of the phobic disorders compared to MDD, any anxiety, and GAD.Selfreported phobic disorder displayed high agreement with the symptombased measure on participants without a diagnosis (specificity 0.90-0.99),but poor agreement on participants with a diagnosis (sensitivity 0.13-0.42).Post-hoc analyses found that a relatively large proportion (12-26%) of the participants in our sample who received a symptom-based diagnosis of a phobic disorder did not self-report the same diagnosis; instead, many self-reported GAD or MDD.
Self-reported panic attacks had higher sensitivity and only slightly lower specificity than self-reported panic disorder for symptom-based panic disorder.These findings are contrary to what might be expected, since panic attacks are a symptom that can manifest in isolation (Kessler et al., 2006) and are not specific to panic disorder.However, self-reported panic attacks captured a higher proportion of participants with symptom-based panic disorder than self-reported panic disorder.

Limitations
A strength of the GLAD and COPING NBR studies is the successful recruitment of several thousand participants to complete detailed phenotyping measures.This enabled researchers to compare self-reported and symptom-based measures of depressive and anxiety disorders.However, as with any study, there are limitations that should be considered.Both cohorts are disproportionately female, white, and highly educated compared to the UK population.Exploration of measure agreement in more representative samples would establish the generalisablity of our findings.
As mentioned previously, the symptom-based and self-reported diagnoses have not been compared to a "gold standard" clinical interview in this study, and prior evidence is conflicting (Carlbring et al., 2002;Levinson et al., 2017;McManus et al., 2016;Patten, 1997;Sanchez-Villegas et al., 2008;Stuart et al., 2014).As a result, we cannot make any conclusions about which diagnostic method is more accurate from the analyses conducted here.Further research is therefore required to validate these measures against "gold standard" clinical interviews.Validation is key to ensuring that research findings are relevant to clinical practice.Nonetheless, it is worth noting that some researchers have argued that "gold standard" diagnoses do not exist; even structured and semi-structured interviews may result in different classifications of diagnosis and estimates of population prevalence (Brugha, Bebbington, & Jenkins, 1999).Other validation methods for these measures are worth exploring, such as investigating the genetic overlap with clinically-ascertained cohorts or by comparing against clinical outcome measures such as functional impairment or treatment response.
At this point we could not assess whether participants' self-report of a clinical diagnosis matched their clinical data nor which health professional provided the diagnosis (e.g., general practitioner [GP] or psychiatrist).In the context of genetics, studies that have utilised selfreported diagnoses have similarly done so without medical record validation (Howard et al., 2019;Purves et al., 2020;Wray et al., 2018).Furthermore, since individuals with anxiety and depressive disorders often do not present to a medical professional or receive a diagnosis (Kessler, Bennewith, Lewis, & Sharp, 2002;McManus et al., 2016;Rayner et al., 2019), reliance on health records alone is not a substitute for asking the participant.However, all GLAD and COPING NBR participants have consented to providing medical record access and an application for clinical data is underway, so this comparison could be conducted in future analyses.

Implications
We observed an asymmetry between cohorts in agreement results for MDD, any anxiety, and GAD, with agreement being stronger for cases in the case-enriched GLAD cohort and stronger for controls in the general population COPING NBR cohort.Consistent results across the cohorts were found for symptom-based and self-reported diagnoses of the phobic disorders (specific phobia, social anxiety disorder, panic disorder, and agoraphobia).The phobic disorder measures had high agreement for classification of participants without a diagnosis but differed substantially when classifying cases.Taken together, these findings suggest that studies on anxiety disorders applying self-reported diagnostic methods would tend to categorise the majority of participants as having GAD, whereas those utilising symptom-based measures would find more of a distribution across the anxiety subtypes.
These findings have important implications for large-scale studies investigating disorder-specific factors or outcomes, as ascertained diagnoses would differ depending on the selected measure.Although some factors are largely shared between anxiety and major depressive disorders (e.g., genetic factors), others show more specificity.For example, aspects of the environment show differential associations with anxiety and depression (Finlay-Jones & Brown, 1981;Hettema, Prescott, Myers, Neale, & Kendler, 2005;Waszczuk, Zavos, Gregory, & Eley, 2014), and some suggest that incorporating disorder-specific approaches to psychological treatment can improve outcomes (Clark & Beck, 2011).Studies focused on expanding sample sizes may find that meta-analyses combining data from cohorts ascertained with self-reported or symptom-based diagnoses are sufficient to identify effects that are shared between anxiety and major depressive disorders (Hettema et al., 2005;Morneau-Vaillancourt et al., 2020).However, in order to understand disorder-specific risk factors or investigate treatment approaches for these disorders, particularly the anxiety subtypes, findings may vary depending on the ascertainment method used in the study.The population of interest should therefore be considered when selecting measures for future studies.Both methods may have an important role in future research depending on the aims of individual studies.Those focused on increasing participation and reducing the time burden for participants and researchers may consider the use of self-reported measures, taking into account the differences by disorder in sensitivity and specificity for symptom-based diagnoses.For instance, a study recruiting from clinical populations may use self-reported MDD to ascertain MDD diagnosis.In contrast, those particularly interested in identifying cases with specific anxiety disorder subtypes are likely to benefit from use of the symptom-based approach.
These results can further be considered in the context of the efficacy of self-reported, broad diagnostic measures, which have been explored the most thus far with regard to depression.There is a diversity of opinions concerning the value and utility of these brief measures in the field, especially in large-scale research such as psychiatric genetics.For example, researchers have found that "broad" depression (e.g., depression defined using self-reported diagnoses or self-reports of treatment seeking) is non-specific to MDD and has lower heritability estimates than symptom-based MDD (Cai et al., 2020;Glanville et al., 2020).Misclassification dilutes the power of case-control analyses to detect differences between the samples (Manchia et al., 2013;Schork et al., 2018).As such, the lower heritability estimates of self-reported measures of MDD may indicate that this approach has a higher rate of misclassification for true cases and controls than the symptom-based diagnoses.Indeed, self-reported MDD, any anxiety, and GAD in this study had high agreement with symptom-based diagnoses when identifying cases, but differed in the classification of those without a diagnosis.It has been argued that symptom-based measures are preferable over their self-reported counterparts, and that studies utilising self-reported diagnoses for the purpose of case-control comparisons are more likely to identify effects that are non-specific, complicating efforts to disentangle disorder-specific factors and treatments (Cai et al., 2020;Phillips & Kendler, 2021).For studies that are unable to administer symptom-based assessments or existing studies that did not include these measures, combining multiple broad diagnostic measures (e.g., self-reported diagnoses, self-reported help-seeking questions, and self-reported antidepressant usage) has been shown to reduce misclassification and increase heritability of MDD cases to equal or exceed heritability estimates of symptom-based MDD (Glanville et al., 2020).
In terms of the differences observed in the categorisation of the anxiety disorders, the lower proportion of self-reported diagnoses of the anxiety disorders (aside from GAD) could be due to a lack of treatmentseeking or recognition.Many individuals with symptoms do not seek treatment for mental health or related problems (McManus et al., 2016;Rayner et al., 2019) and those that do more commonly discuss their problems with a GP rather than a mental health professional (McManus et al., 2016).Research has shown that there is an under-recognition of anxiety disorders, particularly by GPs (Arikian & Gorman, 2001;Fernández et al., 2012;Tylee & Walters, 2007;Vermani, Marcus, & Katzman, 2011).GPs have limited amounts of time and resources and lack specialised training to conduct comprehensive assessments of anxiety symptoms (Baird, Charles, Honeyman, Maguire, & Das, 2016).It is therefore possible that GPs encountering distressed patients may identify symptoms as "anxiety" without specifying a disorder.Notably, in the GLAD and COPING NBR studies, the phrasing of the self-reported GAD item encapsulates general nerves or anxiety to account for this, which may have resulted in an overestimate of the number of participants given a GAD diagnosis.
Our finding that self-reported panic attacks had higher agreement than self-reported panic disorder to symptom-based panic disorder could be further indication of this under-recognition.Although panic attacks are not specific to panic disorder, they are more recognisable and straightforward to diagnose than panic disorder.However, studies have found that the majority of individuals who experience a panic attack do not have panic disorder (Kessler et al., 2006).Consequently, this finding could instead indicate a lack of specificity of the symptom-based panic disorder measure.

Conclusion
Large-scale research projects that lack the resources to conduct "gold standard" clinical interviewing commonly utilise questionnaires applying symptom-based or self-reported diagnostic methods.We compared these two approaches and found good comparability between symptom-based and self-reported MDD and "any anxiety" disorder for categorisation of participants with a diagnosis, although performance varied between the case and general population cohorts in this study.Ascertainment of participants with diagnoses for the individual anxiety disorders was largely different depending on which phenotyping measure was applied.Taking our results together with previous studies, we suggest that self-reported diagnoses may be sufficient depending on the aims of the research and the population under study, but may not be suitable for case-control studies investigating disorder-specific risk factors or outcomes.Notably, prior research provides little insight regarding the validity of self-reported or symptom-based diagnoses against clinical interviews.The differences observed in this study highlighted the need for further validation of these diagnoses against clinical interviews to advise measure selection and ensure translatability of research incorporating these measures.

Fig. 1 .
Fig. 1.Frequencies of symptom-based and self-reported diagnoses of major depressive disorder, any anxiety, or an anxiety disorder in the GLAD and COPING cohorts.The bars represent the proportion (%) of either the full sample (N = 58,400), GLAD (N = 40,953), or COPING NBR (N = 17,447) with a symptombased (blue) or self-reported diagnosis (yellow) for each disorder.Proportions exclude participants with missing data on either measure for the specified diagnosis.*Anyanxiety includes participants with at least one anxiety disorder (GAD, specific phobia, social anxiety disorder, panic disorder, and/or agoraphobia) on the indicated method (symptom-based vs self-reported).* *For panic attacks, symptom-based panic disorder is displayed and compared to self-reported panic attacks.Abbreviations: GLAD, Genetics Links to Anxiety and Depression; COPING NBR, COVID-19 Psychiatry and Neurological Genetics NIHR BioResource; MDD, major depressive disorder; GAD, generalised anxiety disorder.

Fig. 2 .
Fig. 2. All comparisons of agreement and disagreement on symptom-based vs self-reported diagnoses.Each bar displays the proportions (%) of the full sample (N = 58,400) and each cohort (GLAD: N = 40,954; COPING NBR: N = 17,447) with agreement or disagreement between the two measures for each disorder.Agreements are represented in blue (dark blue = agreement on diagnosis, light blue = agreement on no diagnosis) while disagreements are in yellow (dark yellow = symptom-based but not self-reported diagnosis, light yellow = self-reported but not symptom-based diagnosis).*The panic attacks column displays the agreement between symptom-based panic disorder and self-reported panic attacks.Abbreviations: GLAD, Genetics Links to Anxiety and Depression; COPING NBR, COVID-19 Psychiatry and Neurological Genetics NIHR BioResource; MDD, major depressive disorder; GAD, generalised anxiety disorder.

Fig. 3 .
Fig. 3. Alternate symptom-based diagnoses for participants with self-reported but not symptom-based generalised anxiety disorder.Each bar displays the proportions (%) of participants with self-reported but not symptom-based GAD from the full sample (N = 9,197) and each cohort (GLAD: N = 8,128; COPING NBR: N = 1,069) with: symptom-based diagnoses for each of the other disorders, for any other symptom-based anxiety disorder, or without any symptom-based diagnosis (no diagnosis).The bars for the disorders are not exclusive as participants may have more than one symptom-based diagnosis.Proportions exclude participants with missing data for the indicated symptom-based diagnosis, and the proportion of "no diagnosis" excludes participants with missing data on any of the disorders.Abbreviations: GLAD, Genetics Links to Anxiety and Depression; COPING NBR, COVID-19 Psychiatry and Neurological Genetics NIHR BioResource; MDD, major depressive disorder; GAD, generalised anxiety disorder.

Table 1
Sample characteristics.
Table 1 displays the sample characteristics for the Genetics Links to Anxiety and Depression (GLAD; N = 40,953) and COVID-19 Psychiatry and Neurological Genetics NIHR BioResource (COPING NBR; N = 17,447) cohorts, as well as the full sample (N

Table 2
Frequencies of symptom-based and self-reported diagnoses from the full sample (N = 58,400) and by cohort.

Table 3
Agreement between symptom-based and self-reported diagnoses for the full sample and by cohort.
Cross tabulations are presented for each disorder for the full sample and by cohort, with symptom-based (yes/no) in columns and self-reported (yes/no) in rows.Agreements between symptom-based and self-reported diagnoses are in bold.Accuracy (%) and sensitivity and specificity of self-reported for symptom-based (SR -> SB) and of symptom-based for self-reported (SB -> SR), Cohen's kappa, and McNemar's test p-value results are presented.Accuracy, sensitivity, and specificity are reported with 95% confidence intervals in parentheses.Abbreviations: GLAD, Genetics Links to Anxiety and Depression; COPING NBR, COVID-19 Psychiatry and Neurological Genetics NIHR BioResource; SB, symptom-based; SR, self-reported; MDD, major depressive disorder; GAD, generalised anxiety disorder.