DSM-5 Autism Spectrum Disorder: In search of essential behaviours for diagnosis

The objective of this study was to identify a set of ‘essential’ behaviours sufﬁcient for diagnosis of DSM-5 Autism Spectrum Disorder (ASD). Highly discriminating, ‘essential’ behaviours were identiﬁed from the published DSM-5 algorithm developed for the Diagnostic Interview for Social and Communication Disorders ( DISCO ). Study 1 identiﬁed a reduced item set (48 items) with good predictive validity (as measured using receiver operating characteristic curves) that represented all symptom sub-domains described in the DSM-5 ASD criteria but lacked sensitivity for individuals with higher ability. An adjusted essential item set (54 items; Study 2) had good sensitivity when applied to individuals with higher ability and performance was comparable to the published full DISCO DSM-5 algorithm. Investigation at the item level revealed that the most highly discriminating items predominantly measured social-communication behaviours. This work represents a ﬁrst attempt to derive a reduced set of behaviours for DSM-5 directly from an existing standardised ASD developmental history interview and has implications for the use of DSM-5 criteria for clinical and research practice.

Thi s v e r sio n is b ei n g m a d e a v ail a bl e in a c c o r d a n c e wit h p u blis h e r p olici e s. S e e h t t p://o r c a . cf. a c. u k/ p olici e s. h t ml fo r u s a g e p olici e s. Co py ri g h t a n d m o r al ri g h t s fo r p u blic a tio n s m a d e a v ail a bl e in ORCA a r e r e t ai n e d by t h e c o py ri g h t h ol d e r s . & Volkmar, 2012;Wilson et al., 2013;Worley & Matson, 2012;Young & Rodi, 2013), particularly for individuals with higher cognitive ability or Asperger-like presentations (e.g., McPartland et al., 2012). These studies have typically reported reduced sensitivity in the context of excellent specificity (between .94 and 1.0), suggesting that the proposed DSM-5 criteria may be overly stringent. In contrast, Huerta et al. (2012) reported that although the DSM-5 criteria missed only nine percent of cases meeting criteria for DSM-IV-TR PDD according to parent-report data, the specificity of the DSM-5 criteria was unacceptably low (.53). Moreover, a recent meta-analysis reported that although DSM-5 may lack sensitivity for autistic disorder and PDD not otherwise specified (PDD-NOS), sensitivity was not significantly reduced for individuals with a DSM-IV-TR diagnosis of Asperger Syndrome (Kulage, Smaldone, & Cohn, 2014). However, the majority of studies have reported that DSM-5 may under-diagnose individuals with ASD and importantly, several studies have also reported that symptom severity for individuals missed by DSM-5 was significantly higher than for both clinical and non-clinical controls (Matson, Belva et al., 2012;Mayes et al., 2013;Worley & Matson, 2012). In order to meet criteria for DSM-5 ASD, an individual must have symptoms in all three of the social-communication subdomains and at least two of the four restricted and repetitive sub-domains (see Table 1). However, adjustments have been proposed to 'relax' the number of DSM-5 subdomains required to meet diagnostic criteria. Reducing the number of subdomains required to meet criteria, was reported to improve sensitivity (Frazier et al., 2012;Huerta et al., 2012;Mayes et al., 2013;Wilson et al., 2013). Although this was accompanied by decreased specificity, several studies reported that this effect was minimal Mayes et al., 2013) and consideration of the use of 'relaxed' rules was recommended (Frazier et al., 2012).
In contrast to the findings described above, a recent study using the Diagnostic Interview for Social and Communication Disorders (DISCO; Leekam, Libby, Wing, Gould, & Taylor, 2002;Wing, Leekam, Libby, Gould, & Larcombe, 2002) produced a diagnostic algorithm for the draft descriptions of DSM-5 ASD (APA, 2011) with high levels of both sensitivity and specificity (Kent, Carrington et al., 2013). The DISCO is a 320 item interview that is completed with a parent or carer of individuals of varying age and ability levels to provide a detailed developmental history. The interview has good inter-rater reliability and criterion validity Maljaars, Noens, Scholte, & van Berckelaer-Onnes, 2012;Nygren et al., 2009) and good agreement with output on both the Autism Diagnostic Interview (ADI-R; Lord, Rutter, & Le Couteur, 1994) and Autism Diagnostic Observation Schedule (ADOS; Lord et al., 2000) according to ICD-10/DSM-IV-TR criteria (Maljaars et al., 2012;Nygren et al., 2009). The DISCO DSM-5 algorithm item set comprises 85 DISCO items, selected to map onto the DSM-5 descriptors, for the two domains of behaviour (social-communication and restricted and repetitive behaviours), each of which included multiple 'categories' or sub-domains of behaviour (Table 1). Items were mapped in a three-stage process. DISCO items were initially mapped by two researchers both of whom had experience of ASD (SJC and RGK). Item selection and placement was then reviewed by one clinician (JG) and one researcher (SRL) with extensive knowledge of ASD and the DISCO. Finally, the proposed assignment of all items was independently reviewed by three clinicians experienced in the use of the DISCO, none of whom had been involved in the study's design or implementation; all independently agreed on placement of all items. Once items were mapped to the DSM-5 descriptions, thresholds were set for each sub-domain specifying the number of items on which an individual must 'score' -indicating that they have the symptom being measured in order to reach criterion for the behaviour described by that sub-domain. Three different methods of threshold definition were compared, with the best overall sensitivity and specificity for the algorithm achieved by setting sub-domain thresholds that minimised false positives while maximising sensitivity. The effect of 'relaxing' the DSM-5 rules such that only two rather than all three social-communication sub-domains were required decreased the specificity of the algorithm and did not significantly improve sensitivity. These results suggest that the capacity of the DSM-5 criteria to provide high levels of sensitivity and specificity comparable with DSM-IV-TR relies on the careful selection of appropriate items from diagnostic instruments that map onto the DSM-5 descriptions.
The aim of the current study was to further investigate the DSM-5 diagnostic criteria, and identify those behaviours that best discriminated between individuals with ASD and non-ASD clinical diagnoses and which could therefore be considered 'essential' for the diagnosis of DSM-5 ASD. Due to the wide range of items measured by the DISCO, the DISCO DSM-5 algorithm includes multiple items for many of the behaviours described in DSM-5. The inclusion of these different examples of behaviours means that individual variations in the symptom profile -for example, in terms of different types of sensory sensitivity -can be captured within the DSM-5 algorithm. Seventy-two percent of items included in the DISCO DSM-5 algorithm have been found to have comparable frequencies for children, adolescents, and adults and for individuals of high and low ability in the original study (Kent, Carrington et al., 2013). Indeed three 'global' items ('sharing interests limited or absent, 'lack of friendships with age peers', 'lack of awareness of others' feelings') were found to be present in more than 90% of cases across ability level and in both children and adults. There were also 19 algorithm items that were significantly more frequent in particular age or ability groups. The authors suggested that this combination of both 'global' and specific items facilitated the high level of correspondence between the DISCO DSM-5 algorithm output and clinical diagnosis according to DSM-IV-TR.
Although the algorithm had excellent psychometric properties and very little statistical redundancy (Kent, Carrington et al., 2013), it is nevertheless possible that the performance of the algorithm in the majority of cases may have been largely dependent on a sub-set of items; that is, not all of the examples of behaviour included were essential. In the current study, the effect of reducing the number of items in the algorithm was investigated using the same datasets used in the development of the DISCO DSM-5 algorithm, in order to identify a subset of highly discriminating behaviours essential for diagnosis of DSM-5 ASD. It would be advantageous to identify the most salient clinical items for DSM-5 as this process could guide the development of more efficient clinical approaches to diagnosis. Given demands on clinicians' and researchers' time and the limited resources available to complete a developmental history interview, it would be helpful to streamline accurate and reliable diagnosis in relatively straightforward cases of ASD in a consistent and standardised way. For more complex cases, an essential subset of items could also help to guide the content of a more detailed follow-up developmental history interview or provide an essential subset of interview questions that could be supplemented by pre-assessment parent-completed schedules. The need for shorter, standardised diagnostic interviews for ASD has been recognised (e.g., Matson, Nebel-Schwalm, & Matson, 2007). The development of a short version of the Developmental, Dimensional and Diagnostic Interview (3di), for example, has been validated relative to the ADI-R for DSM-IV-TR (Santosh et al., 2009). To our knowledge, however, the current study is the first to identify behaviours essential for diagnosis of DSM-5 ASD. The identification of a subset of essential items has implications beyond the DISCO and will be relevant for clinicians and researchers using a range of developmental history interviews.
Using the DISCO DSM-5 datasets we searched for essential items by first identifying which items best discriminated between individuals with ASD and non-ASD clinical diagnoses. Given that the diagnosis of ASD is based on the presence of a profile of behaviours, that is, a pattern of behaviours that can be considered relatively unique to ASD, we ensured distribution of items across the three social-communication sub-domains and the four restricted and repetitive activities sub-domains specified by DSM-5.

Participants
All analyses were conducted on three datasets used for the development of the DISCO DSM-5 algorithm (Kent, Carrington et al., 2013). Full details of the clinical and demographic details can be found in previous reports for Sample 1 Wing et al., 2002), Sample 2 (Maljaars et al., 2012), and Sample 3 (Leekam, Libby, Wing, Gould, & Gillberg, 2000;Leekam, Nieto, Libby, Wing, & Gould, 2007). The original recruitment of samples had ethical approval from relevant regional ethics committees with the resulting datasets anonymised upon study completion. Use of these datasets in the current analyses was approved by Cardiff University's School of Psychology Research Ethics Committee.

Sample 1: development
Sample 1 comprised 82 children (34-140 months) originally recruited through clinics and special schools in the UK Wing et al., 2002). Thirty six children (31 male) had a clinical diagnosis of ICD-10 Childhood Autism or DSM-IV-TR Autistic Disorder (18 high ability; 18 low ability). The non-ASD clinical control group comprised 31 children (19 male), 17 with low ability (intellectual disability; ID) and 14 with high ability (language impairment; LI). Fifteen typically developing children (9 male) were also included. Children in the clinical groups were all recruited through clinical services and special schools. Diagnoses were made by responsible, specialised clinicians who were independent of the research studies to which the participants were recruited and without reference to the DISCO. Moreover, children were followed up two years after data collection to determine whether the diagnosis had changed Wing et al., 2002). The ASD and control samples were matched on both chronological age and non-verbal IQ. However there were more males in the ASD group than the control group (x 2 (1) = 6.38, p < .05). The grouping of higher and lower ability at the time of recruitment (IQ above or below 70, respectively) was confirmed using either the Leiter International Performance Scale (Leiter, 1979) or the Bayley Scale for Infant Development (Bayley, 1993); composite performance mental age scores on the Bayley Scale were converted to IQ scores. Items essential for diagnosis of DSM-5 ASD were identified through analysis of data collected with this sample.

Sample 2: validation
Sample 2 were children recruited from clinics and special schools in the Netherlands (Maljaars et al., 2012). There were 52 (17 high ability; 35 low ability) children with ASD (DSM-IV-TR PDD; 43 male, 34-137 months), and a non-ASD clinical control group of 26 children with ID (16 male, 48-134 months). The ASD and non-ASD clinical control groups were matched for chronological age. Clinical diagnoses were made by an independent clinician without reference to the DISCO. Thirty seven typically developing children (15 male, 24-49 months) were also included. There were more males in the ASD group compared with the non-ASD group (x 2 (1) = 13.92, p < .05). Ability was measured using a Dutch test for non-verbal intelligence (Tellegen, Winkel, Wijnberg-Williams, & Laros, 1998). The ASD and control groups were matched for non-verbal mental age. This sample ensured independent validation of the essential item sets above.

Sample 3
Sample 3 included 190 individuals drawn from a sample of 200 participants reported in two previous studies (Leekam et al., 2000;Leekam et al., 2007). All were assessed using the DISCO in a specialist tertiary clinic by the clinicians who designed and developed the interview, and all received DISCO ICD-10 algorithm diagnoses of Childhood (n = 180) or Atypical Autism (n = 10). The sample was divided into three age groups; 112 children (95 male; 32-143 months), 33 adolescents (27 male; 144-215 months), and 45 adults (36 male; 216-456 months). IQ was primarily measured using age-appropriate Wechsler Intelligence Scales; participants were divided into high and low-ability groups (above and below IQ of 70; Leekam et al., 2007). Sample 3, therefore, allowed exploration of how items identified from Sample 1 contributed to diagnosis across both age and ability level.

Measures and item selection
Full details of the DISCO and development of the DSM-5 algorithm (85 items) can be found in the original publications (Kent, Carrington et al., 2013;Leekam et al., 2002;Wing et al., 2002). In the DISCO, each item is coded according to level of impairment. The most relevant codes for DISCO items were selected based on the DSM-5 behaviour descriptions; most items were scored as present only if there was a 'marked' (severe) impairment. Although the DISCO includes 'current' and 'ever' (lifetime) scores for each item, only 'ever' scores were used for these analyses as is common practice in the development of lifetime diagnostic algorithms.
Items for the full DISCO DSM-5 algorithm were mapped to the draft descriptions in a three-stage process which included consultation with independent clinicians, as described in Kent, Carrington et al. (2013). As this previous research was based on the draft proposal (APA, 2011), the final published DSM-5 guidelines (APA, 2013) were consulted before beginning the search for essential items for this study. The published DSM-5 ASD diagnostic criteria include only one additional example of behaviour: 'rituals when greeting others' (sub-domain B2; see Table 1). No DISCO item was identified that could adequately capture this behavioural description. Consequently, only items in the published DISCO DSM-5 algorithm were considered in the identification of a reduced item set. The item set reported here reflects the final, published DSM-5 criteria.
Items were selected for inclusion in the reduced item set based on their predictive validity, calculated from data in Sample 1. The item reduction process followed the procedure used for the development of the Social Communication Questionnaire (SCQ) a parent report questionnaire (Rutter, Bailey, & Lord, 2003). In the case of the SCQ, items were first selected from the ADI-R based on clinical judgement and chi-square analyses were used to evaluate the resultant item set. In the current study, only items in the original DISCO DSM-5 algorithm that significantly discriminated between the ASD and non-ASD clinical control groups in Sample 1 were included in the reduced item set. This follows a similar approach to measurement development used in other areas of health and medicine (e.g., pre-psychotic state; Liu et al., 2013). The internal consistency of the reduced item set was assessed using Cronbach's alpha and inter-item correlations were calculated to measure redundancy. Although Sample 1 included typically developing children, these children were not included in the chi-square analyses to ensure the strictest test of discrimination.

Algorithm thresholds
The DSM-5 description of ASD specifies that individuals must have symptoms in all of the three sub-domains of social communication behaviours, and at least two of the four sub-domains of restricted and repetitive behaviours. In the development of the original DISCO DSM-5 algorithm additional rules were defined regarding the number of items on which an individual must 'score' in order to reach criterion for a sub-domain. This 'score' is referred to as the sub-domain threshold, and was calculated based on sensitivity and specificity values calculated from receiver operating characteristic (ROC) curve analyses. Algorithm thresholds were re-set for the reduced item sets using the same approach applied in the original DISCO DSM-5 algorithm (see Kent, Carrington et al., 2013). Full details of the resetting of thresholds are shown in Appendix 1.

Testing the algorithm
The discriminative power of the DISCO DSM-5 algorithm for ASD when applied to the reduced item sets was tested using ROC curve analyses comparing the ASD and non-ASD clinical control groups. ROC curves plot sensitivity against 1specificity and the area under the curve (AUC) quantifies the power of the algorithm to correctly classify individuals as belonging to the ASD or non-ASD clinical control group. AUCs of .7 and above are considered acceptable, whereas AUCs of .8 and above are excellent and AUCs of .9 and above outstanding (Hosmer & Lemeshow, 2000). ROC curve statistics were calculated both in the development sample (Sample 1) and in the independent validation sample (Sample 2). Outcome on the algorithm for typically developing individuals is presented in Table 3 for comparison; these data were not included in the analyses to ensure the most stringent test of discrimination. Sample 3 included only individuals with a clinical diagnosis of DSM-IV-TR/ICD-10 Childhood or Atypical Autism and no comparison group; consequently, only sensitivity was calculated for this sample. Chi-square analyses tested whether the sensitivity of the algorithm varied across age or ability sub-groups. Finally, the ASD groups in Samples 1 and 2 and the sub-group of children in Sample 3 were combined in a larger sample (n = 200; 31 female) to enable investigation of the sensitivity of the algorithm by gender using chisquare analyses.
The development of an effective abbreviated form of a clinical assessment is critically dependent on ensuring (a) that the range of content covered in the original assessment is preserved in the abbreviated form; and (b) there is adequate overlap in the variance accounted for by the full and abbreviated forms (Smith, McCarthy, & Anderson, 2000). Following Smith et al.'s recommendations, the range of content included in both the reduced and original DISCO DSM-5 item sets was assessed quantitatively through calculation of mean inter-item correlations, with significantly higher inter-item correlations in the reduced item set compared with full item set indicating significantly restricted coverage of content. The overlap in variance for the reduced and full items sets was estimated by calculating the reliability of internal consistency. Reliability was estimated based on the following equation, taken from (Nunnally & Bernstein, 1994), and reported in Smith et al. (2000): reliability = (n Â r(ij))/(1 + (n À 1)r(ij)), where n = number of items in the set and r(ij) = the mean inter-item correlation of the set. Correlations between the reliability of the full item set in Sample 1 and the reduced item sets in Samples 2 and 3 were calculated as a further test of the validity of the reduced item sets. All statistical analyses were conducted in SPSS.

Study 1
The first step in the search for behaviours essential for the diagnosis of DSM-5 ASD was to explore the behaviours that best discriminated between individuals with ASD and those with other, non-ASD clinical diagnoses. As an item set intended for diagnosis should contain sufficient items to describe the full ASD profile, the subsequent step was to identify a minimum item set that fully represented the specified DSM-5 sub-domains across the two domains of social-communication impairments and restricted, repetitive behaviours and to test the predictive validity of this set in an independent sample. Items were selected based on their predictive validity. DISCO DSM-5 algorithm rules were then applied to the minimum item set in order to measure the predictive validity of the reduced item set as a whole, and to allow comparison with the published DISCO DSM-5 ASD diagnostic algorithm (Kent, Carrington et al., 2013).

Data analysis
Items were selected if they discriminated between children with a clinical diagnosis of Childhood Autism or Autistic Disorder compared with clinical controls at the p < .001 level, based on the chi-square statistic. Following the approach adopted in previous studies of the DSM-5 criteria (e.g., McPartland et al., 2012), all seven DISCO items assessing the presence of symptoms in early childhood (as in ICD-10) were included to provide a range of measures of early symptoms (age of onset). New thresholds were calculated for each of the sub-domains as described in Appendix 1. The reduced DISCO DSM-5 item set and thresholds were identified using Sample 1. The predictive validity of the algorithm when applied to the reduced item set was calculated relative to clinical diagnosis in both Sample 1 and the independent validation sample, Sample 2. The algorithm was tested on Sample 3 to establish sensitivity across different age and ability groups and across the combined ASD groups of children from Samples 1, 2 and 3 to compare sensitivity across gender. Finally, agreement between the proportion of individuals identified using the reduced item set and those identified using the full published DISCO DSM-5 ASD algorithm (Kent, Carrington et al., 2013) was assessed using McNemar's test (for example, Huerta et al., 2012).
As a preliminary test of the validity of the reduced item set, mean inter-item correlations were calculated for each individual sub-domain, to determine whether the range of content coverage of the reduced item set was comparable to the original DISCO DSM-5 item set. The sub-domain relating to age of onset was not included in this analysis as it was not reduced. In addition, the reliability of the internal consistency of the reduced and full items sets was compared to investigate the overlap in the variance for full and revised sets.

Results and discussion
Fourteen items were identified that significantly discriminated between the ASD and non-ASD clinical control groups in Sample 1 at a stringent alpha level (p < .001). These items are marked (**) in Table 1. Three of the 'global' items that had previously been identified in the research for the full DSM-5 algorithm in 90% of children, adolescents and adults, ('lack of awareness of others feelings', 'sharing interests limited or absent', 'lack of friendship with age peers'; Kent, Carrington et al., 2013) were included in this set. The set of 14 items, however, did not fully reflect the domains and sub-domains of behaviour for ASD described in DSM-5. The majority of items were from the social-communication domain (11/14), and more specifically, the socio-emotional reciprocity sub-domain (7/14). In order to better represent the full ASD profile, the threshold for inclusion was lowered (p < .05) to the same level used in other published studies (e.g., Liu et al., 2013;Rutter et al., 2003).
Forty eight items were identified across the social-communication (A) and restricted, repetitive behaviour (B) domains of the DISCO DSM-5 algorithm, in addition to the seven items assessing symptoms in early childhood ( Table 1). The internal consistency of the item set was excellent (alpha = .95) and inter-item correlations within each sub-domain in the algorithm revealed very little redundancy; 'does not give comfort to others' was highly correlated with 'no emotional response to age peers' (r = .80) and 'lack of descriptive gestures' with 'lack of instrumental gestures' (r = .71). As removal of any item reduced the internal consistency of the sub-domain (A1 and A2 respectively) and the overall algorithm, all items were retained.
For each sub-domain, there were at least four items that met the criterion for inclusion (p < .05) and the proportion of items from the sub-domains of the published full DISCO DSM-5 item set included in the reduced set varied between 46% (B1) and 90% (A1). Despite the decreased number of items included in the reduced set, the reliability of the full item set in Sample 1 (.95) and reduced item set in Sample 2 (.96) and Sample 3 (.92) were highly correlated with each other (.92 and .87 respectively), indicating clear overlap in the variance accounted for by the original and reduced forms. The mean inter-item correlation in the reduced set was typically increased relative to the full item set in all three samples ( Table 2). Comparison of the full and reduced version of each sub-domain, however, revealed that this effect was only significant in one of the seven sub-domains in Sample 1 (A3) and two in both Sample 2 and Sample 3 (A2 and A3), indicating reasonably good coverage of the content (see Table 2 for details).
The sensitivity, specificity, and AUC of the algorithm run on the reduced item set relative to clinical diagnosis are reported in Tables 3 and 4. Chi-square analyses indicated that sensitivity of the algorithm was comparable across the age and ability sub-groups in Sample 3. Moreover, chi-square analyses revealed that sensitivity of the algorithm was comparable for male and female children in the ASD groups of Samples 1, 2 and 3. These results reflect excellent performance of the algorithm relative to clinical diagnosis and are comparable to results achieved with the original, full DISCO DSM-5 item set.
Direct comparison with outcome on the original full DISCO DSM-5 ASD item set revealed that the sensitivity and specificity of the reduced item set in Samples 1 and 2 was not significantly altered; however, in Sample 3, which included a wider age range and ability level, reduction of the item set significantly decreased sensitivity of the algorithm (x 2 (1) = 8.03, p < .01). Post hoc analyses found that although sensitivity of the reduced item set was decreased in each age-group in Sample 3, this was only significant for the group of children (x 2 (1) = 4.05, p < .05); moreover, all of the children missed were in the higher ability group. A second set of post hoc analyses indicated that reduction of the item set did not significantly affect the sensitivity of the algorithm for the lower ability subgroup of Sample 3 (n = 70), but sensitivity was significantly decreased compared with the full item set for the higher ability group (n = 120; x 2 (1) = 7.03, p < .01). These findings are consistent with concerns raised following the publication of the draft DSM-5 criteria that the new guidelines may lack sensitivity to individuals with higher functioning manifestations of ASD (McPartland et al., 2012). Given that reduced sensitivity for higher functioning manifestations of ASD was not apparent with the original DISCO DSM-5 algorithm, the finding in the current study supports the view that sufficiently detailed mapping of the DSM-5 descriptions is necessary to ensure sensitivity across age and ability. In Study 2, the search for items was therefore extended to examine whether including additional items from the original DISCO DSM-5 item set (Kent, Carrington et al., 2013) could improve sensitivity -particularly for this higher ability group -without reducing specificity.

Study 2
Study 2 included a subset of additional items taken from the full published DISCO DSM-5 item set in order to improve sensitivity. In the original study, results identified nine particular items that were significantly more frequent in the higher ability compared with lower ability groups. Three of these items were already included in the reduced item set of Study 1 ('insists on sameness in routines', 'talks about a repetitive theme', and 'repetitive activities related to special skills'). The additional six items ('communication is one-sided', 'interrupts conversations', 'anger towards parents', 'long-winded and pedantic speech', 'insistence on perfection', 'collects facts on specific subjects') previously identified in Kent, Carrington et al. (2013) were included in Study 2.

Results and discussion
The sensitivity, specificity, and AUC for the DSM-5 algorithm run on the revised item set relative to clinical diagnosis are reported in Tables 3 and 4. Sensitivity did not vary significantly across the age or ability sub-groups in Sample 3. In comparison with Study 1, the inclusion of the six additional items resulted in the incorrect classification of one additional individual in the control group in Sample 1. However, the specificity of the algorithm run on the revised item set was not significantly different to results for the published DISCO DSM-5 item set. The inclusion of additional items improved sensitivity in Samples 2 and 3 relative to Study 1 (sensitivity was at ceiling for Sample 1). In Sample 3, the inclusion of the six  items improved sensitivity by identifying additional individuals in the higher ability sub-group. Moreover, sensitivity of the algorithm run on the revised item set was no longer significantly different to the sensitivity of the algorithm run on the full item set. The revised item set identified one additional female and four males across the three samples of children; however, chi-square analyses revealed that the sensitivity of the revised item set was comparable for males and females. As in Study 1, the reliability of the full item set in Sample 1 (.95) and revised item set in Sample 2 (.96) and Sample 3 (.91) were highly correlated with each other (.91 and .86 respectively), indicating clear overlap in the variance accounted for by the original and reduced forms. Finally, comparison of the mean inter-item correlations calculated for the full and revised item sets within each sub-domain revealed comparable results in all but one sub-domain in each sample (A3 in Sample 1, A2 in Samples 2 and 3). These results show improved content coverage compared with Study 1 (see Table 2). Overall, these results indicate that consideration of the six additional items included in this study may be beneficial during diagnostic assessment of higher functioning individuals.

General discussion
The goal of the study was to search for 'essential' items for the diagnosis of DSM-5 ASD. The process of identifying items essential to the diagnosis of ASD is an important step in disentangling symptoms that are common across neurodevelopmental disorders from those more specifically associated with ASD. This point is particularly relevant given growing recognition of a high degree of comorbidity across the symptoms of neurodevelopmental disorder (e.g., Gillberg, 2010). Thus the identification of essential items for a diagnosis of ASD could contribute to the development of more streamlined diagnostic practice for straightforward cases, and be used with supplementary information for complex cases. However, the clinical and research utility of an algorithm based on identified essential items will need replication and further investigation in independent prospective samples.
Despite reducing the number of items included within each sub-domain, quantitative investigation of the range of content included within the sub-domains found that in the majority of cases, there was not a significant effect of reducing the number of items. Study 1 showed that the 14 most highly discriminating items within the DISCO DSM-5 item set predominantly measured social-communication behaviours, particularly those related to socio-emotional reciprocity, suggesting that these behaviours may be essential for the diagnosis of DSM-5 ASD. However, restricted and repetitive behaviours represent the other domain of the DSM-5 dyad, and it is therefore also important to explore which of these behaviours contributes most to the diagnosis of DSM-5 ASD. A more inclusive selection criterion ensured a balanced representation of the DSM-5 ASD description, which was further refined in Study 2 to include items to better identify higher ability cases. The proportion of items from the full DISCO DSM-5 item set that were included in the reduced set in Study 1 varied across the four repetitive and restricted behaviour sub-domains (Table 2). While only 46% of items related to stereotyped or repetitive speech, motor movements, or use of objects met criterion for inclusion in the revised set, 70% of items relating to sensory symptoms were included. The formal recognition of sensory sensitivities in ASD is one of the most marked changes in DSM-5 relative to DSM-IV-TR/ICD-10 and reflects a growing research literature highlighting differences in sensory processing in ASD (Baranek et al., 2013;Ben-Sasson et al., 2009). These findings support the inclusion of sensory sensitivities in the DSM-5 description of ASD and suggest they may play a central role in distinguishing ASD from other clinical conditions. However, no one behaviour or category of behaviours is diagnostic of ASD, and instead it is the pattern or profile of symptoms that defines the condition. Thus, while social-communication behaviours and sensory sensitivities may be the most discriminating at an individual item level, and could therefore be considered essential to the diagnosis of DSM-5 ASD, in this paper, we have identified discriminating items associated with each DSM-5 sub-domain that, when used in combination may assist clinicians and researchers in obtaining a more efficient, focused developmental history as part of the ASD diagnostic process.
The reduced sensitivity in Study 1 to individuals with ASD with higher ability initially appears consistent with concerns that DSM-5 may underdiagnose individuals with higher functioning ASD (e.g., McPartland et al., 2012). The reduced sensitivity in this study, however, was likely a function of the way in which items were selected for inclusion in the reduced set and therefore reflects a limitation of the item set identified in Study 1. In the original publication of the DISCO DSM-5 algorithm, it was argued that the sensitivity of the algorithm across age and ability was dependent on the inclusion of items more specially associated with individuals of higher ability, in addition to 'global' items relevant across the autism spectrum. However, items more specifically associated with individuals with higher ability were endorsed by a relatively small proportion of the whole ASD sample, and were therefore less likely to differentiate between the ASD and clinical comparison groups. Indeed, only three of such items originally identified by Kent, Carrington et al. (2013) were included in the reduced item set in Study 1. Although the inclusion of the additional six items in Study 2 improved sensitivity for higher ability individuals with ASD such that sensitivity of the revised item set was comparable across ability level, these results do highlight a vulnerability of the new DSM-5 criteria. While previous studies proposed modification to the DSM-5 rules to improve the sensitivity of the criteria (e.g., Huerta et al., 2012;Mayes et al., 2013), the current studies suggest that sufficiently detailed mapping of the DSM-5 descriptions, and particularly of behaviours more common among individuals with higher ability and higher language levels is essential if the criteria are to accurately identify individuals with ASD across the entire spectrum.
Essential items were identified based on their predictive validity, calculated using chi-square analysis of individuals with ASD compared with individuals with non-ASD clinical diagnoses. This model was based on the approach used in the development of the SCQ and screening tools currently being used in other areas of health and medicine. A potential limitation of this approach is the relatively small size of Sample 1 (the development sample), which may have affected the accuracy and precision with which items met the inclusion criteria (p < .05). An alternative statistical approach to the selection of items for inclusion in the reduced item set is item response theory (IRT), a technique that has been successfully used both to abbreviate and test the psychometric properties of instruments within educational and psychiatric settings. IRT involves estimating the relative effectiveness of individual items to assess a dimensional trait, such as scores on a test of mental arithmetic or a personality trait such as anxiety. More specifically, the analysis estimates how much discrimination each item offers across individual differences on the entire continuum of the dimensional trait being measured. Although this approach may be appropriate for the abbreviation of assessments such as the Autism Spectrum Quotient (AQ; Baron-Cohen et al., 2001), in which a summative score is calculated to estimate symptom severity, this approach could not be adopted in the current study due to the nature of the algorithm. Outcome on the DISCO DSM-5 algorithm is not dependent only on the number of items on which an individual scores; the algorithm rules also specify a particular pattern of symptoms, namely symptoms in all three of the social-communication sub-domains and at least two of the four repetitive behaviour subdomains.
An important limitation of the study is that all discrimination analyses reported here were based on samples selected with an ASD (Childhood Autism or Autistic Disorder) or non-ASD clinical diagnosis. Although this approach is traditionally adopted for the development of diagnostic tools (e.g., ADI-R; Lord et al., 1994), the focus on relatively 'classical' presentations of autism may have inflated the sensitivity of the item sets. Although Sample 3 included a range of age and ability, the size of the samples in this study was relatively limited. The utility of the proposed measure -in this case a set of items for an abbreviated framework for an ASD developmental history interview -will therefore need to be evaluated and replicated in both community-based settings and well characterised research samples. More specifically, prospective studies are required in which clinicians generate diagnoses based on the DSM-5 criteria and also using the abbreviated DISCO DSM-5 criteria, so that outcome on the two measures can be compared. Future work should include a broader range of ASD cases, including individuals with a DSM-IV-TR/ICD-10 diagnosis of Asperger Syndrome, as well as individuals across the age span. Further investigation of the sensitivity of DSM-5 to females will also be important as current descriptions of ASD from which the item set was derived are biased towards the traditionally 'male' presentation of the condition. The ASD groups included in Samples 1 and 2 in the current study included significantly more males than the clinical control groups, a pattern that has been reported in previous studies of the DSM-5 criteria and likely reflects the male-to-female ratio of ASD (e.g., Frazier et al., 2012). It is therefore possible that the good predictive validity reported here was partially attributable to differential sensitivity of the item sets for males and females. Although previous studies have reported comparable sensitivity for both males and females (e.g., Huerta et al., 2012), and the exploratory analyses of these datasets are consistent with this, it will be important to further explore the sensitivity of the DSM-5 criteria for females as our understanding of the female profile grows. Finally, it will also be important to draw comparisons with a more varied clinical comparison, including individuals with diagnoses such as ADHD and disruptive behaviour disorders in order to fully investigate the diagnostic potential of these item sets.
The latter point is relevant to a wider issue about the application of diagnostic algorithms to produce a categorical outcome of 'ASD or not ASD'. Although such binary outcomes may assist in diagnostic decisions, the outcome alone does not provide a clinical description of an individual's profile and should always be considered in conjunction with the other components of a multidisciplinary (often multiagency) clinical assessment. Moreover, one of the primary strengths of the DISCO and other detailed developmental history interviews is the wealth of information that can be acquired, which can be invaluable in addressing the strengths and difficulties of an individual, identifying any other relevant co-morbidities and facilitating access to appropriate support services for the individual and their family. As mentioned in the introduction, the wide range of items was considered to be an advantage in the development of the original DISCO DSM-5 algorithm, in that it enabled detailed mapping of the DSM-5 criteria for ASD. Although the revised item set maintained good levels of sensitivity and specificity, the use of reduced item sets limits the information available to clinicians, including a more limited range of behaviours associated with each sub-domain of symptoms described in DSM-5. Thus despite significant reduction in the time taken to administer such a reduced item interview, this advantage should be balanced against the potential cost/loss in terms of the reduced breadth of information gained.

Conclusion
This study highlights items essential for the diagnosis of DSM-5 ASD based on analysis of the DISCO DSM-5 algorithm. The results highlight that social-communication behaviours highly discriminate between ASD and other, non-ASD clinical diagnoses. Moreover, items measuring sensory behaviours were among the most highly discriminating items in the restricted and repetitive behaviour domains. While most items in the reduced item set were relevant across age and ability, Study 2 highlighted that consideration of a few additional items (revised item set) may be relevant for the diagnosis of higher functioning individuals. The good psychometric properties of the reported item sets suggest that the search for items essential for the diagnosis of DSM-5 ASD may have identified items sets that are potentially of use for clinicians and researchers in the development of efficient and focused ASD diagnostic processes. Further work involving existing ASD diagnostic tools (including the ADI-R, DISCO and 3di) will be required to further validate the clinical and research use of these item sets.