Mapping established psychopathology scales onto the Hierarchical Taxonomy of Psychopathology (HiTOP)

The Hierarchical Taxonomy of Psychopathology (HiTOP) organizes phenotypes of mental disorder based on empirical covariation, offering a comprehensive organizational framework from narrow symptoms to broader patterns of psychopathology. We argue that established self-report measures of psychopathology from the pre-HiTOP era should be systematically integrated into HiTOP to foster cumulative research and further the understanding of psycho-pathology structure. Hence, in this study, we mapped 92 established psychopathology (sub)scales onto the current HiTOP working model using data from an extensive battery of self-report assessments that was completed by community participants and outpatients (


INTRODUCTION
Traditional psychopathology taxonomies (e.g., diagnostic categories as suggested in the Diagnostic and Statistical Manual of Mental Disorders ; American Psychiatric Association [APA], 2013) have severe drawbacks such as artificial comorbidities, arbitrary diagnostic cutoffs, diagnostic instability, and phenotypic heterogeneity that limit their usefulness and practical applicability (e.g., Kotov et al., 2018;Krueger et al., 2018).The Hierarchical Taxonomy of Psychopathology (HiTOP) provides an alternative classification system of signs and symptoms of mental disorder as well as maladaptive traits that is built on factor analytic studies of empirical covariation (Kotov et al., 2017).By this means, HiTOP aims to provide a more efficient as well as fine-grained diagnostic conceptualization of mental health problems, following a dimensional rather than a categorical approach to classification (e.g., Markon et al., 2011).Initial results emphasize its potential for providing a better understanding of the nature, scope, and etiology of mental disorders (e.g., Kotov et al., 2020;Krueger et al., 2021;Waszczuk et al., 2020;Watson, Levin-Aspenson, et al., 2022; but see Haeffel et al., 2021), thereby revitalizing clinical psychology research and practice (e.g., Conway et al., 2019;Hopwood et al., 2020;Ruggero et al., 2019).
The hallmark feature of HiTOP is to provide a framework for considering the full breadth and depth of psychopathology in a hierarchical order.HiTOP thus enables the specificity and generality of mental health problems to be located at multiple levels of granularity (e.g., Conway et al., 2019).A general factor of psychopathology, the p factor, at the apex of the hierarchy embodies sizeable covariance between most-if not allindicators of psychological distress (e.g., Caspi & Moffitt, 2018;Constantinou & Fonagy, 2019).The p factor is longitudinally stable, moderately heritable, and associated with important functional outcomes (e.g., for a review, see Lahey et al., 2021).Yet the question of whether the p factor phenomenon is a mere statistical abstraction (e.g., Fried, 2020;Levin-Aspenson et al., 2021;Watts et al., 2020), captures a substantive construct (e.g., Lahey et al., 2021), or is a mix of both (e.g., Watts et al., 2022) remains controversial.First, as per interpretations of the p factor as a common cause, it has been suggested that it reflects general liabilities towards psychopathology such as compromised brain function (e.g., Caspi et al., 2020), impairments in self-and interpersonal functioning (Widiger et al., 2019), or, more specifically, impairments in social learning in terms of problems with mentalizing and epistemic trust (e.g., Fonagy et al., 2021;Fonagy & Campbell, 2021).Other researchers have pointed out that the p factor can also be explained, at least in part, by method-specific causes, such as evaluative biases (e.g., Leising et al., 2020;Pettersson et al., 2014;Smith et al., 2020).Second, some researchers think of the p factor as an index of symptomatic distress (Fried et al., 2021) or an index of impairment (McCabe et al., 2022) that is secondary to the disorders themselves.Third, network theorists consider that the p factor may not reflect common causes but rather direct causal paths between mutually reinforcing symptoms (e.g., Borsboom & Cramer, 2013;Bringmann et al., 2021).One level down are located the superspectra of emotional dysfunction, psychosis, and externalizing (e.g., Kotov et al., 2020;Krueger et al., 2021;Watson, Levin-Aspenson, et al., 2022).Beneath these, the current working model of HiTOP specifies six dimensions on an intermediate level of the hierarchy-the spectra of internalizing, antagonistic externalizing (hereinafter simply referred to as antagonism), disinhibited externalizing (disinhibition), thought disorder, detachment, and somatoform (Kotov et al., 2017).Lower levels of the hierarchy may consist of finer grained symptom clusters, but fewer studies have investigated their number, nature, or structure (e.g., Cicero et al., 2022;Forbes et al., 2021;Mullins-Sweatt et al., 2022;Sellbom et al., 2022;Watson, Forbes, et al., 2022;Zimmermann et al., 2022).
HiTOP thus offers a comprehensive taxonomy that allows a multidimensional classification of mental health problems and is becoming an increasingly influential alternative to traditional categorical nosologies (Kotov et al., 2021).Conversely, this also means that existing measures of psychopathology that have been used in clinical research and practice for a long time (i.e., in the pre-HiTOP era) should also be mapped to this taxonomy.However, few studies have attempted to locate established self-report questionnaires of psychopathology within the HiTOP model (e.g., Sellbom et al., 2020Sellbom et al., , 2021;;Wright & Simms, 2015).Established psychopathology scales tend to follow more traditional clinical conceptualizations that are often tied to specific diagnostic concepts (e.g., symptoms of depression as measured by the Brief Symptom Inventory [BSI], Derogatis & Spencer, 1993) or are more narrowly circumscribed (e.g., dissociation as measured by the Dissociative Experiences Scale [DES], Bernstein & Putnam, 1986).Self-report measures like these are generally used to provide an economic assessment of psychopathologies for clinical research or to screen for mental disorders in clinical practice.To date, the plethora of tests are studied separately and often have unclear conceptual boundaries (Fried, 2017), thus lacking integration into an overarching conceptualization of psychopathology, which is offered by HiTOP.With accumulating evidence indicating the relevance and utility of higher-level HiTOP dimensions (e.g., Kotov et al., 2020;Krueger et al., 2021;Smith et al., 2020;Watson, Levin-Aspenson, et al., 2022), it is likely that these superimpose with the more unique information that is conveyed by established psychopathology scales.Established scales therefore need to be re-examined regarding their distinctiveness above and beyond these higher-level dimensions (see Müller et al., 2022, for an example).
This issue is also relevant when investigating nomological networks.Indeed, it is well documented that many risk (e.g., adverse childhood experiences and low cognitive functioning) and outcome variables (e.g., selfharm and incarceration) are similarly related to psychopathology (e.g., Kotov et al., 2020;Krueger et al., 2021;Smith et al., 2020;Watson, Levin-Aspenson, et al., 2022).Statistical associations may thus not be specific to the construct that a test purports to measure.For example, a significant association between concurrent symptoms of dissociation and past childhood experiences (e.g., as reported in Van IJzendoorn & Schuengel, 1996) could just as well be ascribed to the statistical influence of higher-level dimensions, so that such associations likely generalize to many other psychopathological phenomena (e.g., Lahey et al., 2021).Importantly, given that everything is somehow related to everything else (also: crud factor; for a recent discussion, see Orben & Lakens, 2020) and given that this seems to be particularly true for psychopathology constructs, deeper insights can only be obtained when focusing on the magnitude and specificity of effects.Thus, it may only be feasible to determine whether these associations are truly unique if higherlevel dimensions are assessed with high fidelity and accounted for statistically.
The aforementioned issues raise questions about isolated interpretations of traditional self-report measures of psychopathology.Given that HiTOP can provide a comprehensive taxonomy to organize psychopathology constructs in an integrated and connected manner, we argue that established psychopathology scales could also be mapped to HiTOP, as has been recently done, for example, for the Minnesota Multiphasic Personality Inventory-3 (Sellbom et al., 2021).In this way, it is possible to examine the measured constructs more thoroughly, to highlight issues of discriminant validity and specificity (Conway et al., 2019;Stanton et al., 2020), to expose similarities and differences between measures, thereby identifying and preventing jingle-jangle fallacies (e.g., Kelley, 1927;Lawson & Robins, 2021), and to foster cumulative research integration.In a similar vein, personality researchers renewed their call for a (more stringent) use of Big Five dimensions to provide a unifying framework for organizing psychological trait dimensions (Bainbridge et al., 2022).Furthermore, existing research on the structure of psychopathology that has informed the current HiTOP working model is subject to some limitations, as has been pointed out repeatedly (e.g., Kotov et al., 2017).Among these are the reliance on (a) diagnostic categories (e.g., Ringwald et al., 2021) that neglect symptom-level information or on (b) single symptom indicators (e.g., Forbes et al., 2021) that preclude from detecting multidimensionality at lower levels of the hierarchy (e.g., Bollen & Lennox, 1991;Watts et al., 2021).Such considerations have led the HiTOP consortium to start developing a HiTOP measure (Simms et al., 2022) that holds promise in addressing these limitations in terms of realizing a psychometrically optimized multiple indicator assessment.Given the fact that paradigm shifts are implemented only slowly, established psychopathology scales will keep playing an important role to spur additional insights into psychopathology and its structure, due also to their diversity in terms of relying on different clinical conceptualizations and traditions.
Using clinical and community data from an extensive assessment of self-reported psychopathology (i.e., 685 items) including measures of self-and interpersonal functioning, we aimed to map 92 established psychopathology scales from 21 questionnaires onto the current HiTOP working model (Kotov et al., 2017).In a content-based approach to assess HiTOP spectra, we selected indicators from the item pool using expert ratings of item content (e.g., see Colquitt et al., 2019) in a first step, further purified this selection by factor analysis to ensure (essential) unidimensionality in a second step, and realized measurement models in terms of a bifactor-(S-1) model and a correlated factors model in a third step.To test the convergent and discriminant validity of our newly derived HiTOP scales, we evaluated associations with personality disorder (PD) diagnoses following the Structured Clinical Interview for DSM-IV Axis 2 Disorders (SCID-II).Clinical diagnoses as assessed by standardized interviews provide a useful validation criterion because (1) they are heteromethod and (2) meta-analytic findings of the associations between HiTOP spectra and DSM diagnostic categories are available (Ringwald et al., 2021).For our main analysis, we estimated the extent to which established scales reflect higher-level HiTOP dimensions (i.e., p factor and HiTOP spectra) to shed more light on which established scales are pure markers or reflect blends of HiTOP dimensions.To this end, we applied bifactor modeling (e.g., Eid et al., 2017) to model the p factor and HiTOP spectra jointly, reflecting the two most prominent upper levels of the psychopathology hierarchy.By mapping many established scales onto HiTOP, we aimed to gain additional insights for understanding psychopathology structure and its measurement.

Samples
Participants were recruited in Greater London via the Personality and Mood Disorder Research Consortium consisting of 260 healthy community participants and 649 outpatients (N = 909; 66% female; mean age of 30.7, range = 16-65, SD = 10.4) from National Health Service Improving Access to Psychological Therapies (NHS IAPT) services for Mood Disorders and secondary or tertiary specialist services for PDs referred from National Health Service specialist PD clinical services.In the total sample, a large number of participants met diagnostic criteria for current Borderline PD (59%), Paranoid PD (27%), Antisocial PD (23%), Narcissistic PD (4%), Schizotypal PD (4%), and Histrionic PD (1%) according to the DSM-5 (APA, 2013).Participants with PDs were oversampled because patients with severe mental health problems were primarily referred who are considered too complex for standard care due to multimorbidity or risk to self or others.The data have been previously used to study various research questions distinct from the current research (Euler et al., 2019;Huang et al., 2020;Rifkin-Zybutz et al., 2021;Wendt et al., 2019).

Established psychopathology scales
Participants completed a battery of established self-report questionnaires, indicating their agreement to statements about themselves on rating scales.We included a plethora of measures designed to assess current or persistent signs, symptoms, and characteristic traits of mental disorders, including maladaptive personality traits and measures of personality functioning (see DeYoung et al., 2020, for how maladaptive personality traits are linked to HiTOP).The measures were the Autonomous Functioning Index (AFI; Weinstein et al., 2012), Antisocial Process Screening Device (APSD; Frick & Hare, 2001), Baratt Impulsiveness Scale (BIS-11;Patton et al., 1995), Brief Symptom Inventory (BSI; Derogatis & Spencer, 1993), Drugs, Alcohol, and Self-Injury Questionnaire (DASI; Wilkinson et al., 2018), Difficulties in Emotion Regulation Scale (DERS; Gratz & Roemer, 2004), Dissociative Experiences Scale (DES; Bernstein & Putnam, 1986), Experiences in Close Relationships-Revised (ECR-R; Fraley et al., 2000), Empathy Quotient (EQ; Baron- Cohen & Wheelwright, 2004), Green et al. Paranoid Thoughts Scale (GPTS;Green et al., 2008), Inventory of Interpersonal Problems (IIP-32; Horowitz et al., 2000), Life History of Aggression (LHA; Coccaro et al., 1997), Other as Shamer Scale (OAS; Goss et al., 1994) Blanchard et al., 1996), Reflective Functioning Questionnaire -Extended 18-Item Version (RFQ-18;Rogoff et al., 2021), Standardized Assessment of Personality: Abbreviated Scale (SAPAS; Moran et al., 2003), Schizotypal Personality Questionnaire (SPQ; Raine, 1991), and Levenson Self-Report Psychopathy Scale (SRPS; Levenson et al., 1995).For this study, scales were inverted when necessary, so that higher values were geared towards the maladaptive pole of a trait dimension indicating greater severity or impairment in the respective domain.For more detailed information including the number of items and scales, response categories, and internal consistency estimates, see Table S1.In the questionnaires included are 92 (presumably unidimensional) scales.To use scale scores in the subsequent latent variable analysis, we tested unidimensional measurement models (Little et al., 2013), except for DASI Drugs and alcohol, for which we relied on a formative measurement model and used the manifest sum score.We report fit statistics of the unidimensional models for scale scores in Table S2.

Structured Clinical Interview for DSM-IV Axis 2 Disorders (SCID-II)
To assess current symptoms of PD according to DSM-IV (i.e., Paranoid PD, Schizoid PD, Schizotypal PD, Antisocial PD, Borderline PD, Histrionic PD, Narcissistic PD, Avoidant PD, Dependent PD, and Obsessive-compulsive PD), structured interviews were conducted using the SCID-II (First & Gibbon, 2004).The interviews were administered by mental health professionals.For this study, we considered symptom counts of PD diagnoses that are the number of endorsed symptoms in each diagnostic category.

Content-based HiTOP scales
Drawing from the item content of the questionnaires described above, we derived a measurement of HiTOP spectra (i.e., internalizing, antagonism, disinhibition, detachment, thought disorder, and somatoform) and the p factor.We assumed that the item pool provides a sufficiently broad representation of higher-level psychopathology dimensions (i.e., 685 items in total).HiTOP dimensions are commonly identified in a data-driven way using factor analytic methods.However, in this study, we relied on a content-based approach with expert ratings of the item pool (e.g., Colquitt et al., 2019) in a first step and factor analysis to ensure (essentially) unidimensional scales in a second step.In a third step, we realized a bifactor-(S-1) model and a correlated factors model that were used for estimating the associations between HiTOP dimensions and other variables in the main statistical analyses.
Step 1: Expert ratings Eight raters (i.e., three of the authors and five trained psychology undergraduate students) were presented all items in randomized order and were asked to evaluate to what extent items are characteristic of each of the HiTOP spectra.The raters familiarized themselves with the original HiTOP publication (Kotov et al., 2017), and it was ensured that raters were knowledgeable of the common definitions of the signs, symptoms, and characteristic traits of mental health problems that are considered by the model.Ratings were provided on a 3-point scale (0 = not characteristic, 1 = possibly characteristic, 2 = definitely characteristic).The interrater reliability for the average of eight judges was acceptable with interclass correlations (ICC[2, 8]; Fleiss & Shrout, 1978) ranging between 0.84 (internalizing) and 0.92 (thought disorder).Items were deemed to be characteristic when the mean rating was >1.2 for one spectrum and <0.8 for other spectra.Overall, we retained a large number of indicators that were evaluated as indicative of HiTOP spectra.However, due to insufficient representation of the somatoform spectrum (i.e., only four items were selected by raters), it was not included in subsequent analyses.
Step 2: Exploratory factor analysis In the next step, we ensured that the selected items of each spectrum loaded on a common general factor.To this end, we conducted exploratory factor analysis (EFA) on the item pools derived in the previous step.The number of factors was determined based on model fit and the emergence of well-defined factors using geomin rotation. 1 To extract a common general factor, we used orthogonal bifactor rotation (Mansolf & Reise, 2016) and discarded items with weak loadings on the general factor for each spectrum (<0.40).The final number of retained indicators was 76 for internalizing (with 3 items removed due to weak loadings on the general factor), 37 for antagonism (16 items removed), 35 for disinhibition (9 items removed), 39 for detachment (3 items removed), and 49 for thought disorder (3 items removed).These items represented the content-based HiTOP scales that formed the basis for the measurement models (i.e., bifactor S-1 model and correlated factors model).
Step 3: Bifactor-(S-1) and correlated factors model Two measurement models were realized to operationalize HiTOP dimensions in a latent variable framework.On the one hand, we used the correlated factors model to estimate associations between HiTOP spectra and PD diagnoses because this facilitates comparison with the results reported by Ringwald et al. (2021).On the other hand, we used bifactor modeling (e.g., Rodriguez et al., 2016) to separate and jointly consider two levels of the HiTOP hierarchy (in terms of the p factor and HiTOP spectra) when mapping established psychopathology scales onto HiTOP.Bifactor modeling is useful for studying external relations of hierarchical constructs (e.g., Bornovalova et al., 2020) because the variance of the indicators can be clearly partitioned into variance common to all indicators (i.e., modeled by a general factor; here: p factor), variance specific to a set of indicators in a given content domain (i.e., modeled by specific factors; here: HiTOP spectra, which are orthogonal to the general trait), and variance not explained by latent factors (i.e., modeled as indicator-specific residual variances).To date, studies have mostly used traditional bifactor models to operationalize HiTOP spectra and the p factor (e.g., Forbes et al., 2021;Lahey et al., 2021).However, traditional bifactor models are prone to estimation problems, such as vanishing specific factors, Heywood cases, or other implausible estimates (Eid et al., 2017).We thus implemented an orthogonal bifactor-(S-1) model (Eid et al., 2017) that has particularly beneficial properties for studying the external relations of hierarchical constructs (Moshagen, 2021;Zhang et al., 2021).In this model, one specific factor is removed so that items of that factor only load on the general factor and thus serve as a reference domain for the general factor.
With respect to the indicators for realizing the measurement models, we relied on a homogeneous parceling approach (i.e., opting for item-to-construct balance; Little et al., 2002) to minimize undesirable sources of multidimensionality (e.g., Little et al., 2013;Rhemtulla, 2016).Considering that the included questionnaires differ in the number of response categories, we rescaled item responses from 0 to 100 before creating the parcels (i.e., percent of maximum possible; Cohen et al., 1999).We created three parcels for each HiTOP spectrum using the items that were retained in the previous step.For the correlated factors model, the parcels of each HiTOP spectrum loaded on a corresponding factor.For the bifactor-(S-1) model, one quarter of the items in each HiTOP spectrum were withheld to create statistically independent parcels for the p factor because, as mentioned earlier, separate indicators are needed that load exclusively on the general factor but not on any specific factor.The remaining items were used to create parcels that loaded on both the general factor and a corresponding specific factor.By using p factor parcels that aggregate across HiTOP spectra, a shortcoming of bifactor-(S-1) models can be circumvented (i.e., equating the p factor with one of the HiTOP spectra) while still facilitating model estimation.Indeed, our model aligns with an operational definition of the p factor in which the p factor simply reflects sum scores of psychopathology indicators (e.g., Fried et al., 2021).The factors of our parcel-based bifactor-(S-1) model have a clear meaning in a descriptive sense: Whereas the p factor reflects the total symptomatic distress (irrespective of content), the specific factors indicate whether symptoms in a HiTOP spectrum are relatively more pronounced or less pronounced than what would be expected given the standing on the p factor.
Fit indices for the correlated factors model (see Figure S1) were acceptable, scaled χ 2 (160) = 978.6,Comparative Fit Index (CFI) = 0.96, Root Mean Square Error of Approximation (RMSEA) = 0.08, Standardized Root Mean Square Residual (SRMR) = 0.05.Factor loadings ranged from 0.81 to 0.97 and factor correlations ranged from 0.43 to 0.78.The bifactor-(S-1) model (see Figure S2) had acceptable fit, scaled χ 2 (120) = 763.1,CFI = 0.96, RMSEA = 0.07, SRMR = 0.05.The factor loadings of the bifactor-(S-1) model were all positive and model parameters were plausible.Factor loadings on the p factor were highest for indicators of the p factor (ranging from 0.92 to 0.96), followed by internalizing (from 0.87 to 0.89), thought disorder (from 0.76 to 0.85), disinhibition (from 0.71 to 0.76), detachment (from 0.62 to 0.70), and antagonism (from 0.43 to 0.64).The size of factor loadings on the specific factors was in opposite order with antagonism indicators having the strongest loadings on their corresponding specific factor (from 0.58 to 0.75), followed by indicators of detachment (from 0.52 to 0.63), disinhibition (from 0.43 to 0.62), thought disorder (from 0.29 to 0.51), and internalizing (from 0.37 to 0.39).This shows that, in the current study, the p factor was most strongly indicated by internalizing content and less so by antagonism content.

Convergent and discriminant validity of content-based HiTOP scales
To test the convergent and discriminant validity of our newly derived content-based HiTOP scales (using the correlated factors model), we evaluated their associations with PD diagnoses by comparing them against metaanalytic estimates as reported in Ringwald et al. (2021).Specifically, if our content-based HiTOP scales offered an adequate approximation of HiTOP spectra, the correlation patterns to PD diagnoses in the current study should be similar to the meta-analytically derived factor loading patterns of PD diagnoses.Despite methodological differences between the two approaches, they can be compared because they address the same question (i.e., statistical association between PD diagnoses and HiTOP spectra). 2Ringwald et al. ( 2021) regarded PD diagnoses to be markers of HiTOP spectra when factor loadings were equal or larger than the absolute value of 0.30.They reported PD diagnoses to be markers of internalizing (i.e., Avoidant PD and Borderline PD), antagonism (i.e., Antisocial PD, Borderline PD, Histrionic PD, Narcissistic PD, Obsessive-compulsive PD, and Paranoid PD), disinhibition (i.e., Antisocial PD), thought disorder (Paranoid PD, Schizotypal PD, and Schizoid PD), and detachment (i.e., Avoidant PD, Obsessive-compulsive PD, Schizotypal PD, and low Histrionic PD).A schematic model of this analysis is depicted in Figure S3.

Mapping established psychopathology scales onto HiTOP
To map established scales onto HiTOP, we conducted structural equation modeling to regress the factors of established scales on the factors of the bifactor-(S-1) model.An illustration of this model is presented in Figure 1.Separate regression models were used to predict each of the scales (i.e., 92 model estimations in total).Given that content-based HiTOP scales draw from the same item content as the established scales, we needed to prevent unmodeled correlated residual variances from inflating the estimates of association (i.e., criterion contamination).Hence, we excluded items to be considered as indicators for the HiTOP factors when they were part of the criterion scale and reassembled the item parcels for each of the 92 models, thereby ensuring that the same items were not considered in both the criterion and the predictor variables. 3 To guide interpretation of the regression models, we regarded standardized regression coefficients of HiTOP spectra (i.e., β 1-5 ) equal or larger to the absolute value of 0.20 as indicating that an established scale reflected a HiTOP spectrum markedly, as this is an effect size typically observed in psychological research (Gignac & Szodorai, 2016).When only one regression coefficient of HiTOP spectra (β 1-5 ) was above the cutoff, we considered an established scale to be a pure marker of a HiTOP spectrum.When multiple regression coefficients of HiTOP spectra (β 1-5 ) were above the cutoff, we deemed a scale to reflect a blend of HiTOP spectra.Also, when all regression coefficients of HiTOP spectra (β 1-5 ) were below the cutoff but the regression coefficient of the p factor (for β 6 ) was above the cutoff, we deemed a scale to be a pure marker of the p factor.Finally, if neither HiTOP spectra nor the p factor yielded a regression coefficient above the cutoff, we concluded that a scale was not captured by HiTOP at all.

Model estimation and software packages
All analyses were conducted in R Version 4.1.1(R Core Team, 2021) unless stated otherwise.Models were estimated using robust maximum likelihood (MLR) or weighted least squares mean and variance adjusted (WLSMV).Items with five or more ordinal responses as well as parcels were considered as continuous indicators and items with four or fewer ordinal responses were considered as ordered indicators (Rhemtulla et al., 2012).
Structural equation models and confirmatory factor analysis were estimated with the R package lavaan Version 0.6.9(Rosseel, 2012), and bifactor-rotated EFA was conducted with Mplus Version 8.4 (Muthén & Muthén, 2017).R code for reproducing the analyses can be accessed at https://osf.io/hkav3/.The data are available from the corresponding author on reasonable request.

Convergent and discriminant validity of content-based HiTOP scales
The pattern of associations between interview-based PD diagnoses and self-reported HiTOP spectra appeared to be similar to the pattern of meta-analytic estimates of factor loadings reported in Ringwald et al. ( 2021), which supported the validity of the newly derived content-based HiTOP scales.The latent correlations are displayed in Table 1.First, we will refer to PDs for which correlation  (2021).Narcissistic PD was positively associated with antagonism (r = 0.27).Histrionic PD was related to antagonism (r = 0.24) as well as to low detachment (r = À0.13).Antisocial PD was most strongly related to antagonism (r = 0.43) and disinhibition (r = 0.45).Schizotypal PD was most strongly related to detachment (r = 0.48).Avoidant PD was most strongly related to internalizing (r = 0.47) and detachment (r = 0.44).Second, we will point to results that were not fully consistent with Ringwald et al. ( 2021), or at least not in every regard.Although, as expected, Borderline PD was strongly related to both internalizing (r = 0.47) and antagonism (r = 0.35), there were unexpected associations of similar magnitude with disinhibition (r = 0.40) and thought disorder (r = 0.55).In line with the results of Ringwald et al. ( 2021), Paranoid PD was in fact strongly associated with antagonism (r = 0.39) and thought disorder (r = 0.40), but it was also strongly related to the other HiTOP spectra with correlation coefficients between 0.32 and 0.43.In line with expectations, Schizoid PD was strongly related to detachment (r = 0.38), though against expectations, it was not significantly related to thought disorder (r = 0.07).As in Ringwald et al. (2021), Obsessive-compulsive PD was associated with antagonism (r = 0.10) and detachment (r = 0.10), but another significant association was found with internalizing (r = 0.19).There were no expectations regarding Dependent PD, as no results were reported in the meta-analysis by Ringwald et al. (2021).In sum, our results aligned well with the reported associations by Ringwald et al. ( 2021), considering that methodological differences between studies can likely account for moderate deviations (e.g., sample characteristics, methods, and indicators used to operationalize HiTOP spectra).

Mapping established psychopathology scales onto HiTOP
The bifactor-(S-1) models converged normally and the fit was acceptable (see Table S4).The complete list of standardized regression coefficients is displayed in Table S5.To better visualize the results, we used variance decompositions that depict the extent to which HiTOP dimensions are reflected in the established scales (Figure 2).To this end, standardized regression coefficients were taken to the square to indicate the variance explained by each predictor.
Most of the variance in the established scales was explained by HiTOP factors, with the p factor explaining an average of 54% and HiTOP spectra explaining an additional 14% (i.e., 69% in total).With the decision to interpret standardized regression coefficients > j0.20j as marked associations (as described in the Methods section), most scales could be considered pure markers of a single HiTOP spectrum (i.e., 54 scales), whereas fewer scales (i.e., 23) represented blends of HiTOP spectra.This indicates that most scales could be allocated relatively unambiguously to a spectrum when the p factor was taken into account.Among the established scales included in this study, we found 27 scales that were pure markers of internalizing, 5 for thought disorder, 6 for detachment, 9 for disinhibition, and 7 for antagonism.Three specific blends were found most frequently: highexternalizing-low-internalizing (as represented by seven scales), high-detachment-high-internalizing (i.e., five scales), and high-antagonism-high-disinhibition (i.e., four scales).In addition, 12 scales were found to exclusively represent the p factor.Finally, it should be noted that limited explained variance was found for only two scales (i.e., OPD-SQ Use of phantasy and AFI Interest-taking).
For readers who are particularly interested in how specific questionnaires are related to HiTOP dimensions, we provide Figure S4 in which the results are visually arranged according to the alphabetical order of the questionnaires.In the following, we will provide some examples for illustrative purposes.For some questionnaires, all scales incorporated therein were mapped to a single HiTOP spectrum.These were the BIS-11 scales (reflecting disinhibition), the DES scales (thought disorder), and the PCL-C scales (internalizing).Other questionnaires had scales predominantly tapping into the internalizing HiTOP spectrum (i.e., BSI, DERS, and OAS).However, most questionnaires had scales tapping multiple HiTOP spectra.To name a few examples, the EQ tapped into both antagonism (e.g., low EQ Emotional reactivity) as well as detachment (e.g., low EQ Social skills).Similarly, the IIP-32 reflected antagonism (e.g., IIP-32 Domineering) and detachment (e.g., IIP-32 Cold).The APSD tapped into disinhibition (e.g., APSD Impulsiveness) and antagonism (e.g., APSD Callousness).

DISCUSSION
In this study, we mapped 92 established psychopathology scales (including signs and symptoms of mental disorder as well as maladaptive traits and indicators of personality functioning) onto the current working model of HiTOP.To this end, we derived content-based scales of HiTOP dimensions, tested their validity, used a bifactor-(S-1) model to separate p factor and spectra statistically, and calculated their associations with established scales in order to estimate the location of scales within the HiTOP framework.The scales tended to be covered well by higher-level HiTOP dimensions and their estimated locations corresponded closely with their current placement in the HiTOP model.These findings underline the capacity of HiTOP to efficiently organize and summarize selfreported psychopathology and it strengthens the notion that established psychopathology measures could and should be integrated into HiTOP.
In previous studies, p factors tended to be saturated with content of the internalizing domain, albeit considerable inconsistencies were documented between studies that may be related to characteristics of the sample and the indicators used (e.g., Levin-Aspenson et al., 2021;Watts et al., 2020).While we do also find strong empirical overlap between internalizing and the p factor in this study, we also find them to have unique prototypical markers among the included scales that signal their distinctiveness.Our results further demonstrate that, after the p factor is taken out, psychopathology constructs can be linked to single HiTOP spectra or specific blends with clarity and consistency.These findings highlight the utility of the bifactor-(S-1) modeling approach to separate out the general disposition to mental health problems from the specific indications associated with more narrow symptoms of mental disorder, maladaptive personality traits, or indicators of personality functioning.

HiTOP structure
In the following, we will discuss how our findings may further the understanding of psychopathology structure.As pointed out previously, the estimated location of established scales tended to match the current placement of constructs in HiTOP.However, there were some noteworthy deviations that we will also discuss.
The pure markers of internalizing were (1) scales that assess intensely aversive states of negative emotionality (OPD-SQ Affect tolerance and OPD Affect differentiation) that are experienced as uncontrollable (DERS emotion regulation strategies and PAI-BOR Mood instability), including anxiety, phobia, depression (BSI Anxiety, BSI Phobia, and BSI Depression), posttraumatic stress (PCL-C Re-experiencing), and separation anxiety (OPD-SQ Detaching relations); (2) scales that assess adverse physiological or behavioral aspects of intense negative emotionality, such as arousal (PCL-C Hyperarousal), concentration problems (DERS Goal-directed behavior), avoidance (PCL-C Avoidance), and self-harm (DASI Selfharm and LHA Self-harm); and (3) scales that assess unstable or diffuse self-image (OPD-SQ Sense of identity, PAI-BOR Identity problems, and OPD-SQ Self-perception), as well as negative self-evaluation (OPD-SQ Regulation of self-esteem, OAS Feeling of inferiority, BSI Interpersonal sensitivity, OPD-SQ Use of introjects, OAS Feeling of emptiness, and OPD-SQ Bodily self).This pattern is consistent with the current HiTOP working model of the internalizing domain (Watson, Levin-Aspenson, et al., 2022).
Whereas previous studies regularly indicate what features of psychopathology tend to be most strongly related to the p factor, our study is the first to investigate pure markers of the p factor.We find pure markers of the p factor to be (1) scales that assess mentalizing impairments regarding one's own mental states in general (RFQ-18 Mentalizing of self) and with respect to one's own feelings and emotions in the specific (DERS Emotional clarity and DERS Emotional awareness) and (2) scales that assess suspiciousness and mistrust towards others in terms of feeling negatively evaluated by others (PAI-BOR Negative relationships, GPTS Thoughts of social referencing, and SPQ Mistrust), feeling estranged (SPQ Oddity), feeling unfairly treated or let down (BSI Paranoid Ideation and OAS Reactions of others), or expecting this to happen (OPD-SQ Internalization and ECR-R Attachment anxiety).These results are consistent with views that consider mentalizing impairments and epistemic mistrust as defining features of the p factor (Fonagy et al., 2021;Fonagy & Campbell, 2021) and that place self-and interpersonal functioning at the core of psychopathology (Widiger et al., 2019;Wright et al., in press).In fact, whereas we found evidence for suspiciousness to be a pure marker of the p factor, it has previously been placed in the spectra of detachment (Zimmermann et al., 2022), antagonism (Krueger et al., 2021;Mullins-Sweatt et al., 2022), or thought disorder (Cicero et al., 2022).Yet consistent with our results, studies using the Personality Inventory for DSM-5 (APA, 2013) have indicated that suspiciousness exhibits strong associations with a general PD factor but low domain-specificity (e.g., Somma et al., 2019;Williams et al., 2018).
With respect to the thought disorder spectrum, we found pure markers to be scales assessing unusual or odd beliefs and experiences or perceptual irregularities such as supernatural phenomena (i.e., SPQ Unusual beliefs), dissociative or psychotic experiences (DES Depersonalization, DES Amnestic dissociation, and DES absorption), and feeling persecuted or conspired against by others (GPTS Feelings of persecution).Markers of detachment were measures pertaining to avoiding social contacts and intimacy (i.e., IIP-32 Cold and ECR-R Attachment avoidance), having limited social skills (EQ Social skills), feeling uncomfortable and nervous in social interactions (SPQ Social anxiety), or not feeling rewarded by it (SPQ Social anhedonia).Interestingly, scales that pertain to problems with shyness (OPD-SQ Establishing contact and IIP-32 Socially inhibited) or to making use of social contacts (OPD-SQ Accepting help) appeared to be an interstitial feature between detachment and internalizing.In a similar vein, Ringwald et al. (2021) found that avoidant PD and social phobia precisely reflected this blend, which fits well with our placement of shyness scales, as well as the placement of the shyness scale of the MMPI-3 in HiTOP (Sellbom et al., 2021).
Regarding the disinhibition spectrum of HiTOP, we found pure markers to be scales of impulsiveness (e.g., BIS-11 Non-planning impulsiveness, BIS-11 Motor impulsiveness, BIS-11 Attentional impulsiveness, OPD-SQ Impulse control, APSD Impulsiveness, and AFI Authorship), substance use (e.g., DASI Drugs and alcohol), impulsive self-directed and other-directed aggression (PAI-BOR Self-harm, LHA Aggression, and OPD-SQ Impulse control), and increased willingness to take risks (APSD Impulsiveness).We found two blends that characterized a combination of disinhibition and internalizing; however, these were only represented by one scale each.A measure of acting impulsively under the influence of negative emotions (also: negative urgency; DERS Impulse control) was specifically related to high disinhibition and high internalizing, whereas sensation-seeking was indicative of high disinhibition and low internalizing (PAI-ANT Sensation-seeking).The most complex pattern of results was observed for the antagonism domain and its various blends.Pure markers of antagonism tapped into willfully ignoring others' feelings and needs (APSD Callousness, SRPS Callousness, IIP-32 Vindictive, and IIP-32 Overly accommodating reversed), caring a lot about oneself instead (APSD Narcissism and IIP-32 Domineering), and having a hostile attitude towards others (BSI Hostility).By contrast, conduct problems such as illegal activities or getting into troubles at work or in school were indicative of interstitial antagonism-disinhibition (PAI-ANT Antisociality, LHA Antisocial behavior, OPD-SQ Balancing interests, and SRPS Antisocial), and scales of cognitive empathy (EQ Cognitive empathy and RFQ-18 Mentalizing others) were placed between detachment and antagonism.The blend of high antagonism and low internalizing was represented by scales of affective empathy (EQ Emotional reactivity and OPD-SQ Empathy) as well as various scales that tap into being egocentric (PAI-ANT Egocentricity, SRPS Egocentricity, and IIP-32 Self-sacrificing reversed).However, what distinguishes pure markers of antagonism from interstitial markers of low-internalizing-high-antagonism seems hard to grasp.We suggest that the latter scales might tap into what the literature on psychopathy refers to as boldness/fearless dominance (for a meta-analysis, see Sleep et al., 2019), which is a construct related to narcissism and dominance-seeking (high antagonism) but also emotional stability (low internalizing).

Limitations
Some limitations of the current study should be considered.First, even though the used item pool is arguably among the more extensive collections of self-reports on psychopathology, some aspects were underrepresented (e.g., somatoform and obsessive-compulsive) or were not assessed at all (e.g., mania, eating pathology, and sexual problems).Second, the use of extreme groups in sampling (e.g., including both community participants and outpatients) likely bloats the saturation of the p factor in terms of inflating the magnitude of its associations (Fisher et al., 2020), yet we have no reason to believe that it influences their pattern (i.e., sizes of the effects relative to each other).Third, further characteristics of the sample (i.e., oversampling of individuals with pronounced personality pathology) may hamper generalization to other samples.Fourth, although we made the HiTOP factors statistically independent from the predicted scales to avoid inflated associations, there might be additional sources of criterion contamination that we could not control given limitations of the study design (e.g., common method bias).Fifth, when this study was conducted, we relied on the then current version of the HiTOP working model as outlined in Kotov et al. (2017), but the model is subject to ongoing revisions (e.g., Kotov et al., 2020;Krueger et al., 2021;Watson, Levin-Aspenson, et al., 2022).Future studies should replicate our analysis using the official HiTOP measure (e.g., see Simms et al., 2022) once it becomes available.This would allow the analysis to be performed with truly separate scales that would further reduce the risk of criterion contamination.Sixth, we assumed unidimensional measurement models for all established (sub)scales and tested model fit, but we did not further explore misspecifications.

Future directions and practical recommendations
Our study has several implications.First, whenever the aim is to study narrow clinical constructs, we advise researchers to conduct a broad assessment of psychopathology that taps into different hierarchical levels of HiTOP.Using this approach, the meaning and validity of constructs can be better established, specific associations can be studied (i.e., beyond higher-level psychopathology dimensions), jingle-jangle fallacies can be better identified (Lawson & Robins, 2021), and, finally, the treatment utility of clinical assessments may be enhanced (Kamphuis et al., 2021).Currently, an omnibus measure of the HiTOP model is under development (Simms et al., 2022) with initial results being published for preliminary scales of HiTOP spectra (Cicero et al., 2022;Mullins-Sweatt et al., 2022;Sellbom et al., 2022;Watson, Forbes, et al., 2022;Zimmermann et al., 2022).Until the instrument becomes available (and most likely beyond), researchers will need to rely on existing measures that capture psychopathology broadly and are compatible with HiTOP.Alternatively, from the scales included here, some are pure markers that could be used as proxies to operationalize HiTOP spectra.For example, the SPQ offers multiple pure marker scales of HiTOP dimensions: SPQ Social anhedonia scale could be used as a proxy to assess the detachment spectrum, SPQ Unusual beliefs for thought disorder, and SPQ Mistrust for the p factor.Slightly better, however, would be to approximate HiTOP spectra with multiple proxy scales.Yet in the absence of a truly comprehensive HiTOP measure that should exert higher fidelity in assessing higher-level psychopathology dimensions, inferences with improvised HiTOP measures will be limited but necessary.
The bifactor-(S-1) model offers advantages for modeling multiple levels of the psychopathology hierarchy (Eid et al., 2017;Heinrich et al., 2021), but it requires the specification of reference indicators that instantiate the p factor a priori.Unfortunately, there is little consensus about the meaning of the p factor.In this study, we circumvented this issue by parceling across HiTOP spectra to define the p factor, making use of the sheer mass of items included in this study.However, this is no parsimonious solution to define the p factor in future studies.Our results provide some support for the hypothesis that scales assessing impairments in mentalizing and epistemic trust may be pure markers of the p factor (Fonagy et al., 2021;Fonagy & Campbell, 2021), whereas other candidate constructs that have been proposed (e.g., emotional dysregulation and negative self-evaluation; Smith et al., 2020) were specific to spectra (e.g., internalizing).Although more evidence about the generalizability will be needed to corroborate these results, this raises some optimism that the p factor can be separately identified with selected transdiagnostic constructs.If pure markers (rather than just strong markers) of the p factor could thus be repeatedly identified with reasonable consistency and across different samples, these could be used to define the p factor in bifactor-(S-1) models.

CONCLUSION
Research has documented how symptoms of psychopathology tend to co-occur between individuals.As a result of synthesizing this literature, the HiTOP model proposes a hierarchical system of psychopathology including the p factor and several spectra (i.e., internalizing, thought disorder, detachment, antagonism, and disinhibition) that have exhibited strong validity (e.g., Kotov et al., 2020;Krueger et al., 2021;Watson, Levin-Aspenson, et al., 2022).Herein, we have reported results that help to understand which (sub)scales of established psychopathology questionnaires (a) are pure markers of HiTOP spectra, (b) are pure markers of the p factor, (c) reflect blends of HiTOP spectra, (d) or-in contrast-do not map onto HiTOP at all.This can enable researchers to form richer and more distinct interpretations of the constructs measured and it facilitates the cumulative integration of various clinical traditions that rely on different conceptualizations and assessments of psychopathology (e.g., OPD-SQ originating from psychodynamic theory) but can be traced into HiTOP as an organizing framework.
recently recommended for dimensionality analysis (i.e., Hull Method, Empirical Kaiser Criterion, traditional parallel analysis, and sequential χ 2 model tests; Auerswald & Moshagen, 2019).However, the methods did not converge on an optimal number of factors (see Table S3 for details) so we considered them to a lesser extent and relied more heavily on substantive considerations. 2Standardized factor loadings denote associations between indicators (e.g., PD diagnoses) and extracted factors (e.g., HiTOP spectra) that are usually rotated towards simple structure (e.g., geomin rotation).By contrast, in our study, PD diagnoses and HiTOP spectra are each measured independently, so that their association is estimated using the correlation coefficient. 3Of note, there is the possibility of using another analytic approach.When predicting a scale, all items of the questionnaire from which the criterion scale is taken can be excluded from the HiTOP factors (i.e., not only the items of the criterion scale).This approach could be considered even more conservative in avoiding inflated associations by controlling method variance associated with the specific characteristics of a questionnaire (e.g., number or labels of response options).However, it also has significant shortcomings (i.e., reduced construct coverage), which is why we report this analysis in the supplement (see Note S1) and do not consider it further in the remainder of this article.

F
I G U R E 1 Schematic display of the latent regression model used to map established psychopathology scales onto the Hierarchical Taxonomy of Psychopathology (HiTOP).Note: Indicator residual variances and latent factor variances are not displayed.HiTOP factors (i.e., INT-DET and P) were modeled using an orthogonal bifactor-(S-1) approach.Target scales (SCALE) were modeled as a unidimensional simple structure (except for DASI Drugs and alcohol).ANT = antagonism; DET = detachment; DIS = disinhibition; INT = internalizing; P = P factor; SCALE = target scale; THO = thought disorder; X 16 -X 18 = parcel indicators of the general factor (i.e., P); X 1 -X 15 = parcel indicators of HiTOP spectra (i.e., INT-DET); Y 1 -Y k = item indicators of the target scale; β = regression path; ε = residual variance of the dependent latent factor; η = dependent latent factor; λ = factor loading; ξ = independent latent factor.patternsseemed fully consistent with Ringwald et al.

F
I G U R E 2 Mapping established psychopathology scales onto the Hierarchical Taxonomy of Psychopathology (HiTOP).Note: Variance explained by the p factor (modeled as a general factor) and HiTOP spectra (modeled as specific factors) was calculated by taking the square of standardized regression coefficients (noted as β 1-6 in Figure1).The order of the scales indicates their estimated location within the HiTOP model as based on our results.ANT = antagonism; DET = detachment; DIS = disinhibition; INT = internalizing; THO = thought disorder; Total Variance = total variance of the latent factor.