Differentiating weight-restored anorexia nervosa and body dysmorphic disorder using neuroimaging and psychometric markers

Anorexia nervosa (AN) and body dysmorphic disorder (BDD) are potentially life-threatening conditions whose partially overlapping phenomenology—distorted perception of appearance, obsessions/compulsions, and limited insight—can make diagnostic distinction difficult in some cases. Accurate diagnosis is crucial, as the effective treatments for AN and BDD differ. To improve diagnostic accuracy and clarify the contributions of each of the multiple underlying factors, we developed a two-stage machine learning model that uses multimodal, neurobiology-based, and symptom-based quantitative data as features: task-based functional magnetic resonance imaging data using body visual stimuli, graph theory metrics of white matter connectivity from diffusor tensor imaging, and anxiety, depression, and insight psychometric scores. In a sample of unmedicated adults with BDD (n = 29), unmedicated adults with weight-restored AN (n = 24), and healthy controls (n = 31), the resulting model labeled individuals with an accuracy of 76%, significantly better than the chance accuracy of 35% (p^<10‑4). In the multivariate model, reduced white matter global efficiency and better insight were associated more with AN than with BDD. These results improve our understanding of the relative contributions of the neurobiological characteristics and symptoms of these disorders. Moreover, this approach has the potential to aid clinicians in diagnosis, thereby leading to more tailored therapy.

In stage 1, MELODIC was used to identify large-scale connectivity patterns in all participants. A group-level ICA was performed with the de-noised single participant data as inputs, resulting in a decomposition of our data set into 20 independent components. These group ICA components, specific for when participants viewed high spatial-frequency bodies, were then correlated with canonical ICA networks as previously described (Smith et al , 2009) . The components with the highest correlations with canonical networks for our networks of interest were used to select our network-specific masks.
In stage 2, we identified subject-specific temporal dynamics and associated spatial maps for each subject's fMRI data. This step used the full set of group-ICA spatial maps in a spatial regression for a linear fit against the separate fMRI data sets. The resulting time-course matrices (comprised of the temporal dynamics for each component and each subject) were then used in a temporal regression for a linear model fit against the fMRI data to estimate subject-specific spatial maps. The spatial maps from stage 2 were used to calculate coherence values for each subject within each network mask.

Handling of missing data, continued:
We accomplished multiple imputation by creating 20 independent filled-in (imputed) datasets, and then analyzing each separate dataset. This technique supposes that data is missing completely at random, which is defined by no relation between the probability of missingness and the value of the missing data point. The results from each dataset were combined according to the between and within imputation variation (Rubin, 2004) . We initiated each dataset with cold-deck imputation: the missing values were filled in randomly, with replacement, from the other observed values. To improve upon these uninformed estimates, we trained a linear regression model to predict the mean and variance of each missing value based on other observed variables (e.g., MH and NPL), not considering the diagnosis of any participants. The missing values were filled in stochastically, based on the predicted mean and variance. A coordinate descent method was used to impute multiple variables where, when imputing missing BABS values, the most recent iteration of imputed coherence values were used to train the imputation model, and vice-versa for imputing coherence based on BABS. This was completed for over 1,000 iterations. For the first 100 cycles, we artificially multiplied the variance by an exponentially decaying factor such that the variance increased by 1% after 100 cycles; this provided a more complete exploration of the likelihood space of imputation models. After roughly 150 cycles, the deviance of the imputation models had reached an equilibrium. We used residual deviance from prediction as our goodness of fit measure because for least-squares residuals, minimization of deviance results in the maximum likelihood estimate of statistics.

Diagnostic Statistical Modeling, continued:
In leave-one-out training, the predictive model is trained excluding one participant, then validated using the excluded participant. The identity of that excluded participant is then changed so that each participant is the validation participant once and only once.
We evaluated model performance by comparing our observed results to those of identical models trained on 10,000 independent datasets in which the participant diagnoses were shuffled without replacement. The permuted results were used to create empirical probability distributions of each reported statistic, yielding an empirical, two-tailed p-value indicated by . p ︿ Permutation tests were used to avoid distributional assumptions of reported statistics that could have been made inaccurate by multiple imputation and other factors.