Item Reduction, Psychometric and Biometric Properties of the Italian Version of the Body Perception Questionnaire—Short Form (BPQ-SF): The BPQ-22

Body awareness disorders and reactivity are mentioned across a range of clinical problems. Constitutional differences in the control of the bodily state are thought to generate a vulnerability to psychological symptoms. Autonomic nervous system dysfunctions have been associated with anxiety, depression, and post-traumatic stress. Though interoception may be a transdiagnostic mechanism promoting the improvement of clinical symptomatology, few psychometrically sound, symptom-independent, self-report measures, informed by brain–body circuits, are available for research and clinical use. We validated the Italian version of the body perception questionnaire (BPQ)—short form and found that response categories could be collapsed from five to three and that the questionnaire retained a three-factor structure with items reduced from 46 to 22 (BPQ-22). The first factor was loaded by body awareness items; the second factor comprised some items from the body awareness scale and some from the subdiaphragmatic reactivity scale (but all related to bloating and digestive issues), and the third factor by supradiaphragmatic reactivity items. The BPQ-22 had sound psychometric properties, good convergent and discriminant validity and test–retest reliability and could be used in clinical and research settings in which the body perception assessment is of interest. Psychometric findings in light of the polyvagal theory are discussed.


Introduction
Body awareness disorders and reactivity are mentioned across a range of clinical problems, and recently, it has been proposed that vulnerability to psychological symptoms, particularly anxiety, may originate in constitutional differences in the control of the bodily state [1]. Joint hypermobility syndrome, or Ehlers-Danlos syndrome hypermobile type social engagement system. The outputs of the social engagement system consist of motor pathways regulating striated muscles of the face and head (i.e., somatomotor) and smooth and cardiac muscles of the heart and bronchi (i.e., visceromotor). The somatomotor component involves special visceral efferent pathways that regulate the striated muscles of the face and head. The visceromotor component involves the myelinated supradiaphragmatic vagal pathway that regulates the heart and bronchi. Functionally, the social engagement system emerges from a face-heart connection that coordinates the heart with the muscles of the face and head. The initial function of the system is to coordinate sucking, swallowing, breathing, and vocalizing. Atypical coordination of this system early in life is an indicator of subsequent difficulties in social behavior and emotional regulation. The preferential recruitability of the social engagement system, or the progressive hierarchical recruitment of the SNS or the DVC, depends on the neural evaluation of environmental risk. According to the polyvagal theory, the neural evaluation of risk does not require conscious awareness and is achieved through neuroception, a neural reflexive mechanism capable of instantaneously shifting physiological state, distinct from perception, and capable of distinguishing environmental and visceral features that are safe, dangerous, or life-threatening. In safe environments, a neuroception of safety promotes the social engagement system and the autonomic state is adaptively regulated to dampen SNS activation and to protect the oxygen-dependent central nervous system, especially the cortex, from the metabolically conservative reactions of the DVC (e.g., fainting). Conversely, a neuroception of danger, or life threat, promotes SNS, or DVC, activation, respectively [25,26,29]. The organization of these individual circuits, along with the sympathetic nervous system, can affect subjective experiences of body awareness by modulation of signals that arise from the body by top-down postprocessing, including cortical areas informed by the information traveling through the body-integrative circuits of the brain [19].
Several tools have been developed to assess interoception but, following the review of Mehling and colleagues [21], they do not address important domains of the construct of body awareness and exhibit important psychometric limitations. To address these limitations, new psychometrically sound tools have been developed (e.g., [30][31][32]). Unfortunately, the self-report measures that show sound psychometric properties have not been rooted in the organization of peripheral neural pathways.
The body perception questionnaire (BPQ; [33]) was developed to evaluate the subjective experience of the function and reactivity of bodily organs and structures that are innervated by the ANS. The first version of BPQ, consisting of 122 items, assessed body awareness, ANS reactivity, cognitive-emotional-somatic stress response, body and cognitive stress response styles, and health history. Though the original BPQ has been widely used in several peer-reviewed publications and translated into several languages, its lack of psychometric testing and its extensive length has limited its broader application. The interest in BPQ has been mainly related to body awareness and autonomic reactivity subscales. Past research has mainly used only these subscales (e.g., [34,35]). In order to overcome these limitations, Cabrera and colleagues [36] validated the BPQ-short form (BPQ-SF) and found that body awareness was described by a single factor. In addition, autonomic reactivity reflected unique factors for organs above and below the diaphragm. Subscales showed strong reliability and converged with validation measures.
A paucity of self-report measures to assess body awareness and somatic reactivity is validated in the Italian language. The somatosensory amplification scale (SSAS; [37]; Italian version in Bernini et al. [38]) was originally developed to assess somatosensory amplification (SA), defined as the individual's tendency to experience somatic and visceral sensations as unusually intense, noxious, and disturbing. SA has been proposed to be a risk and/or maintenance factor for hypochondriasis, somatization, and, in general, physical symptom reports. The modified somatic perception questionnaire (MSPQ; [39,40] was originally developed for investigating chronic backache, or other forms of chronic pain, stroke and cardiovascular diseases, tinnitus and Meniere's disease and patients undergoing surgery. It has also been used to measure somatization in nonpainful conditions. Furthermore, the Italian version of the MSPQ [41] has been used with immigrant populations [42]. Though informative, unfortunately SSAS and MSPQ are not rooted in a neurophysiologically informed and reliable background. Considering the lack of a psychometrically sound tool aimed to assess body awareness and supradiaphragmatic and subdiaphragmatic autonomic reactivity, according to the polyvagal theory, validated in the Italian population, this paper aims to examine the psychometric properties and validate the BPQ-SF [36] to provide a useful tool that can be employed in both clinical and research fields. In particular, the present study aims to: (a) examine the factor structure and psychometric properties of the BPQ-SF among a non-clinical sample; (b) examine the BPQ-SF internal consistency and test-retest reliability; (c) demonstrate convergent validity with the SSAS, the MDSP and the depression anxiety stress scales-21 (DASS-21; [43][44][45]; Italian version in Bottesi et al. [46]; (d) demonstrate a discriminant validity with the obsessive beliefs questionnaire-20 (OBQ-20; [47]; Italian version in Melli et al. [48]).

Participants
We collected two samples of participants. The first sample consisted of 1.361 (80.9% female) community participants (M = 37.29 years, SD = 9.94 range 18-81), who responded to an email advertisement requesting volunteers to complete psychological questionnaires. In terms of education, 3.97% of the participants had a medium-level of education (high school degree), 40.41% had a higher-level degree (bachelor's degree or master's degree), and the remaining 55.62% had the highest level of education (Ph.D. or specialization). Most were employed (89.93%), 3.53% were undergraduate university students, and the remaining 6.54% were homemakers, unemployed, or retired. Regarding marital status, 47.98% were single, 46.51% were married or cohabiting, 4.78% were divorced, and 0.73% were widows or widowers. These participants completed a battery of measures described in the next section.
A second sample of 97 (84.7% female, mean age 37.08 years, SD = 10.17, range 24-67) participants, who completed the BPQ-SF twice at a 3 week interval, was recruited with the same strategy as the first. They had a master's degree or higher, and the majority was employed (90.82%). Sixty percent of them were single or divorced, with the others having a stable relationship.
To be included in the study, participants must be 18 or older and not report any history of psychiatric or psychological disorders.

Measures
Body Perception Questionnaire-Short Form (BPQ-SF; [36]). The BPQ is a self-report measure of body awareness and autonomic reactivity originally developed by Porges [33] and then refined by Cabrera et al. [36]. Although the latter authors introduced a very brief, 12-item version, we focused here on the 46-item version reported in the BPQ manual [49]. Items ask participants to rate on a 5-point scale (from 1 = never to 5 = always) the frequency with which they feel aware of bodily sensations (body awareness, e.g., "My mouth being dry") and the frequency with which they experience supradiaphragmatic reactivity (e.g., "I feel shortness of breath"), and subdiaphragmatic reactivity (e.g., "I have indigestion").
The Italian translation of the BPQ-SF was obtained through a mixed forward-and back-translation procedure [50]. The authors and one bilingual Italian-English psychologist independently translated the English version of the scale into Italian. After consensus among translators was achieved, an Italian-English researcher blind to the original version translated this preliminary version back into English. Discrepancies emerging from this back-translation were discussed with the original authors of the scale. Before being used in this study, the newly developed Italian version of the BPQ-SF was administered to ten participants (not included in the present study) in order to check the understandability of the items, which were all found to be easy to understand and to provide ratings for.
Somatosensory Amplification Scale (SSAS; [37]). The SSAS is a measure of "somatosensory amplification", i.e., the individual's tendency to experience somatic and visceral sensations as unusually intense, noxious, and disturbing. It comprises 10 items that ask the participant to report how much she is bothered by various uncomfortable visceral and somatic sensations that, however, are not pathological symptoms of serious diseases. In this study, we used the Italian version for the SSAS developed by [38].
Modified Somatic Perception Questionnaire (MSPQ; [39,40]). The MSPQ is a list of 22 symptoms of heightened somatic awareness (e.g., "feeling hot all over", "blurring of vision"). Items are rated on a 4-point scale of severity (1 = not at all, 4 = "extremely, could not have been worse"), and the total score is derived from the sum of the original 13 items introduced by [39]. In this study, we used the Italian version by [41].
Depression Anxiety Stress Scales-21 (DASS-21; [43][44][45]). The DASS-21 is a 21-item selfreport questionnaire that assesses the core symptoms of depression (including lack of incentive, low self-esteem, and dysphoria, e.g., "I couldn't seem to experience any positive feeling at all"), anxiety (including somatic and subjective symptoms of anxiety, as well as acute responses of fear, e.g., "I felt scared without any good reason"), and stress (including irritability, nervous tension, difficulty relaxing, and agitation, e.g., "I tended to over-react to situations"). Participants are asked to rate the severity of the symptoms over the past week on a 4-point scale (1 = "Did not apply to me at all", 4 = "Applied to me very much or most of the time"). In this study, we used the Italian version by [46].
Obsessive Beliefs Questionnaire-20 (OBQ-20; [47]). The OBQ-20 is a 20-item version of the original 87-item [51] 2001 and the subsequent 44-item version [52]. The purpose of this scale is the assessment of four dysfunctional belief domains that can lead to a misappraisal of intrusive thoughts: (i) threat (e.g., "If I do not take extra precautions, I am more likely than others to have or cause a serious disaster"); (ii) inflated responsibility (e.g., "If I don't act when I foresee danger, then I am to blame for consequences"); (iii) importance and control of thoughts (also assesses need to control thoughts; e.g., "For me, having bad urges is as bad as actually carrying them out"); and (iv) perfectionism (also assesses intolerance of uncertainty; e.g., "I must keep working until it's done exactly right"). Items are rated on a 7-point agreement scale (1 = "disagree very much", 7 = "agree very much"). The Italian version we used here is the one by [48].

Procedure
The questionnaires were made available online using a secure web-based survey program (SurveyMonkey). Questionnaires were administered in a counterbalanced fashion to control for order and sequence effects, and batteries took between 15 and 25 min to complete. All participants volunteered to take part in the study after being presented with a detailed description of the procedure and were treated in accordance with the Ethical Principles of Psychologists and Code of Conduct [53]. No external incentives were offered for participating in this study.

Data Analysis
As a first step in the analyses, we examined the item score distributions in order to evaluate the frequency distributions of the scores for each item, namely, whether all the values of the answer scale had been endorsed at least once, and we also assessed the extent of the missing data. Shapiro-Wilk test was performed to verify the normality of the distributions.
We then tested on the total sample through confirmatory factor analysis (CFA) using the weighted least squares with means and variance adjustment (WLSMV) estimator (theta parameterization) whether the hypothesized 3-factor structure (body awareness, supradiaphragmatic reactivity, and subdiaphragmatic reactivity, taking into account that item 41 ("I feel like vomiting") should load on both the last two factors) was supported by the data at hand. The goodness-of-fit was evaluated using the comparative fit index (CFI), the Tucker-Lewis index (TLI), and the root-mean-square error of approximation (RMSEA), with its 90% confidence interval (CI). We used the following criteria for model fit [54]: TLI and CFI: values ≥ 0.90 indicated acceptable fit, values ≥ 0.95 indicated excellent fit; RMSEA: values ≤ 0.08 indicated acceptable fit, values ≤ 0.06 indicated excellent fit. Missing values were handled by the full information method implemented in Mplus 7 [55], with which we performed the analyses.
In case of inadequate fit of this model, since we would have to investigate the most suitable measurement model for the Italian BPQ-SF without the support of prior knowledge, we decided on a cross-validation approach, i.e., performing an exploratory factor analysis (EFA) on a random split of the sample in order to find a factor structure that could meet the requirements of a simple approximate structure [56], i.e., each item should substantively (>|0.32|; [57]) load on one factor, while negligibly loading on the others), and a CFA on the other random split.
Before performing these analyses, however, we searched for redundancies and items with low squared multiple correlations (SMC) using the total dataset. The former searches for pairs of items whose intercorrelation is too strong. In factor analysis, these items are likely to yield the so-called "bloated specifics" [58], p. 288), i.e., factors of little substantive interest that result from very highly correlated items that usually share very similar content and/or wording. We considered as redundant items those whose intercorrelation was larger than |0.707| (i.e., more than 50% of shared variance). We then computed SMC for all the remaining items. SMC is the proportion of variance shared by an item with all the others, and it is routinely used by EFA software as an estimate of initial communality, i.e., an estimate of the proportion of variance of an item accounted for by the common factors. Items with SMC smaller than 0.10 are unlikely to contribute substantially to the measurement model and can be removed from the item pool [57].
In order to perform EFA on the first random subsample, we first investigate the optimal number of factors to be extracted through dimensionality analyses, i.e., the scree test [59], the parallel analysis (PA, [60]), and computed the minimum average partial (MAP) correlation statistic [61]. The scree test [59] suggests that the optimal number of factors corresponds to the factors before, which the downward curve of the eigenvalues seems to flatten out. Parallel analysis [60] compares the observed eigenvalues to the eigenvalues generated from a simulated matrix of random data of the same size. Based on the recommendations of Buja & Eyuboglu [62], we performed PA on 1000 random correlation matrices obtained through permutation of the raw data, and following Longman and colleagues [63], we considered the 95th percentile random-generated eigenvalues as the threshold values. Velicer [61] proposed that the optimal number of factors is the one at which the average partial correlation of the variables (i.e., the MAP statistic) reaches its minimum after partialling out the factors.
Once determined the optimal number of factors, we could perform the exploratory analyses, always on the first random subsample. We used exploratory structural equation modeling (ESEM, [64]) with WLSMV estimation, theta parameterization, and GEOMIN rotation. ESEM allows for the estimation of all factor loadings (subject to the constraints necessary for identification) and, in general, for an exploration of complex factor structures (similarly to EFA) while allowing access to parameter estimates, standard errors, goodness-of-fit (GOF) statistics, and modeling flexibility (e.g., correlating error variances, obtaining factor scores corrected for measurement error, etc.)-all features that are otherwise commonly associated with CFA. The choice of the final model relied on the GOF indices (using the same criteria described above for the CFA) and the best approximation of a simple structure. As ESEM allows the estimation of the standard errors of loadings, we considered as substantial those loadings whose 95% confidence interval was entirely over the |0.32| threshold.
Once determined a measurement model through ESEM, we used the data from the other random subsample to test its fit using CFA. Together with the obtained factor model, we also tested alternative models. Two parsimonious models, such as a single factor model and an independent-factor model, and a bifactor model, i.e., a model where the items loaded on general body awareness and reactivity factor, and on specific factors, allowed us to examine the reliability of the total score of the BPQ-22. Besides Cronbach's alpha, we computed the indices suggested by Rodriguez and colleagues [65] to test whether the single factor score could be considered as sufficiently reliable to be used along with subscale scores. We thus calculated the omega hierarchical coefficient, the explained common variance (ECV), the proportion of items with a relative bias (i.e., the absolute difference between an item's loading in the unidimensional solution and its general factor loading in the bifactor model, divided by the general factor loading), and the percentage of uncontaminated correlations (PUC, i.e., the number of correlations between items from different group factors divided by the total number of correlations). Support for the use of the total score despite the presence of a multidimensional factor structure is advised if a threshold of 0.80 for the omega hierarchical, and 0.70 for the ECV and the PUC is met [66], and if the proportion of items with a relative bias does not exceed 15% [67].
Construct validity was investigated by computing Spearman correlation coefficients among the observed scores of the BPQ-22 and the other measures administered to the first sample of participants.
The association of the BPQ-22 scores with background variables was tested by specifying a general linear model that included as predictors sex, age, years of education, relationship status, and occupational status. Dunn's post hoc comparisons with adjustment for false discovery rate were used to test differences between groups in significant categorical predictors.
Finally, we tested the test-retest reliability of the scales on the second sample of participants. The retest coefficient was computed as the Spearman correlation of observed scores at time 1 and time 2, while the stability of scores was evaluated through a Wilcoxon signed-rank test. In order to find evidence for adequate stability of scores, we expected retest coefficients larger than 0.70 (i.e., at least 50% of shared variance) and negligible or low effect sizes (i.e., r < 0.30) for the Wilcoxon signed-rank test.
Wherever possible, we computed and reported 95% confidence intervals and measures of effect size. For the data analyses we used IBM ® SPSS ® 27 software.

Results
The percentage of missing answers never exceeded 1% (see Table S1 in the Supplementary Materials (SM)), but for some items, the distribution was highly positively skewed ( Figure 1), as the 4 and 5 answer options were never or very rarely endorsed. Since we planned to analyze these data as ordinal, the resulting sparseness of the contingency tables of item scores would have affected the estimate of correlation coefficients. We could have removed these items from the item pool, but this would have affected the content validity of the questionnaire (i.e., "the degree to which elements of an assessment instrument are relevant to and representative of the targeted construct for a particular assessment purpose"; [68], p. 238). Hence, we decide to address the issue of rarely endorsed response categories by collapsing categories 3, 4, and 5. This practice has traditionally received mixed consideration in the literature, as there have been supporters (e.g., [69]) and opposers (e.g., [70]). However, a recent simulation study showed that collapsing rarely used response options had negligible effects in establishing valid psychiatric symptom structures, and thus it is a feasible option as long as item scores are specified as ordinal and the sample size is adequate [71], as it is the case of this study. Reducing the response scale from 5 to 3 options ensured that the more severe symptoms states were represented by at least 1% of the sample ( [71]; see Table S1 in the SM).
The results of the CFA performed on the total sample to test the adequacy of the original three-factor measurement model revealed a poor fit of this model (χ 2 (985) = 6918.283, p < 0.001; comparative fit index (CFI) = 0.789, Tucker-Lewis index (TLI) = 0.778, root mean square error of approximation (RMSEA) = 0.067 (90% confidence interval: 0.065−0.068)). The inspection of modification indices suggested the specification of both cross-loadings (i.e., loadings of some items on the non-expected factor) and correlated uniquenesses (i.e., a correlation between the model-estimated residual variance of a pair of items). We chose to avoid post-hoc modifications of the model since this practice is known to lead to the specification of models that capitalize on chance characteristics of the data and thus are likely to have poor replicability (see, e.g., [72]   Before using the cross-validation approach described earlier to find the most suitable measurement model for the BPQ items, we searched for redundancies and items with low squared multiple correlations (SMC) using the total dataset. The only pair of items that exceeded the threshold for redundancy was the one that comprised items 26 ("feeling constipated") and 43 ("I am constipated"). Given that these two items tap into the same symptom, we kept in the item pool only item 43. We then computed SMC for all the remaining items. The only item with an SMC smaller than the threshold was item 11 ("muscle tension in my face", SMC = 0.058), and was removed from the item pool.
The total sample was then randomly split, and dimensionality and EFA were carried out on the first random subsample (n = 668) to determine the optimal number of factors and the most suitable measurement model for the BPQ-SF items. We then tested the fit of this and alternative measurement models using CFA on the other random split (n = 693).
The line of the scree plot seemed to flatten out at the fourth or fifth factor, suggesting the extraction of three or four factors, respectively ( Figure 2). However, the parallel analysis revealed that six observed eigenvalues were larger than the 95th percentile of the corresponding random eigenvalues, while the MAP statistic reached its minimum at the fourth factor (0.0175, 0.0144, 0.0111, 0.0110, 0.0112, 0.0114, 0.0121, 0.0127). Hence, it emerged that the optimal number of factors could be three, four, or six, which would account for 40.98%, 44.78%, and 51.36% of the variance, respectively.   We then used ESEM to test the fit of these models. The results are reported in Tables 1-3. The three-factor solution had an acceptable fit (χ 2 (817) = 1496.322, p < 0.001; CFI = 0.942, TLI = 0.933, RMSEA = 0.035 (0.032; 0.038)) and emerged as the most suitable measurement model since we could find at least eight items per factor that had a single loading with a confidence interval entirely over 0.32.
The 4-and 6-factor ESEM models showed a slightly higher fit (χ 2 (776) = 1279.736, p < 0.001; CFI = 0.957, TLI = 0.948, RMSEA = 0.031 [0.028; 0.034] and χ 2 (697) = 966.611, p < 0.001; CFI = 0.977, TLI = 0.969, RMSEA = 0.024 [0.020; 0.028], respectively). However, in the former a factor was defined only by two items that referred to cough issues (2 and 34), while the latter no item showed a single loading with a confidence interval entirely over 0.32 in the second factor. Although none of the dimensionality analyses suggested doing so, we also tested a 5-factor model. This model had an acceptable fit (χ 2 (736) = 1094.002, p < 0.001; CFI = 0.969, TLI = 0.961, RMSEA = 0.027 (0.024; 0.030)), but it did not show evidence of approximated simple structure, as there were three items with loadings whose confidence interval was entirely over 0.32 on two factors, and there were two factors on which loaded only two items (see Table S2 in the SM). In the final model, the first factor (BPQ-BOA) was loaded by body awareness items (15,18,16,17,12,19,5) and the third factor (BPQ-SUP) by supradiaphragmatic reactivity items (32,27,33,28,39,37,38,34). The second factor (BPQ-BOA/SUB) comprised some items from the body awareness scale and some from the subdiaphragmatic reactivity scale (45,42,14,44,13,46,7), but all tapping into bloating and digestive issues. Together, these items are comprised in the BPQ-22.   Once determined a measurement model through ESEM, we used the data from the other random subsample to test the fit of these models and of three alternative models (single-factor model, three-independent-factor model, and bifactor model) using CFA. The results are reported in Tables 4 and 5 and show that both the three-correlated-factor and the bifactor model have an adequate fit. Although the single factor score showed a Cronbach's alpha of 0.91, the omega hierarchical coefficient was 0.76 (hence lower the recommended threshold of 0.80), and the explained common variance (ECV) was 0.54, suggesting a weak general factor. Moreover, the proportion of items with a relative bias larger than the recommended 15% was 0.46, and the percentage of uncontaminated correlations was 0.69. As none of these indices met the criteria described in the Data Analysis section, we concluded that these results did not support the use of a total score. Table 4. Goodness-of-fit indices for the confirmatory factor analysis models on the second random subsample (n = 693). Note: all chi-squared tests were significant at p < 0.001; df = degrees of freedom; CFI = comparative fit index; TLI = Tucker-Lewis index; RMSEA = root-mean-square error of approximation; CI = confidence interval. Table 6 shows the correlations of the scores on the BPQ-22 scales with the other scales in this study. Given the very large sample size, we refrained from reporting significance levels, as a correlation as low as 0.089 would have been significant at p < 0.001 and comment only on effect sizes. The three BPQ-22 scales showed very similarly moderate (i.e., 0.30 < ρ < 0.50) correlations with the SSAS, suggesting that higher scores on any scale are associated with a higher tendency to experience somatic and visceral sensations as intense and disturbing. The body awareness scale scores tended to be less correlated with the other measures than the other two scales of the BPQ-22, as coefficients were always in the weak range (i.e., 0.10 < ρ < 0.30). Despite not being much correlated with one another, the BPQ-SUP and the BPQ-BOA/SUB showed a similar pattern of moderate, positive correlations with the DASS scales, suggesting that higher scores tend to be associated with higher levels of anxiety, depression, and stress symptoms. The same two BPQ-22 scales had a strong correlation (i.e., ρ > 0.50) with MSPQ, which is consistent with the expectation that individuals with higher levels of supra-and subdiaphragmatic reactivity tend to report higher levels of somatic complaints, supporting the convergent construct validity of the scales. Finally, the BPQ-SUP scale had low-to-moderate correlations with the OBQ scales, while the other two BPQ-22 scales showed only weak correlations with them. This result indicates that individuals with a higher supradiaphragmatic reactivity tend to generally report more intense misappraisals of intrusive thoughts. Taken together, these results seem to support the convergent and discriminant validity of the BPQ-22 scales.

Discussion
The aim of the present study was to validate the Italian version of the BPQ-SF [36], evaluating the possibility of collapsing response categories, item reduction, its factor structure, reliability, convergent and discriminant validity. Our results supported the collapsing of response categories 3, 4, 5, the item reduction from 46 to 22, a three-factor structure (consisting of a body awareness factor, a supradiaphragmatic factor and a subdiaphragmatic/body awareness factor), the test-retest reliability and the convergent and discriminant validity of the scale.
Though collapsing response categories have been both supported (e.g., [69]) and opposed (e.g., [70]), to preserve the content validity of the questionnaire, we collapsed response categories 3, 4, 5. Accordingly, a recent simulation study showed that collapsing infrequently used response options had minor effects in establishing valid psychiatric symptom structures. Therefore, it represents a viable option as long as item scores are specified as ordinal and the sample size is adequate [71]. Hence, the item scores and the sample size of this study were adequate to undergo this kind of analysis.
The three-factor structure is in line with the original findings by Cabrera et al. [36] and with the findings by Wang et. al. [73] for the Chinese validation of BPQ-SF. The BOA factor, consisting of items related to the upper parts of the body (e.g., "watering or tearing of my eyes") or to the whole body (e.g., "goose-bumps") may reflect the convergence of cranial and spinal pathways integration in the brainstem, while the SUP factor may reflect the function exerted by VVC whose fibers originate in the NA in the brainstem. Interestingly, we found that the third factor, BOA/SUB, included three items from the original BOA subscale as well as four items that are related to subdiaphragmatic issues, all tapping into bloating and digestive issues. Though it may seem surprising or unexpected, there is evidence that may account for this finding. Recently, Kaelberer et. al. [74], using optogenetics and whole-cell patch-clamp recordings, found remarkable evidence for a gut-brain neural circuit involved in nutrient sensory transduction through enteroendocrine cells. Nevertheless, unlike other sensory epithelial cells, no synaptic connection has been established between enteroendocrine cells and a cranial nerve [75]. It is believed that these cells can act on nerves only through an indirect effect mediated by the slow endocrine action of hormones, like cholecystokinin. However, circulating concentrations of cholecystokinin typically reach their highest levels just several minutes after food ingestion. Hence, this evidence suggests that the central nervous system may sense gut sensory information through faster synaptic transmission. A monosynaptic tracing, using rabies virus, allowed us to discover that enteroendocrine cells project to vagal nodose neurons in one synapse. Furthermore, optogenetic stimulation of enteroendocrine cells generated excitatory postsynaptic potentials in synaptically-associated nodose neurons within milliseconds. Eventually, cholecystokinin and glutamate pharmacological inactivation experiments showed that enteroendocrine cells (termed neuropod cells; [74]) use glutamate as a neurotransmitter to relay sensory gut information to the brainstem. Remarkably, vagal nodose neurons are pseudounipolar afferent neurons so that the information is relayed in just one synapse to the brainstem and, in particular, to the nucleus of the tractus solitarius (NTS). NTS is a brainstem nucleus that represents an integrative hub for olfactory and gustatory information [76] that is upstream to both the NA and the DMNX. In addition to receiving sensory information also from the gut, NTS receives information also from the insula [77] that is typically believed to play a critical role in human interoceptive awareness [78]. Thus, the NTS seems to integrate subdiaphragmatic reactivity information originating directly from the gut (e.g., the information outflow originating from neuropod cells) and bodily awareness information originating from the insula. Accordingly, the SUB component of our BOA/SUB factor may be represented by gut sensory information projected by cells like neuropod cells and relayed by NTS to cardioinhibitory fibers stemming from DMNX, while the BOA component may be represented by bodily awareness information projected by insula and relayed by NTS to cardioinhibitory fibers stemming from NA. Together, this may represent the implementation of the immobilization without fear state, through a neuroception of safety [29], that, in the polyvagal theory, is believed to require a co-activation of the NA and the DMNX fibers. The co-activation of myelinated NA fibers would assure a sense of safety given by the awareness of one's own bodily state, that could be, or promote, a portal to self-compassion [79][80][81][82][83][84][85][86][87][88][89][90].
Regarding convergent validity, the three BPQ-22 scales showed moderate correlations with the SSAS, suggesting that higher scores on any scale are associated with a higher tendency to experience somatic and visceral sensations as intense and disturbing. The BPQ-SUP and the BPQ-BOA/SUB showed a similar pattern of moderate, positive correlations with the DASS-21 subscales, suggesting that higher scores tend to be associated with higher anxiety levels, depression, and distress symptoms. Accordingly, anxiety, depression and distress symptom appraisals tend to be highly rooted in bodily perception. Furthermore, BPQ-SUP and BPQ-BOA/SUB scales strongly correlated with MSPQ, which suggests that individuals with higher levels of supra-and subdiaphragmatic reactivity tend to experience higher levels of somatic complaints, supporting the convergent construct validity of the scales. Finally, the BPQ-SUP scale had low-to-moderate correlations with the OBQ scales, while the other two BPQ-22 scales showed only weak correlations with them. This result indicates that individuals with a higher supradiaphragmatic reactivity tend to generally report more intense misappraisals of intrusive thoughts. Overall, these results seem to support the convergent and discriminant validity of the BPQ-22 scales.
Regarding background demographic variables, we found that single participants tended to endorse higher scores in the SUP than participants in a relationship. These results may be explained considering that, according to the polyvagal theory, intimacy and romantic relationships require immobilization without fear, which is implemented by a co-activation of the VVC and the DVC, thus implying the whole parasympathetic nervous system. VVC activation, requiring the activity of the myelinated fibers of the NA, may contribute to homeostatically regulate sympathetic and supradiaphragmatic activation. Single and lonely people may present a state, or trait, tendency to rely on the sympathetic nervous system [91,92], associated with a tendency not to trust the parasympathetic nervous system states that may be sensed as unsafe. For the BOA scale, we show that scores tended to decrease with age, a finding that is in line with previous research demonstrating that aging tends to increase sensory thresholds for a variety of exteroceptive and proprioceptive stimuli [93]. Unsurprisingly, age-related declines have been demonstrated in VVC cardiac autonomic regulation through the assessment of respiratory sinus arrhythmia [94][95][96]. These results suggest that the co-occurring bodily sensations changes may reflect dampened sensory transmission between body and brain over time.
Regarding the test-retest reliability of the scales, we evaluated it on another sample of participants. The scores were consistent after a 3 week period, as all test-retest correlations were well above 0.70, showing that BPQ-22 has a good test-retest reliability.
This study has some limitations that should be outlined. First, the questionnaire's psychometric properties were evaluated only in a large non-clinical sample recruited from the general Italian population; further studies should confirm its three-factor structure and adequate reliability and validity in clinical samples, although this would require large clinical sample sizes. Second, though the scale had a sound pattern of convergent and discriminant validity when the present study was planned, other tools concerning a neurophysiologically informed body perception assessment were not available. Thus, future studies using other neurophysiologically informed convergent measures would confirm our conclusions.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/ijerph18073835/s1, Table S1: Item scores distribution in the 5-and 3-point scoring conditions;  Institutional Review Board Statement: The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Ethics Committee of the University of Pisa (protocol code 0036344/2020 and date of approval 3rd of April 2020).
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.