Psychometric properties of the Social Behavior Questionnaire (SBQ) in a longitudinal population-based sample

We assessed the psychometric properties of the Social Behavior Questionnaire (SBQ), a 30-item questionnaire evaluating social (e.g., disruptive behaviors, bullying) and emotional problems (e.g., anxiety, depression) among children aged 3.5–12 years. Children (n = 1,950, 50.21% boys) were drawn from the Quebec Longitudinal Study of Child Development. Mothers reported the frequency with which children presented social and emotional behaviors from 3.5 to 8 years of age, and teachers from 6 to 12 years. We assessed internal structure using Confirmatory Factor Analysis, reliability using Cronbach’s alpha, and convergent and discriminant validity using a multitrait-multimethod (MTMM) approach. The six-factor (emotional distress, withdrawal, impulsive/hyperactive/inattentive, disruptive behaviors, prosocial behaviors, and peer relationships difficulties) structure of the SBQ showed good fit from ages 3.5 to 12 years. Reliability estimates were good to excellent (alphas > .7), and MTMM showed good convergent and discriminant validity. Overall, the SBQ presented good psychometric properties with a large population-based sample aged 3.5–12 years. Further studies should assess its screening potential by investigating its convergent validity with diagnostic information.

The aim of this study was to evaluate the psychometric properties of parent and teacher SBQ assessments with a cohort of Canadian children aged 3.5-12 years. Following the guidelines proposed by Boateng et al. (2018), we evaluated the SBQ's internal structure using Confirmatory Factor Analysis (CFA), reliability using Cronbach's alpha, and convergent and discriminant validity using a MTMM approach.

Method
The purpose of the SBQ questionnaire is to assess children's social and emotional behaviors within the home and the school environment. Thus, the SBQ ranks children according to the severity of the assessed behavior (measured as latent variable with multiple indicators) as evaluated by teachers and parents, separately. The following section describes the item selection and analysis conducted to evaluate the psychometric properties of the SBQ within raters.

Data and Sample
Data were drawn from the Québec Longitudinal Study of Child Development (QLSCD), a representative cohort of 2,120 infants born in Québec, Canada, between October 1997 and July 1998 . Participants were selected from the Québec Birth Registry based on living area and birth rate. Inclusion criteria were single pregnancy, pregnancy that has lasted 24-42 weeks of amenorrhea, and maternal literacy in French or English. The QLSCD protocol was approved by the Institut de la Statistique du Québec (ISQ) and the Sainte Justine Hospital Research Center ethics committees on 10 March 1998 and 10 January 2004, respectively. Ethics approval and written informed consent were obtained from the person most knowledgeable about the child at each data collection wave. Data were collected, in either English or French, by trained interviewers from the person most knowledgeable about the child (mother in >98% of the cases) at child ages 3.5, 4, 5, 6, 7, 8, 10, and 12 years. Teachers also rated the children's social and emotional behavior at 6, 7, 8, 10, and 12 years of age. At 3.5 years, the QLSCD sample consisted of 1,950 children, of whom 50.21% were boys (Table 1).

Measurements
From the initial pool of items used between 3.5 and 12 years of age, we selected 30 items based on the following criteria: (1) best construct validity according to expert opinion and (2) highest loadings on the original factor (Table 2). For the former, we have conducted several meetings with experts in child development, developmental psychopathology, and psychometrics, including some of the co-authors of this study (CO, MO, MB, and SMC) to select the most informative items. Items were reported by the person most knowledgeable about the child and by the teacher. Items were answered on a three-point Likert-type scale ("never or not true," "sometimes or somewhat true," "often or very true") referring to the past 12 months. The positively phrased items were reverse-coded. At each data point, the subscales scores were obtained calculating the mean of the items. The SBQ is currently available in both French and English (Supplemental Appendices 1 and 2 for the French and English version, respectively).

Data Screening
We checked the presence of outliers in items scoring. Furthermore, for each data collection wave, we calculated the total number of missing items responses and excluded participants presenting missing data on all (i.e., 30-item missing responses, except for the maternal-reported SBQ at 8 years: 26 missing item responses).

Statistical Analysis
Internal Structure. To examine the internal structure of the SBQ, we carried out first-order CFA, using the robust weighted least squares means and variance adjusted estimator (WLSMV) accounting for the ordinal nature of the items' response scale. The full information maximum likelihood (FIML) method was used to account for missing data. We applied CFA with the following a-priori defined factors: (1) emotional distress, (2) withdrawal, (3) impulsive/hyperactive/inattentive, (4) disruptive behaviors, (5) prosocial behavior, and (6) Tabachnick & Fidell, 2012). The RMSEA has been recognized as the best indices when evaluating models using the WLSMV estimator (Yu & Muthén, 2002). We also reported the relative chi-square (i.e., the ratio chi-square/DF, acceptable if ratio ⩽5; Schumacker & Lomax, 2004 Reliability. Internal consistency was estimated for each subscale using a Cronbach's alpha adapted for Likert-type item response (Gadermann et al., 2019;Zumbo et al., 2007). Internal consistency values below 0.70 are considered "unsatisfactory," between 0.71 and 0.80 "good," between 0.81 and 0.90 "very good," and above 0.91 "excellent" (Cicchetti, 1994;Cohen, 1977). Internal consistency was also investigated using Omega coefficients. We further estimated inter-rater reliability between maternal and teacher-reported assessments at ages 6 and 8 years using intraclass correlation coefficients (ICC; Shrout & Fleiss, 1979). As the SBQ is not a diagnostic questionnaire but is rather aimed to be used in cohorts following children, we estimated consistency ICC. ICC values below 0.5 are considered as "poor," between 0.51 and 0.75 "moderate," between 0.76 and 0.90 "good," and above 0.91 "excellent" (Koo & Li, 2016). The aim of this study was to evaluate the psychometric properties of parent and teacher SBQ assessments with a cohort of Canadian children aged 3.5-12 years.
Sex Differences and Intercorrelations. Subscales means were normalized and rescaled to be expressed on a scale from 0 to 10, with higher score indicating higher frequency of these behaviors, using the following transformation: with max old = maximum of the non-transformed subscale score of the sample max new = 10 min old = minimum of the non-transformed subscale score of the sample min new = 0 v = non-transformed subscale score of the participant We described the distribution of each subscale's scores using mean and standard deviation and stratified by sex. To estimate sex differences, we calculated Hedge's effect size (interpreted as: very small, <0.20; medium, 0.21-0.50; large, 0.51-0.80; and very large, 0.81-1.20; Cohen, 1988) and used Student's t-tests. Correlations between SBQ subscales for each sex were estimated using Spearman's rank correlation coefficient, accounting for their non-normal distribution.

Convergent and Discriminant Validity.
To assess internal convergent and discriminant validity, we used an MTMM approach (Campbell & Fiske, 1959). We estimated item-total score correlations for each subscale. It corresponds to the mean correlation between the items of a given subscale with the total score of the same subscale (e.g., the correlation between the items of the "withdrawal" subscale with the total score of the "withdrawal" subscale). We also estimated inter-items correlations for each subscale, that is the mean correlation between the item of a given subscale (e.g., the correlation among the items of the "withdrawal" subscale) along with the mean correlation between the items of a subscale with the items of another subscale (e.g., the correlation between the items of the "withdrawal" subscale with the items of the "disruptive behaviors" subscale). We expected items belonging to the same subscale to have higher correlations (i.e., convergent correlation) than items belonging to different subscale (i.e., discriminant correlation).
Furthermore, for the assessments at age 6 and 8 in which both mother-and teacher-reports were available, we also conducted a MTMM analysis in a CFA framework.

Data Screening
No outlier was found. In addition, the exclusion of participants presenting missing data on all items per wave led to the inclusion of n = 1,950 participants at age 3.5, n = 1,942 at 4 years, n = 1,759 participants at 5 years, n = 1,492 at 6 years, and n = 1,466 at 8 years, for maternal-reported SBQ. For teacher-reported SBQ, it led to the inclusion of n = 966 participants at age 6 years, n = 1,311 participants at age 7 years, n = 1,288 at age 8 years, n = 991 at age 10 years, and n = 1,008 participants at age 12 years.

Internal Structure
The proposed six factors structure was supported for both the maternal-and the teacher-reported versions when using CFA. Models investigating the maternal-reported version of the SBQ from 3.5 to 8 years presented a good fit as showed by the RMSEA values (0.062-0.065, with 90% confidence intervals [CI] comprised between 0.060 and 0.067) and the CFI values (0.856-0.908). All standardized factors loadings ranged from 0.414 ("Sought the company of other children?" at 6 years) to 0.931 ("Helped other children [friends, brother or sister] who were feeling sick?" at 5 years; Supplemental Table 2). Similarly, models investigating the teacher-reported version of the SBQ presented good fit: RMSEA values comprised between 0.075 and 0.081 (with 90% CI comprised between 0.072 and 0.084) and CFI values comprised between 0.916 and 0.924. Standardized factors loadings ranged from 0.393 ("Sought the company of other children?" at 10 years) to 0.946 ("Hit, bit, or kicked other children" at 12 years; Supplemental Table 3).

Sex Differences and Intercorrelations
For both the maternal-reported and the teacher-reported SBQ, boys had higher scores than girls on the impulsive/hyperactive/ inattentive, disruptive behaviors, lack of prosocial behaviors, and peer relationship difficulties scales (effect sizes comprised between 0.11 and 0.76; Table 4). We found positive correlations between emotional distress, withdrawal, and peer relationship difficulties subscales, for boys and girls and for both maternal and teacher-reported SBQ (Supplemental Table 1). Similarly, positive correlations were found between impulsive/hyperactive/inattentive and disruptive behaviors, for both sexes and both reporters.

Convergent and Discriminant Validity
Convergent and discriminant validity are presented in Table 5 and Supplemental Figure 1 and showed good differentiation between subscales. Overall, for the maternal-reported SBQ, correlations between items from each subscale and their total score (i.e., item-total score correlations) ranged from 0.26 (for the "peer relationship difficulties" subscale at 3.5 years) to 0.66 ("prosocial behaviors" at 5 years). The "impulsive/hyperactive/ inattentive" subscale correlated positively with the "impulsive/ hyperactive/inattentive" (0.52-0.59) and with the score of the "disruptive behaviors" subscales (0.31-0.36). For the teacherreported SBQ, item-total score correlations ranged from 0.38 ("withdrawal" at 8 years) to 0.72 ("impulsive/hyperactive/inattentive" at 7 years and "prosocial behaviors" at 7 years). Items belonging to the "peer relationship difficulties" subscale correlated with the score of the "peer relationship difficulties" (0.44-0.65) and with the score of the "disruptive behaviors" (0.30-0.34) subscales. Finally, items belonging to the "withdrawal" subscale correlated with the score of "withdrawal" (0.38-0.51) and with the score of the "emotional distress" subscales (0.26-0.39).
Results for the MTMM analyses conducted using a CFA approach showed a good model fit for both age 6 (RMSEA value = 0.050, with 90% CI comprised between 0.048 and 0.051) and age 8 (RMSEA value = 0.052, with 90% CI comprised between 0.051 and 0.54). More details about factor correlations indicating convergent and discriminant validity are provided in Supplemental Table 5.

Discussion
This study investigated the psychometric properties of a brief version of the Social Behavior Questionnaire among a populationbased sample of children aged 3.5-12 years. The brief version contains 30 items for children aged 3.5-12 years assessing six dimensions: emotional distress, withdrawal, impulsive/   (1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005), © Gouvernement du Québec, Institut de la statistique du Québec. Subscales means were rescaled to be expressed on a scale from 0 to 10. *p < .05. **p < .01. ***p < .001. hyperactive/inattentive, disruptive behaviors, prosocial behaviors, and peer relationships difficulties. The included items are inspired from well-known previous scales and have been used in several Canadian studies focusing on child behavioral development (Rouquette et al., 2014;Tremblay et al., 1994). This is important, as it will avoid inconsistency in measure and contribute to improve comparability between these studies. The study provided support for the reliability, internal structure, and convergent and discriminant validity of both the parental and teacher-reported questionnaires. The SBQ can be completed within 6 min by parents or teachers, thus making its use appropriate for large epidemiological studies. At each age, the internal structure including six factors (i.e., emotional distress, withdrawal, impulsive/hyperactive/inattentive, disruptive behaviors, prosocial behaviors, and peer relationships difficulties) were supported by CFA models, with acceptable CFI, RMSEA, and factors loadings values. Compared with the eight syndromes structures of the CBCL (Achenbach, 1991) and the five factors of the SDQ (Goodman, 1997), the six factors structure of the SBQ provides reliable assessments with less items. The internal consistency and reliability of the subscales were satisfactory.
Cronbach's alphas were overall satisfactory for all subscales. This is important considering that Cronbach's alpha is influenced by the number of items (Nunnally & Bernstein, 1993;Streiner, 2003). For the SBQ, the lowest alphas were obtained for the peer relationship difficulties subscale, which is the subscale with the smallest number of items (i.e., 3). The Cronbach's alpha values of the SBQ subscales were slightly higher than those of the SDQ, especially for the hyperactive/inattentive and the disruptive behaviors subscales (both teacher reported; Stone et al., 2010). As for the SBQ, low Cronbach's alpha for the SDQ's peer relationships difficulties subscale have been reported in several studies (Stone et al., 2010;Theunissen et al., 2013). Similarly, the  (1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005), © Gouvernement du Québec, Institut de la statistique du Québec. This correlational matrix represents Pearson's correlations between items with the total score of each subscale. Values in bold in the diagonal represent mean correlation between the items of each subscale. Values in the top right outside the diagonal represent inter-items correlations for each subscale (i.e., mean correlation of a given subscale along with the mean correlation between item of another subscale). Values in the bottom left outside the diagonal represent item-total score correlations (i.e., mean correlation between the items of a given subscale with the total score of the same subscale). Cronbach's alpha values of the SBQ subscales were slightly higher than those of the CBCL for the emotional distress, hyperactive/inattentive, and prosocial behaviors subscales (Achenbach & Rescorla, 2001;Kristensen et al., 2010;Nakamura et al., 2009). Furthermore, inter-rater agreement at ages 6 and 8 years was considered as poor despite being similar to those of the SDQ (Fält et al., 2018;Kersten et al., 2018;Mieloo et al., 2012). In addition to the CFA, we further used the MTMM approach to investigate the convergent and divergent validity of the SBQ. By estimating the correlations between items and total scores of all the other subscales, the MTMM approach allows investigating correlations between items in a context of comorbidity. The MTMM analyses revealed good convergent and discriminant validity, that is, item-total correlations and inter-item correlations were higher for items from the same subscale than for items from another subscale. Nevertheless, the "impulsive/hyperactive/inattentive" subscale also correlated strongly with the "disruptive behaviors" subscale.
Our supplemental analyses showed the expected sex differences in emotional and social behaviors, with boys having higher scores of social behaviors than girls. These sex differences have been observed when using other questionnaires (e.g., SDQ and CBCL; Hoffmann et al., 2020;Shojaei et al., 2009;Woerner et al., 2004).

Strengths and Weaknesses
This study presents several strengths. First, it used data from a large population-based and longitudinal study, which permitted the assessment of the psychometric properties of the SBQ at multiple time points across childhood. Second, we used two complementary approaches, CFA and MMTM matrix methods, to evaluate the questionnaire's internal consistency.
Some limitations should be noted when interpreting the findings. First, the SBQ is a brief questionnaire designed for conducting research in community or clinical samples, but it does not provide clinical or diagnostic assessments. Second, we did not validate the SBQ by comparing it with another tool using the same sample (e.g., SDQ, CBCL). The QLSCD collected intensive information about participants. Including additional questionnaires, similar to the existing one, for the purpose of comparing their results, would have significantly increased the time and cognitive burden on participants and reduced responses rate (Edwards et al., 2002;Galesic & Bosnjak, 2009). Similarly, we did not have information on clinical diagnoses, therefore we were unable to investigate if the SBQ is a good screening instrument for common social and emotional problems diagnosed in children. Additional research is needed to explore the usefulness of the SBQ in clinical settings. Third, culture may play a role in the expression and distribution of behaviors (Office of the Surgeon General et al., 2001) and the SBQ has been validated only within a representative sample of children from the Canadian Province of Quebec. Thus, future cross-cultural comparisons and validation will need to be conducted to assess the validity of the SBQ in multiple contexts. Fourth, rater's bias might be present in the study. However, it was beyond the scope of this study to consider rater's mental and cognitive characteristics. Finally, the SBQ was designed and evaluated by the same researcher group.
Like most of other scales, the SBQ would need further independent psychometric assessments to cumulate validation evidence (Gridley et al., 2019;Pontoppidan et al., 2017).

Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The Québec Longitudinal Study of Child Development was supported by funding from the ministère de la Santé et des Services sociaux, le ministère de la Famille, le ministère de l'Éducation et de l'Enseignement supérieur, the Lucie and André Chagnon Foundation, the Institut de recherche Robert-Sauvé en santé et en sécurité du travail, the Research Centre of the Sainte-Justine University Hospital, the ministère du Travail, de l'Emploi et de la Solidarité sociale and the Institut de la statistique du Québec.

Supplemental Material
Supplemental material for this article is available online.