Internalizing Behavior of Sociometrically Neglected Students in Inclusive Primary Classrooms – A Methodological Issue?

Internalizing problems in children belong to the category of special educational needs called emotional and behavioral difficulties. Recent decades have witnessed a critical discussion about whether children and adolescents experiencing internalizing problems are at risk of being sociometrically neglected (neither liked nor disliked by their peers). Previous studies have shown evidence both for and against the association between internalizing problems and neglected sociometric status. These contradictory results may be due to the following methodological aspects: (1) shortcomings of sociometric status classification methods (arbitrariness of the sociometric classification rules) and (2) different operationalizations of internalizing problems (broadband and narrowband dimensions of behavior). The aim of the present study is to investigate empirically whether and to what extent these methodological aspects lead to contradictory results on the internalizing behavior of neglected students. This question is investigated using a sample of students (N = 2334) in German inclusive primary schools. The systematic investigation presented here provides initial indications that the various methodological approaches can lead to conflicting results. The contradictory results are not only due to the application of different sociometric classification methods, but also to different operationalizations of internalizing behavior (narrowband and broadband scales). Earlier contradictory evidence on the internalizing behavior of neglected students must therefore be seen in a different light: the reasons for previously conflicting results may actually be methodological. Based on the results, conclusions are drawn as to how methodological aspects can be given more consideration in sociometric research on internalizing behavior.

Internalizing problems in children belong to the category of special educational needs called emotional and behavioral difficulties. Recent decades have witnessed a critical discussion about whether children and adolescents experiencing internalizing problems are at risk of being sociometrically neglected (neither liked nor disliked by their peers). Previous studies have shown evidence both for and against the association between internalizing problems and neglected sociometric status. These contradictory results may be due to the following methodological aspects: (1) shortcomings of sociometric status classification methods (arbitrariness of the sociometric classification rules) and (2) different operationalizations of internalizing problems (broadband and narrowband dimensions of behavior). The aim of the present study is to investigate empirically whether and to what extent these methodological aspects lead to contradictory results on the internalizing behavior of neglected students. This question is investigated using a sample of students (N = 2334) in German inclusive primary schools. The systematic investigation presented here provides initial indications that the various methodological approaches can lead to conflicting results. The contradictory results are not only due to the application of different sociometric classification methods, but also to different operationalizations of internalizing behavior (narrowband and broadband scales). Earlier contradictory evidence on the internalizing behavior of neglected students must therefore be seen in a different light: the reasons for previously conflicting results may actually be methodological. Based on the results, conclusions are drawn as to how methodological aspects can be given more consideration in sociometric research on internalizing behavior.
Keywords: special educational needs, inclusive education, social inclusion, sociometric status, sociometric neglect, internalizing behavior, broadband and narrowband dimensions of behavior

INTRODUCTION: SOCIAL INCLUSION OF STUDENTS WITH SPECIAL EDUCATIONAL NEEDS (SEN)
In view of the educational placement of students with SEN in mainstream schools, a major challenge and goal for education policy and educators is the successful social inclusion of these children. The importance of the topic has been emphasized in previous studies, where it is assumed that students with SEN are an at-risk group for social exclusion in mainstream schools (Frostad et al., 2011;Krull et al., 2014;Garrote et al., 2017;Henke et al., 2017). However, proper conceptualization and measurement of social inclusion of students with SEN is the subject of an ongoing debate (Chambers and Kay, 1992;Frederickson et al., 2007;Koster et al., 2009;Gerullis and Huber, 2018).

Poor Conceptualization and Measurement of Social Inclusion
Despite the current evidence, previous reviews (Chambers and Kay, 1992;Nakken and Pijl, 2002;Frederickson et al., 2007) have concluded that research results give a contradictory picture of whether students with and without SEN are socially included to equal extents in mainstream classrooms. A reason for the conflicting evidence could be the unidimensional assessment of social inclusion (Chambers and Kay, 1992). Depending on which dimension of social inclusion is addressed and which measurement technique is used, the answer to the question of whether students with SEN are equally socially integrated can vary. Accordingly, the study by Avramidis et al. (2018) considered several dimensions of social inclusion. A differentiated picture emerges here insofar as students with SEN (compared to students without SEN) received fewer nominations of peer acceptance and had fewer friends and social interactions with classmates (sociometric measures), whereas they showed no difference in social self-perception and perception of friendship quality (self-report questionnaires). Furthermore, a comparison of the measurement of social inclusion with two different sociometric tools, i.e., the peer nomination method and social cognitive mapping (Avramidis et al., 2017), indicated that research using sociometric nominations has mainly revealed an unfavorable social inclusion of children with SEN while social cognitive mapping has yielded mixed and positive results. The same is true if sociometric tools are contrasted with observational methods. The use of sociometric tools implies mostly negative effects of social inclusion for students with SEN and the application of observational methods indicates mainly positive effects (Chambers and Kay, 1992). These contradictory findings may be due to the different operationalizations of social inclusion. Koster et al. (2009), who point out the multidimensional nature of social inclusion of students with SEN (e.g., interactions, friendships, self-perception of peer-acceptance, etc.), describe the poor conceptualization and measurement (see also Gerullis and Huber, 2018) to the effect that previous studies deal with the concept of social inclusion (without further specifying it), but use different operationalizations and therefore refer to different dimensions of social inclusion. Hence, inconsistent results could be an effect of different operationalizations. Another reason for the contradictions, which is examined in more detail below, could be that students with SEN are regarded as a homogeneous group.

Students With SEN Are Not a Homogeneous Group
Another reason for the conflicting research findings could be that students with SEN are considered as a homogenous group. Many studies on social inclusion of students with SEN (e.g., Frostad et al., 2011) do not distinguish between different types of disabilities or difficulties (e.g., mental disability, physical disability, behavior difficulties, and learning difficulties). The educational needs of students with different types of disabilities or difficulties vary considerably, and therefore inconsistent research findings may reflect the fact that students with SEN are not a homogeneous group (O'Mara et al., 2012). Thus, the study by Krull et al. (2014) showed that the social inclusion of students with SEN depends on the type of disability or difficulty. The results indicate that students with both behavior problems and learning difficulties are more likely to be rejected by their peers than students without SEN, although the rejection rate for students with behavior problems is twice as high as for those with learning difficulties. The highest risk of rejection is for students with combined behavior and learning difficulties. Accordingly, it becomes clear that the evidence found on the social inclusion of students with specific SEN cannot simply be transferred to students with other types of disabilities or difficulties. This issue is described in more detail below with reference to students with emotional and behavioral difficulties.

Students With Emotional and Behavioral Difficulties
Even when research on social inclusion focuses on a particular group of students with SEN, namely students with emotional and behavioral difficulties, generalizations may be flawed. As the category of emotional and behavioral difficulties includes students with both internalizing and externalizing problems, a homogeneous sample cannot be assumed even within this group (Kershaw and Sonuga-Barke, 1998). Accordingly, the educational needs of the students vary depending on the behavior at hand (Kershaw and Sonuga-Barke, 1998). This difference also appears as a function of the type of behavior (internalizing vs. externalizing) for social inclusion (Rytioja et al., 2019). The results of the study by Rytioja et al. (2019) imply that externalizing problems are most strongly associated with the controversial sociometric status (students both liked and disliked by many peers) followed by the rejected sociometric status (students liked by few and disliked by many peers), while for students with internalizing problems, the opposite is the case. Thus, internalizing problems are most strongly related to the rejected sociometric status, followed by the controversial status. As far as the popular sociometric status (students liked by many and disliked by few peers) is concerned, externalizing behavior for this group tends to be higher than internalizing behavior, whereas externalizing and internalizing problems are equally pronounced for sociometrically neglected students (neither liked nor disliked by their peers). Overall, the study by Rytioja et al. (2019) shows that internalizing and externalizing problems are related to different qualities of social inclusion (different sociometric groups). However, as research results on the social inclusion of children with internalizing problems are far from clear, especially when the sociometric neglect is used as an indicator of social inclusion, the present study focuses on the group of students with internalizing problems.

Internalizing Behavior and Sociometric Neglect
A positive association between internalizing behavior and sociometric neglect is often assumed (Wilmshurst, 2017, p. 307): "Neglected children were shy and withdrawn (Coie and Kupersmidt, 1983), had higher levels of social anxiety and were at greater risk for internalizing problems then their non-neglected peers." However, a closer examination of the evidence shows that the association is not as clear as suggested. While Coie and Kupersmidt (1983) found that neglected children exhibit more shyness, Cantrell and Prinz (1985) report that there is no association between sociometric neglect and shyness, withdrawal, and anxiety. On the contrary, other study results support the association of sociometric neglect with anxiety (La Greca et al., 1988;Strauss et al., 1988). Regarding depression, neglected children are more likely to report depressive feelings (Kupersmidt and Patterson, 1991) and show severe anhedonic symptoms (Hecht et al., 1998). Again, contrary findings can be identified. Burton and Krantz (1990) conclude that neglected children are not prone to depression or anxiety. Meta-analytical results (Newcomb et al., 1993) show that neglected children evidenced more withdrawal and less depression than average children (but rejected children are most at risk). Mega-analytic results (Crews et al., 2007) show that the controversial status is the strongest sociometric risk factor for internalizing behavior, followed by the neglected and rejected status. More recent studies show an increased level of anxiety, depression, loneliness, and withdrawal in neglected children (DeRosier and Thomas, 2003;Friedman, 2004;Kaya, 2007;Woodhouse et al., 2012) while other studies show no increased association between sociometric neglect and anxiety or withdrawal (Delgado et al., 2016;García Bacete and Cillessen, 2017). The latest study (Rytioja et al., 2019) shows that rejected children exhibit the strongest internalizing behavior.
Consequently, there is evidence for and against the association of internalizing behavior with sociometric neglect. Therefore, in recent decades a critical discussion has taken place about whether children experiencing internalizing problems are at risk for being neglected (Hymel and Rubin, 1985;Bierman, 1987;Rubin et al., 1989;Howe, 2010;Kingery et al., 2010;Epkins and Heckler, 2011;Kulawiak and Wilbert, 2019). The findings of about half of the studies presented suggest that internalizing problems are not related to sociometric neglect or that other sociometric groups are more strongly associated with internalizing problems, which seems to be the case especially for the rejected and controversial groups. The evidence on the internalizing behavior of neglected children is therefore best described as contradictory and must be treated with caution (Howe, 2010).

METHODOLOGICAL REASONS FOR THE CONFLICTING FINDINGS ON INTERNALIZING BEHAVIOR IN NEGLECTED CHILDREN
The aforementioned conflicting findings raise the question of the reasons for this phenomenon, which make it possible to better understand the differences in the internalizing behavior of neglected children. Two methodological aspects are discussed in more detail in the following: (1) sociometric status classification issues and (2) different operationalizations of internalizing behavior.

Sociometric Status Classification Issues
A variety of different sociometric status classification methods is still widely used in research on the social inclusion of children with disabilities and behavioral difficulties (Krull et al., 2014;Avramidis et al., 2017;Garrote, 2017;Rytioja et al., 2019). According to Rubin et al. (1989), the diversity of applied sociometric classification rules can be seen as a reason for the equivocal evidence on the behavior of neglected children. Sociometric classification criteria are arbitrarily defined across sociometric classification methods (Terry and Coie, 1991;Mayeux et al., 2007;Kulawiak and Wilbert, 2019), which is particularly apparent for the group of neglected students (Rubin et al., 1989, p. 96): .] the standardized score approach popularized by Coie and Dodge and their colleagues [. . .] has been most often employed in recent sociometric research, yet the specific criteria used to identify sociometric subgroups differs from one research report to the next. This is particularly true in the case of neglected children where criteria appear to vary, in part, as a result of efforts to increase sample size. [. . .] Minor variations of this sort produce very different frequencies of identified neglected children." The arbitrary variation of classification rules (as described by Rubin et al., 1989) is a common but doubtful practice and a methodological problem (probably linked to equivocal research results). As a result, there is low classification agreement across methods (McMullen et al., 2014). Different methods then classify a child into different groups. In view of the arbitrary classification rules, it can be stated that there is no consensus method (Rubin et al., 1989) and no consensus about the state of sociometric neglect (Kulawiak and Wilbert, 2019). McMullen et al.'s (2014) findings support Rubin et al.'s (1989) assumption that the diversity of sociometric classification rules could be a reason for contradictory research results. They compared eight different sociometric classification methods by analyzing the associations between the different sociometric statuses and social withdrawal (assessed via peer nomination). In four methods, the neglected sociometric status correlated most strongly with social withdrawal. In the other four methods, the rejected sociometric status correlated most strongly with social withdrawal. Hence, different sociometric classification methods lead to different results regarding the association of sociometric status with social withdrawal. These results suggest that the equivocal evidence on the sociometric status of students with internalizing problems should also be discussed against the background of the sociometric classification methods used.

Different Operationalizations of Internalizing Behavior
The second point that should be considered as a reason for the conflicting evidence on internalizing behavior in neglected children has to do with different operationalizations of internalizing behavior. The behavior of students with internalizing problems is characterized by depressive, anxious, and somatic symptoms, as well as social withdrawal, shyness, sadness, and low self-esteem (Whitcomb, 2018). In line with the current discussion on whether children's behavior can best be described by broadband (internalizing and externalizing behavior) or narrowband categories (e.g., depression, anxiety, and somatization as underlying dimensions of internalizing behavior) (Tandon et al., 2009;Achenbach et al., 2016), it seems equally important to discuss the results of sociometric research against this background. In many cases, broadband scales of behavior are a subsumption of narrowband scales of behavior (Goodman et al., 2010). In earlier sociometric studies, both broadband (e.g., Rytioja et al., 2019) and narrowband scales of behavior (e.g., Tani and Schneider, 1997) were used to asses internalizing behavior. Studies that examine narrowband dimensions of internalizing problems provide an indication that only certain dimensions of internalizing behavior are related to sociometric neglect (differentiated effects). For example, the study by Tani and Schneider (1997) shows that neglected children have a higher degree of somatization than rejected and average children, while depressive symptoms are weaker and feelings of anxiety comparable. Such differentiated effects are also reported in a study by Kulawiak et al. (under review) on the association of narrowband and broadband dimensions of behavior with schoolrelevant outcomes. Hence, the authors conclude that narrowband scales are more informative than broadband scales of behavior when describing the association between behavior and students' social and academic outcomes. This is also reflected in the fact that some narrowband scales show stronger effect sizes than broadband scales. Accordingly, the question arises as to whether the examination of internalizing behavior in neglected children by a broadband scale does not systematically underestimate the association, whereas the application of narrowband scales would show differentiated associations and stronger effect sizes. Hence, the equivocal evidence on the internalizing behavior of neglected students should also be discussed against the background of the measured dimensions of internalizing behavior (broadband and narrowband). However, it must also be taken into account that a wide range of different measuring approaches is used to assess internalizing behavior (peer ratings, teacher ratings, clinical assessments, etc.). This variety can also impact research results on internalizing behavior in neglected children.

AIMS
The considerations presented about the methodological factors, which can determine the research results on the internalizing behavior of sociometrically neglected students, emphasize the importance of a systematic investigation of these methodological aspects in order to gain a clear understanding of previously conflicting findings. Accordingly, the present data on internalizing behavior in neglected students are considered from a methodological perspective (Figure 1): (1) the application of different sociometric status classification methods and (2) the  With regard to the first methodological aspect, it is evaluated whether different sociometric classification methods give contradictory results in the prediction of internalizing behavior. The rank order of sociometric groups in terms of level of internalizing behavior is crucial for judging whether the results of the different sociometric classification methods are contradictory. If all sociometric methods show the same rank order of sociometric groups (e.g., popular, average, controversial, rejected, and neglected), the results are consistent. If the rank orders differ (e.g., popular, average, controversial, rejected, and neglected vs. popular, average, controversial, neglected, and rejected), the results are contradictory. If the rank of the neglected group differs across rank orders (i.e., across methods), the results are contradictory with regard to the neglected children.
With regard to the second methodological aspect, internalizing problems are assessed using both narrowband and broadband scales of behavior. It is assumed that the use of narrowband scales shows differentiated effect sizes and that the use of the broadband scale underestimates the association between sociometric neglect and internalizing behavior (i.e., the narrowband scales show stronger effect sizes).
Accordingly, the aim of the present study is therefore to empirically investigate whether and to what extent the methodological aspects contribute to the generation of contradictory research results and, if so, to draw conclusions about how to overcome these issues in future research on the internalizing behavior of neglected children.

Procedure and Participants
The present study is part of the German research project "Schools On Their Journey Towards Inclusion: Mettmann 2.0" (Hennemann et al., 2018). Data were collected in 2017 and 2018 at ten inclusive primary schools in Mettmann County, Federal State of North Rhine-Westphalia, Germany. As the study was approved by the local school authority of Mettmann County (approval criteria: compliance with data protection regulations and educational relevance of research), additional ethics approval was not required in accordance with the national legislation and the institutional requirements. Written and informed consent was obtained from the children's parents/legal guardians. The sample consists of 112 classes across grades one to four, i.e., 12 (11%) classes of grade one, 19 (17%) classes of grade two, 25 (22%) classes of grade three, 27 (24%) classes of grade four, and 29 (26%) multigrade classes of grades one to four (different grade compositions). The median class size is 24 (Min = 16, Max = 32). The sample comprises 2,699 students in total, but complete data is available for 2,334 students (M age = 8.81 years, SD age = 1.20 years, N boys = 1205 [52%]). Missing data are due to lack of parental consent or the absence of students at data collection, e.g., due to illness. For more than half of the classes, complete data are available on more than 90% of the children in a class (proportion of complete data per class: Min = 43%, Mdn = 91%, Max = 100%).

Sociometry
To collect sociometric data, a sociometric nomination questionnaire was used (Bukowski et al., 2012). Second to fourth-grade students filled out the questionnaire on their own in the classroom. Children in the first grade who could not sufficiently write and read as well as children who needed special support were questioned in a one-to-one interview in a separate room. All children were instructed to write down the names of their classmates (no self-nomination) whom they liked the most (social acceptance) and whom they liked the least (social rejection). The number of nominations was unlimited. Nominations received for each question (indegrees) are counted for each student, resulting in two scores (LM: like-most score, LL: like-least score). An acceptable reliability of sociometric data (α > 0.80) can be shown even with low participation rates (participation rate per class > 10%) (Marks et al., 2013). Four different methods were used to determine students' sociometric status (sociometric groups: average, popular, 1 rejected, neglected, and controversial).

Method 1 (CD1)
Using the classification procedure by Coie and Dodge (1983) (hereinafter referred to as CD1), students are classified into the different sociometric status groups on the basis of withinclass standardized scores (LM z and LL z ; M = 0 and SD = 1 for each class). Additionally, a social preference score (SP z : within-class standardized difference between LM z and LL z ) and a social impact score (SI z : within-class standardized sum of LM z and LL z ) are required for sociometric group definition.
1 Please note the debate about the distinction between sociometric and peerperceived popularity (Parkhurst and Hopmeyer, 1998).
Classification rules are subsequently applied to assign children to the groups: • Popular: LM z > 0 and LL z < 0 and SP z > +1 • Rejected: LM z < 0 and LL z > 0 and SP z < −1 • Neglected: LM z < 0 and LL z < 0 and SI z < −1 • Controversial: LM z > 0 and LL z > 0 and SI z > +1 Coie and Dodge (1983) advocate including all remaining students (who could not be assigned to a particular group by the abovementioned classification rules) in the average group.

Method 2 (CD2)
Other authors (Boivin et al., 1994;Hubbard, 2001) have adjusted the classification rules of the CD1 method and use ±0.75 as cutoff values for SI z and SP z : • Popular: LM z > 0 and LL z < 0 and SP z > +0.75 • Rejected: LM z < 0 and LL z > 0 and SP z < −0.75 • Neglected: LM z < 0 and LL z < 0 and SI z < −0.75 • Controversial: LM z > 0 and LL z > 0 and SI z > +0.75 • Average: all remaining students This classification system (hereinafter referred to as CD2) is more liberal than the CD1 method, in such a way that more children are classified as popular, rejected, neglected, and controversial, but fewer children are classified as average.

Method 3 (FW)
The classification procedure by French and Waas (1985) (hereinafter referred to as FW) is comparable to the CD1 method, but does not require the SI z and SP z scores and uses ±0.5 as cut-off values for LL z and LM z : • Popular: LM z > 0.5 and LL z < −0.5 • Rejected: LM z < -0.5 and LL z > 0.5 • Neglected: LM z < -0.5 and LL z < −0.5 • Controversial: LM z > 0.5 and LL z > 0.5 • Average: all remaining students Method 4 (SVLLS) Schaughency et al. (1992) recommend assigning the children whose score is above the class median on LM and below the class median on LL to the popular group. Those scoring below the class median on LM and above the class median on LL form the rejected group. Children scoring below the class median on both LM and LL form the neglected group, while those scoring above the class median on both LM and LL form the controversial group. All remaining children form the average group. This classification procedure is hereinafter referred to as SVLLS.

Integrated Teacher Report Form (for Internalizing Behavior) (ITRF-I)
The German version of the ITRF-I was used to assess internalizing behavior in class (Volpe et al., 2020). The screening questionnaire (18 items) was constructed from the subscales of anxious-depressive behavior (α = 0.87) and social withdrawal (α = 0.88). For each student, class teachers were asked to indicate their level of concern using a four-point scale ranging from "no concern" to "strong concern." The anxious-depressive behavior scale consists of 11 items (example: "Acts fearful.") and the social withdrawal scale of 7 items (example: "Does not respond to others' attempts to socialize."). Both narrowband scales can be subsumed into the broadband scale of internalizing behavior.
Four students were excluded due to missing values on more than four items. 253 students had one missing value and 23 students had two to three missing values. Hence, 2047 students (88%) had no missing values. Missing data are imputed by means of predictive mean matching (Eekhout et al., 2014).

Statistical Analysis
To describe the association between internalizing behavior and sociometric status, internalizing behavior is regressed on sociometric status. Four different sociometric status classification methods are applied and both broadband and narrowband internalizing behavior scales are used as criteria (this analysis strategy is schematically illustrated in Figure 1), while all regression models are formulated as linear mixed effects regression models (restricted maximum likelihood estimation; nested data structure: students within classes), i.e., random intercept and slope models (Bates et al., 2015).
The sociometric status is used as a dummy variable. The reference is the average sociometric status. Therefore, the intercept of each regression model is interpretable as the expected average internalizing behavior for sociometrically average children. Since all the internalizing behavior scales are standardized (M = 0, SD = 1), the intercept represents the average internalizing behavior for the average sociometric group as a deviation from the overall sample mean (M = 0) in units of standard deviation. The regression parameters (B) for all the other sociometric status groups are interpretable as the difference in internalizing behavior (in units of standard deviation) between the average sociometric group and another sociometric group (popular, rejected, neglected, or controversial). Sex and grade (effect coding: boys vs. girls and grades 1 and 2 vs. grades 3 and 4) changed the regression coefficients for the sociometric status only marginally. Therefore, both variables are not included as covariates (parsimonious model).
In terms of the New Statistics Approach, effect sizes and confidence intervals (instead of p-values) are reported (Cumming, 2014). All statistical analyses were conducted in R 3.6.0 and with additional R packages (Bates et al., 2015;Barton, 2019;Lüdecke, 2019).

Sociometric Status
The number of students classified into the sociometric status groups by the four different sociometric classification methods (CD1, CD2, FW, and SVLLS) is displayed in Table 1. It is apparent that the proportion of children assigned to a sociometric group varies across methods [e.g., neglected children: 9% (CD1), 13% (CD2), 4% (FW), and 8% (SVLLS)]. With regard to sociometric neglect, Fleiss' Kappa of 0.67 indicates a substantial inter-method agreement, albeit 213 (9%) children are simultaneously classified as neglected, popular, rejected, and average, while only 90 (4%) children are congruently classified as neglected. These results underscore the fact that there is no consensus (across sociometric classification methods) about the state of sociometric neglect (Kulawiak and Wilbert, 2019).

Internalizing Behavior
Descriptive parameters of the internalizing behavior scales are displayed in Table 2. The correlation between the narrowband scales of internalizing behavior (anxious-depressive behavior and social withdrawal) is moderate (r = 0.53), while the broadband scale of internalizing behavior is highly correlated with the narrowband scales (anxious-depressive behavior: r = 0.91; social withdrawal: r = 0.84). There is some dependence between the variances in the scales of internalizing behavior and the class membership (e.g., social withdrawal: ICC = 0.12).

Main Results
Sociometric status is determined by means of four different sociometric classification methods (CD1, CD2, FW, and SVLLS). The broadband and narrowband scales of internalizing behavior are regressed on the different sociometric statuses. The results of the different regression models are displayed in Table 3.

Do Different Sociometric Status Classification Methods Give Contradictory Results on Internalizing Behavior in Neglected Students?
In the following, the question is considered whether the application of the different sociometric classification methods leads to contradictory results regarding internalizing behavior in neglected children. The rank order of sociometric groups in terms of the level of internalizing behavior is crucial for judging whether the results of the different sociometric classification methods are contradictory. If all sociometric methods show the same rank order of sociometric groups, the results are consistent. If the rank orders differ, the results are contradictory. If the rank of the neglected group differs across rank orders (i.e., across methods), the results are contradictory with regard to the neglected children. Differences between rank orders are described in more detail (see also Figure 2).  CD, Coie and Dodge (1983); FW, French and Waas (1985); SVLLS, Schaughency et al. (1992).  there are contradictory results regarding neglected children (whether neglected or controversial children are more affected by anxious-depressive behavior), although the differences between sociometric groups (neglected vs. controversial) that indicate this contradiction are very small. A different rank order is also the case with the SVLLS method (ascending): popular, controversial, average, neglected, and rejected (SVLLS: B popular = −0.21, B controversial = −0.02, B average = −0.01, B neglected = 0.06, B rejected = 0.29). As the neglected and rejected children are at the top of this rank order [as with the first rank order (CD1 and CD2)], this rank order also conflicts with the second rank order (FW), where the controversial and rejected children are at the top. Additionally, there is a contradiction regarding the question of whether average or controversial children are more affected by anxious-depressive behavior. The first rank order (CD1 and CD2) indicates a higher level of anxious-depressive behavior for the controversial group (e.g., CD1: B controversial = 0.11), while the third rank order (SVLLS) indicates a higher level of anxious-depressive behavior for the average group (SVLLS: B controversial = −0.02), although the differences between sociometric groups (average vs. controversial) that indicate this contradiction are very small.

Narrowband scale of social withdrawal
Three methods (CD1, CD2, and FW) show the following rank order of sociometric groups in terms of the level of social withdrawal behavior (  Linear mixed effects regression models (restricted maximum likelihood estimation; random intercept and slope models; nested data structure: students within classes).
controversial children differ in that controversial children show a lower level of internalizing behavior (SVLLS: B popular = −0.21, B controversial = −0.22), although this difference is very small. Hence, there are contradictory results regarding the question of whether popular or controversial children are more affected by withdrawal behavior, although the differences between sociometric groups (popular vs. controversial) that indicate this contradiction are very small.

Do Narrowband Behavior Scales Show Differentiated Effect Sizes and Does the Broadband Behavior Scale Underestimate the Association Between Internalizing Behavior and Sociometric Neglect?
The question is considered here whether the application of narrowband scales shows differentiated effect sizes and whether the use of the broadband scale underestimates the association (i.e., the narrowband scales show stronger effect sizes). Differences in the prediction between the narrowband scales (differentiated effects) are apparent, as in all analyses the narrowband scale of social withdrawal is more strongly associated with sociometric neglect (e.g., CD2: B neglected = 0.38) than the narrowband scale of anxious-depressive behavior (e.g., CD2: B neglected = 0.11) (the effect sizes differ many times over). An underestimation of the association by the broadband scale is evident in that, in all analyses, the narrowband scale of social withdrawal is more strongly associated with sociometric neglect (e.g., FW: B neglected = 0.38) than the broadband scale of internalizing behavior (e.g., FW: B neglected = 0.18).
Differences in the prediction between the narrowband scales are also noticeable with regard to the controversial sociometric status, as the narrowband scale of social withdrawal is negatively associated with sociometric controversiality (e.g., CD1: B controversial = −0.13) while the narrowband scale of anxious-depressive behavior is positively associated (e.g., CD1: B controversial = 0.11). This differentiated effect (positive and negative association) is not visible when using the broadband scale of internalizing behavior, i.e., the effect size is close to zero (e.g., CD1: B controversial = 0.02).

DISCUSSION
In this study, we examined the question of whether and to what extent the conceptualization and measurement of the social inclusion of students with SEN is relevant for the generation of consistent research results. This issue remains important because poor measurement of social inclusion has been linked to equivocal research results on the social inclusion of students with SEN in mainstream schools (Chambers and Kay, 1992;Avramidis et al., 2017). Therefore, in the present paper, issues FIGURE 2 | Broadband scale of internalizing behavior and narrowband scales of internalizing behavior (anxious-depressive behavior and social withdrawal) by sociometric groups (boxplots in ascending rank order). of a specific assessment approach of social inclusion were examined, namely the sociometric status classification, which is widely used in research on social inclusion of students with SEN. At the same time, we outlined the equivocal evidence on the internalizing behavior of sociometrically neglected children. Internalizing problems in children belong to the category of SEN called emotional and behavioral difficulties. An undifferentiated (broadband) perspective on children with SEN has also been linked to contradictory research results (O'Mara et al., 2012). Therefore, in the present paper, the operationalization of internalizing behavior was likewise examined, namely the operationalization of broadband and narrowband dimensions of behavior.
The analyses and results presented here provide initial indications that methodological approaches can lead to conflicting results on the internalizing behavior in sociometrically neglected children (but also with regard to the other sociometric groups), although the differences in behavior between sociometric groups that indicate these contradictions are small. The contradictory results are not only due to the application of different sociometric classification methods, but also to different operationalizations of internalizing behavior (narrowband and broadband scales).
Contradictions regarding the question of whether neglected children are more affected by anxious-depressive behavior than average and controversial children are due to the application of different sociometric classification methods. These contradictory results reflect the current equivocal state of research. For example, one study implies that neglected children are the most affected and controversial children are the least affected by anxiety (La Greca et al., 1988), while another study indicates higher anxiety among controversial children (Crick and Ladd, 1993). The results are conflicting and both studies use different sociometric classification methods.
The two narrowband scales show differentiated effects, as the narrowband scale of social withdrawal is more strongly associated with sociometric neglect than the narrowband scale of anxiousdepressive behavior (the effect sizes differ many times over). At the same time, an underestimation of the association by the broadband scale is evident in that the narrowband scale of social withdrawal is more strongly associated with sociometric neglect than the broadband scale of internalizing behavior. Hence, the differentiation between narrow facets of internalizing behavior and the application of narrowband scales provide a deeper understanding of internalizing behavior in neglected children. These results support the assumption that there is no benefit (but rather harm) in summarizing distinctive behavior problems (operationalized by narrowband scales) into one broad category of behavior (operationalized by a broadband scale) (Tandon et al., 2009;Kulawiak et al., under review).

Limitations
Critically, it should be noted that the narrowband scale of anxious-depressive behavior used in this study summarizes two distinctive behavioral problems, namely anxiety and depression. Strictly speaking, a more differentiated view of anxiety and depression (in the sense of two different narrowband scales) could provide an even more detailed insight into neglected children's internalizing behavior. Beyond that, this study covers just some dimensions and not the entire spectrum of dimensions of internalizing behavior. This, in turn, may lead to the premature conclusion that internalizing problems are not related to the neglected sociometric status, as other relevant narrowband dimensions of internalizing behavior may not have been evaluated.

Conclusion
The systematic investigation presented here provides initial indications that methodological approaches can lead to conflicting results. Earlier contradictory evidence on the internalizing behavior of neglected students must therefore be seen in a different light: the reasons for previously conflicting results may actually be methodological. The main methodological issue is that even after decades of sociometric research there is still no consensus method for identifying neglected children. This reflects the lack of a sufficient conceptualization of sociometric neglect. Early on, a clearer definition of sociometric neglect was demanded (Gottman, 1977). However, even decades later, the conceptualization of sociometric neglect is still in progress (Kulawiak and Wilbert, 2019) and new sociometric classification methods appear at regular intervals (García Bacete and Cillessen, 2017), some of which emphasize the need for a more accurate representation in the sense of a continuous metric of the sociometric status (DeRosier and Thomas, 2003;Kulawiak and Wilbert, 2019). Which sociometric classification method provides the most reliable research results? Given the arbitrariness of the sociometric classification rules, this question is hard to answer. In order to improve the operationalization of sociometric neglect, it is necessary to clarify how neglect affects internalizing behavior (or vice versa). For example, there is a difference between passive and active neglect (others ignore vs. avoid the individual) (Leary, 1990). This kind of nuanced information is not presented by current sociometric classification methods. Both types of neglect can have different effects on a child's internalizing behavior. In the context of sociometric research on internalizing behavior, therefore, a more detailed conceptualization of sociometric neglect is needed. Measuring approaches other than sociometry (e.g., observation of peer interactions or self-reports of feelings) could be an enrichment for the assessment of neglect, as these procedures can capture more nuanced information (passive and active neglect or subjective feelings of being neglected). Besides that, many of the sociometric studies on internalizing behavior are cross-sectional (including the present study) and therefore do not investigate the causal relationship or causal direction between internalizing behavior and sociometric neglect. Hence, priority must be given to a theoretical model that postulates a causal link between sociometric neglect and internalizing behavior (or vice versa). For example, the experience of being neglected (or ignored) by peers may lead to social-evaluative concerns that may in turn prompt the inhibition of sociable behavior (Asendorpf, 1990), i.e., children would act socially withdrawn. But, the experience of being neglected may also trigger other internalizing reactions (Leary, 1990), e.g., the experience of being neglected may reinforce fears of future social interactions (social anxiety). In order to derive a plausible causal model, the relevant narrowband dimensions of internalizing behavior must be theory-driven and clearly identified (instead of summarizing of all internalizing problems into one broadband category).

DATA AVAILABILITY STATEMENT
The datasets generated for this study will not be made publicly available due to data protection regulation.

ETHICS STATEMENT
Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin.

AUTHOR CONTRIBUTIONS
All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.