Development of Group Dynamics Scale (Gds): Validity and Reliability Study

We developed a scale to assess teachers’ perceptions of group dynamics in schools. The sample consisted of 995 teachers from five public schools affiliated with the Ministry of National Education, Turkey. Construct validity was determined using an exploratory factor analysis (EFA) and a confirmatory factor analysis (CFA). The EFA results revealed a one-factor structure that accounted for 44% of the total variance. The CFA results indicated acceptable goodness of fit indices for the one-factor group dynamics scale (GDS) model. Criterion validity was determined using the scale of organizational silence (SOS) and the person-organization fit scale (POFS). The results showed that the GDS was positively correlated with POFS and negatively correlated with SOS. Reliability was measured on three different samples. The GDS had a Cronbach’s alpha (α: internal consistency coefficient) of .88 to .89. Reliability was also analyzed using the test-retest method. The results showed that the GDS had an acceptable reliability coefficient. These results indicated that the GDS was a reliable measure. The “upper and lower 27 percent rule” and corrected item-total correlation coefficients were used for item analysis. The former revealed acceptable results for all three samples, while the latter revealed significant t-test results for all items. All these results indicate that the GDS is a valid and reliable measure.


Introduction
Group dynamics is defined as the force arising from the interaction between an individual and the social group to which he/she feels a sense of belonging. We should make use of that force in education to achieve teaching/learning outcomes. School stakeholders who collaborate are more likely to achieve organizational goals. Where there are people, there is interaction, and this study focused on group dynamics originating from that interaction. Collective success takes precedence over individual success due to cut-throat global competition and social, political, and cultural changes. Therefore, we should explore and analyze group structure and dynamics (Dereli & Cengiz, 2011). The Turkish Language Association defines the term "group" as "a set of beings or things with shared characteristics" (2019). A body of people is transformed into a group if they have a strong bond, a sense of belonging, and shared norms and goals, and if they take up interdependent role systems, find the group rewarding, and cooperate. People form groups to forge a shared culture, solve problems collectively, guide one another towards the right path, facilitate interaction, and promote independence or apply group or peer pressure (Aksu, 1996). Most groups consist of and recruit likeminded members who cooperate, although they have different takes on different issues (Çöklü, 1994). Aksu (1996) defines group dynamics as "the group force acting upon an individual or vice versa." Yavaşça (2010) defines it as dynamic actions that shape how group members act and interact and encourage them to seek the common good and abide by group rules when communicating with each other, other groups, and even institutions. Dereli and Cengiz (2011) define group dynamics as the influence of group force on its members. Group members internalize each other's feelings, thoughts, and behaviors, which is known as conformity. They do it either to be appreciated by others or to be on the right side. Sherif (autokinetic effect experiment) and Asch (line judgment tasks) have shown that even when the right answers are obvious, a group member is more likely to give wrong answers if the other members do so. This suggests that people deny what they see and submit to group pressure for the sake of conformity (Gerrig & Zimbardo, 2012). Group dynamics make members' lives easier because they allow them to do things collectively that they cannot or dare to do alone. However, it does not always work out the way we would like it to, and instead alienates members from themselves and make them forget their own values (Çöklü, 1994). Group dynamics encourages members to improve themselves and their groups, overcome formal, organizational, and systemic problems, and work efficiently to achieve shared goals and tasks (Külebi, 1990 (Yılmazer & Eroğlu, 2008;Dereli & Cengiz, 2011).
Interaction between group members yields positive results. According to Acar (2014), members find solutions from within the group because the higher the number of members, the higher the number of solutions. Members are expected to learn to accept each other for who they are as they do it to themselves. Teacher collaboration is of paramount importance as schools have workgroups, commissions, and boards (Çetin & Yaman, 2004). Schools should address teacher collaboration to adapt to group dynamics and transform themselves into effective educational institutions (Şekerci & Aypay, 2009). Teachers who can turn into a group are more likely to achieve common goals and develop a sense of belonging through their solutions arising from group dynamics.
Group dynamics encompasses teacher collaboration against problems faced by schools.
Teacher collaboration is critical for achieving educational goals as well. However, there is no scale for measuring group dynamics in schools in Turkey. The synergistic climate scale and the group unity scale fall short of assessing perceived group dynamics. Therefore, this study aimed to develop a valid and reliable measure of group dynamics that fully represents Turkish culture.

Method
This section addressed the sample, data collection tools, scale development steps, and data collection and analysis. Each stage of the research was conducted according to the ethical principles outlined by the Declaration of Helsinki.

Sample
The sample consisted of 995 teachers from public schools in Elazig, Turkey. The research was conducted in five stages in the fall semester of the 2019-2020 academic year. First, a pilot test was conducted (n=50). Second, an exploratory factor analysis (EFA) was performed (n=240). Third, a confirmatory factor analysis (CFA) was performed (n=275). Fourth, criterion validity was tested (n=310). Fifth, test-retest was used to check for reliability (n=102). Table 1 shows the demographic characteristics of the participants.

Data Collection Tools
Data were collected using a demographic characteristics questionnaire, the group dynamics scale (GDS), the scale of organizational silence (SOS), and the Person-Organization Fit Scale (POFS).
The group dynamics scale (GDS) was a measure developed and tested by this study. The "Results" section addressed its psychometric properties.
The scale of organizational silence (SOS) was used by Dyne, Ang, and Botero (2003) and Briensfield (2009) and was adapted to Turkish (SOS-TR) by Alparslan (2010). It consists of three subscales and 29 items. We took the principles for short-form development and content validity into account and turned the scale into a short form (SOS-SF), which was a combination of SOS-TR adapted by Alparslan (2010) and the three-item SOS (α=.83) used by Zincirli (2017).
The person-organization fit scale (POFS) is a valid and reliable four-item measure developed by Aumann (2007) to assess perceived individual-organization fit. The scale was also used by Vilela, Gonzales, and Ferrin (2008) and Piasentin (2007). Ulutaş et al. (2015) adapted the scale to Turkish (α=.72). The items are scored on a five-point Likert-type scale.

Pilot Study
The first stage of scale development is to conduct a literature review to determine the main points of interest (Şeker & Gençdoğan, 2014). After we selected the topic, we conducted a literature review and determined the main points of interest. Afterward, we chose items based on the main points and developed a pool of 39 relevant, easy-to-understand, and culturally sensitive items. We consulted three experts (educational management, guidance and psychological counseling, and assessment and evaluation) for relevance and comprehensibility. We removed sixteen items based on their feedback. A linguist checked the remaining items for grammar and semantics (n=23). Afterward, we conducted a pilot test on 50 participants representing the target population. We told them that it was of utmost importance that they tell us about the items they had difficulty understanding or any problems they encountered. We evaluated the results together with the three academics and removed four items as some participants did not understand them. We then moved onto the main study.

Data Collection
We informed all teachers about the research purpose, procedure, and confidentiality and obtained informed consent from those who volunteered. We handed them the data collection forms and briefed them about the purpose of the research and the concepts of interest. We asked them to indicate (on a scale of 1 to 5) to what extent the items represented their situation. We picked up some of the forms the same day and others a couple of days later and thanked them for their participation.

Statistical Analysis
Before the analyses to be carried out in this study were determined, it was examined, by looking at the skewness and kurtosis values, whether the collected data showed normal distribution. It is seen that the coefficients of skewness vary between -.963 and -.487, while the coefficients of kurtosis vary between .396 and .891. According to these results, it is possible to suggest that the data shows normal distribution (Can, 2013;Özdemir, 2018). Exploratory (EFA) and confirmatory factor analyses (CFA) were used for construct validity. The Kaiser-Meyer-Olkin (KMO) was used for sampling adequacy, and Bartlett's test of sphericity was used to determine the correlation between the items for factor analysis. The KMO was .914, for which the Bartlett's test of sphericity was significant (χ² = 1016.518 (p<0.000)), indicating sampling adequacy for principal components analysis and an adequate correlation between the items for factor analysis. A confirmatory factor analysis was used to verify the factor structure. The model fit was assessed using the most common Internal consistency coefficients (Cronbach's alpha) were calculated for all samples, and a test-retest was performed to determine reliability. The "upper and lower 27 percent rule" and corrected itemtotal correlation coefficients were used for item analysis. Table 2 shows the goodness of fit indices and their cut-off points.

Exploratory Factor Analysis
Construct validity was determined using an EFA (n=240). This study pursued the three stages of EFA proposed by Pohlmann (2004); (1) selecting and measuring variables, (2) determining the number of factors, and (3) interpreting them. First, the Kaiser-Meyer-Olkin (KMO) was used for sampling adequacy, and Bartlett's test of sphericity was used for factor analysis. The KMO was .914, for which the Bartlett's test of sphericity was significant (χ² = 1016.518 (p<0.000)), suggesting sampling adequacy and adequate correlation for factor analysis (Tabachnick & Fidell, 1996;Kalaycı, 2006;Field, 2009;Çokluk, Şekercioğlu & Büyüköztürk, 2010). The EFA was performed on the 19item GDS using principal component analysis. An exploratory factor analysis should be based on determining the smallest number of factors that best represent the correlation between items.
Therefore, items should be loaded on factors with an eigenvalue of 1 or greater (Hutcheson & Sofroniou, 1999). Moreover, an item should have a loading of greater than .40, and the difference between its load on a factor and that on another should be greater than 0.10 (Büyüköztürk, 2007).
Four items (8, 9, 10, and 13) were removed because they had factor loadings of smaller than 0.40.
Three items (5, 6, and 18) had acceptable factor loadings but were removed from the scale because they were either unsuitable to the scale structure (<0.10) or were loaded on more than one factor.
Factors should explain 30%-60% of the total variance (Çokluk et al., 2010;Tavşancıl, 2010;Savcı, Ercengiz & Aysan, 2018). The analysis showed that the GDS items were loaded on one factor (an eigenvalue of 5.283) that accounted for 44.023% of the total variance of the single-factor structure.
According to the scree plot (Figure 1), there was a significant rupture after the first factor, indicating that the GDS had one factor with 12 items (model). The scale items had factor loadings of 0.59 to 0.72. Figure 1 and Table 3 show the scree plot and the EFA Results, respectively.  Confirmatory Factor Analysis The one-factor structure GDS (12 items) was examined using a CFA (n=275  Figure 2 shows the path diagram for the GDS. to .78. Figure 3 shows the path diagram for the criterion validity analysis.

Criterion Validity
Criterion validity was tested using SOS and POFS (n=310). The Pearson correlation coefficient was used to determine the correlation between the GDS and the SOS and POFS scores.

Reliability
Reliability was determined using the test-retest method and Cronbach's alpha internal consistency coefficient (α). The GDS had a Cronbach's alpha of .88, .89, and .89 for the EFA, CFA, and criterion validity samples, respectively. The results indicated that the scale had high reliability.
Test-retest was used to determine whether the GDS yielded consistent results when repeated over time. A sample of 102 teachers was drawn from the CFA sample and tested again four weeks after the initial test. The results showed a test-retest reliability of .81.

Item Analysis
Item analysis is vital for determining item validity. According to Tezbaşaran (1997), corrected item-total correlation coefficients and the difference between the upper and lower 27 percent should be calculated for item analysis (t scores). Şencan (2005) and Büyüköztürk (2007) argue that each item should have an item-total correlation of greater than .30. The item analysis was also performed on three different samples (EFA, CFA, and criterion validity). The "upper and lower 27 percent rule" was used to determine the discriminatory power of the GDS items. For the EFA sample, the corrected item-total correlation coefficients ranged from 0.50 to 0.63, while the difference between the upper and lower 27 percent ranged from 11.69 to 6.85 (t scores; p < 0.001). For the CFA sample, the corrected item-total correlation coefficients ranged from 0.55 to 0.64, while the difference between the upper and lower 27 percent ranged from 11.75 to 8.27 (t scores; p < 0.001). For the criterion validity sample, the corrected item-total correlation coefficients ranged from 0.57 to 0.64, while the difference between the upper and lower 27 percent ranged from 11.45 to 8.58 (t scores; p < 0.001). Table 5 shows the results.

Discussion, Conclusion and Recommendations
We developed a scale to assess group dynamics in Turkish society. First, we conducted a literature review and developed a pool of 39 items. Three experts (educational management, guidance and psychological counseling, and assessment and evaluation) checked the items for relevance and comprehensibility. We removed 16 items based on their feedback. We then conducted a pilot study and removed four more items based on its results. Lastly, we checked the construct validity of the 19item group dynamics scale (GDS) on different samples. We performed EFA to determine the construct validity of the GDS. The EFA factor structure was verified using a CFA. We also looked into the correlation between the GDS and the SOS and POFS scores to check for criterion validity.
We calculated Cronbach's alpha (α) internal consistency coefficients on each sample and then employed the test-retest method to determine the reliability of the GDS. We calculated the corrected item-total correlation coefficients for each item and the difference between the upper and lower 27 percent (t scores).
First, we used an EFA to determine the construct validity of the GDS. The EFA results revealed a one-factor structure consisting of items with eigenvalues of greater than 1 (model). Çokluk et al. (2010) state that a one-factor structure should explain 30%-60% of the total variance. The EFA results showed that the one-factor structure explained about 44% of the total variance. Each item should have a factor loading of greater than .30 (Şencan, 2005;Büyüköztürk, 2007). The results showed that the GDS items had adequate factor loadings. We performed the CFA on two different samples to test the model. The results showed that the model had acceptable goodness of fit indices on both samples and that the items had acceptable factor loadings (Büyüköztürk, 2007).
The participants' GDS scores were negatively correlated with their SOS scores, suggesting that the higher the group dynamics, the less the perceived organizational cynicism. Their GDS scores were positively correlated with their POFS scores, suggesting that the higher the group dynamics, the higher the person-organization fit. The reliability of the GDS was determined using Cronbach's alpha and the test-retest method. Psychometric studies suggest that a Cronbach's alpha should be greater than 0.70 (Büyüköztürk, 2007;Tavşancıl, 2010). The results showed that the GDS had adequate Cronbach's alpha values on the EFA, CFA, and criterion validity samples, which was also confirmed by the test-retest results. Item analysis was performed on the three different samples. The results suggested that the GDS had acceptable corrected item-total correlation coefficients. There was a significant difference between the upper and lower 27 percent groups on all samples. These results indicated that all GDS items were reliable. The validity analysis also showed that all items measured what they were intended to measure (Çokluk et al., 2010). All in all, the results indicate that the GDS is a valid and reliable measure of group dynamics among teachers.
The group dynamics scale (GDS) consists of statements on how teachers perceive group dynamics in schools. Future studies should adapt the GDS to different cultures. We can recruit people from different backgrounds for further research to see how perceived group behavior varies across situations and microcultures.