Self‐efficacy matters: Influence of students’ perceived self‐efficacy on statistics anxiety

Statistical knowledge is a key competency for psychologists in order to correctly interpret assessment outcomes. Importantly, when learning statistics (and its mathematical foundations), self‐efficacy (defined as an individual's belief to successfully accomplish specific performance attainments) is a central predictor of students’ motivation to learn, learning engagement, and actual achievement. Therefore, it is crucial to gain a better understanding of students’ self‐efficacy for statistics and its interrelations with statistics anxiety and students’ belief in the relevance of statistics. Here, we present results showing development and validation of a self‐assessment questionnaire for examining self‐efficacy for statistics in psychology students (Self‐Efficacy for Learning Statistics for Psychologists, SES‐Psy). Upon using different methodological approaches, we demonstrate that the SES‐Psy questionnaire has (1) sound psychometric properties, and within our sample of university students, (2) a robust latent structure disclosing three clearly distinctive profiles that are characterized by a complex and nonlinear interplay between perceived self‐efficacy (for basic and advanced statistics), statistics anxiety, and students’ belief in the relevance of statistics. Implications for educational settings and future research are discussed.


INTRODUCTION
Many graduate students-particularly psychology students-report high levels of statistics anxiety, which seriously hamper students' attainments in statistics and research methodology courses. [1][2][3] Statistic anxiety is a specific form of test and performance anxiety, and up to 80% of graduate students in psychology, business, behavioral, and social sciences report unmanageable levels of statistics anxiety. 1 Per definition, statistics anxiety denotes feelings of apprehension upon being confronted with statistical tasks, be it in instructional situations

The relation between self-efficacy and academic performance
According to Efklides, 21 students' self-regulated learning is crucially modulated by a reciprocal interplay of motivation, affect, and metacognitive experiences. Furthermore, beyond metacognitive knowledge and motivation to learn, self-efficacy has also been reported to determine students' academic performance. 22 The term self-efficacy denotes a person's belief to succeed in a specific task accomplishment. Importantly, self-efficacy can be distinguished from related constructs, such as locus of control, which, in academic settings, refers to individual attribution styles of academic failure or success. 23,24 For instance, students with an internal locus of control attribute their academic outcomes to personal investments, such as effort, while students with an external locus of control believe that their academic outcomes are beyond their own control but instead are attributable to external factors, such as luck or fate. On the contrary, academic self-efficacy refers to students' beliefs of successful completion of course-specific academic tasks. Following social cognitive theories, self-efficacy is highly context-specific and dynamic in nature. 24 As stated by Bandura,23,25 experiences of personal mastery as well as vicarious experiences of someone else (who is regarded as similar to oneself) foster the development of self-efficacy. Notably, self-efficacy is an important personality characteristic because high levels of self-efficacy promote individual accomplishments. For instance, the likelihood of actually being successful increases considerably if an individual is certain that an action can be carried out successfully. This is true even when optimistic beliefs in success do not necessarily match actual ability.
Importantly, positive attitudes, such as a strong sense of self-efficacy, promote the motivation to work on new and difficult tasks and facilitate the engagement in and completion of task assignments. 26,27 Negative attitudes like (statistics) anxiety, on the other hand, leave people without initiative or cause them to give up prematurely. Noteworthy, the influence of self-efficacy on performance depends on the perceived relevance of an area. 28 For instance, psychology students who aim to become scientists are likely to consider statistics extremely important and thus, their perceived self-efficacy for statistics will have a large impact on their statistics performance. On the contrary, the influence of perceived self-efficacy for statistics on statistics performance might be low in psychology students with a major interest in psychotherapy who consider statistics not being relevant for their future work as psychotherapists.
With respect to math learning, previous findings revealed that selfefficacy-beyond being significantly related to math performance (see Refs. 22 and 29 for similar findings in pandemic-related contexts of online learning) and students' motivation for mathematics 30,31 -is a much more significant predictor of academic performance than math anxiety. 26,[32][33][34] Indeed, recent findings from a large-scale study comprising 158,161 eight-graders disclosed that students self-efficacy for math had a direct and moderate effect on their math performance (β = 0.260) that was somewhat higher than the observed (direct) effect of math anxiety on math performance (β = −0.212). 33 Likewise, there is accumulating empirical evidence that students' selfefficacy for statistics exerts direct and significant effects on statistics performance 5,27,35,36 (reported correlation strengths being medium with β = 0.45 and an adjusted R 2 = 0.21; 35 for similar findings, see Ref. 36). However-and contrary to the above reported findings reflecting a direct effect of math anxiety on math performance 33preliminary evidence suggests that the effect of statistics anxiety on statistics performance is an indirect one (via self-efficacy for statistics). 35 Because the latter study might have been underpowered due to a small sample comprising 63 students only, the latter findings need to be interpreted with caution until they are replicated with larger samples.

Rationale and aims of the present study
Scales for measuring students' self-efficacy in statistics are available in English-speaking countries. 37,38 In terms of content, these scales concentrate on statistical performance in specific areas, such as basic statistics, 37 and the practical application of statistics using specific software. 38 Generally, self-efficacy was found to be positively related to students' statistics performance, 36,37,39 but negatively related to statistics anxiety. [39][40][41] Moreover, not surprisingly, students' statistics-related self-efficacy increased during their statistics education. 37 However, to the best of our knowledge, existing scales for measuring students' self-efficacy for statistics neither consider students' familiarity with statistics (i.e., whether they attend basic or advanced statistics courses) nor do they control for potential influences of self-efficacy, perceived relevance of statistics, or emotional responses (such as statistics anxiety) to statistics learning. 37 we sought to conduct a validity check on our new assessment tool by comparing the SES-Psy to a popular questionnaire measuring competency and control beliefs (German abbreviation "FKK"). 42 In particular, the FKK 42 has been conceptualized to measure locus of control, which is related, but not identical, to the construct of self-efficacy addressed by the SES-Psy. Thus, the present study aimed to fill these gaps.
The main aims of the present study were threefold (at least). First, we sought to develop an economic assessment tool that is apt to evaluate psychologist's self-efficacy for statistical competencies and to determine its psychometric properties. Therefore, we calculated the internal consistency and test-retest reliability coefficients and moreover, examined convergent and divergent validity by conducting correlation analyses between the subscales of the newly developed SES-Psy as well as between the SES-Psy subscales and a questionnaire measuring locus of control (FKK). 42 As regards validity testing, we assume correlations between the SES-Psy and the questionnaire tapping locus of control (i.e., FKK) to be lower than the correlations observed among the subscales of the SES-Psy measuring self-efficacy for learning statistics.
Second, upon considering the current literature stressing the close interrelation between self-efficacy, motivational factors, emotional factors, and actual achievement, we chose to validate our new assessment instrument in psychology students attending basic and advanced statistic courses (i.e., second-and fourth-term students, respectively) upon including scales that-beyond testing self-efficacy for basic and advanced statistical knowledge-also tap students beliefs in the relevance of statistics and their perceived statistics anxiety.
Third, we extended our approach with a person-centered analysis using latent profile analysis (LPA) to shed further light on the relation between motivational and emotional factors as well as self-efficacy.
It has been suggested that such a person-centered perspectivecompared with a pure variable-centered approach-is more adequate for studying the complex structure of individuals self-regulatory processes. 43 This allows for modeling the heterogeneity between students that may arise from the complex interaction between selfefficacy, motivation, and emotional factors. 44

Participants
Overall, 290 German-speaking psychology students were tested as part of the present study (

Study procedure
Students were recruited during their statistics courses. Study participation was voluntary. All students were tested in class at the midpoint of the term. Due to pragmatic reasons, we abstained from conducting the present study at the beginning or toward the end of the semester, when students are busy either with setting up their course work or exams. Testing was performed in groups at the end of a 90-min teaching unit and took between 45 and 60 min to be completed. Students willing to participate remained in class, while the others could leave the room since the teaching was already terminated. Questionnaire booklets were distributed and collected at the students' desks with the help of student assistants. Questionnaire booklets were identified by an alphanumeric code and did not contain other personal information about individual participants other than age and sex. Participants were instructed that they could interrupt their participation at any point in time without any prejudice. The study was performed in accordance with the declaration of Helsinki.

Test and item development
Upon acknowledging that self-efficacy is highly context-specific, 23 items tapping self-efficacy should directly address the specific situations of interest. 44,45 For the purpose of the present study, 73 items were found to be eligible, some of which were adapted and translated from an existing questionnaire, while others were new. More specifically, 14 items from the Current Statistics Self-Efficacy (CSSE) scale 37 were found to be eligible for the purpose of the present study and translated into German. In terms of content, these items mainly covered elementary statistical knowledge, such as the distinction between population and sample parameters, when mean, mode, or median should be used as a measure of the central tendency, and so on. Because the usual training in statistics for psychologists incorporates more advanced levels of statistical knowledge than are covered by the original items of the CSSE, 37 (Table S1). For each item, the question "The following statement applies to me . . . " should be answered by using a 6-point scale (ranging from "not at all = 1" to "completely true = 6"). Subsequently, the abovedescribed item pool has been further tested to identify the best items regarding psychometric properties.

Item selection for the SES-Psy
Three different methods were used for item selection: Mokken scaling, 46,47,48 cluster analysis, 48 and principal component analysis. 49 Item groups concordant in at least two out of the three methods were considered consistent.

Mokken scaling
Mokken scaling is related to nonparametric item response theory models and finds an optimal combination of items maximizing parameters of one-dimensionality, local independence, latent monotonicity, and nonintersection. 46

Questionnaire of competency and control beliefs (FKK)
For a subsample of students recruited from the University of Salzburg (n = 149), we were additionally able to assess competency and control beliefs by using the FKK. 42 The FKK is a 32-item questionnaire containing four equally long subscales (internal consistency of single scales ranging from 0.70 to 0.76). All subscales of the FKK measure control beliefs but with different emphasis on perceived ability to exert control (thus tapping either internal or external locus of control attributions). In particular, internal locus of control is measured by the two subscales FKK_c (self-concept and one's own competencies) and FKK_i (internality in the attribution of control), while external locus of control is tested by the two subscales FKK_s (social externality) and FKK_f (fatalistic externality). Importantly, the constructs of self-efficacy and locus of control (reflecting control beliefs) are related but can be kept apart. 23,24 Accordingly, the FKK was used to assess the divergent validity of the SES-Psy. In order to determine the divergent validity of the construct of self-efficacy in statistics, the correlation coefficients between the subscales of the SES-Psy and the FKK were calculated.
Hence, under the assumption that self-efficacy and locus of control tap distinct constructs, we expect the correlations observed among subscales of the SES-Psy to be higher than the correlations observed between the SES-Psy and the FKK.

Profile analysis
Finally, we conducted an LPA 44 to examine individual differences and the heterogeneity of students using the scores of the four final SES-Psy scales (see the Results section below). Accordingly, this analysis allows for identifying potential subgroups of students with different profiles of statistics anxiety, perceived self-efficacy, and relevance of statistics.
LPA was run using R 4.0.5 50 and the R package mclust, 51 which is based on finite Gaussian mixture modeling. Results were visualized using the R package ggplot 2. 52 We chose LPA as it offers statistical tests (BIC; BLRT) 53 for determining the adequate number of clusters, which is not always the case for traditional clustering approaches.
In particular, to identify potential subgroups or latent profiles of students, we used the four subscales of the newly developed questionnaire SES-Psy as LPA indicators (see the Results section below).
Prior to running the LPA, all indicators were z standardized. To identify the best model and profile solution, we utilized the Bayesian information criterion (BIC). In particular, different numbers of clusters k and different classes of models (characteristics of distribution, volume, orientation, and form) were considered. With the mclust R package, 51 the least negative BIC value defined the best fitting model, but also parsimony and interpretability of the model was considered. 44 Finally, we used BLRT with the standard parameters (999 bootstrap replications; nonparametric bootstrapping) of the mclust R package. 51 This procedure compares the model fit between models with k-1 and k clusters.

Item selection yielding the final assessment instrument
From the 73 original items described above (and depicted in Table S1)

Reliability
In order to examine the scale's reliability, we performed tests of internal consistency and test-retest reliability on different subsamples of students. As depicted in

Construct validity
To determine the construct validity of the scale, we examined the correlations between the different subscales of the SES-Psy in the whole sample comprising 290 students and, separately, in a subsample of 149 students, the correlations between the subscales of the SES-Psy with the subscales of the FKK. 42 Notably, as depicted in Table 3A, the correlations between the subscales of the SES-Psy were satisfactory, the highest correlation coefficients being observed for the subscale SE_ADV (i.e., varying from 0.324 to 0.605). Upon recalculating these correlations for second-and fourth-term students separately, the strengths and the pattern of the correlation coefficients were found to be quite similar in the two samples. However, it needs to be noted that typically, correlations only stabilize with sample sizes close to and above n = 250. 54 Moreover, and as depicted in Table 3B, the intercorrelations between the subscales of the SES-Psy were generally higher than those observed between subscales of the SES-Psy and the FKK, thus indicating satisfactory construct validity.
Perceived self-efficacy for basic statistics (SE_BASIC) We conducted another 2 × 2 ANOVA with the factors sex (female vs. Perceived self-efficacy for advanced statistics (SE_ADV) We conducted another 2 × 2 ANOVA with the factors sex and term to also investigate potential differences on self-efficacy for advanced statistics. No main-or interaction effects reached significance.

Perceived relevance of statistics (SE_REL)
Another 2 × 2 ANOVA with the factors sex and term was run to investigate potential differences on perceived relevance of statistics. No main-or interactions effect reached significance.

Latent profile analysis
Finally, we ran LPA to identify potential subgroups or profiles of students with respect to constructs assessed with the SES-Psy. Table 4 provides an overview of the three best models according to BIC values.  Abbreviations: ANX, statistics anxiety; FKK, German-language questionnaire measuring competency and control beliefs (Krampen, 1991); FKK_c, selfconcept and one's own competencies; FKK_f, fatalistic externality in the attribution of control; FKK_i, internality in the attribution of control; FKK_s, social externality; RELEV, perceived relevance for statistics; SE_ADV, self-efficacy for advanced statistics; SE-BASIC, self-efficacy for basic statistics. Abbreviations: EEE, ellipsoidal, equal volume, shape, and orientation; EVE, ellipsoidal, equal volume, and orientation; VVE, ellipsoidal and equal orientation.

Best BIC solutions
We

DISCUSSION
The main aims of this study were (1)

Psychometric aspects of the SES-Psy
The final version of the new assessment tool SES-Psy is comprised of 42 items grouped into the four subscales statistics anxiety (ANX), self-efficacy for basic statistics (SE_BASIC), self-efficacy for advanced statistics (SE_ADV), and perceived relevance of statistics (RELEV) (Table S1). Overall, the psychometric properties of the SES-Psy were found to be satisfactory as indicated by internal consistency (indexed by Cronbach's α, ranging from 0.77 to 0.90) and test-retest reliability (ranging from 0.68 to 0.87; see Table 2). Likewise, validity testing yielded promising results. Such as, construct and discriminant validity were found to be satisfactory as reflected by higher intercorrelations between the subscales of the SES-Psy compared to lower correlations between subscales of the SES-Psy and a questionnaire assessing general competency and control beliefs (i.e., FKK; 42 see Table 3). In particular, correlation coefficients between the SES-Psy subscales ranged from 0.219 to 0.647, thus indicating almost moderate to strong effect sizes. 55 Interestingly, the only exception (from the generally lower intercorrelations observed across the subscales of the SES-Psy and the FKK) were the rather high correlations between the SES-Psy subscales assessing self-efficacy and the FFK subscale FKK_c (denoting self-concept and one's own competencies) reaching r = 0.315 (SE_BASIC and FKK_c) and r = 0.311 (SE_ADV and FKK_c). However, upon acknowledging that the subscale FKK_c and the subscales SE_BASIC and SE_ADV tap related constructs (i.e., namely self-concept and self-efficacy), the latter finding is not very surprising.

Content-related aspects of the SES-Psy
Novel features of the new assessment tool SES-Psy are (1) the consideration of potential emotional and motivational moderating factors for students' self-efficacy for statistical knowledge, and (2) the differentiation into self-efficacy for basic and advanced statistical knowledge.
Notably, existing assessment tools for students' self-efficacy do not distinguish between statistics competency levels. 37,38 However, this distinction is relevant upon acknowledging that self-efficacy is highly context-specific in nature. 23,56 Furthermore, and as mentioned already above, in addition to assessing students' extent of self-efficacy for basic and advanced statistics, the newly developed scale SES-Psy also incorporates potential moderating effects on self-efficacy stemming from emotional and motivational factors (indexed by the two SES-Psy subscales ANX and RELEV, respectively). Our rationale to include the latter two subscales is derived from previous findings reporting that (learning) motivation critically affects-and even predicts-self-efficacy 8,28 (for similar findings in the realm of math anxiety, see Refs. 15 and 20), and that statistics anxiety exerts negative effects on self-efficacy. [39][40][41] Hence, upon developing and constructing the SES-Psy that ought to measure students' self-efficacy for statistics, we also included items tapping potential motivational and emotional moderating variables (that were grouped into the subscales ANX and RELEV; see Table S1).
Overall, the SES-Psy allows a more comprehensive evaluation of students' self-efficacy for statistics and potentially influencing variables than already existing tools measuring self-efficacy for statistics. 37,38 In the following paragraph, we provide a more elaborate discussion of the complex interrelations between self-efficacy on the one hand and emotional (i.e., statistics anxiety) and motivational factors (i.e., perceived relevance of statistics) on the other hand. For that purpose, we employed a person-centered approach (by conducting an LPA) that should enable us to identify subgroups of students with distinct parameter values on the subscales of the SES-Psy.

Subgroups of students with different profiles of self-efficacy, statistics anxiety, and perceived relevance of statistics
Results of the LPA disclosed three subgroups of students that are characterized by rather distinct profiles of perceived self-efficacy in relation to statistics anxiety and perceived relevance of statistics ( Figure 1). As mentioned above, a unique feature of the SES-Psy is the differentiation between students perceived self-efficacy for basic and advanced statistics knowledge. Thus, we were able to investigate whether students' self-efficacy is influenced by the difficulty level of their statistics education.
In the following, we will describe the three profile types on a descriptive level in more detail (see Figure 1, which also depicts the verbal labels of these subtypes). Students in profile 1 might be considered confident as those students report the lowest statistics anxiety and highest self-efficacy for advanced statistics knowledge (so-called confident students). In contrast, students in profile 2 have rather high statistics anxiety but also do not consider statistics to be particularly relevant and have a somewhat average self-efficacy for basic as well as advanced statistics (anxious indifferent students). Finally, profile 3 is characterized by students with high statistics anxiety that consider statistics to be barely relevant, at least compared to the opinion of the rest of the current sample (anxious deniers). Further, these students have the lowest self-efficacy for advanced statistics.
Upon having a closer look at these three subgroups of students (Table 5 and Figure 1), it becomes apparent that compared with the other two subgroups, the confident believers pertaining to profile 1 have the highest proportion of fourth-term students (i.e., students attending advanced statistics courses). Accordingly, these students were found to score highest on the SES-Psy subscale SE_ADV. Moreover, the highest scores on the subscale ANX were found in two subgroups, namely, anxious indifferent students and anxious believers (comprising profiles 2 and 3, respectively). However, while anxious indifferent students are characterized by an extreme oblique sex distribution favoring females (who comprise 75% of this subgroup) whocompared with men-generally tend to report higher levels of both trait and state anxiety, 57-59 high anxiety scores of anxious believers (profile 3 students) might be partially explained by their extremely low beliefs in the relevance of statistics ( Figure 1). Notably, the distinctive profile of the second subgroup (comprising a high proportion of females reporting high levels of statistics anxiety) seems to be partially caused by sample characteristics. It might reflect an accumulating effect of higher general anxiety levels found in women that are superimposed by higher context-(and subject) specific anxieties (i.e., in our case, statistics anxiety).
Despite the more comprehensive understanding we gained from the LPA, we want to stress that the results of the LPA should only be interpreted with regard to the current sample of students. Nevertheless, such person-centered analyses are not restricted to linear patterns and provided a more differential understanding of how selfefficacy, anxiety, and perceived relevance for statistics might interact.
This would not have been feasible with regular variable-centered analysis approaches alone. 43

Implications for educational settings and future research endeavors
The present study further corroborates the importance of acknowledging moderating effects of self-efficacy, motivational and emotional factors on students' academic performance by focusing on statistics competencies. Notably, despite the fact that the SES-Psy was validated on a sample of psychology students, its use is not restricted to psychology students. Rather, we suggest that for any student taking statistics courses, the SES-Psy might be a useful measure of students' self-efficacy for statistics. An important finding that could potentially inform educational settings is the identification of distinct subgroups of students regarding the complex and nonlinear interplay between self-efficacy, statistics anxiety, and perceived relevance of statistics. Such as, distinct subgroups of students could inform the development of more comprehensive, adaptive, and individually tailored teaching tutorials (and exercises) that encompass the fostering of students' self-efficacy and learning motivation on the one hand and the reduction (or avoidance) of statistics anxiety on the other hand.
To conclude, future research endeavors are urgently needed to develop and test efficient measures to increase students perceived self-efficacy (the latter of which might be targeted at the facilitation of positive experiences, the imitation of a role model, the persuasion by a familiar model, or the regulation of one's own psychophysiological reactions). For instance, it has been suggested that game elements, autonomy support, and scaffolding constitute potential instructional mechanisms to foster self-efficacy in students. 30,[60][61][62][63] Likewise, future research should be targeted at implementing and investigating methods to alleviate statistics anxiety and to improve attitudes in students suffering from statistics anxiety and/or low learning motivation. Such methods could include programs employing traditional counseling techniques, stress inoculation training, or systematic desensitization.
Finally, future research should seek to further corroborate the validity (and persistence) of the three rather distinctive profiles characterized by a complex and nonlinear interplay between perceived self-efficacy, statistics anxiety, and students' belief in the relevance of statistics.

Limitations of the study
Though the construct validity of the new scale SES-Psy is satisfactory (as reflected by a good model fit), its construct validity still awaits further testing (i.e., by assessing correlations between the SES-Psy and other tests of statistics anxiety). However, as mentioned above, existing tools measuring self-efficacy for statistics are restricted to basic statistics knowledge and furthermore, do not incorporate potential moderating variables. 37,38 Further potential limitations are the rather small sample size (n = 290), which should be considered when interpreting the LPA results. The oblique sex distribution favoring females might be considered another limitation.
Even so, all existing studies focusing on psychology students have a similar sample with oblique sex distribution, 4,8,12 reflecting the fact that many more women than men study psychology. Clearly, future research endeavors are needed that systematically examine how sex differences might impact upon the interplay of self-efficacy, statistics anxiety, and related motivational factors. Moreover, future studies should collect-above and beyond age and sex-broader demographic variables related to participants' personal characteristics, as well as previous experiences with statistics classes.
Nonetheless, despite these potential limitations, we believe that the present study significantly adds to the literature. The newly developed scale SES-Psy is novel as it is targeted at the assessment of selfefficacy for statistics in psychology students upon taking into account moderating emotional and motivational variables (i.e., statistics anxiety and perceived relevance of statistics). Another unique feature of the SES-Psy is its potential to differentiate subgroups of psychology students, who are characterized by clearly distinguishable profiles regarding the interplay between perceived self-efficacy, statistics anxiety, and perceived relevance of statistics. Overall, our findings underscore the necessity to regard self-efficacy as a highly context-specific construct that clearly impacts upon students' emotional and motivational factors upon learning statistics and thus needs to be measured (and fostered) accordingly.