Psychometric Evaluation of the Capstone Core Competency Scale on Nursing Students in Taiwan

ABSTRACT Background Previous studies have evaluated the competency of newly graduated nurses and nurses. However, most of the instruments used include a large number of items that make completing them a time-consuming process. A brief instrument may be more acceptable and feasible for use in these evaluations. Purpose This study was designed to develop the brief capstone core competency (CCC or 3C) scale and validate its effectiveness in evaluating the academic and practical performance of nursing students enrolled in a bachelor's degree program. Methods A cross-sectional study was conducted. The 3C scale was developed in two phases. In Phase I, the items were summed from literature reviews, an expert panel, known-groups validity, test–retest reliability, internal consistency reliability, and exploratory factor analysis. In Phase II, the efficacy of the instrument was confirmed using confirmative factor analysis. Five hundred ninety-six participants participated in the first phase, and 520 participants participated in the second phase. The study period was 2016–2017. Results The 3C scale includes 24 items distinguished into a three-component structure that accounts for 67.85% of the total variance. The three components include nursing intelligence, nursing humanity, and nursing career. The 3C scale was found to have high levels of internal consistency reliability (.97) and test–retest stability reliability (r = .97). A significant statistical difference in the performance level was examined between senior and junior nursing students. The hypothesized three-factor model fit index showed χ2/df = 1338.25/249, p < .001, goodness-of-fit index = .82, comparative fit index = .90, root mean square error of approximation = .09, and standardized root mean square residual = .06. The participants were found to have excellent nursing humanity competency. Conclusions The developed 3C scale exhibited satisfactory reliability and validity for use with nursing students. The 3C scale may be used to evaluate the performance of nursing students during their learning process, and the results may be used to evaluate changes in educational outcomes.


Introduction
Competency-based training has been adopted in prelicensure nursing programs to induce progress toward advanced nursing practice (Jagera et al., 2020). Delivery of care by nurses is challenged by changing social and political contexts. Education and the future needs of nursing are cornerstones of the development of the nursing profession and of the development of individuals and their professional careers (Holland & Lauder, 2012). The cultivation of professional nursing manpower must start with nursing education and on-thejob education. The goals of nursing education are to ensure an educated, competent, and motivated nursing workforce within effective and responsive health systems at all levels and in different settings (Jagera et al., 2020). Capstone learning experiences may be described as the pinnacle of a nursing student's educational journey. The objectives of the capstone core competency evaluation are to create an opportunity for students to synthesize and integrate concepts and skills across the nursing curriculum into the clinical practicum, where nursing students must show mastery of academic content and the realization of the nursing role in clinical settings (Smallheer et al., 2018). Nursing competencies are highly valued by government, professional development, and consumer awareness (Holland & Lauder, 2012;Taiwan Nurses Association, 2019). Nursing curricula have been transformed using teaching strategies such as problem-based learning, objective structured clinical examination, simulation practice, and flip teaching, which have been identified as effective methods for improving 1 psychomotor and problem-solving skills to strengthen patientcentered nursing care competencies in nursing students (Fan et al., 2015;Şentürk Erenel et al., 2021). The capstone core competencies are required for all majors and fulfill the experience requirement for registered nurses to receive license accreditation (Mackenzie et al., 2019). Especially, the bachelor of nursing students must study at least 87 credits of compulsory professional courses, 128 graduation credits, and more than 1,200 hours of clinical practicum to qualify for the national eligibility examination for registered nurses in Taiwan. Construction of capstone core competency assessment indicators for nursing students that may be used in curriculum development, planning, and learning outcome evaluation has been a key focus of nursing research in recent years (Hsu & Hsieh, 2013).
Previous studies have highlighted limitations in long narratives, comparisons for subjects, 36-119 items in the evaluation forms, and valuation methods (Berkow et al., 2009;Halcomb et al., 2016;Lee et al., 2019). Thus, the purpose of this study was to construct a capstone core competency scale (3C scale) for appraising the performance for students studying in bachelor-degree nursing programs.

Literature Review
Previous studies have focused on evaluating the competency of newly graduated nurses, experienced nurses (Halcomb et al., 2016;Park & Park, 2020;Yang et al., 2013), and nursing students (Hsu & Hsieh, 2013;Lee et al., 2019;Tommasini et al., 2017). However, these scales include various factors and a significant number of items that require significant time to complete. The core competency assessment may be regarded as both an indicator of academic and practicum performance in nursing educational programs and a reference for revising the teaching and learning process (Ličen & Plazar, 2015). The competencies cover nursing student confidence in clinical practicums, the quality of patient care, professional career advancement, role performance, and job success (Ličen & Plazar, 2015). The objectives of the capstone core competencies for nursing have been identified as essential in the national competencies of nursing educational programs by the Taiwan Nursing Accreditation Council (Hsu & Hsieh, 2009).
An international case study on the clinical competence evaluation scales was conducted in the programs of bachelor nursing science courses. This tool included 196 items classified into 12 core competence categories using content analysis in a final evaluation at the end of the clinical placement (Tommasini et al., 2017). A Korean study used a core competency scale containing 70 items classified into five subscales to evaluate Korean nursing students (Lee et al., 2019), whereas another study described the psychometric properties of two questionnaires with 34 items used to assess the concepts of "good nurse" and "better nursing" (Park & Park, 2020). In addition, the Competency Outcomes and Competency Assessment (COPA) model addresses three factors (affective, cognitive, and psychomotor skills) for bachelor nursing programs with eight core practice competencies (Lenburg et al., 2011). However, the COPA model lacks evidence-based data to verify its reliability and validity (Lenburg et al., 2011). Another study showed that the 36 tested competencies were grouped into six broader skill categories used to evaluate newly graduated nurses (Berkow et al., 2009). Hsu and Hsieh (2009) developed an eight-item Self-Evaluated Core Competencies Scale covering the two components of humanity/responsibility and cognitive/ competency. In addition, the Competency Inventory of Nursing Students covers six domains with 47 items (Hsu & Hsieh, 2013). In brief, these scales generalize the competencies of clinical skills, therapeutic communication, critical thinking, and the professional attitude competencies related to ethics, accountability, lifelong learning, and humanity. However, these core competencies poorly address leadership and management, professional enhancement, quality improvement, and continuing education.
Core competency assessments are designed to review from a holistic perspective the attainment of expected learning outcomes (Ličen & Plazar, 2015). Previous studies have proposed a variety of factors and items be included in nursing core competency instruments for nursing students and newly graduated nurses (Berkow et al., 2009;Lee et al., 2019;Ličen & Plazar, 2015). Thus, the aim of this study was to develop a simple and clear self-reported instrument on the 3C scale to provide summative feedback on students and the curriculum.

Research Design and Participants
A cross-sectional survey design was employed using the convenience sampling method at different time points. Of the 596 participants in Phase I, 446 were senior-year students from three private universities and 150 were junior-year students from one private medical university. In Phase II, 520 senior-year students were recruited as participants from one private medical university during 2016-2017. The inclusion criteria for the senior-year participants were as follows: (a) completed the prerequisite senior courses, (b) had clinical practicum experience, and (c) were full-time nursing students. The "prerequisite" senior courses included junior courses and advanced required nursing courses, including courses of advanced nursing intelligence-containing certification of basic nursing skills, advanced clinical nursing skills, medical and surgical nursing (including practice), maternity nursing (including practice), pediatric nursing (including practice), community health nursing (including practice), psychiatric nursing (including practice), introduction to nursing administration (including practice), comprehensive Clinical Nursing Practice I and II, introduction to nursing research, nursing career (II), nursing ethics, and seminar on professional nursing issues. The inclusion criteria for the junior-year participants were as follows: (a) completed the prerequisite junior courses and (b) were full-time nursing students. The "prerequisite" junior courses covered all required junior courses, including the courses of basic biomedical science and basic nursing intelligence-containing introduction of nursing, nursing career (I), physiology (including laboratory), anatomy (including laboratory), human development (including practice), microbiology and immunology (including laboratory), pharmacology, biomedical statistics, pathology, physical assessment (including practice), and fundamentals of nursing (including practice).
The exclusion criteria in this study were as follows: (a) being a restudying student and (b) holding a registered nurse license. The minimum sample size was calculated using a target ratio of 10 participants for each principal component and factor analysis item (Meyers et al., 2017).

Phase I: Scale Development and Item Generation
In the item generation stage, the competencies used were drawn primarily from the eight-core-competency framework for national nursing educational programs published by the Taiwan Nursing Accreditation Council (Hsu & Hsieh, 2013). Thirtyfive items were constructed from the conceptual definition and affiliated with other items elucidated from the literature review and a panel discussion with experts in nursing education and administration (DeVellis, 2017;Mishel, 1998). For the review of items pooled by another expert panel, 20 experts assessed content validity by reflecting on the adequacy and importance of each item within the eight-core-competency framework. Eight factors were retained and 11 items were deleted based on the expert panel discussion. Finally, 24 items were selected, refined, and included in the 3C scale. The content validity index was assessed by 10 nursing educators and administrators, who evaluated appropriateness by confirming relevance, clarity, and simplicity (DeVellis, 2017). Next, a group of senior-year nursing students (n = 30) reviewed the items and evaluated the transparency and relevance of the test from the perspective of respondents (Gravetter & Forzano, 2012).
Known-groups validity was measured by comparing the mean scores for scale variables between the two groups. Another group was recruited from the junior-year nursing students. One criterion may be that test scores should discriminate between groups that, theoretically, should have different traits. A procedure was outlined to assess the expected differences between junior-and senior-year student groups that computed an effect size measurement showing the strength of the relationship or magnitude of the difference between the two groups (Hattie & Cooksey, 1984;Morgan et al., 2013).
The alpha coefficient was used to assess internal consistency reliability, whereas test-retest reliability was measured to show the stability of the scale (Mishel, 1998). All of the tests were two tailed, and significance was defined by a p value of < .05. Data were collected from 446 senior-year students and 150 junior-year students from private universities in southern Taiwan.

Phase II: Confirmatory Factor Analysis
Confirmatory factor analysis (CFA) was used to establish the 3C scale structure. Five hundred twenty nursing students from one medical university participated in this phase during the period of 2016-2017.

Ethical Considerations
This research was approved by the institutional review board (KMUHIRB-SV [II]-20160038). Data were anonymized, and no names or information were revealed to ensure that participation had no effect on participants' course scores.

Study Settings
The subjects participated in this study based on the paper announcement. The researcher then went to the class to explain the purpose of the study.

Data Collection Procedure
Participants were informed that participation was voluntary and that participation status would not affect their course scores. The purposes, risks, and benefits of this study were explained to the students, and the study questionnaire was given to those who provided informed consent to participate.

Data Analysis
Data were analyzed using SPSS Statistical Software Version 22.0 (IBM Inc., Armonk, NY, USA). The mean and standard deviation for each item were examined to provide information about items for judgment (Meyers et al., 2017). Construct validity was assessed using exploratory factor analysis (EFA), which used principal component analysis with varimax rotation to interpret the maximum amount of variance in the participants with compounded interpretability of the components (Meyers et al., 2017). Items with loading values over .5 were retained.
CFA was run on AMOS 18 statistical software to examine the 3C scale structure. CFA is used to reveal the degree to which each factor item corresponds with the observed covariance. The following parameters were determined for all of the tests: chi-square/degree of freedom ratio (w 2 /df value ≤ 5.00 is considered acceptable; Bollen, 1989), standardized root mean square residual (SRMR value should be < .10), comparative fit index (CFI value of .90-.95 indicates an acceptable level of fit), and root mean square error of approximation (RMSEA value of .07-.08 indicates a moderate fit, and .08-1.0 indicates a marginal fit). A goodness-of-fit index Evaluation of the Capstone Core Competency Scale VOL. 30, NO. 5, October 2022 (GFI) value of .90-.95 indicates an acceptable level of fit (Meyers et al., 2017). Furthermore, the reliability coefficients of average variance extracted, Cronbach's alpha, and composite reliability were calculated.

Content Validity
The overall content validity of the 3C scale was .98, and that for individual items ranged from .85 to 1.0, which were all higher than the recommended minimum content validity index value of .78 (Mishel, 1998).

Construct Validity
Means across items ranged from 3.75 to 4.47 (Table 1). Correlation coefficients between the items and the total scale ranged from .36 to .77. Construct validity was examined in the EFA using principal axis factor analysis with varimax rotation. The Kaiser-Meyer-Olkin value and the Bartlett sphericity test value (KMO = .96, w 2 = 4269.84, p < .001) indicate that adequate items were predicted by each factor and that the variables were sufficiently correlated to provide reasonable data for factor analysis. The item-total correlation was used to examine item discrimination, which ranged from .63 to .78 (Table 1; Morgan et al., 2013).
The 3C scale was grouped into three competency factors, including nursing intelligence, nursing humanity, and nursing career, which respectively accounted for 56.29%, 7.00%, and 4.56% of the variance. The first factor, named nursing intelligence competency, consists of 13 items (Items 1-11, 19, and 20) and addresses nursing student competencies related to clinical nursing skills, biomedical science, logical reasoning, and critical thinking as well as teamwork communication and cooperation in the context of the clinical practicum. The eigenvalue for this factor was 13.06, and the mean was 3.91 (SD = 0.67). The correlations between the various items ranged from .67 to .78. Only four items in this factor attained a performance level score > 80%. The participants scored relatively well on clinical skills, professional communication, and teamwork skills but poorer on critical thinking skills (Shirazi & Heidari, 2019). The second factor, named nursing humanity competency, consists of seven items (Items 12-18) and addresses nursing student competencies related to the esthetics of nursing, caring attitude (e.g., listening, shared decision making, empathy, and respect), respect for culture, and ethical thinking in judgment that shows an ability to apply empirical knowledge and empathy in healthcare practice. The eigenvalue for this factor was 1.87, and the mean was 4.22 (SD = 0.59). The correlations between the various items ranged from .63 to .70. Most items in this factor attained a performance level score > 80%, indicating excellent performance. The third factor, named nursing career competency, consists of four items (Items 21-24) and addresses the need for nursing students to reflect responsible attitudes and behaviors, proactively pursue self-directed learning, and use various resources to plan their professional nursing career development. The eigenvalue for this factor was 1.07, and the mean was 3.85 (SD = 0.69). The correlations among the various items ranged from .73 to .77.

Reliability
The Cronbach's alpha coefficient was .97, and coefficients of individual factors ranged from .90 to .95. Thirty junior nursing students participated in a 2-week test-retest for testing the stability of this scale, and the reliability coefficients were obtained from the total score (intraclass correlation coefficient = .85). The corrected item-total correlation ranged from .63 to .78 (Table 1).

Confirmatory Factor Analysis
CFA was conducted to confirm the three-factor structure of the 3C scale based on the results of the EFA. The p value was statistically significant (w 2 = 1338.25, df = 249, p < .001), and the other w 2 /df = 5.37, GFI = .82, CFI = .90, RMSEA = .09 (a value between .08 and 1.0 indicates a marginal fit), and SRMR = .06 indices of goodness of fit all indicated poor model fit ( Table 2). The Pearson's correlation coefficients ranged from .71 to .96 for the three factors. The internal consistency reliability of the total scale was .92, and the composite reliability for each factor varied between .90 and .94, which is above the minimum value (> .70; Meyers et al., 2017) required for reliability. Average variance extracted indicator values between .57 and .72 (greater than or equal to .50) were recommended

The Journal of Nursing Research
Hui-Chen TSENG et al.

Discussion
The purpose of this study was to develop and test the psychometric properties of the 3C scale. No simple capstone assessment has been used to analyze the achievement of learning in bachelordegree-program nursing students with regard to professionalism in Taiwan. The brief 3C scale was shown to exhibit good validity and reliability, suggesting that it may be widely used as a tool for evaluating appropriateness of the junior to senior curriculum designs. The development of the 3C scale applied a panel discussion, content validity, known-groups validity, construct validity, and test-retest reliability to obtain the Cronbach's α coefficients and used EFA and CFA to evaluate reliability and validity. Evaluation of the Capstone Core Competency Scale VOL. 30, NO. 5, October 2022 Methodological Strength Over 500 participants were recruited in both phases in the survey, adhering to the general rule that a sample size of 500 is very good in most cases of ordinary factor analysis (DeVellis, 2017). According to a recommendation in Meyers et al. (2017), data from at least 300 participants should be collected to analyze a questionnaire with 25 items based on the recommendations of the adequacy of various sample sizes for factor analysis. The sample in this study was adequate for the validity test, with the results indicating satisfactory psychometric properties. Notably, 11 items were deleted from the expert panel discussion because of inadequate relevance, clarity, or simplicity in item descriptions in the item-generation phase (DeVellis, 2017). The 3C scale provided good estimates of construct validity by evaluating the EFA with the varimax rotation and CFA with maximum likelihood estimation. The EFA, conducted for all 24 items, presented different aspects of core competencies in nursing evaluation. A three-component model was extracted using the scree plot and eigenvalues.
The results of the EFA indicate that the scale has good construct validity, explaining 67.85% of the total variance of the retained factors and retaining the appropriate operationalization of the concepts. The results further showed that most variance was loaded on Factor 1 (56.29%), with the other two factors accounting for 7% and 4.56%, respectively. The eigenvalues varied between 1.07 and 13.06, and the loadings ranged from .50 to .83 (Table 1). The authors made the decisions to keep all three factors in this scale according to "the extent that the ratio of the variance of the variance explained by the canonical function to the estimated error variance exceeds 1, the signal is stronger than the noise making the signal more apparent." Each eigenvalue is associated with a theta value, that is, an index of the strength of the relationship (the amount of shared variance) between the dependent variable and the predictor variable for canonical function (Meyers et al., 2017).
On the basis of the known-groups validity analysis results, all of the items in the developed scale were sufficiently homogeneous and had good discrimination ability (Morgan et al., 2013). Item discrimination was proven by the statistically significant differences between senior-and junior-year groups. The effect sizes can express the size of an effect. Therefore, to convey the effect size is relative to measure the strength of the relationship of a sample-based estimate of that quantity and the variability in the population (Schäfer & Schwarz, 2019). The effect sizes (d) of the total score and the three factors were approximately .87, .76, .99, and .68, respectively, which were larger than the typical large effect sizes (.50) on behavioral sciences and education (Table 3; Morgan et al., 2013).
The 3C scale was found to exhibit high internal consistency reliability. The internal consistency of the total scores and all factors were found to be high (Cronbach's α = .97 and .90-.95, respectively), showing the degree to which test subjects responded in a consistent manner to the items (Meyers et al., 2017). Moreover, test-retest reliability over a 2-week period validated acceptable stability across time (intraclass correlation coefficient = .97; DeVellis, 2017). The results of the EFA indicated that all items met the significant criterion for a factor loading of .5 (Meyers et al., 2017). It is noteworthy that Item 19 ("Evaluate the effectiveness of nursing care interventions") with a loading value over .50 was kept in Factor 1 (nursing intelligence) and that a loading value over .53 was kept in Factor 2 (nursing humanity). On the basis of the meaning of the item and the concept framework in the nursing practicum field, Item 19 was included under the nursing intelligence competency (Osborne, 2014).
According to standards for evaluating goodness of fit, the cutoff point for the model fit indices were as follows: w 2 /df < 5.0, GFI > .9, CFI > .9, RMSEA < .10, and SRMR ≤ .10 (Bollen, 1989;Meyers et al., 2017). Comparing the results of the w 2 / df, CFI, EFI, NFI, RMSEA, and SRMR data in the threefactor model in Phase II, the marginal criteria were met. If Items 1, 9, and 21 were deleted from the scale, the results of the same analyses would be w 2 /df = 4.33, GFI = .90, CFI = .92, RMSEA = .08, and SRMR = .06, which approximate the recommended criteria. Therefore, the scale for Item 1 ("Integrate Note. SRW = standardized regression weights; SE = standard error; AVE = average variance extracted; CR = composite reliability; t value = CR in the regression weight.

The Journal of Nursing Research
Hui-Chen TSENG et al.
knowledge of basic biomedical science") belongs in the appraisal of knowledge domain, whereas Items 9 ("Cultivate a judging goal-oriented decision-making process in nursing care process") and 21 ("Cultivate self-management ability in nursing care activities") belong in the appraisal of affection domain.

Practical Strengths
This study has several practical strengths. The 3C scale with fewer items offers the potential to save the time and energy necessary to evaluate subjects. Shorter instruments may be considered more acceptable and feasible by participants (Berkow et al., 2009;Halcomb et al., 2016;Lee et al., 2019). In addition, the construct of the 3C scale is similar to the COPA model according to the framework of quality care and patient safety competency (Lenburg et al., 2011).

Limitations and Scope for Further Research
The 3C scale was developed from data sampled from bachelor nursing students from three private universities in Phase I and from one private university in Phase II. Thus, the findings may not be generalizable to other populations (e.g., registered nurses or master's students). In addition, the 3C scale reflects the perspectives of nursing experts, nursing educators, and nursing students in Taiwan, which may not adequately capture the perspectives of healthcare professionals in other cultural settings. However, items in the nursing humanity factor reflected cultural-related concepts. Therefore, further research is recommended to broaden the cultural relevance of the instrument.

Conclusion
The 3C scale comprises three factors and 24 items and is scored on a 6-point Likert scale. The 3C scale exhibited good psychometric properties for two reliability metrics and the construct validity metrics. This 3C scale may be used to evaluate the academic and practical performance of quality care and patient safety competency and provides a guideline for the construction of both online and offline educational programs, with a conciseness and advanced methodology that makes it feasible and reliable for implementation in real-world settings. Evaluation of the Capstone Core Competency Scale VOL. 30, NO. 5, October 2022