Detailed data about a forty-year systematic review and meta-analysis on nursing student academic outcomes

Data were extracted from observational studies describing undergraduate nursing students’ academic outcomes that were included in a systematic review and meta-analysis conducted in 2019 and updated in 2020 [1]. Data were extracted by two researchers independently through a previously tested electronic spreadsheet; any disagreement about data extraction was discussed with a third author. Extracted data were studies’ general information, characteristics (i.e., country, study design, involved centers, number of cohort of students involved, duration (years) and denomination of the program attended, sample (N), sociodemographic characteristics of the sample, and methods utilized for data collection), and data related to the research question(s) of the review, i.e., nursing students’ academic outcomes occurrence and associated factors. Raw data for each included study are reported, along with meta-analyses that were performed using ProMeta free software utilizing Odds Ratio (OR) and Cohen's d as principal effect sizes. The random-effect model was used for all studies, while the level of heterogeneity was explored and quantified through the Cochran's Q-test and I2, respectively. Substantial or considerable heterogeneity (i.e., I2 ≥ 50%) was explored through a subgroup analysis based on the study design, when feasible [2]. A sensitivity analysis was also performed to detect the possible influence of single studies on meta-analyses results [2]. Publication bias was assessed through funnel plots and the testsf for their asymmetry, i.e., Begg and Mazumdar's rank correlation and Egger's linear regression method [2]. These data provide for an updated state of the art about nursing students’ outcomes and associated factors. Therefore, they could ease future literature summaries about the topic, other than allow a comparison of the literature with future research results.

explored and quantified through the Cochran's Q-test and I 2 , respectively. Substantial or considerable heterogeneity (i.e., I 2 ≥ 50%) was explored through a subgroup analysis based on the study design, when feasible [2] . A sensitivity analysis was also performed to detect the possible influence of single studies on meta-analyses results [2] . Publication bias was assessed through funnel plots and the testsf for their asymmetry, i.e., Begg and Mazumdar's rank correlation and Egger's linear regression method [2] . These data provide for an updated state of the art about nursing students' outcomes and associated factors. Therefore, they could ease future literature summaries about the topic, other than allow a comparison of the literature with future research results.  Table   Subject Nursing and Health Professions Specific subject area Associated factors of nursing students' academic outcomes Type of data Tables  Figures  How data were acquired Data were acquired consulting the studies included in the systematic review. Included studies were retrieved launching search strings on PubMed, Scopus, Education Resources Information Centre (ERIC), and Open Grey databases. To maximize the finding of potentially relevant manuscripts, the reference lists of the included studies, as well as the references that had cited the included studies on Scopus were assessed for eligibility. Data format Raw Analysed Parameters for data collection Studies were included in the systematic review and data were extracted if studies were: available in full text, published in Italian or English languages, and quantitative non-randomized in design. Moreover, studies had to: include academic nursing students that attended a program lasting at least three years, consider the academic outcomes measured at the end of the regular duration of the program according to its regulation, and report an analysis of associated or predictive factors of students' academic outcomes. Description of data collection After a two-step screening of abstracts and full texts according to relevant inclusion criteria, 18 studies ( n = 10 were retrospective cohort, n = 7 were prospective cohort, n = 1 was and case-control) were included in the review. Nine studies were included in meta-analysis. The 'Downs and Black instrument' was used to assess the risk of bias of included studies [3] . Data were extracted and collected by two researchers independently after printing included full texts. Any disagreement between the two researchers about data extraction was solved by discussion with a third author. Data were inserted in an electronic spread sheet of Excel for Windows, where the following data were collected: first author, publication year, Country, study design, whether the study was multicentric, number of cohorts of students, program duration (years), program denomination, sample ( N ), number of females, age [Mean (SD)], methods for data collection, study aim, adopted definition of the outcome, independent variables assessed for the association with/prediction of the outcome and their occurrence or mean values, statistical analyses performed by the authors, and summary of results provided referred to all the outcome definitions adopted by the authors (if available).
( continued on next page )

Value of the Data
• These data contribute to understand nursing students' academic outcomes occurrence and their associated factors. • Researchers and educators of academic nursing students can benefit from these data to inform their future research and educational practice. • These data might ease future literature summaries about the topic, they could be used to drive future research and compare their results with the literature available so far.

Data Description
In Table 1 , all details about search strategy in electronic databases are reported, while Fig. 1 describes the output of the integrative electronic search conducted in 2020 and the selection strategy of retrieved references. The Supplementary file 1 lists all retrieved references through the three search strategies performed (i.e., initial search described in Table 1 , scanning of reference lists and citations of the included studies, and additional search launched in January 2020). The output of initial search and references and citations scanning has been already published [1] .
Raw data extracted from each study are reported in the Supplementary file 2 (Excel for Windows spreadsheets). These data were utilized to report collected information in tables and figures.
In Table 2 (Supplementary file 3), all extended data extracted from the included studies are reported, especially data related to the research question of the review (i.e., study aim, outcome definition, independent variables, statistical analyses, and results and conclusions). Table 3 reports the evaluation of risk of bias in the included studies according to the Downs' & Black instrument [3] , that was customized as needed. Table 4 reports aggregated results about frequency of academic success and lack of success according to the outcome definitions reported in the studies.
Figures from 2 to 5 refer to the meta-analyses that considered as outcome nursing students' 'graduation within the regular duration of the program'. In particular, Fig. 2 reports subgroup analyses based on the study design for the meta-analyses comparing language, gender, age, and secondary school grades; Fig. 3 reports sensitivity analysis for the meta-analyses comparing language, age, gender, and secondary school grades; Fig. 4 reports sensitivity analysis for the metaanalyses comparing type of secondary school attended, working experience in the nursing field before attending the nursing program, time to reach the university, and working while attending the nursing program; Fig. 5 reports funnel plots to assess the publication bias for the metaanalyses comparing language, age, gender, type of secondary school, secondary school grades,

AND "students"[All Fields])) AND ("retention (psychology)"[MeSH Terms] OR ("retention"[All Fields] AND "(psychology)"[All Fields]) OR "retention (psychology)"[All Fields] OR "retention"[All Fields])
Scopus nursing students AND attrition nursing students AND academic failure nursing students AND student dropouts nursing students AND wastage nursing students AND withdrawal nursing students AND academic success nursing students AND achievement nursing students AND retention ERIC nursing students AND attrition nursing students AND academic failure nursing students AND student dropouts nursing students AND wastage nursing students AND withdrawal nursing students AND academic success nursing students AND achievement nursing students AND retention Open Grey nursing students AND attrition nursing students AND academic failure nursing students AND student dropouts nursing students AND wastage nursing students AND withdrawal nursing students AND academic success nursing students AND achievement nursing students AND retention working experiences in the nursing field, time spent to reach the university, and working while attending the nursing program. Finally, Table 4 reports a summary of the results of meta-analyses, assessment of heterogeneity considering the study design for subgroup analysis, and sensitivity analysis as regards the outcome 'graduation within the regular duration of the program'.

Experimental Design, Materials, and Methods
An extensive systematic research and meta-analysis were performed in accordance with relevant criteria described in the 'Cochrane Handbook for Systematic Reviews of Interventions', Version 5.1.0 [2] and their reporting was checked against relevant items of the 'Preferred Reporting Items for Systematic Reviews and Meta-Analyses' (PRISMA) checklist [4] .
To identify suitable keywords for the search strategy, a pilot search was performed in Scopus using the following search strings: a) nursing students and attrition; b) nursing students and retention; c) nursing students and dropout, along with the filter 'Review'. Therefore, considering the keywords used in the retrieved reviews [5][6][7][8][9][10][11] and the aims of the present study, the following keywords were utilized in the search strategy: 'students, nursing', 'achievement', 'academic success', 'retention', 'attrition', 'wastage', 'academic failure', 'student dropouts', and 'withdrawal'. The consulted electronic databases were PubMed, Scopus, Education Resources Information Center (ERIC), and Open Grey. The identified keywords were combined both for the research about academic success and failure. In regard to academic success, the following combinations were used: (a) students, nursing and achievement; (b) students, nursing and academic success; (c) students, nursing and retention. Instead, for the research about academic lack of success, the following strings were used: (a) students, nursing and attrition; b) students, nursing and wastage; (c) students, nursing and academic failure; (d) students, nursing and student dropouts; (e) students, nursing and withdrawal. The research was performed on January 21st, 2019 (limited to December 31st, 2018) and updated on January 24th, 2020 (limited from January 1st, 2019 to De-     cember 31st, 2019); no further limits or filters were utilized to ensure a high sensitivity of the search strategy and adopt a 'broad approach' [2] . All the retrieved references were collected and managed with EndNote X7.8 for Windows (Thomson Reuters, New York). Moreover, the reference lists of the included studies and references that had cited the included studies were retrieved through Scopus and assessed for eligibility. To be included in the review, studies had to be observational in nature; undergraduate nursing students attending an academic program lasting at least three years were the considered population; all the measures of academic success and lack of success measured at least at the end of the regular duration of the nursing program were considered as outcomes.
Studies were screened for eligibility and inclusion analysing titles/abstracts and full-texts, respectively. Two raters independently screened titles and abstracts of the retrieved references. To avoid the exclusion of potentially relevant articles, in this phase, titles and abstracts had to fulfil the following broad criteria: (a) include any kind of nursing students or medical faculties; (b) describe the assessment of academic outcomes, even though generically defined as 'achievement'. Table 4 Summary of the results of meta-analysis, assessment of heterogeneity considering the study design for subgroup analysis, and sensitivity analysis as regards the outcome 'graduation within the regular duration of the program'. Full-texts were analysed by two raters for their inclusion in the review considering the following as inclusion criteria: (a) the full-text was available through the library resources of the University of L'Aquila; (b) the full-text was published in Italian or English; (c) the full-text described quantitative non-randomized studies, i.e., the study design, assessed through the 'List of study design' [2] , revealed to be prospective or retrospective cohort or case-control; (d) the sample of the study included academic nursing students that attended a program that lasted at least three years; (e) the authors had considered one of the definitions described in the literature regarding academic success or lack of success as outcomes and was measured at least at the end of the regular duration of the program; (f) the authors described the assessment of predictive or associated factors with the outcome. In this phase, the following exclusion criteria were considered: mixed samples (e.g., nursing and midwifery students) and not separate data available or obtained after contacting the authors. Finally, whether the same data were duplicated in different journals or included in larger and mixed samples, the paper presenting the highest methodological quality and most of data regarding nursing students was included. Studies that reported data on the same sample but about different variables regarding the academic outcome were included and treated as a unique study for descriptive statistics of the students and their characteristics, while they were considered as separate studies when assessing the results on associated or predictive factors. The whole selection and computing processes for the PRISMA flow-chart were performed using SPSS version 25.0 (IBM Corp., Armonk, NY, USA). Both in the eligibility and inclusion stages, the agreement among the judgements of the raters (interrater reliability) was estimated with the Krippendorff's alpha coefficient ( α) ranging from 0 (totally disagree) to 1 (totally agree) [12] . Any disagreement between the raters was resolved by discussion with a third author until consensus was reached. The assessment of risk of bias was performed through the 'Downs and Black instrument' [3] after having modified it as needed. The following data were extracted: general information, study characteristics, and data related to the research question of the review. Authors were contacted if needed. Both risk of bias assessment and data extraction were performed by two raters independently and any disagreement was discussed with a third author. Descriptive data reported in the studies have been synthetized to provide an overview of the included studies and samples. Moreover, to detect the possible influence of the effects of micro-, meso-, and macro-level variables on the academic outcomes, data about each possible influencing variable were synthetized by pooling studies reporting the same definition of the outcome. When studies reported definitions of academic success and lack of success, investigating the associated/predictive variables of both definitions, results were first summarized referring to academic success as the outcome. Afterwards, when the definition of academic lack of success provided in these studies was not complementary to the definition of success (i.e., they referred to one aspect of lack of success such as failure), results were also summarized referring to academic lack of success as the outcome.
When more than two studies reporting the same definition of the outcome and influencing variable were retrieved, meta-analyses were performed utilizing the odds ratio (OR) or Cohen's d as effect sizes for categorical and continuous variables, respectively. The Random Effects Model (REM) was used to compute the meta-analyses through ProMeta free software. The Cochran's Q ( χ 2 ) and I 2 were calculated for each meta-analysis for the assessment of heterogeneity. A subgroup analysis was conducted for the meta-analyses in which a 'substantial' or 'considerable' (i.e., I 2 ≥ 50%) heterogeneity was detected; a sensitivity analysis was performed for each metaanalysis that included three or more studies. For the meta-analyses that included three or more studies, the publication bias was assessed through funnel plots and tests for the asymmetry of the funnel plots (Begg and Mazumdar's rank correlation and Egger's linear regression method) [2] . Data that could not be included in the quantitative synthesis were narratively synthetized.

Ethics statement
Not applicable.

Declaration of Competing Interests
The authors declare that they have no known competing financial interests or personal relationships which have or could be perceived to have influenced the work reported in this article.