Research misconduct in health and life sciences research: A systematic review of retracted literature from Brazilian institutions

Background Measures to ensure research integrity have been widely discussed due to the social, economic and scientific impact of research integrity. In the past few years, financial support for health research in emerging countries has steadily increased, resulting in a growing number of scientific publications. These achievements, however, have been accompanied by a rise in retracted publications followed by concerns about the quality and reliability of such publications. Objective This systematic review aimed to investigate the profile of medical and life sciences research retractions from authors affiliated with Brazilian academic institutions. The chronological trend between publication and retraction date, reasons for the retraction, citation of the article after the retraction, study design, and the number of retracted publications by author and affiliation were assessed. Additionally, the quality, availability and accessibility of data regarding retracted papers from the publishers are described. Methods Two independent reviewers searched for articles that had been retracted since 2004 via PubMed, Web of Science, Biblioteca Virtual em Saúde (BVS) and Google Scholar databases. Indexed keywords from Medical Subject Headings (MeSH) and Descritores em Ciências da Saúde (DeCS) in Portuguese, English or Spanish were used. Data were also collected from the Retraction Watch website (www.retractionwatch.com). This study was registered with the PROSPERO systematic review database (CRD42017071647). Results A final sample of 65 articles was retrieved from 55 different journals with reported impact factors ranging from 0 to 32.86, with a median value of 4.40 and a mean of 4.69. The types of documents found were erratum (1), retracted articles (3), retracted articles with a retraction notice (5), retraction notices with erratum (3), and retraction notices (45). The assessment of the Retraction Watch website added 8 articles that were not identified by the search strategy using the bibliographic databases. The retracted publications covered a wide range of study designs. Experimental studies (40) and literature reviews (15) accounted for 84.6% of the retracted articles. Within the field of health and life sciences, medical science was the field with the largest number of retractions (34), followed by biological sciences (17). Some articles were retracted for at least two distinct reasons (13). Among the retrieved articles, plagiarism was the main reason for retraction (60%). Missing data were found in 57% of the retraction notices, which was a limitation to this review. In addition, 63% of the articles were cited after their retraction. Conclusion Publications are not retracted solely for research misconduct but also for honest error. Nevertheless, considering authors affiliated with Brazilian institutions, this review concluded that most of the retracted health and life sciences publications were retracted due to research misconduct. Because the number of publications is the most valued indicator of scientific productivity for funding and career progression purposes, a systematic effort from the national research councils, funding agencies, universities and scientific journals is needed to avoid an escalating trend of research misconduct. More investigations are needed to comprehend the underlying factors of research misconduct and its increasing manifestation.


Association between impact factor and number of citations before retraction
Pearson's correlation coefficient can be calculated for these two variables, however this coefficient can be used for population inference only when the joint distribution of variables is normal. Before conducting the test, assessment of the distribution and normality of each variable was performed. In case the variables did not behave normally, the use of a nonparametric test to evaluate correlation would be necessary (Spearman's correlation test).

Distribution of impact factor and number of citations before retraction
The graphic bellow suggests both variables have a linear positive correlation.
First data distribution was evaluated as followed:

Assessment of normality for number of citations before retraction
To evaluate normality distribution of this variable, Shapiro-Wilk histogram was made. The test shows the variable is not normal. With a p-value < 0.05 the null hypothesis of normality is rejected.

Assessment of normality for impact factor
To evaluate normality distribution of this variable, Shapiro-Wilk histogram was made. The test shows the variable is not normal. With a p-value < 0.05 the null hypothesis of normality is rejected.

Assessment of multivariate normality test
The multivariate normality test (for impact factor and number of citations before retractions) was performed using Henze-Zirkler test as follow: The test shows the variable is not normal. With p-value < 0,05 the null hypothesis of normality is rejected.

Use of Pearson's correlation test
Because the variables do not have a normal distribution, Pearson's correlation test can't be used for population inferences. However, it can still indicate the relation among the variables of this sample. Considering it, the test showed a correlation coefficient of 0.20. Therefore, the correlation is weak and positive. The test result is as follow: Pearson's product-moment correlation data: as.numeric(Dados$`Citations BEFORE retraction`) and as.numeric(Dados$`Impact Factor (last 5 years)`) t = 1.666, df = 63, p-value = 0.1007 alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval: -0.04050932 0.42788096 sample estimates: cor 0.2054194

Use of Spearman's correlation test
Spearman's correlation test doesn't need a normal distribution of the variables involved. It is a non-parametric test that can be performed considering the variables distribution and the study sample size. The test showed correlation coefficient of 0,43. It indicates a moderate correlation between impact factor and number of citations before retraction, p-value < 0.05.

Association between impact factor and number of citations after retraction
Pearson's correlation coefficient can be calculated for these two variables, however this coefficient can be used for population inference only when the joint distribution of variables is normal. Before conducting the test, assessment of the distribution and normality of each variable was performed. In case the variables did not behave normally, the use of a nonparametric test to evaluate correlation would be necessary (Spearman's correlation test).

Distribution of impact factor and number of citations before retraction
The graphic bellow suggests both variables have a linear positive correlation.
First data distribution was evaluated as followed: Cita ons AFTER retratcion X Impact Factor (last 5 years)

Assessment of normality for number of citations after retraction
To evaluate normality distribution of this variable, Shapiro-Wilk histogram was made. The test shows the variable is not normal. With p-value < 0.05 the null hypothesis of normality is rejected.

Assessment of multivariate normality test
The multivariate normality test (for impact factor and number of citations after retractions) was performed using Mardia's as follow: Mardia's Multivariate Normality Test  The test shows the variable is not normal. With a p-value < 0,05 the null hypothesis of normality is rejected.

Use of Spearman's correlation test
Spearman's correlation test doesn't need a normal distribution of the variables involved. It is a non-parametric test that can be performed considering the variables distribution and the study sample size. Hence, its results allow population inferences.
For it, the null hypothesis of no correlation between the variables considered . The result is as follow: Spearman's rank correlation rho data: as.numeric(Dados$`N Citções pós retratação`) and as.numeric(Dados$`Fator de Impacto (ultimos 5 anos)`) S = 6777.8, p-value = 3.587e-09 alternative hypothesis: true rho is not equal to 0 sample estimates: rho 0.7106711 The test showed correlation coefficient of 0.71. It indicates a strong correlation between impact factor and number of citations after retraction, p-value < 0.05.