Skip to main content
Log in

A multi-perspective analysis of retractions in life sciences

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

The aim of this study is to explore trends in retracted publications in life sciences and biomedical sciences over axes like time, countries, journals and impact factors, and topics. Nearly seven thousand publications, which comprise the entirety of retractions visible through PubMed as of August 2019, were used. This work involved sophisticated data collection and analysis techniques to use data from PubMed, Wikipedia, and WikiData, and study it with respect to the above mentioned axes. Importantly, I employ state-of-the-art analysis and visualization techniques from natural language processing (NLP) to understand the topics in retracted literature. To highlight a few results, the analyses demonstrate an increasing rate of retraction over time and noticeable differences in the publication quality (as measured by journal impact factor) among top countries. Moreover, while molecular biology and cancer dominate retractions, we also see a number of retractions not related to biology. The methods and results of this study can be applied to continuously understand the nature and evolution of retractions in life sciences, thus contributing to the health of this research ecosystem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data availability

The dataset is available in the github repository (https://github.com/bbhatt001/Retracted_Life_Sciences_Literature).

References

  • Arturo Casadevall, R., Steen, G., & Fang, F. C. (2014). Sources of error in the retracted scientific literature. The FASEB Journal, 28(9), 3847–3855.

    Article  Google Scholar 

  • Bar-Ilan, J., & Halevi, G. (2017). Post retraction citations in context: A case study. Scientometrics, 113(1), 547–565.

    Article  Google Scholar 

  • Ben Mabey. pyldavis. https://github.com/bmabey/pyLDAvis. Python library for interactive topic model visualization.

  • Bird, S., Klein, E., & Loper, E. (2009). Natural language processing with Python: analyzing text with the natural language toolkit. O’Reilly Media, Inc.

  • Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.

    MATH  Google Scholar 

  • Bozzo, A., Bali, K., Evaniew, N., & Ghert, M. (2017). Retractions in cancer research: A systematic survey. Research Integrity and Peer Review, 2(1), 5.

    Article  Google Scholar 

  • Budd, J. M., Sievert, M. E., & Schultz, T. R. (1998). Phenomena of retraction: Reasons for retraction and citations to the publications. Jama, 280(3), 296–297.

    Article  Google Scholar 

  • Clarivate analytics, 2018 Journal Impact Factor, Journal Citation Reports, 2019.

  • Cokol, M., Ozbay, F., & Rodriguez-Esteban, R. (2008). Retraction rates are on the rise. EMBO Reports, 9(1), 2.

    Article  Google Scholar 

  • Coletti, M. H., & Bleich, H. L. (2001). Medical subject headings used to search the biomedical literature. Journal of the American Medical Informatics Association, 8(4), 317–323.

    Article  Google Scholar 

  • Fang, F. C., Grant Steen, R., & Casadevall, A. (2012). Misconduct accounts for the majority of retracted scientific publications. Proceedings of the National Academy of Sciences, 109(42), 17028–17033.

    Article  Google Scholar 

  • Ferric, C. F., & Casadevall, A. (2011). Retracted science and the retraction index.

  • Foo, J. Y. A. (2011). A retrospective analysis of the trend of retracted publications in the field of biomedical and life sciences. Science and Engineering Ethics, 17(3), 459–468.

    Article  Google Scholar 

  • Gasparyan, A. Y., Ayvazyan, L., Akazhanov, N. A., & Kitas, G. D. (2014). Self-correction in biomedical publications and the scientific impact. Croatian Medical Journal, 55(1), 61.

    Article  Google Scholar 

  • Grant Steen, R. (2011a). Retractions in the scientific literature: Is the incidence of research fraud increasing? Journal of Medical Ethics, 37(4), 249–253.

    Article  Google Scholar 

  • Grant Steen, R. (2011b). Retractions in the scientific literature: Do authors deliberately commit research fraud? Journal of Medical Ethics, 37(2), 113–117.

    Article  Google Scholar 

  • Grant Steen, R., Casadevall, A., & Fang, F. C. (2013). Why has the number of scientific retractions increased? PloS ONE, 8(7), e68397.

    Article  Google Scholar 

  • Kans, J. (2013). Entrez direct: E-utilities on the unix command line. https://www.ncbi.nlm.nih.gov/books/NBK179288/.

  • King, E. G., Oransky, I., Sachs, T. E., Farber, A., Flynn, D. B., Abritis, A., et al. (2018). Analysis of retracted articles in the surgical literature. The American Journal of Surgery, 216(5), 851–855.

    Article  Google Scholar 

  • Korpela, K. M. (2010). How long does it take for the scientific literature to purge itself of fraudulent material?: The breuning case revisited. Current Medical Research and Opinion, 26(4), 843–847.

    Article  Google Scholar 

  • Lowe, H. J., & Octo Barnett, G. (1994). Understanding and using the medical subject headings (mesh) vocabulary to perform literature searches. Jama, 271(14), 1103–1108.

    Article  Google Scholar 

  • Masoomi, R., & Amanollahi, A. (2018). Why Iranian biomedical articles are retracted? The Journal of Medical Education and Development, 13(2), 87–100.

    Google Scholar 

  • Medical subject headings. https://www.nlm.nih.gov/mesh/meshhome.html.

  • Moylan, E. C., & Kowalczuk, M. K. (2016). Why articles are retracted: A retrospective cross-sectional study of retraction notices at biomed central. BMJ OPEN, 6(11), e012047.

    Article  Google Scholar 

  • Nath, S. B., Marcus, S. C., & Druss, B. G. (2006). Retractions in the research literature: Misconduct or mistakes? Medical Journal of Australia, 185(3), 152–154.

    Article  Google Scholar 

  • Neumann, M., King, D., Beltagy, I. & Ammar, W. (2019). Scispacy: Fast and robust models for biomedical natural language processing.

  • Pratanwanich, N., & Lio, P. (2014). Exploring the complexity of pathway-drug relationships using latent Dirichlet allocation. Computational Biology and Chemistry, 53, 144–152.

    Article  Google Scholar 

  • Řehůřek, R., & Sojka, P. (2010). Software framework for topic modelling with large corpora. In Proceedings of the LREC 2010 workshop on new challenges for NLP frameworks (pp. 45–50), Valletta, Malta, May 2010. ELRA. http://is.muni.cz/publication/884893/en.

  • Retraction watch. https://retractionwatch.com/.

  • Shaheen Syed and Marco Spruit. Full-text or abstract? examining topic coherence scores using latent Dirichlet allocation. In 2017 IEEE International conference on data science and advanced analytics (DSAA) (pp. 165–174). IEEE.

  • Sievert, C., & Shirley, K. (2014). LDAVIS: A method for visualizing and interpreting topics. In Proceedings of the workshop on interactive language learning, visualization, and interfaces (pp. 63–70).

  • Spacy. https://github.com/explosion/spaCy. Industrial-strength Natural Language Processing (NLP) with Python and Cython.

  • Wager, E., & Williams, P. (2011). Why and how do journals retract articles? An analysis of medline retractions 1988–2008. Journal of Medical Ethics, 37(9), 567–570.

    Article  Google Scholar 

  • Wang, H., Ding, Y., Tang, J., Dong, X., He, B., Qiu, J., & Wild, D. J. (2011). Finding complex biological relationships in recent pubmed articles using bio-lda. PloS One, 6(3).

  • Wang, T., Xing, Q.-R., Wang, H., & Chen, W. (2019). Retracted publications in the biomedical literature from open access journals. Science and Engineering Ethics, 25(3), 855–868.

    Article  Google Scholar 

  • Wu, Y., Liu, M., Jim Zheng, W., Zhao, Z., & Xu, H. (2012). Ranking gene-drug relationships in biomedical literature using latent Dirichlet allocation. In Biocomputing 2012 (pp. 422–433). World Scientific.

  • Zheng, B., McLean, D. C., & Xinghua, L. (2006). Identifying biological concepts from a protein-related corpus with a probabilistic topic model. BMC Bioinformatics, 7(1), 58.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

Bhatt designed the study, collected and analyzed the data and wrote the paper.

Corresponding author

Correspondence to Bhumika Bhatt PhD.

Ethics declarations

Conflict of interest

This research did not receive any funding. The author declares no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bhatt, B. A multi-perspective analysis of retractions in life sciences. Scientometrics 126, 4039–4054 (2021). https://doi.org/10.1007/s11192-021-03907-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-021-03907-0

Keywords

Navigation