Abstract
Scientists generally do scientific collaborations with one another and sometimes change their affiliations, which leads to scientific mobility. This paper proposes a recursive reinforced name disambiguation method that integrates both coauthorship and affiliation information, especially in cases of scientific collaboration and mobility. The proposed method is evaluated using the dataset from the Thomson Reuters Scientific “Web of Science”. The probability of recall and precision of the algorithm are then analyzed. To understand the effect of the name ambiguation on the h-index and g-index before and after the name disambiguation, calculations of their distribution are also presented. Evaluation experiments show that using only the affiliation information in the name disambiguation achieves better performance than that using only the coauthorship information; however, our proposed method that integrates both the coauthorship and affiliation information can control the bias in the name ambiguation to a higher extent.
Similar content being viewed by others
References
Badar, K., Hite, J., & Badir, Y. (2012). Examining the relationship of co-authorship network centrality and gender on academic research performance: the case of chemistry researchers in Pakistan. Scientometrics, 1–21, doi:10.1007/s11192-012-0764-z.
Chung, C., & Park, H. (2012). Web visibility of scholars in media and communication journals. Scientometrics, 1–9, doi:10.1007/s11192-012-0707-8.
Egghe, L. (2006). Theory and practise of the g-index. Scientometrics, 69(1), 131–152.
Guns, R., Liu, Y., & Mahbuba, D. (2011). Q-measures and betweenness centrality in a collaboration network: A case study of the field of informetrics. Scientometrics, 87(1), 133–147.
Gurney, T., Horlings, E., et al. (2012). Author disambiguation using multi-aspect similarity indicators. Scientometrics, 91(2), 435–449.
Hirsch, J. E. (2005). An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences of the United States of America, 102(46), 16569.
Huang, J., Ertekin, S., & Giles, C. (2006). Efficient name disambiguation for large-scale databases. Knowledge Discovery in Databases, PKDD, 2006(4213), 536–544.
Iglesias, J., & Pecharromán, C. (2007). Scaling the h-index for different scientific ISI fields. Scientometrics, 73(3), 303–320.
Kang, I., Na, S., Lee, S., Jung, H., Kim, P., Sung, W., et al. (2009). On co-authorship for author disambiguation. Information Processing and Management, 45(1), 84–97.
Laherrère, J., & Sornette, D. (1998). Stretched exponential distributions in nature and economy: “fat tails” with characteristic scales. The European Physical Journal B, 2(4), 525–539.
Newman, M. E. J. (2001). Scientific collaboration networks.I. Network construction and fundamental results. Physical Review E, 64(1), 16131.
Newman, M. E. J. (2004). Coauthorship networks and patterns of scientific collaboration. Proceedings of the National Academy of Sciences of the United States of America, 101(Suppl 1), 5200–5205.
Onodera, N., Iwasawa, M., Midorikawa, N., Yoshikane, F., Amano, K., Ootani, Y., et al. (2011). A method for eliminating articles by homonymous authors from the large number of articles retrieved by author search. Journal of the American Society for Information Science and Technology, 62(4), 677–690.
Petersen, A. M., Jung, W., Yang, J., & Stanley, H. E. (2011). Quantitative and empirical demonstration of the Matthew effect in a study of career longevity. Proceedings of the National Academy of Sciences, 108(1), 18–23.
Petersen, A. M., Wang, F., & Stanley, H. E. (2010). Methods for measuring the citations and productivity of scientists across time and discipline. Physical Review E, 81(3), 36114.
Radicchi, F., Fortunato, S., Markines, B., & Vespignani, A. (2009). Diffusion of scientific credits and the ranking of scientists. Physical Review E, 80(5), 56103.
Smalheiser, N. R., & Torvik, V. I. (2009). Author name disambiguation. Annual Review of Information Science and Technology, 43(1), 1–43.
Soler, J. (2007). Separating the articles of authors with the same name. Scientometrics, 72(2), 281–290.
Tang, L., & Walsh, J. (2010). Bibliometric fingerprints: name disambiguation based on approximate structure equivalence of cognitive maps. Scientometrics, 84(3), 763–784.
Wooding, S., Wilcox-Jay, K., Lewison, G., & Grant, J. (2006). Co-author inclusion: A novel recursive algorithmic method for dealing with homonyms in bibliometric analysis. Scientometrics, 66(1), 11–21.
Zhao, D., & Strotmann, A. (2011). Counting first, last, or all authors in citation analysis: A comprehensive comparison in the highly collaborative stem cell research field. Journal of the American Society for Information Science and Technology, 62(4), 654–676.
Acknowledgments
This work was supported in part by the ISTIC-THOMSON Joint Scientometrics Lab Fund (Grant No. IT2012004) and in part by the China National Natural Science Fund (Grant No. 71101059).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wu, J., Ding, XH. Author name disambiguation in scientific collaboration and mobility cases. Scientometrics 96, 683–697 (2013). https://doi.org/10.1007/s11192-013-0978-8
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-013-0978-8