ABSTRACT
Recently, there are many resources of scientific research on the Web. Because the gap between the amount of academic information available on the Web and human processing abilities becomes large, several problems have arisen: (1) losing opportunities of research presentation, (2) loosing opportunities of gathering research information, (3) increasing burden of peer review, (4) difficulty in selecting papers to read. In order to solve these problems, quantitative evaluation index of a paper as a selection criterion is needed.
This paper proposes quantitative evaluation methods of scientific papers on the basis of text analysis. The journal similarity of a target journal to an authoritative journal is defined with using distributed representations of papers. When the similarity of a target journal is high, its quality in terms of writing and organization is expected to be high. This paper also proposes an evaluation method using ROUGE (Recall-Oriented Understudy for Gisting Evaluation).
Proposed evaluation methods are evaluated by experiments. Experiments results show that the journal similarity has rough correspondence to Scimago Journal Rank (SJR). The result also implies the possibility of evaluating journals that have not yet been indexed in some authoritative journal indices using the proposed methods. The evaluation method using ROUGE is shown to have the possibility of evaluating the consistency of papers.
- P. Wouters. 2011. Journal ranking biased against interdisciplinary research. Retrieved August 23, 2018 from https://citationculture.wordpress.com/2011/11/15/journal-ranking-biased-against-interdisciplinary-research/.Google Scholar
- M. Kovanis, R. Porcher, P. Ravaud, and L. Trinquart. 2016. The global burden of journal peer review in the biomedical literature: Strong imbalance in the collective enterprise. PLoS ONE, 11 (11).Google Scholar
- V. P. Guerrero-Bote and F. Moya-Anegón. 2012. A further step forward in measuring journals' scientific prestige: The SJR2 indicator. Journal of Informetrics, 6 (4), 674--688.Google ScholarCross Ref
- E. Callaway. 2016. Publishing elite turns against impact factor. Nature, 535 (14), 210--211.Google ScholarCross Ref
- E. Garfield. 2006. The History and Meaning of the Journal Impact Factor. Journal of the American Medical Association, 295, 90--93.Google ScholarCross Ref
- C. Y. Lin and E. Hovy. 2003. Automatic evaluation of summaries using N-gram co-occurrence statistics. Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - NAACL '03, 71--78. Google ScholarDigital Library
- C. Y. Lin. 2004. Rouge: A package for automatic evaluation of summaries. Proceedings of the workshop on text summarization branches out (WAS 2004), 25--26.Google Scholar
- J. Mingers and L. Leydesdorff. 2015. A review of theory and practice in scientometrics. European Journal of Operational Research, 246 (1), 1--19.Google ScholarCross Ref
- R. M. Alguliyev and R. M. Aliguliyev. 2016. Modified Impact Factors. Journal of Scientometric Research, 3, 197--208.Google Scholar
- J. D. West, T. C. Bergstrom, and C. T. Bergstrom. 2010. The Eigenfactor Metrics<sup>™</sup>: A Network Approach to Assessing Scholarly Journals. College & Research Libraries, 71 (3), 236--244.Google ScholarCross Ref
- H. F. Moed. 2010. Measuring contextual citation impact of scientific journals. Journal of Informetrics, 3, 265--277.Google ScholarCross Ref
- J. E. Hirsch. 2005. An index to quantify an individual's scientific research output. Proceedings of the National Academy of Sciences of the United States of America, 102 (46), (Nov. 2005).Google ScholarCross Ref
- T. Mikolov, K. Chen, G. Corrado, and J. Dean. 2013. Efficient estimation of word representations in vector space. In Proceedings of the Workshop of the First International Conference on Learning Representations (ICLR 2013).Google Scholar
- T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean. 2013. Distributed representations of words and phrases and their compositionality. Proceedings of the 26th International Conference on Neural Information Processing Systems, 2, 3111--3119. Google ScholarDigital Library
- Q. Le and T. Mikolov. 2014. Distributed representations of sentences and documents. Proceedings of the 31st International Conference on Machine Learning, 32 (2), 1188--1196. Google ScholarDigital Library
- A. Joulin, E. Grave, P. Bojanowski, and T. Mikolov. 2017. Bag of tricks for efficient text classification. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2, 427--431.Google Scholar
- P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov. 2017 Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5, 135--146.Google ScholarCross Ref
- Z. S. Harris. 1954. Distributional Structure. WORD, (2--3), 146--162.Google Scholar
- A. M. Dai, C. Olah, and Q. V. Le. 2014. Document embedding with paragraph vectors. Neural Information Processing Systems (NIPS) Deep Learning Workshop.Google Scholar
- D. M. Blei, B. B. Edu, A. Y. Ng, A. S. Edu, M. I. Jordan, and J. B. Edu. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research, 3, 993--1022. Google ScholarDigital Library
Index Terms
- Preliminary investigation on quantitative evaluation method of scientific papers based on text analysis
Recommendations
Citation-based criteria of the significance of the research activity of scientific teams
The objective of this research is elaborating new criteria for evaluating the significance of the research results achieved by scientific teams. It is known, that the h-index (Hirsch index) is used to evaluate scientific organizations, as well as ...
Measuring social media activity of scientific literature: an exhaustive comparison of scopus and novel altmetrics big data
This paper measures social media activities of 15 broad scientific disciplines indexed in Scopus database using Altmetric.com data. First, the presence of Altmetric.com data in Scopus database is investigated, overall and across disciplines. Second, a ...
How well developed are altmetrics? A cross-disciplinary analysis of the presence of `alternative metrics' in scientific publications
In this paper an analysis of the presence and possibilities of altmetrics for bibliometric and performance analysis is carried out. Using the web based tool Impact Story, we collected metrics for 20,000 random publications from the Web of Science. We ...
Comments