Trend Detection Using NLP as a Mechanism of Decision Support

Lobanova, P. A.; Kuzminov, I. F.; Karatetskaia, E. Yu.; Sabidaeva, E. A.; Anpilogov, V. V.

doi:10.3103/S0147688223050106

Trend Detection Using NLP as a Mechanism of Decision Support

Published: 05 March 2024

Volume 50, pages 440–448, (2023)
Cite this article

Scientific and Technical Information Processing Aims and scope

P. A. Lobanova¹,
I. F. Kuzminov¹,
E. Yu. Karatetskaia¹,
E. A. Sabidaeva¹ &
…
V. V. Anpilogov²

35 Accesses
Explore all metrics

Abstract

The purpose of this article is to present the principles of a developed algorithm for identifying trends based on the analysis of big text data and presenting the result in formats that are convenient for decision makers to be implemented in the iFORA Big Data Mining System. The paper provides an overview of existing text analytics algorithms; outlines the mathematical basis for identifying terms that mean trends, which is proposed and tested for dozens of implemented projects; describes approaches to clustering terms based on their vectors in the Word2vec space; and provides examples of two key visualizations (semantic, trend maps) that outline the range of topics and trends that characterize a particular area of study, as a way to adapt the results of the analysis to the tasks of decision makers. The limitations and advantages of using the proposed approach for decision support are discussed, and directions for future research are suggested.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

https://www.elastic.co/guide/en/elasticsearch/reference/master/search-aggregations-bucket-significantterms-aggregation. html#_jlh_score

REFERENCES

Pappa, G.L. and Freitas, A., Automating the Design of Data Mining Algorithms: An Evolutionary Computation Approach, Natural Computing Series, Berlin: Springer, 2010. https://doi.org/10.1007/978-3-642-02541-9_5
Book Google Scholar
Yuan, Ye., Sun, P., and Fan, H., Automatic selection and evaluation on data mining algorithms, 2015 6th IEEE Int. Conf. on Software Engineering and Service Science (ICSESS), Beijing, 2015, IEEE, 2015, pp. 29–32. https://doi.org/10.1109/icsess.2015.7339000
Porter, A.L. and Zhang, Y., Tech mining of science & technology information resources for future-oriented technology analyses, Futures Res. Methodology Version, 2015, vol. 3.
Google Scholar
Zhu, D. and Porter, A.L., Automated extraction and visualization of information for technological intelligence and forecasting, Technol. Forecast. Soc. Change, 2002, vol. 69, no. 5, pp. 495–506. https://doi.org/10.1016/s0040-1625(01)00157-3
Article Google Scholar
Osipov, G., Smirnov, I., Tikhomirov, I., Sochenkov, I., Shelmanov, A., and Shvets, A., Information retrieval for R&D support, Professional Search in the Modern World, Paltoglou, G., Loizides, F., and Hansen, P., Eds., Lecture Notes in Computer Science, vol. 8830, Cham: Springer, 2014, pp. 45–69. https://doi.org/10.1007/978-3-319-12511-4_4
Book Google Scholar
Newman, N.C., Porter, A.L., Newman, D., Trumbach, Ch.C., and Bolan, S.D., Comparing methods to extract technical content for technological intelligence, J. Eng. Technol. Manage., 2014, vol. 32, pp. 97–109. https://doi.org/10.1016/j.jengtecman.2013.09.001
Article Google Scholar
Tseng, Yu., Lin, C., and Lin, Y.I., Text mining techniques for patent analysis, Inf. Process. Manage., 2007, vol. 43, no. 5, pp. 1216–1247. https://doi.org/10.1016/j.ipm.2006.11.011
Article Google Scholar
Cooke, P., Gomez Uranga, M.G., and Etxebarria, G., Regional innovation systems: Institutional and organisational dimensions, Res. Policy, 1997, vol. 26, nos. 4–5, pp. 475–491. https://doi.org/10.1016/s0048-7333(97)00025-5
Article Google Scholar
Kwakkel, J.H., Carley, S., Chase, J., and Cunningham, S.W., Visualizing geo-spatial data in science, technology and innovation, Technol. Forecast. Soc. Change, 2014, vol. 81, pp. 67–81. https://doi.org/10.1016/j.techfore.2012.09.007
Article Google Scholar
Feldman, R., Fresko, M., Kinar, Ya., Lindell, Ye., Liphstat, O., Rajman, M., Schler, Yo., and Zamir, O., Text mining at the term level, Principles of Data Mining and Knowledge Discovery, Żytkow, J.M. and Quafafou, M., Eds., Lecture Notes in Computer Science, vol. 1510, Berlin: Springer, 1998, pp. 65–73. https://doi.org/10.1007/bfb0094806
Book Google Scholar
Averbuch, M., Context-sensitive medical information retrieval, MEDINFO 2004, Fieschi, M., Coiera, E., and Li, Yu-Ch.J., Eds., Studies in Health Technology and Informatics, vol. 107, IOS Press, 2004, pp. 282–286. https://doi.org/10.3233/978-1-60750-949-3-282
Osipov, G., Smirnov, I., Tikhomirov, I., Sochenkov, I., and Shelmanov, A., Exactus expert—search and analytical engine for research and development support, Novel Applications of Intelligent Systems, Hadjiski, M., Kasabov, N., Filev, D., and Jotsov, V., Eds., Studies in Computational Intelligence, vol. 586, Cham: Springer, 2016, pp. 269–285. https://doi.org/10.1007/978-3-319-14194-7_14
Church, K.W., A stochastic parts program and noun phrase parser for unrestricted text, Int. Conf. on Acoustics, Speech, and Signal Processing, Glasgow, 1988, IEEE, 1988, pp. 695–698. https://doi.org/10.1109/icassp.1989.266522
Wang, B., Liu, S., Ding, K., Liu, Z., and Xu, J., Identifying technological topics and institution-topic distribution probability for patent competitive intelligence analysis: a case study in LTE technology, Scientometrics, 2014, vol. 101, no. 1, pp. 685–704. https://doi.org/10.1007/s11192-014-1342-3
Article Google Scholar
Frantzi, K., Ananiadou, S., and Mima, H., Automatic recognition of multi-word terms: The C-value/NC-value method, Int. J. Digital Libr., 2000, vol. 3, no. 2, pp. 115–130. https://doi.org/10.1007/s007999900023
Article Google Scholar
Javed, Z. and Afzal, H., Biomedical text mining for concept identification from traditional medicine literature, 2014 Int. Conf. on Open Source Systems & Technologies, Lahore, Pakistan, 2014, IEEE, 2014, pp. 206–211. https://doi.org/10.1109/icosst.2014.7029345
Rose, S., Engel, D., Cramer, N., and Cowley, W., Automatic keyword extraction from individual documents, Text Mining: Applciations and Theory, Berry, M.W. and Kogan, J., Eds., John Wiley & Sons, 2010, pp. 1–20. https://doi.org/10.1002/9780470689646.ch1
Book Google Scholar
Salton, G. and Yu, C.T., On the construction of effective vocabularies for information retrieval, ACM SIGPLAN Not., 1973, vol. 10, no. 1, pp. 48–60. https://doi.org/10.1145/951787.951766
Article Google Scholar
Liu, C., Sheng, Ya., Wei, Z., and Yang, Yo., Research of text classification based on improved TF-IDF algorithm, 2018 IEEE Int. Conf. of Intelligent Robotic and Control Engineering (IRCE), Lanzhou, China, 2018, IEEE, 2018, pp. 218–222. https://doi.org/10.1109/irce.2018.8492945
Kutuzov, A., Kuzmenko, E., and Pivovarova, L., Clustering of Russian adjective-noun constructions using word embeddings, Proc. 6th Workshop on Balto-Slavic Natural Language Processing, Valencia: Association for Computational Linguistics, 2017, pp. 3–13. https://doi.org/10.18653/v1/w17-1402
Kumar, G. and Kumar, K., An information theoretic approach for feature selection, Secur. Commun. Networks, 2013, vol. 5, no. 2, pp. 178–185. https://doi.org/10.1002/sec.303
Article Google Scholar
Turney, P.D., Mining the Web for synonyms: PMI-IR versus LSA on TOEFL, Machine Learning: ECML 2001, Da Raedt, L. and Flach, P., Eds., Lecture Notes in Computer Science, vol. 2167, Berlin: Springer, 2001, pp. 491–502. https://doi.org/10.1007/3-540-44795-4_42
Ahmad, K. and Davies, A.E., Weirdness in special-language text: Welsh radioactive chemicals texts as an exemplar, Int. Inst. Terminologieforschung J., 1994, vol. 5, no. 2, pp. 22–52.
Google Scholar
Steinhaus, H., Sur la division des corps materiels en parties, Bull. Acad. Polon. Sci., C, 1956, vol. 4, pp. 801–804.
Han, J., Kamber, M., and Pei, J., Classification, The Morgan Kaufmann Series in Data Management, San Francisco: Morgan Kaufmann, 2001. https://doi.org/10.1016/C2009-0-61819-5
Book Google Scholar
Bae, S. and Yi, Yo., Acceleration of word2vec using GPUs, Neural Information Processing. ICONIP 2016, Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., and Liu, D., Eds., Lecture Notes in Computer Science, vol. 9948, Cham: Springer, 2016, pp. 269–279. https://doi.org/10.1007/978-3-319-46672-9_31
Book Google Scholar
Waskom, M.L., Seaborn: Statistical data visualization, J. Open Source Software, 2021, vol. 6, no. 60, p. 3021. https://doi.org/10.21105/joss.03021
Article ADS Google Scholar

Download references

Funding

The study by the Analytical Center at the Government of the Russian Federation, agreement no. 000000D730321P5Q0002, and by the HSE University, agreement dated November 2, 2021, no. 70-2021-00139, within a grant for support of research centers in the area of artificial intelligence, including the area of strong artificial intelligence, systems of authorized artificial intelligence, and ethical aspects of application of artificial intelligence.

Author information

Authors and Affiliations

HSE University, Institute for Statistical Studies and Economics of Knowledge, Moscow, Russia
P. A. Lobanova, I. F. Kuzminov, E. Yu. Karatetskaia & E. A. Sabidaeva
PJSC Sberbank, Russian Federation, Center for Validation of Corporate Investment Business Models, Moscow, Russia
V. V. Anpilogov

Authors

P. A. Lobanova
View author publications
You can also search for this author in PubMed Google Scholar
I. F. Kuzminov
View author publications
You can also search for this author in PubMed Google Scholar
E. Yu. Karatetskaia
View author publications
You can also search for this author in PubMed Google Scholar
E. A. Sabidaeva
View author publications
You can also search for this author in PubMed Google Scholar
V. V. Anpilogov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to P. A. Lobanova, I. F. Kuzminov, E. Yu. Karatetskaia, E. A. Sabidaeva or V. V. Anpilogov.

Ethics declarations

The authors declare that they have no conflicts of interest.

Additional information

Publisher’s Note.

Allerton Press remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

About this article

Cite this article

Lobanova, P.A., Kuzminov, I.F., Karatetskaia, E.Y. et al. Trend Detection Using NLP as a Mechanism of Decision Support. Sci. Tech. Inf. Proc. 50, 440–448 (2023). https://doi.org/10.3103/S0147688223050106

Download citation

Received: 06 June 2022
Published: 05 March 2024
Issue Date: December 2023
DOI: https://doi.org/10.3103/S0147688223050106

Keywords:

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Trend Detection Using NLP as a Mechanism of Decision Support

Abstract

Access this article

Notes

REFERENCES

Funding

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Additional information

Publisher’s Note.

About this article

Cite this article

Share this article

Keywords:

Search

Navigation