Skip to main content
Log in

Trend Detection Using NLP as a Mechanism of Decision Support

  • Published:
Scientific and Technical Information Processing Aims and scope

Abstract

The purpose of this article is to present the principles of a developed algorithm for identifying trends based on the analysis of big text data and presenting the result in formats that are convenient for decision makers to be implemented in the iFORA Big Data Mining System. The paper provides an overview of existing text analytics algorithms; outlines the mathematical basis for identifying terms that mean trends, which is proposed and tested for dozens of implemented projects; describes approaches to clustering terms based on their vectors in the Word2vec space; and provides examples of two key visualizations (semantic, trend maps) that outline the range of topics and trends that characterize a particular area of study, as a way to adapt the results of the analysis to the tasks of decision makers. The limitations and advantages of using the proposed approach for decision support are discussed, and directions for future research are suggested.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1.
Fig. 2.

Notes

  1. https://www.elastic.co/guide/en/elasticsearch/reference/master/search-aggregations-bucket-significantterms-aggregation. html#_jlh_score

REFERENCES

  1. Pappa, G.L. and Freitas, A., Automating the Design of Data Mining Algorithms: An Evolutionary Computation Approach, Natural Computing Series, Berlin: Springer, 2010. https://doi.org/10.1007/978-3-642-02541-9_5

    Book  Google Scholar 

  2. Yuan, Ye., Sun, P., and Fan, H., Automatic selection and evaluation on data mining algorithms, 2015 6th IEEE Int. Conf. on Software Engineering and Service Science (ICSESS), Beijing, 2015, IEEE, 2015, pp. 29–32. https://doi.org/10.1109/icsess.2015.7339000

  3. Porter, A.L. and Zhang, Y., Tech mining of science & technology information resources for future-oriented technology analyses, Futures Res. Methodology Version, 2015, vol. 3.

    Google Scholar 

  4. Zhu, D. and Porter, A.L., Automated extraction and visualization of information for technological intelligence and forecasting, Technol. Forecast. Soc. Change, 2002, vol. 69, no. 5, pp. 495–506. https://doi.org/10.1016/s0040-1625(01)00157-3

    Article  Google Scholar 

  5. Osipov, G., Smirnov, I., Tikhomirov, I., Sochenkov, I., Shelmanov, A., and Shvets, A., Information retrieval for R&D support, Professional Search in the Modern World, Paltoglou, G., Loizides, F., and Hansen, P., Eds., Lecture Notes in Computer Science, vol. 8830, Cham: Springer, 2014, pp. 45–69. https://doi.org/10.1007/978-3-319-12511-4_4

    Book  Google Scholar 

  6. Newman, N.C., Porter, A.L., Newman, D., Trumbach, Ch.C., and Bolan, S.D., Comparing methods to extract technical content for technological intelligence, J. Eng. Technol. Manage., 2014, vol. 32, pp. 97–109. https://doi.org/10.1016/j.jengtecman.2013.09.001

    Article  Google Scholar 

  7. Tseng, Yu., Lin, C., and Lin, Y.I., Text mining techniques for patent analysis, Inf. Process. Manage., 2007, vol. 43, no. 5, pp. 1216–1247. https://doi.org/10.1016/j.ipm.2006.11.011

    Article  Google Scholar 

  8. Cooke, P., Gomez Uranga, M.G., and Etxebarria, G., Regional innovation systems: Institutional and organisational dimensions, Res. Policy, 1997, vol. 26, nos. 4–5, pp. 475–491. https://doi.org/10.1016/s0048-7333(97)00025-5

    Article  Google Scholar 

  9. Kwakkel, J.H., Carley, S., Chase, J., and Cunningham, S.W., Visualizing geo-spatial data in science, technology and innovation, Technol. Forecast. Soc. Change, 2014, vol. 81, pp. 67–81. https://doi.org/10.1016/j.techfore.2012.09.007

    Article  Google Scholar 

  10. Feldman, R., Fresko, M., Kinar, Ya., Lindell, Ye., Liphstat, O., Rajman, M., Schler, Yo., and Zamir, O., Text mining at the term level, Principles of Data Mining and Knowledge Discovery, Żytkow, J.M. and Quafafou, M., Eds., Lecture Notes in Computer Science, vol. 1510, Berlin: Springer, 1998, pp. 65–73. https://doi.org/10.1007/bfb0094806

    Book  Google Scholar 

  11. Averbuch, M., Context-sensitive medical information retrieval, MEDINFO 2004, Fieschi, M., Coiera, E., and Li, Yu-Ch.J., Eds., Studies in Health Technology and Informatics, vol. 107, IOS Press, 2004, pp. 282–286. https://doi.org/10.3233/978-1-60750-949-3-282

  12. Osipov, G., Smirnov, I., Tikhomirov, I., Sochenkov, I., and Shelmanov, A., Exactus expert—search and analytical engine for research and development support, Novel Applications of Intelligent Systems, Hadjiski, M., Kasabov, N., Filev, D., and Jotsov, V., Eds., Studies in Computational Intelligence, vol. 586, Cham: Springer, 2016, pp. 269–285. https://doi.org/10.1007/978-3-319-14194-7_14

  13. Church, K.W., A stochastic parts program and noun phrase parser for unrestricted text, Int. Conf. on Acoustics, Speech, and Signal Processing, Glasgow, 1988, IEEE, 1988, pp. 695–698. https://doi.org/10.1109/icassp.1989.266522

  14. Wang, B., Liu, S., Ding, K., Liu, Z., and Xu, J., Identifying technological topics and institution-topic distribution probability for patent competitive intelligence analysis: a case study in LTE technology, Scientometrics, 2014, vol. 101, no. 1, pp. 685–704. https://doi.org/10.1007/s11192-014-1342-3

    Article  Google Scholar 

  15. Frantzi, K., Ananiadou, S., and Mima, H., Automatic recognition of multi-word terms: The C-value/NC-value method, Int. J. Digital Libr., 2000, vol. 3, no. 2, pp. 115–130. https://doi.org/10.1007/s007999900023

    Article  Google Scholar 

  16. Javed, Z. and Afzal, H., Biomedical text mining for concept identification from traditional medicine literature, 2014 Int. Conf. on Open Source Systems & Technologies, Lahore, Pakistan, 2014, IEEE, 2014, pp. 206–211. https://doi.org/10.1109/icosst.2014.7029345

  17. Rose, S., Engel, D., Cramer, N., and Cowley, W., Automatic keyword extraction from individual documents, Text Mining: Applciations and Theory, Berry, M.W. and Kogan, J., Eds., John Wiley & Sons, 2010, pp. 1–20. https://doi.org/10.1002/9780470689646.ch1

    Book  Google Scholar 

  18. Salton, G. and Yu, C.T., On the construction of effective vocabularies for information retrieval, ACM SIGPLAN Not., 1973, vol. 10, no. 1, pp. 48–60. https://doi.org/10.1145/951787.951766

    Article  Google Scholar 

  19. Liu, C., Sheng, Ya., Wei, Z., and Yang, Yo., Research of text classification based on improved TF-IDF algorithm, 2018 IEEE Int. Conf. of Intelligent Robotic and Control Engineering (IRCE), Lanzhou, China, 2018, IEEE, 2018, pp. 218–222. https://doi.org/10.1109/irce.2018.8492945

  20. Kutuzov, A., Kuzmenko, E., and Pivovarova, L., Clustering of Russian adjective-noun constructions using word embeddings, Proc. 6th Workshop on Balto-Slavic Natural Language Processing, Valencia: Association for Computational Linguistics, 2017, pp. 3–13. https://doi.org/10.18653/v1/w17-1402

  21. Kumar, G. and Kumar, K., An information theoretic approach for feature selection, Secur. Commun. Networks, 2013, vol. 5, no. 2, pp. 178–185. https://doi.org/10.1002/sec.303

    Article  Google Scholar 

  22. Turney, P.D., Mining the Web for synonyms: PMI-IR versus LSA on TOEFL, Machine Learning: ECML 2001, Da Raedt, L. and Flach, P., Eds., Lecture Notes in Computer Science, vol. 2167, Berlin: Springer, 2001, pp. 491–502. https://doi.org/10.1007/3-540-44795-4_42

  23. Ahmad, K. and Davies, A.E., Weirdness in special-language text: Welsh radioactive chemicals texts as an exemplar, Int. Inst. Terminologieforschung J., 1994, vol. 5, no. 2, pp. 22–52.

    Google Scholar 

  24. Steinhaus, H., Sur la division des corps materiels en parties, Bull. Acad. Polon. Sci., C, 1956, vol. 4, pp. 801–804.

  25. Han, J., Kamber, M., and Pei, J., Classification, The Morgan Kaufmann Series in Data Management, San Francisco: Morgan Kaufmann, 2001. https://doi.org/10.1016/C2009-0-61819-5

    Book  Google Scholar 

  26. Bae, S. and Yi, Yo., Acceleration of word2vec using GPUs, Neural Information Processing. ICONIP 2016, Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., and Liu, D., Eds., Lecture Notes in Computer Science, vol. 9948, Cham: Springer, 2016, pp. 269–279. https://doi.org/10.1007/978-3-319-46672-9_31

    Book  Google Scholar 

  27. Waskom, M.L., Seaborn: Statistical data visualization, J. Open Source Software, 2021, vol. 6, no. 60, p. 3021. https://doi.org/10.21105/joss.03021

    Article  ADS  Google Scholar 

Download references

Funding

The study by the Analytical Center at the Government of the Russian Federation, agreement no. 000000D730321P5Q0002, and by the HSE University, agreement dated November 2, 2021, no. 70-2021-00139, within a grant for support of research centers in the area of artificial intelligence, including the area of strong artificial intelligence, systems of authorized artificial intelligence, and ethical aspects of application of artificial intelligence.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to P. A. Lobanova, I. F. Kuzminov, E. Yu. Karatetskaia, E. A. Sabidaeva or V. V. Anpilogov.

Ethics declarations

The authors declare that they have no conflicts of interest.

Additional information

Publisher’s Note.

Allerton Press remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lobanova, P.A., Kuzminov, I.F., Karatetskaia, E.Y. et al. Trend Detection Using NLP as a Mechanism of Decision Support. Sci. Tech. Inf. Proc. 50, 440–448 (2023). https://doi.org/10.3103/S0147688223050106

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.3103/S0147688223050106

Keywords:

Navigation