Skip to main content
Log in

An Iterative Approach to the Terminology Extraction from Ukrainian-Language Scientific Text Corpora

  • Published:
Cybernetics and Systems Analysis Aims and scope

Abstract

This article describes a combined method for the acquisition of valuable terms and relations from raw texts with the help of an iterative algorithm for automated terminology extraction from Ukrainian-language scientific texts. Special attention is paid to the analysis of lexicographical features of characteristic text fragments of documents. Specific features of Ukrainian-language documents are taken into account. The emphasis is on solving the applied problem of terminology acquisition from input texts in the widely used pdf format with obtaining output term relations in the RDF format.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. S. I. Landau, Dictionaries: The Art and Craft of Lexicography [Ukrainian translation], K.I.S., Kyiv (2012).

    Google Scholar 

  2. M. Lassi, Automatic Thesaurus Construction, University College of Boras, Sweden (2002), http://www.academia.edu/506142/Automatic_thesaurus_construction.

  3. Types of Relations in a Thesaurus, Web. 5/10/2014, http://publish.uwo.ca/~craven/677/thesaur/main06.htm.

  4. H. Chen, T. D. Ng, J. Martinez, and B. Schatz, “A concept space approach to addressing the vocabulary problem in scientific information retrieval: An experiment on the worm community system,” J. Amer. Soc. for Inform. Sci. (1997), http://arizona.openrepository. com/arizona/bitstream/10150/105991/1/chen21.pdf.

  5. U. Miller, “Thesaurus construction: Problems and their roots,” Inform. Proc. & Management, 33, No. 4, 481–493 (1997).

    Article  Google Scholar 

  6. “ISO 25964 — the International Standard for Thesauri and Interoperability with Other Vocabularies,” ISO 25964 Thesaurus Schemas, Web. 08 April 2014, http://www.niso.org/schemas/iso25964/.

  7. JSON-LD 1.0, Web. 08 June 2014, http://www.w3.org/TR/json-ld/.

  8. H. Chen, T. Yim, D. Fye, and B. Schatz, “Automatic thesaurus generation for an electronic community system,” J. Amer. Soc. for Inform. Sci., 46, No. 3, 175–193 (1995).

    Article  MATH  Google Scholar 

  9. H. Chen, K. Lynch, K. Basu, and T. D. Ng, “Generating, integrating, and activating thesauri for concept-based document retrieval,” IEEE Expert., 8, No. 2, 25–34 (1993).

    Article  Google Scholar 

  10. G. Grefenstette, Automatic Thesaurus Generation from Raw Text Using Knowledge-Poor Techniques, Rank Xerox Research Centre (1993), http://www.academia.edu/4186829/AUTOMATIC_THESAURUS_GENERATION_FROM_RAW_TEXT_USING_KNOWLEDGE-POOR_TECHNIQUES.

  11. M. A. Hearst, “Automatic acquisition of hyponyms from large text corpora,” in: Proc. 14th Conf. on Comput. Ling. (COLING ‘92), 2, (1992), pp. 539–545.

    Google Scholar 

  12. H. Alshawi, “Processing dictionary definitions with phrasal pattern hierarchies,” Comput. Ling., 13, Nos. 3–4, 195–202 (1987).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. M. Glybovets.

Additional information

Translated from Kibernetika i Sistemnyi Analiz, No. 6, pp. 53–62, November–December, 2014.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Glybovets, A.M., Reshetnov, I.V. An Iterative Approach to the Terminology Extraction from Ukrainian-Language Scientific Text Corpora. Cybern Syst Anal 50, 866–873 (2014). https://doi.org/10.1007/s10559-014-9677-6

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10559-014-9677-6

Keywords

Navigation