Skip to main content

Automatic Identification of Relations in Quebec Heritage Data

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11196))

Abstract

Heritage data is often represented in unstructured format, especially textual data. In this paper, our objective is to extract instances of predefined relations between persons and real estates from historical notices in French. Using several vector-based representations and supervised learning algorithms, we build classifiers able to achieve an F-measure between 75% to 85% for relation detection. Our results show that performances are highly dependent on the type of relation, and also on the specific evaluation metrics. Our best results are obtained using a TF-IDF vector representation with a support vector machine classifier or Word2Vec vectors combined with a multilayer perceptron classifier.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://www.patrimoine-culturel.gouv.qc.ca/rpcq/detail.do?methode=consulter&id=100101&type=bien.

  2. 2.

    https://wiki.dbpedia.org/.

  3. 3.

    https://wordnet.princeton.edu/.

  4. 4.

    http://www.cidoc-crm.org/.

  5. 5.

    http://www.patrimoine-culturel.gouv.qc.ca/.

  6. 6.

    http://scikit-learn.org/.

  7. 7.

    radimrehurek.com/gensim/models/word2vec.html.

References

  1. Vlachidis, A., Tudhope, D.: A knowledge-based approach to information extraction for semantic interoperability in the archaeology domain. J. Assoc. Inf. Sci. Technol. 67(5), 1138–1152 (2016)

    Article  Google Scholar 

  2. Augenstein, I., Padó, S., Rudolph, S.: LODifier: generating linked data from unstructured text. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 210–224. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-30284-8_21

    Chapter  Google Scholar 

  3. Benson, E., Haghighi, A., Barzilay, R.: Event discovery in social media feeds. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 389–398. Association for Computational Linguistics (2011)

    Google Scholar 

  4. Buranasing, W., Phoomvuthisarn, S., Buranarach, M.: Information extraction and integration for enriching cultural heritage collections. In: 2016 11th International Conference on Knowledge, Information and Creativity Support Systems (KICSS), pp. 1–6, November 2016

    Google Scholar 

  5. Byrne, K., Klein, E.: Automatic extraction of archaeological events from text, April 2009

    Google Scholar 

  6. Doulamis, N.D., Doulamis, A.D., Kokkinos, P., Varvarigos, E.M.: Event detection in Twitter microblogging. IEEE Trans. Cybern. 46(12), 2810–2824 (2016)

    Article  Google Scholar 

  7. Nie, T., Shen, D., Kou, Y., Yu, G., Yue, D.: An entity relation extraction model based on semantic pattern matching. In: 2011 Eighth Web Information Systems and Applications Conference (WISA), pp. 7–12. IEEE (2011)

    Google Scholar 

  8. Odat, S., Groza, T., Hunter, J.: Extracting structured data from publications in the art conservation domain. Digit. Scholarsh. Humanit. 30(2), 225–245 (2014)

    Article  Google Scholar 

  9. Petit, J., Boisson, J.C., Rousseaux, F.: Discovering cultural conceptual structures from texts for ontology generation. In: 2017 4th International Conference on Control, Decision and Information Technologies (CoDIT), pp. 0225–0229. IEEE (2017)

    Google Scholar 

  10. Schöch, C.: A Word2Vec model file built from the French Wikipedia XML Dump using gensim, October 2016

    Google Scholar 

  11. Song, S., Sun, Y., Di, Q.: Multiple order semantic relation extraction. Neural Comput. Appl. 1–14 (2018)

    Google Scholar 

  12. Zahedi, M., Kahani, M.: SREC: discourse-level semantic relation extraction from text. Neural Comput. Appl. 23(6), 1573–1582 (2013)

    Article  Google Scholar 

  13. Zheng, S., Jiaming, X., Zhou, P., Bao, H., Qi, Z., Xu, B.: A neural network framework for relation extraction: learning entity semantic and relation pattern. Knowl.-Based Syst. 114, 12–23 (2016)

    Article  Google Scholar 

Download references

Acknowledgements

This work has been funded by the Quebec Ministry of Culture and Communication.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amal Zouaq .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ferry, F., Zouaq, A., Gagnon, M. (2018). Automatic Identification of Relations in Quebec Heritage Data. In: Ioannides, M., et al. Digital Heritage. Progress in Cultural Heritage: Documentation, Preservation, and Protection. EuroMed 2018. Lecture Notes in Computer Science(), vol 11196. Springer, Cham. https://doi.org/10.1007/978-3-030-01762-0_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-01762-0_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-01761-3

  • Online ISBN: 978-3-030-01762-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics