Abstract
Heritage data is often represented in unstructured format, especially textual data. In this paper, our objective is to extract instances of predefined relations between persons and real estates from historical notices in French. Using several vector-based representations and supervised learning algorithms, we build classifiers able to achieve an F-measure between 75% to 85% for relation detection. Our results show that performances are highly dependent on the type of relation, and also on the specific evaluation metrics. Our best results are obtained using a TF-IDF vector representation with a support vector machine classifier or Word2Vec vectors combined with a multilayer perceptron classifier.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
References
Vlachidis, A., Tudhope, D.: A knowledge-based approach to information extraction for semantic interoperability in the archaeology domain. J. Assoc. Inf. Sci. Technol. 67(5), 1138–1152 (2016)
Augenstein, I., Padó, S., Rudolph, S.: LODifier: generating linked data from unstructured text. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 210–224. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-30284-8_21
Benson, E., Haghighi, A., Barzilay, R.: Event discovery in social media feeds. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 389–398. Association for Computational Linguistics (2011)
Buranasing, W., Phoomvuthisarn, S., Buranarach, M.: Information extraction and integration for enriching cultural heritage collections. In: 2016 11th International Conference on Knowledge, Information and Creativity Support Systems (KICSS), pp. 1–6, November 2016
Byrne, K., Klein, E.: Automatic extraction of archaeological events from text, April 2009
Doulamis, N.D., Doulamis, A.D., Kokkinos, P., Varvarigos, E.M.: Event detection in Twitter microblogging. IEEE Trans. Cybern. 46(12), 2810–2824 (2016)
Nie, T., Shen, D., Kou, Y., Yu, G., Yue, D.: An entity relation extraction model based on semantic pattern matching. In: 2011 Eighth Web Information Systems and Applications Conference (WISA), pp. 7–12. IEEE (2011)
Odat, S., Groza, T., Hunter, J.: Extracting structured data from publications in the art conservation domain. Digit. Scholarsh. Humanit. 30(2), 225–245 (2014)
Petit, J., Boisson, J.C., Rousseaux, F.: Discovering cultural conceptual structures from texts for ontology generation. In: 2017 4th International Conference on Control, Decision and Information Technologies (CoDIT), pp. 0225–0229. IEEE (2017)
Schöch, C.: A Word2Vec model file built from the French Wikipedia XML Dump using gensim, October 2016
Song, S., Sun, Y., Di, Q.: Multiple order semantic relation extraction. Neural Comput. Appl. 1–14 (2018)
Zahedi, M., Kahani, M.: SREC: discourse-level semantic relation extraction from text. Neural Comput. Appl. 23(6), 1573–1582 (2013)
Zheng, S., Jiaming, X., Zhou, P., Bao, H., Qi, Z., Xu, B.: A neural network framework for relation extraction: learning entity semantic and relation pattern. Knowl.-Based Syst. 114, 12–23 (2016)
Acknowledgements
This work has been funded by the Quebec Ministry of Culture and Communication.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Ferry, F., Zouaq, A., Gagnon, M. (2018). Automatic Identification of Relations in Quebec Heritage Data. In: Ioannides, M., et al. Digital Heritage. Progress in Cultural Heritage: Documentation, Preservation, and Protection. EuroMed 2018. Lecture Notes in Computer Science(), vol 11196. Springer, Cham. https://doi.org/10.1007/978-3-030-01762-0_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-01762-0_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01761-3
Online ISBN: 978-3-030-01762-0
eBook Packages: Computer ScienceComputer Science (R0)