Abstract
Data sparsity has long been a problem in 3D geological modeling work. The geometric, topological, and attribute information of geological bodies in geological reports provide important constraint information during 3D geological modeling. However, manually extracting complex and diverse constraint knowledge from a large amount of textual data is a challenging and time-consuming task. The development of information extraction and text mining technology has made it possible to automatically extract textual constraint information. To this end, this study firstly summarized the textual description characteristics of geological body constraint information in geological reports, and used a span-based tagging scheme for data annotation; Secondly, a span-based joint entity and relation extraction framework was introduced to extract constraint information in geological 3D modeling, which improves the extraction capability of the geological modeling constraint information by obtaining deep semantic information of the characters through the BERT model, in addition, the model has the joint extraction capabilities of entity classification and relation classification on candidate entities; Finally, in the experiments study, a Chinese geological survey report was used as training data for evaluation, and we validated our method’s effectiveness through comparison of our results to those of different models. We further compared and analyzed the impact of different parameters and span representations on our model’s performance.
Similar content being viewed by others
Data availability
The data in this manuscript have not been published elsewhere.
References
Brunsting S, De Sterck H, Dolman R et al (2016) GeoTextTagger: High-Precision Location Tagging of Textual Documents using a Natural Language Processing Approach
Budi I, Bressan S (2003) Association rules mining for name entity recognition[C]// Proceedings of the International Conference on Web Information Systems Engineering
Cakaloglu T, Szegedy C, Xu X (2020) Text embeddings for retrieval from a large knowledge base[C]// Proceedings of the International Conference on Research Challenges in Information Science 338–351
Chen Z, Guo C (2022) A pattern-first pipeline approach for entity and relation extraction. Neurocomputing 494:182–191
Chen Q, Liu G, He Z et al (2020) Current situation and prospect of structure-attribute integrated 3D geological modeling technology for geological big data. Bull Geol Sci Technol 39(4):51–58
Eberts M, Ulges A (2020) Span-Based Joint Entity and Relation Extraction with Transformer Pre-Training[C]// Proceedings of the European Conference on Artificial Intelligence
Enkhsaikhan M, Holden EJ, Duuring P et al (2021) Understanding Ore-Forming Conditions using Machine Reading of Text. Ore Geol Rev 135(2):104200
Enkhsaikhan M, Liu W, Holden E-J, et al (2018) Towards Geological Knowledge Discovery Using Vector-Based Semantic Similarity[C]// Proceedings of the International Conference on Advanced Data Mining and Applications, Cham 224–237
Fan R, Wang L, Yan J et al (2019) Deep Learning-Based Named Entity Recognition and Knowledge Graph Construction for Geological Hazards. Int J Geo-Inf 9(1):15
Garcia LF, Abel M, Perrin M et al (2020) The GeoCore ontology: A core ontology for general use in Geology. Comput Geosci 135:104387
Gil Y, Hill M, Horel J et al (2018) Intelligent systems for geosciences: An essential research agenda. Commun ACM 62(1):76–84
Goyal A, Gupta V, Kumar M (2018) Recent Named Entity Recognition and Classification techniques: A systematic review. Computer Science Review 29(AUG.): 21–43
Gupta P, Roth B, Schütze H (2018) Joint Bootstrapping Machines for High Confidence Relation Extraction. arXiv e-prints
Gusenbauer M (2019) Google Scholar to overshadow them all? Comparing the sizes of 12 academic search engines and bibliographic databases. Scientometrics 118(1):177–214
Hao M, Li M, Zhang J et al (2021) Research on 3D geological modeling method based on multiple constraints. Earth Sci Inf 14(1):291–297
Holden E-J, Liu W, Horrocks T et al (2019) GeoDocA – Fast analysis of geological content in mineral exploration reports: A text mining approach. Ore Geol Rev 111:102919
Hou Z, Zhu Y, Gao X et al (2015) A Chinese geological time scale ontology for geodata discovery[C]// Proceedings of the 2015 23rd International Conference on Geoinformatics 1–5
Huang L, Du Y, Chen G (2015) GeoSegmenter: A statistically learned Chinese word segmenter for the geoscience domain. Comput Geosci 76:11–17
Huang W, Mao Y, Yang L et al (2021) Local-to-global GCN with knowledge-aware representation for distantly supervised relation extraction. Knowl-Based Syst (Dec.25): 234
Li W, Wu L, Xie Z et al (2019) Ontology-based question understanding with the constraint of Spatio-temporal geological knowledge. Earth Sci Inf 12(4):599–613
Li Z, Pan M, Han D et al (2016) Three-Dimensional Structural Modeling Technique. Earth Sci 41(12):2136–2146
Liu C, Yang S (2022) Using text mining to establish knowledge graph from accident/incident reports in risk assessment. Expert Syst Appl 207:117991
Liu P, Guo Y, Wang F et al (2022) Chinese named entity recognition: The state of the art. Neurocomputing 473:37–53
Liu W, Wen Y, Yu Z et al (2016) Large-Margin Softmax Loss for Convolutional Neural Networks. JMLRorg
Ma X (2022) Knowledge graph construction and application in geosciences: A review. Comput Geosci 161:105082
Ma Y, Xie Z, Li G et al (2022) Text visualization for geological hazard documents via text mining and natural language processing. Earth Sci Inf 15(1):439–454
Mai G, Janowicz K, Cai L, et al (2020) SE-KGE: A Location-Aware Knowledge Graph Embedding Model for Geographic Question Answering and Spatial Semantic Lifting
Olierook H, Scalzo R, Kohn D et al (2021) Bayesian geological and geophysical data fusion for the construction and uncertainty quantification of 3D geological models. Geosci Front 12(1):479–493
Qin Z, Ye F (2018) Research on Reliability of Instance and Pattern in Semi-supervised Entity Relation Extraction[C]// Proceedings of the International Conference on Intelligent Computing, Communication and Devices
Qiu Q, Zhong X, Liang W (2018a) A cyclic self-learning Chinese word segmentation for the geoscience domain. Geomatica 72(1):16–26
Qiu Q, Xie Z, Wu L et al (2018b) DGeoSegmenter: A dictionary-based Chinese word segmenter for the geoscience domain. Comput Geosci 121:1–11
Qiu Q, Xie Z, Wu L et al (2019) GNER: A Generative Model for Geological Named Entity Recognition Without Labeled Data Using Deep Learning. Earth and Space Science 6(6):931–946
Qiu Q, Xie Z, Wu L et al (2020) Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Sci Inf 13(4):1393–1410
Qiu Q, Ma K, Zhu H et al (2022a) BERT-based Method and Significance of Constraint Information Extraction for 3D Geological Modelling. Northwest Geol 55(4):124–132
Qiu Q, Xie Z, Ma K et al (2022b) Spatially oriented convolutional neural network for spatial relation extraction from natural language texts. Trans GIS 26(2):839–866
Qiu Q, Ma K, Lv H et al (2023a) Construction and application of a knowledge graph for iron deposits using text mining analytics and a deep learning algorithm. Math Geosci 55(3):423–456
Qiu Q, Wang B, Ma K et al (2023b) Geological profile-text information association model of mineral exploration reports for fast analysis of geological content. Ore Geol Rev 153:105278
Sobhana N, Mitra P, Ghosh S (2010) Conditional random field based named entity recognition in geological text. Int J Comput Appl 1(3):143–147
Sobhana NV, Ghosh SK, Mitra P (2012) Entity Relation Extraction from geological text using Conditional Random Fields and subsequence kernels[C]// Proceedings of the India Conference (INDICON), Annual IEEE 2013
Sun Q, Zhang K, Lv L et al (2022) Joint extraction of entities and overlapping relations by improved graph convolutional networks. Appl Intell 52(5):5212–5224
Wan Q, Wei L, Chen X et al (2021) A region-based hypergraph network for joint entity-relation extraction. Knowl-Based Syst 10:107298
Wang B, Wu L, Xie Z et al (2022a) Understanding geological reports based on knowledge graphs using a deep learning approach. Comput Geosci 168:105229
Wang B, Ma K, Wu L et al (2022b) Visual analytics and information extraction of geological content for text-based mineral exploration reports. Ore Geol Rev 144:104818
Wang L, Li Z, Zheng X (2021) Unsupervised Word Segmentation with Bi-directional Neural Language Model. ACM Transactions on Asian and Low-Resource Language Information Processing 22(1):1–16
Wang C, Li Y, Chen J (2023) Text mining and knowledge graph construction from geoscience literature legacy: A review. Geol Soc Am Spec 558:11–28
Wang C, Xiaogang, et al (2018) Information extraction and knowledge graph construction from geoscience literature. Comput Geosci 112:112–120
Wei D, Jiang B, Zhang J (2021) Research on content storage method for unstructured geological data. Northwest Geol 54(04):266–273
Yong PC, Nordholm, et al (2013) Optimization and evaluation of sigmoid function with a priori SNR estimate for real-time speech enhancement. Speech Commun 55(2):358–376
Yue K, Xu F, Yu J (2017) Shallow and wide fractional max-pooling network for image classification. Neural Comput Appl 31:409–419
Zhan X, Lu C, Hu G. 3D structural modeling for seismic exploration based on knowledge graphs. Geophysics, 2022, 87(3): IM81-IM100
Zhang X, Zhang J, Tian Y et al (2020) Urban geological 3D modeling based on papery borehole log. ISPRS Int J Geo Inf 9(6):389
Zhang C, Zhang X, Jiang W et al (2009) Rule-Based Extraction of Spatial Relations in Natural Language Text[C]// Proceedings of the International Conference on Computational Intelligence & Software Engineering
Zhong DY, Wang LG, Lin BI et al (2019) Implicit modeling of complex orebody with constraints of geological rules. Transa Nonferrous Metals Soc China 29(11):2392–2399
Zhuang C, Li W, Xie Z et al (2021) A multi-granularity knowledge association model of geological text based on hypernetwork. Earth Sci Inf 14(1):227–246
Zhuang C, Zhu H, Wang W et al (2023) Research on urban 3D geological modeling based on multi-modal data fusion: a case study in Jinan China. Earth Science Informatics 16(1):549–563
Acknowledgements
We would like to thank the anonymous reviewers for carefully reading this paper and their very useful comments. We thank the Shandong Institute of Geological Survey for providing data support.
Funding
This research was funded by Perspective on Shandong——Geological Information Integration and Comprehensive Utilization Project grant number LuKanZi (2022) No. 16 and Shandong Province Science and Technology Small and Medium-sized Enterprises Innovation Ability En-hancement Project grant number 2023TSGC0094..
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material preparation and data col-lection, C.Z, C.L, H.Z, B.L; methodology, C.Z, C.L, H.Z, Y.M; performed the experiments, C.Z, H.Z, G.S, Z.L; analyzed the data, C.Z, H.Z, Z.L, B.L; writing—original draft preparation, C.Z, C.L, H.Z; writing—review and editing, Y.M, G.S, Z.L, B.L. All authors reviewed the final manuscript.
Corresponding author
Ethics declarations
Ethical approval and consent to participate
Not applicable.
Consent for publication
Written informed consent for publication was obtained from all participants.
Competing interests
The authors declare no competing interests.
Additional information
Communicated by H. Babaie
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhuang, C., Liu, C., Zhu, H. et al. Constraint information extraction for 3D geological modelling using a span-based joint entity and relation extraction model. Earth Sci Inform 17, 985–998 (2024). https://doi.org/10.1007/s12145-024-01245-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12145-024-01245-2