Abstract
Address matching, which aims to match unstructured addresses with standard addresses in an address database, is a key part of geocoding. The core problem of address matching corresponds to text matching in natural language processing. Existing rule-based methods require human-designed templates and thus, have limited applicability. Machine learning and deep learning-based methods ignore the hierarchical relations between address elements, which easily misclassify semantically similar but geographically different locations. We note that the hierarchy of address elements can fill the semantic gap in address matching. Inspired by how humans discriminate addresses, we propose a multi-task learning approach. The approach jointly recognises the address elements and matches the addresses to incorporate the hierarchical relations between the address elements into the neural network. Simultaneously, we introduce a priori information on the hierarchical relationship of address elements through the conditional random field model. Experimental results on the benchmark datasets Shenzhen Address Database and Jiangsu-Hunan Address Dataset demonstrate the effectiveness of our approach. We achieved state-of-the-art F1 scores (i.e. the harmonic mean of precision and recall) of 99.0 and 94.2 on the two datasets, respectively.
Similar content being viewed by others
Availability of data and materials
The 498,294 records of the corpus derived from the Shenzhen Address Database are available in Zenodo with the identifiers https://doi.org/10.5281/zenodo.3477007. Complete corpus from the Jiangsu–Hunan Address Dataset cannot be made publicly available to protect personal information and to follow the national policy on data security.
Code availability
The codes that support the findings of this study are available with the identifier(s) at the private link: https://figshare.com/s/a815fddc2429d4bd6cb2.
References
Drummond WJ (1995) Address matching: GIS technology for mapping human activity patterns. J Am Plann Assoc 61(2):240–251. https://doi.org/10.1080/01944369508975636
Edwards SE, Strauss B, Miranda ML (2014) Geocoding large population-level administrative datasets at highly resolved spatial scales. Trans GIS 18(4):586–603. https://doi.org/10.1111/tgis.12052
Hu W, Dang A, Tan Y (2019) A survey of state-of-the-art short text matching algorithms. In: Tan Y, Shi Y (eds) Data mining and big data. DMBD 2019. Communications in computer and information science, vol 1071. Springer, Singapore. https://doi.org/10.1007/978-981-32-9563-6_22
Zhang Y, Yang Q (2021) A survey on multi-task learning. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2021.3070203
Lafferty J, McCallum A, Pereira FCN (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th International Conference on Machine Learning 2001 (ICML 2001)
Tian Q, Ren F, Hu T, Liu J, Li R, Du Q (2016) Using an optimized Chinese address matching method to develop a geocoding service: a case study of Shenzhen, China. ISPRS Int J Geo-Inf 5(5):65. https://doi.org/10.3390/ijgi5050065
Koumarelas I, Kroschk A, Mosley C, Naumann F (2018) Experience: enhancing address matching with geocoding and similarity measure selection. J Data Inf Qual (JDIQ) 10(2):1–16. https://doi.org/10.1145/3232852
Santos R, Murrieta-Flores P, Martins B (2018) Learning to combine multiple string similarity metrics for effective toponym matching. Int J Digit Earth 11(9):913–938. https://doi.org/10.1080/17538947.2017.1371253
Zhou X, Li Y, Liang W (2020) CNN-RNN based intelligent recommendation for online medical pre-diagnosis support. IEEE/ACM Trans Comput Biol Bioinform 18(3):912–921. https://doi.org/10.1109/tcbb.2020.2994780
Yao Y, Li X, Liu X, Liu P, Liang Z, Zhang J, Mai K (2017) Sensing spatial distribution of urban land use by integrating points-of-interest and Google Word2Vec model. Int J Geogr Inf Sci 31(4):825–848. https://doi.org/10.1080/13658816.2016.1244608
Li H, Lu W et al (2019) Neural Chinese address parsing. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL2019)
Srivastava S, Vargas Munoz JE, Lobry S, Tuia D (2020) Fine-grained landuse characterization using ground-based pictures: a deep learning solution based on globally available data. Int J Geogr Inf Sci 34(6):1117–1136. https://doi.org/10.1080/13658816.2018.1542698
Wang Y, Wang Q, Suo D et al (2020) Intelligent traffic monitoring and traffic diagnosis analysis based on neural network algorithm. Neural Comput Appl. https://doi.org/10.1007/s00521-020-04899-3
Li S, Chen J, Xiang J (2020) Applications of deep convolutional neural networks in prospecting prediction based on two-dimensional geological big data. Neural Comput Appl 32:2037–2053. https://doi.org/10.1007/s00521-019-04341-3
Acheson E, Volpi M, Purves RS (2020) Machine learning for cross-gazetteer matching of natural features. Int J Geogr Inf Sci 34(4):708–734. https://doi.org/10.1080/13658816.2019.1599123
Comber S, Arribas-Bel D (2019) Machine learning innovations in address matching: a practical comparison of word2vec and CRFs. Trans GIS 23(2):334–348. https://doi.org/10.1111/tgis.12522
Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: International conference on machine learning (PMLR)
Santos R, Murrieta-Flores P, Calado P, Martins B (2018) Toponym matching through deep neural networks. Int J Geogr Inf Sci 32(2):324–348. https://doi.org/10.1080/13658816.2017.1390119
Lin Y, Kang M, Wu Y, Du Q, Liu T (2020) A deep learning architecture for semantic address matching. Int J Geogr Inf Sci 34(3):559–576. https://doi.org/10.1080/13658816.2019.1681431
Chen Q, Zhu X, Ling Z, Wei S, Jiang H, Inkpen D (2017) Enhanced lstm for natural language inference. In: Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/P17-1152
Shi C, Cheng Y, Wang J, Wang Y, Mori K, Tamura S (2017) Low-rank and sparse decomposition based shape model and probabilistic atlas for automatic pathological organ segmentation. Med Image Anal 38:30–49. https://doi.org/10.1016/j.media.2017.02.008
Zhou X, Liang W, Wang K, Wang H, Yang L, Jin Q (2020) Deep-learning-enhanced human activity recognition for Internet of healthcare things. IEEE Int Things J 7(7):6429–6438. https://doi.org/10.1109/jiot.2020.2985082
Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw 18(5–6):602–610. https://doi.org/10.1016/j.neunet.2005.06.042
Schmidhuber J, Hochreiter S (1997) Long short-term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Panchendrarajan R, Amaresan A (2018) Bidirectional LSTM-CRF for named entity recognition. In: PACLIC
Rabiner LR (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 77(2):257–286. https://doi.org/10.1109/5.18626
McCallum A, Freitag D, Pereira FC (2000) Maximum entropy Markov models for information extraction and segmentation. In: Proceedings of the Seventeenth International Conference on Machine Learning (ICML)
Nadeau D, Sekine S (2007) A survey of named entity recognition and classification. Lingvisticae Investig 30(1):3–26. https://doi.org/10.1075/li.30.1.03nad
Lai S, Xu L, Liu K, Zhao J (2015) Recurrent convolutional neural networks for text classification. In: Proceedings of the AAAI Conference on Artificial Intelligence
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics, in proceedings of machine learning research, pp 315–323
Collobert R, Weston J (2008) A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th international conference on Machine learning, pp 160–167
Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions, and reversals. In: Soviet physics doklady, pp 707–710
Jaccard P (1908) Nouvelles recherches sur la distribution florale. Bull Soc Vaud Sci Nat 44:223–270
Yue L, Mengjun K (2019) Shenzhen address corpus (part) (Version v1.0). Zenodo. https://doi.org/10.5281/zenodo.3477633
Breiman L (2001) Random forests. Mach Learn 45(1):5–32. https://doi.org/10.1023/A:1010933404324
Hearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B (1998) Support vector machines. IEEE Intell Syst Appl 13(4):18–28. https://doi.org/10.1109/5254.708428
Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. In: Proceedings of the 31st international conference on Neural Information Processing Systems (NIPS'17), pp 6000–6010
Bowman SR, Gauthier J, Rastogi A et al (2016) A fast unified model for parsing and sentence understanding. In: Proceedings of the 54th annual meeting of the Association for Computational Linguistics (ACL), pp 1466–1477. https://doi.org/10.18653/v1/P16-1139
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: International Conference on Learning Representations (ICLR 2015)
Powers D (2011) Evaluation: from precision, recall and F-measure to ROC, informedness, markedness & correlation. J Mach Learn Technol 2:37–63
Lujan-Moreno GA, Howard PR, Rojas OG, Montgomery DC (2018) Design of experiments and response surface methodology to tune machine learning hyperparameters, with a random forest case-study. Expert Syst Appl 109:195–205. https://doi.org/10.1016/j.eswa.2018.05.024
Funding
This research is supported by National Natural Science Foundation of China [62172449, 71790615, 62006251, 62172441], Hunan Provincial Natural Science Foundation of China [2021JJ30870, 2021JJ40783, 2020JJ4746], Changsha Municipal Natural Science Foundation [kq2014134] and National Key Research and Development Program of China [2020YFC0832700]. This work was supported in part by the High Performance Computing Center of Central South University.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by FL, YL, XM, JD and XL. The first draft of the manuscript was written by YL, and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
No potential conflict of interest was reported by the authors.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Ethics approval
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, F., Lu, Y., Mao, X. et al. Multi-task deep learning model based on hierarchical relations of address elements for semantic address matching. Neural Comput & Applic 34, 8919–8931 (2022). https://doi.org/10.1007/s00521-022-06914-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-06914-1