Skip to main content
Log in

Multi-task deep learning model based on hierarchical relations of address elements for semantic address matching

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Address matching, which aims to match unstructured addresses with standard addresses in an address database, is a key part of geocoding. The core problem of address matching corresponds to text matching in natural language processing. Existing rule-based methods require human-designed templates and thus, have limited applicability. Machine learning and deep learning-based methods ignore the hierarchical relations between address elements, which easily misclassify semantically similar but geographically different locations. We note that the hierarchy of address elements can fill the semantic gap in address matching. Inspired by how humans discriminate addresses, we propose a multi-task learning approach. The approach jointly recognises the address elements and matches the addresses to incorporate the hierarchical relations between the address elements into the neural network. Simultaneously, we introduce a priori information on the hierarchical relationship of address elements through the conditional random field model. Experimental results on the benchmark datasets Shenzhen Address Database and Jiangsu-Hunan Address Dataset demonstrate the effectiveness of our approach. We achieved state-of-the-art F1 scores (i.e. the harmonic mean of precision and recall) of 99.0 and 94.2 on the two datasets, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Availability of data and materials

The 498,294 records of the corpus derived from the Shenzhen Address Database are available in Zenodo with the identifiers https://doi.org/10.5281/zenodo.3477007. Complete corpus from the Jiangsu–Hunan Address Dataset cannot be made publicly available to protect personal information and to follow the national policy on data security.

Code availability

The codes that support the findings of this study are available with the identifier(s) at the private link: https://figshare.com/s/a815fddc2429d4bd6cb2.

References

  1. Drummond WJ (1995) Address matching: GIS technology for mapping human activity patterns. J Am Plann Assoc 61(2):240–251. https://doi.org/10.1080/01944369508975636

    Article  Google Scholar 

  2. Edwards SE, Strauss B, Miranda ML (2014) Geocoding large population-level administrative datasets at highly resolved spatial scales. Trans GIS 18(4):586–603. https://doi.org/10.1111/tgis.12052

    Article  Google Scholar 

  3. Hu W, Dang A, Tan Y (2019) A survey of state-of-the-art short text matching algorithms. In: Tan Y, Shi Y (eds) Data mining and big data. DMBD 2019. Communications in computer and information science, vol 1071. Springer, Singapore. https://doi.org/10.1007/978-981-32-9563-6_22

    Chapter  Google Scholar 

  4. Zhang Y, Yang Q (2021) A survey on multi-task learning. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2021.3070203

    Article  Google Scholar 

  5. Lafferty J, McCallum A, Pereira FCN (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th International Conference on Machine Learning 2001 (ICML 2001)

  6. Tian Q, Ren F, Hu T, Liu J, Li R, Du Q (2016) Using an optimized Chinese address matching method to develop a geocoding service: a case study of Shenzhen, China. ISPRS Int J Geo-Inf 5(5):65. https://doi.org/10.3390/ijgi5050065

    Article  Google Scholar 

  7. Koumarelas I, Kroschk A, Mosley C, Naumann F (2018) Experience: enhancing address matching with geocoding and similarity measure selection. J Data Inf Qual (JDIQ) 10(2):1–16. https://doi.org/10.1145/3232852

    Article  Google Scholar 

  8. Santos R, Murrieta-Flores P, Martins B (2018) Learning to combine multiple string similarity metrics for effective toponym matching. Int J Digit Earth 11(9):913–938. https://doi.org/10.1080/17538947.2017.1371253

    Article  Google Scholar 

  9. Zhou X, Li Y, Liang W (2020) CNN-RNN based intelligent recommendation for online medical pre-diagnosis support. IEEE/ACM Trans Comput Biol Bioinform 18(3):912–921. https://doi.org/10.1109/tcbb.2020.2994780

    Article  Google Scholar 

  10. Yao Y, Li X, Liu X, Liu P, Liang Z, Zhang J, Mai K (2017) Sensing spatial distribution of urban land use by integrating points-of-interest and Google Word2Vec model. Int J Geogr Inf Sci 31(4):825–848. https://doi.org/10.1080/13658816.2016.1244608

    Article  Google Scholar 

  11. Li H, Lu W et al (2019) Neural Chinese address parsing. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL2019)

  12. Srivastava S, Vargas Munoz JE, Lobry S, Tuia D (2020) Fine-grained landuse characterization using ground-based pictures: a deep learning solution based on globally available data. Int J Geogr Inf Sci 34(6):1117–1136. https://doi.org/10.1080/13658816.2018.1542698

    Article  Google Scholar 

  13. Wang Y, Wang Q, Suo D et al (2020) Intelligent traffic monitoring and traffic diagnosis analysis based on neural network algorithm. Neural Comput Appl. https://doi.org/10.1007/s00521-020-04899-3

    Article  Google Scholar 

  14. Li S, Chen J, Xiang J (2020) Applications of deep convolutional neural networks in prospecting prediction based on two-dimensional geological big data. Neural Comput Appl 32:2037–2053. https://doi.org/10.1007/s00521-019-04341-3

    Article  Google Scholar 

  15. Acheson E, Volpi M, Purves RS (2020) Machine learning for cross-gazetteer matching of natural features. Int J Geogr Inf Sci 34(4):708–734. https://doi.org/10.1080/13658816.2019.1599123

    Article  Google Scholar 

  16. Comber S, Arribas-Bel D (2019) Machine learning innovations in address matching: a practical comparison of word2vec and CRFs. Trans GIS 23(2):334–348. https://doi.org/10.1111/tgis.12522

    Article  Google Scholar 

  17. Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: International conference on machine learning (PMLR)

  18. Santos R, Murrieta-Flores P, Calado P, Martins B (2018) Toponym matching through deep neural networks. Int J Geogr Inf Sci 32(2):324–348. https://doi.org/10.1080/13658816.2017.1390119

    Article  Google Scholar 

  19. Lin Y, Kang M, Wu Y, Du Q, Liu T (2020) A deep learning architecture for semantic address matching. Int J Geogr Inf Sci 34(3):559–576. https://doi.org/10.1080/13658816.2019.1681431

    Article  Google Scholar 

  20. Chen Q, Zhu X, Ling Z, Wei S, Jiang H, Inkpen D (2017) Enhanced lstm for natural language inference. In: Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/P17-1152

  21. Shi C, Cheng Y, Wang J, Wang Y, Mori K, Tamura S (2017) Low-rank and sparse decomposition based shape model and probabilistic atlas for automatic pathological organ segmentation. Med Image Anal 38:30–49. https://doi.org/10.1016/j.media.2017.02.008

    Article  Google Scholar 

  22. Zhou X, Liang W, Wang K, Wang H, Yang L, Jin Q (2020) Deep-learning-enhanced human activity recognition for Internet of healthcare things. IEEE Int Things J 7(7):6429–6438. https://doi.org/10.1109/jiot.2020.2985082

    Article  Google Scholar 

  23. Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw 18(5–6):602–610. https://doi.org/10.1016/j.neunet.2005.06.042

    Article  Google Scholar 

  24. Schmidhuber J, Hochreiter S (1997) Long short-term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735

    Article  Google Scholar 

  25. Panchendrarajan R, Amaresan A (2018) Bidirectional LSTM-CRF for named entity recognition. In: PACLIC

  26. Rabiner LR (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 77(2):257–286. https://doi.org/10.1109/5.18626

    Article  Google Scholar 

  27. McCallum A, Freitag D, Pereira FC (2000) Maximum entropy Markov models for information extraction and segmentation. In: Proceedings of the Seventeenth International Conference on Machine Learning (ICML)

  28. Nadeau D, Sekine S (2007) A survey of named entity recognition and classification. Lingvisticae Investig 30(1):3–26. https://doi.org/10.1075/li.30.1.03nad

    Article  Google Scholar 

  29. Lai S, Xu L, Liu K, Zhao J (2015) Recurrent convolutional neural networks for text classification. In: Proceedings of the AAAI Conference on Artificial Intelligence

  30. Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics, in proceedings of machine learning research, pp 315–323

  31. Collobert R, Weston J (2008) A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th international conference on Machine learning, pp 160–167

  32. Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions, and reversals. In: Soviet physics doklady, pp 707–710

  33. Jaccard P (1908) Nouvelles recherches sur la distribution florale. Bull Soc Vaud Sci Nat 44:223–270

    Google Scholar 

  34. Yue L, Mengjun K (2019) Shenzhen address corpus (part) (Version v1.0). Zenodo. https://doi.org/10.5281/zenodo.3477633

  35. Breiman L (2001) Random forests. Mach Learn 45(1):5–32. https://doi.org/10.1023/A:1010933404324

    Article  MATH  Google Scholar 

  36. Hearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B (1998) Support vector machines. IEEE Intell Syst Appl 13(4):18–28. https://doi.org/10.1109/5254.708428

    Article  Google Scholar 

  37. Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. In: Proceedings of the 31st international conference on Neural Information Processing Systems (NIPS'17), pp 6000–6010

  38. Bowman SR, Gauthier J, Rastogi A et al (2016) A fast unified model for parsing and sentence understanding. In: Proceedings of the 54th annual meeting of the Association for Computational Linguistics (ACL), pp 1466–1477. https://doi.org/10.18653/v1/P16-1139

  39. Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: International Conference on Learning Representations (ICLR 2015)

  40. Powers D (2011) Evaluation: from precision, recall and F-measure to ROC, informedness, markedness & correlation. J Mach Learn Technol 2:37–63

    Google Scholar 

  41. Lujan-Moreno GA, Howard PR, Rojas OG, Montgomery DC (2018) Design of experiments and response surface methodology to tune machine learning hyperparameters, with a random forest case-study. Expert Syst Appl 109:195–205. https://doi.org/10.1016/j.eswa.2018.05.024

    Article  Google Scholar 

Download references

Funding

This research is supported by National Natural Science Foundation of China [62172449, 71790615, 62006251, 62172441], Hunan Provincial Natural Science Foundation of China [2021JJ30870, 2021JJ40783, 2020JJ4746], Changsha Municipal Natural Science Foundation [kq2014134] and National Key Research and Development Program of China [2020YFC0832700]. This work was supported in part by the High Performance Computing Center of Central South University.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by FL, YL, XM, JD and XL. The first draft of the manuscript was written by YL, and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Xingliang Mao.

Ethics declarations

Conflict of interest

No potential conflict of interest was reported by the authors.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Ethics approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, F., Lu, Y., Mao, X. et al. Multi-task deep learning model based on hierarchical relations of address elements for semantic address matching. Neural Comput & Applic 34, 8919–8931 (2022). https://doi.org/10.1007/s00521-022-06914-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-06914-1

Keywords

Navigation