A cross-view geo-localization method guided by relation-aware global attention

Sun, Jing; Yan, Rui; Zhang, Bing; Zhu, Bing; Sun, Fuming

doi:10.1007/s00530-023-01101-1

A cross-view geo-localization method guided by relation-aware global attention

Regular Paper
Published: 09 May 2023

Volume 29, pages 2205–2216, (2023)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

Jing Sun¹,
Rui Yan¹,
Bing Zhang¹,
Bing Zhu² &
…
Fuming Sun¹

301 Accesses
1 Citation
Explore all metrics

Abstract

Cross-view geo-localization mainly exploits query images to match images from the same geographical location from different platforms. Most existing methods fail to adequately consider the effect of image structural information on cross-view geo-localization, resulting in the extracted features can not fully characterize the image, which affects the localization accuracy. Based on this, this paper proposes a cross-view geo-localization method guided by relation-aware global attention, which can capture the rich global structural information by perfectly integrating attention mechanism and feature extraction network, thus improving the representation ability of features. Meanwhile, considering the important role of semantic and context information in geo-localization, a joint training structure with parallel global branch and local branch is designed to fully mine multi-scale context features for image matching, which can further improve the accuracy of cross-view geo-localization. The quantitative and qualitative experimental results on University-1652, CVUSA, and CVACT datasets show that the algorithm in this paper outperforms other advanced methods in recall accuracy (Recall) and image retrieval average precision (AP).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Cross-view Geo-localization Based on Cross-domain Matching

Content-Aware Hierarchical Representation Selection for Cross-View Geo-Localization

Attention-based neural network with Generalized Mean Pooling for cross-view geo-localization between UAV and satellite

Article 15 April 2023

Data availability

The data that support the findings of this study are available on request from the corresponding author upon reasonable request.

References

Wang, Z., Qin, J., Xiang, X., Tan, Y.: A privacy-preserving and traitor tracking content-based image retrieval scheme in cloud computing. Multimedia Syst. 27(3), 403–415 (2021)
Article Google Scholar
Saritha, R.R., Paul, V., Kumar, P.G.: Content based image retrieval using deep learning process. Cluster Comput. 22(2), 4187–4200 (2019)
Article Google Scholar
Outay, F., Mengash, H.A., Adnan, M.: Applications of unmanned aerial vehicle (uav) in road safety, traffic and highway infrastructure management: recent advances and challenges. Trans. Res. Part A 141, 116–129 (2020)
Google Scholar
Zhao, X., Huang, P., Shu, X.: Wavelet-attention CNN for image classification. Multimedia Syst. 28(3), 915–924 (2022)
Article Google Scholar
Wang, P., Fan, E., Wang, P.: Comparative analysis of image classification algorithms based on traditional machine learning and deep learning. Pattern Recogn. Lett. 141, 61–67 (2021)
Article Google Scholar
Wang, H., Song, Y., Huo, L., Chen, L., He, Q.: Multiscale object detection based on channel and data enhancement at construction sites. Multimedia Syst. 29(1), 49–58 (2023)
Article Google Scholar
Tan, M., Pang, R., Le, Q.V.: Efficientdet: Scalable and efficient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 10781–10790 (2020)
Yuan, Y., Chen, X., Wang, J.: Object-contextual representations for semantic segmentation. In: Proceedings of the European Conference on Computer Vision, pp. 173–190 (2020)
Hao, S., Zhou, Y., Guo, Y.: A brief survey on semantic segmentation with deep learning. Neurocomputing 406, 302–321 (2020)
Article Google Scholar
Jaouedi, N., Boujnah, N., Bouhlel, M.S.: A new hybrid deep learning model for human action recognition. J. King Saud Univ. Comput. Inf. Sci. 32(4), 447–453 (2020)
Google Scholar
Yang, C., Xu, Y., Shi, J., Dai, B., Zhou, B.: Temporal pyramid network for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 588–597 (2020)
Shi, Y., Yu, X., Liu, L., Zhang, T., Li, H.: Optimal feature transport for cross-view image geo-localization. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 11990–11997 (2020)
Zheng, Z., Wei, Y., Yang, Y.: University-1652: A multi-view multi-source benchmark for drone-based geo-localization. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 1395–1403 (2020)
Wang, T., Zheng, Z., Yan, C., Zhang, J., Sun, Y., Zheng, B., Yang, Y.: Each part matters: local patterns facilitate cross-view geo-localization. IEEE Trans. Circuits Syst. Video Technol. 32(2), 867–879 (2021)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Zhang, Z., Lan, C., Zeng, W., Jin, X., Chen, Z.: Relation-aware global attention for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3186–3195 (2020)
Yu, F., Koltun, V., Funkhouser, T.: Dilated residual networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 472–480 (2017)
Zheng, Z., Zheng, L., Yang, Y.: A discriminatively learned cnn embedding for person reidentification. ACM Tran. Multimedia Comput. Commun. Appl. 14(1), 13–11320 (2018)
MathSciNet Google Scholar
Li, X., Yu, L., Chang, D., Ma, Z., Cao, J.: Dual cross-entropy loss for small-sample fine-grained vehicle classification. IEEE Trans. Vehicular Technol. 68(5), 4204–4212 (2019)
Article Google Scholar
Workman, S., Jacobs, N.: On the location dependence of convolutional neural network features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 70–78 (2015)
Workman, S., Souvenir, R., Jacobs, N.: Wide-area image geolocalization with aerial reference imagery. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3961–3969 (2015)
Lin, T.-Y., Cui, Y., Belongie, S., Hays, J.: Learning deep representations for ground-to-aerial geolocalization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5007–5015 (2015)
Vo, N.N., Hays, J.: Localizing and orienting street views using overhead imagery. In: Proceedings of the European Conference on Computer Vision, Springer. pp 494–509 (2016)
Tian, Y., Chen, C., Shah, M.: Cross-view image matching for geo-localization in urban environments. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 3608–3616 (2017)
Altwaijry, H., Trulls, E., Hays, J., Fua, P., Belongie, S.: Learning to match aerial images with deep attentive architectures. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 3539–3547 (2016)
Zhai, M., Bessinger, Z., Workman, S., Jacobs, N.: Predicting ground-level scene layout from aerial imagery. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 867–875 (2017)
Hu, S., Feng, M., Nguyen, R.M., Lee, G.H.: Cvm-net: Cross-view matching network for image-based ground-to-aerial geo-localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 7258–7267 (2018)
Arandjelovic, R., Gronát, P., Torii, A., Pajdla, T., Sivic, J.: Netvlad: Cnn architecture for weakly supervised place recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 5297–5307 (2016)
Shi, Y., Liu, L., Yu, X., Li, H.: Spatial-aware feature aggregation for cross-view image based geo-localization. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. pp 10090–10100 (2019)
Shi, Y., Yu, X., Campbell, D., Li, H.: Where am i looking at? joint location and orientation estimation by cross-view matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 4064–4072 (2020)
Liu, L., Li, H.: Lending orientation to neural networks for cross-view geo-localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5624–5633 (2019)
Rodrigues, R., Tani, M.: Are these from the same place? seeing the unseen in cross-view image geo-localization. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision. pp 3753–3761 (2021)
Regmi, K., Shah, M.: Bridging the domain gap for ground-to-aerial image matching. In: Proceedings of the IEEE International Conference on Computer Visio. pp 470–479 (2019)
Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A.C., Bengio, Y.: Generative adversarial networks. Commun. ACM 63(11), 139–144 (2020)
Article MathSciNet Google Scholar
Toker, A., Zhou, Q., Maximov, M., Leal-Taixé, L.: Coming down to earth: Satellite-to-street view synthesis for geo-localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 6488–6497 (2021)
Zheng, Z., Zheng, L., Garrett, M., Yang, Y., Xu, M., Shen, Y.: Dual-path convolutional image-text embeddings with instance loss. ACM Trans. Multimedia Compu. Commun. Appl. 16(2), 1–23 (2020)
Article Google Scholar
Ding, L., Zhou, J., Meng, L., Long, Z.: A practical cross-view image matching method between uav and satellite for uav-based geo-localization. Remote Sens. 13(1), 47 (2020)
Article Google Scholar
Zhuang, J., Dai, M., Chen, X., Zheng, E.: A faster and more effective cross-view matching method of uav and satellite images for uav geolocalization. Remote Sens. 13(19), 3979 (2021)
Article Google Scholar
Lin, J., Zheng, Z., Zhong, Z., Luo, Z., Li, S., Yang, Y., Sebe, N.: Joint representation learning and keypoint detection for cross-view geo-localization. IEEE Trans. Image Process. 31, 3780–3792 (2022)
Article Google Scholar
Dai, M., Hu, J., Zhuang, J., Zheng, E.: A transformer-based feature segmentation and region alignment method for uav-view geo-localization. IEEE Trans. Circuits. Syst. Video Technol. 32(7), 4376–4389 (2022)
Article Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L.u., Polosukhin, I.: Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, vol. 30, pp. 1–11 (2017)
Chechik, G., Sharma, V., Shalit, U., Bengio, S.: Large scale online learning of image similarity through ranking. J. Mach. Learning Res. 11(3), 1109–1135 (2010)
MathSciNet MATH Google Scholar
Cai, S., Guo, Y., Khan, S., Hu, J., Wen, G.: Ground-to-aerial image geo-localization with a hard exemplar reweighting triplet loss. In: Proceedings of the IEEE International Conference on Computer Vision. pp 8391–8400 (2019)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 7132–7141 (2018)

Download references

Acknowledgements

This work was partly supported by the National Natural Science Foundation of China under Grant 61976042 and 61972068, the Innovative Talents Program for Liaoning Universities under Grant LR2019020, the Liaoning Revitalization Talents Program under Grant XLYC2007023, and the Applied Basic Research Project of Liaoning Province under Grant 2022JH2/101300279.

Author information

Authors and Affiliations

School of Information and Communication Engineering, Dalian Minzu University, Liaohe West Road, Dalian, 116600, Liaoning, China
Jing Sun, Rui Yan, Bing Zhang & Fuming Sun
Department of Information Engineering, Harbin Institute of Technology, Xidazhi Street, Harbin, 150006, Heilongjiang, China
Bing Zhu

Authors

Jing Sun
View author publications
You can also search for this author in PubMed Google Scholar
Rui Yan
View author publications
You can also search for this author in PubMed Google Scholar
Bing Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Bing Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Fuming Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fuming Sun.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sun, J., Yan, R., Zhang, B. et al. A cross-view geo-localization method guided by relation-aware global attention. Multimedia Systems 29, 2205–2216 (2023). https://doi.org/10.1007/s00530-023-01101-1

Download citation

Received: 20 February 2023
Accepted: 26 April 2023
Published: 09 May 2023
Issue Date: August 2023
DOI: https://doi.org/10.1007/s00530-023-01101-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A cross-view geo-localization method guided by relation-aware global attention

Abstract

Access this article

Similar content being viewed by others

Cross-view Geo-localization Based on Cross-domain Matching

Content-Aware Hierarchical Representation Selection for Cross-View Geo-Localization

Attention-based neural network with Generalized Mean Pooling for cross-view geo-localization between UAV and satellite

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A cross-view geo-localization method guided by relation-aware global attention

Abstract

Access this article

Similar content being viewed by others

Cross-view Geo-localization Based on Cross-domain Matching

Content-Aware Hierarchical Representation Selection for Cross-View Geo-Localization

Attention-based neural network with Generalized Mean Pooling for cross-view geo-localization between UAV and satellite

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation