Skip to main content

PANDORA: A Panoramic Detection Dataset for Object with Orientation

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13668))

Abstract

Panoramic images have become increasingly popular as omnidirectional panoramic technology has advanced. Many datasets and works resort to object detection to better understand the content of the panoramic image. These datasets and detectors use a Bounding Field of View (BFoV) as a bounding box in panoramic images. However, we observe that the object instances in panoramic images often appear with arbitrary orientations. It indicates that BFoV as a bounding box is inappropriate, limiting the performance of detectors. This paper proposes a new bounding box representation, Rotated Bounding Field of View (RBFoV), for the panoramic image object detection task. Then, based on the RBFoV, we present a PANoramic Detection dataset for Object with oRientAtion (PANDORA). Finally, based on PANDORA, we evaluate the current state-of-the-art panoramic image object detection methods and design an anchor-free object detector called R-CenterNet for panoramic images. Compared with these baselines, our R-CenterNet shows its advantages in terms of detection performance. Our PANDORA dataset and source code are available at https://github.com/tdsuper/SphericalObjectDetection.

H. Xu and Q. Zhao—This work was done when Hang Xu and Qiang Zhao were at ICT.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Coors, B., Condurache, A.P., Geiger, A.: Spherenet: learning spherical representations for detection and classification in omnidirectional images. In: ECCV (2018)

    Google Scholar 

  2. Anguelov, D., et al.: Google street view: capturing the world at street level. Computer 43(6), 32–38 (2010)

    Article  Google Scholar 

  3. Chou, S.H., Sun, C., Chang, W.Y., Hsu, W.T., Sun, M., Fu, J.: 360-indoor: towards learning real-world objects in 360deg indoor equirectangular images. In: WACV (2020)

    Google Scholar 

  4. Cormack, R.: Flattening the earth: two thousand years of map projections by John P. Snyder; two by two: twenty-two pairs of maps from the newberry library illustrating five hundred years of western cartographic history by James Akerman; Robert Karrow; David Buisseret. ISIS 85(3), 488–489 (1994)

    Google Scholar 

  5. Everingham, M., Gool, L.V., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. IJCV 88(2), 303–338 (2010)

    Article  Google Scholar 

  6. Everingham, M., Eslami, S.M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes challenge: a retrospective. IJCV 111(1), 98–136 (2015)

    Google Scholar 

  7. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

    Google Scholar 

  8. Hu, H.N., Lin, Y.C., Liu, M.Y., Cheng, H.T., Chang, Y.J., Sun, M.: Deep 360 pilot: learning a deep agent for piloting through 360 sports videos. In: CVPR (2017)

    Google Scholar 

  9. Huang, J., Chen, Z., Research, A., Ceylan, U.D., Hailin, U.: 6-DOF VR videos with a single 360-camera. In: VR (2017)

    Google Scholar 

  10. Kingma, D., Ba, J.: Adam: a method for stochastic optimization. Comput. Sci. (2014)

    Google Scholar 

  11. Lee, Y., Jeong, J., Yun, J., Cho, W., Yoon, K.J.: SpherePHD: applying CNNs on a spherical PolyHeDron representation of 360 images (2019)

    Google Scholar 

  12. Lin, T.Y., Dollar, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: CVPR (2017)

    Google Scholar 

  13. Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  14. Lin, T.: Labelimg (2015)

    Google Scholar 

  15. Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2

    Chapter  Google Scholar 

  16. Neubeck, A., Gool, L.: Efficient non-maximum suppression. In: ICPR (2006)

    Google Scholar 

  17. Paszke, A., et al.: Automatic differentiation in pytorch (2017)

    Google Scholar 

  18. Pearson, F.: Map Projections: Theory and Applications. CRC Press, Boca Raton (1990)

    Google Scholar 

  19. Putri, S.E., Tulus, T., Napitupulu, N.: Implementation and analysis of depth-first search (DFS) algorithm for finding the longest path. In: InteriOR (2011)

    Google Scholar 

  20. Ran, L., Zhang, Y., Zhang, Q., Tao, Y.: Convolutional neural network-based robot navigation using uncalibrated spherical images. Sensors 17(6), 1341 (2017)

    Article  Google Scholar 

  21. Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: CVPR (2017)

    Google Scholar 

  22. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS (2015)

    Google Scholar 

  23. Su, Y.C., Grauman, K.: Learning spherical convolution for fast features from 360 imagery. In: CVPR (2017)

    Google Scholar 

  24. Su, Y., Jayaraman, D., Grauman, K.: Pano2vid: automatic cinematography for watching \(360^{\circ }\) videos. In: ACCV (2016)

    Google Scholar 

  25. Tian, Z., Shen, C., Chen, H., He, T.: FCOS: fully convolutional one-stage object detection. In: ICCV (2020)

    Google Scholar 

  26. Wang, K.H., Lai, S.H.: Object detection in curved space for 360-degree camera. In: ICASSP (2019)

    Google Scholar 

  27. Lai, W.-S., Huang, Y., Joshi, N., Buehler, C., Yang, M.-H.: Semantic-driven generation of hyperlapse from 360[formula: see text] video. TVCG 24(9), 2610–2621 (2017)

    Google Scholar 

  28. Wikipedia contributors: Spherical trigonometry (2021). https://en.wikipedia.org/w/index.php?title=Spherical_trigonometry &oldid=1016967508

  29. Yang, W., Qian, Y., Cricri, F., Fan, L., Kamarainen, J.K.: Object detection in equirectangular panorama (2018)

    Google Scholar 

  30. Yang, X., Yan, J., Qi, M., Wang, W., Xiaopeng, Z., Qi, T.: Rethinking rotated object detection with gaussian wasserstein distance loss. In: International Conference on Machine Learning (2021)

    Google Scholar 

  31. Yu, D., Ji, S.: Grid based spherical CNN for object detection from panoramic images. Sensors 19(11), 2622 (2019)

    Article  Google Scholar 

  32. Zhang, S., Chi, C., Yao, Y., Lei, Z., Li, S.Z.: Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: CVPR (2020)

    Google Scholar 

  33. Zhao, P., You, A., Zhang, Y., Liu, J., Tong, Y.: Spherical criteria for fast and accurate \(360^{\circ }\) object detection. In: AAAI, vol. 34, pp. 12959–12966 (2020)

    Google Scholar 

  34. Zhao, Q., Zhu, C., Dai, F., Ma, Y., Zhang, Y.: Distortion-aware CNNs for spherical images. In: Twenty-Seventh International Joint Conference on Artificial Intelligence IJCAI 2018 (2018)

    Google Scholar 

  35. Zhao, Q., et al.: Unbiased IOU for spherical image object detection. In: AAAI (2022)

    Google Scholar 

  36. Zhao, Q., Feng, W., Wan, L., Zhang, J.: Sphorb: a fast and robust binary feature on the sphere. Int. J. Comput. Vision 113(2), 143–159 (2015)

    Article  MathSciNet  Google Scholar 

  37. Zhao, Q., Wan, L., Feng, W., Zhang, J., Wong, T.T.: Cube2video: navigate between cubic panoramas in real-time. IEEE Trans. Multimedia 15(8), 1745–1754 (2013)

    Article  Google Scholar 

  38. Zheng, J., et al.: Gait recognition in the wild with multi-hop temporal switch. In: ACM MM (2022)

    Google Scholar 

  39. Zheng, J., Liu, X., Liu, W., He, L., Yan, C., Mei, T.: Gait recognition in the wild with dense 3D representations and a benchmark. In: CVPR, pp. 20228–20237 (2022)

    Google Scholar 

  40. Zhou, X., Wang, D., Krhenbühl, P.: Objects as points. arXiv (2019)

    Google Scholar 

Download references

Acknowledgements

This work is supported by the National Key Research and Development Program of China (2020YFB1406604) and the National Natural Science Foundation of China (62072438, U1936110, 61931008, U21B2024).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Qiang Zhao or Feng Dai .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 906 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xu, H. et al. (2022). PANDORA: A Panoramic Detection Dataset for Object with Orientation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13668. Springer, Cham. https://doi.org/10.1007/978-3-031-20074-8_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-20074-8_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-20073-1

  • Online ISBN: 978-3-031-20074-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics