Skip to main content

Representation Learning for Point Clouds with Variational Autoencoders

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 Workshops (ECCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13806))

Included in the following conference series:

Abstract

Deep generative networks provide a way to generalize complex multi-dimensional data such as 3D point clouds. In this work, we present a novel method that operates on depth images and with the use of geometric images is able to learn the representation of discrete 3D points based on variational autoencoders (VAE). Traditional VAE solutions failed to capture sharply compressed 3D data; however, with the constrained variational framework with additional hyperparameters, we managed to learn the representation of 3D data successfully. To do this, we applied a Bayesian optimization on the hyperparameter space of the VAE. The results were validated on a large scale of public data while the code and demos are available on the authors’ website: https://github.com/molnarszilard/GIPC_rele.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Achlioptas, P., Diamanti, O., Mitliagkas, I., Guibas, L.: Learning representations and generative models for 3D point clouds. In: Dy, J.G., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, 10–15 July 2018. Proceedings of Machine Learning Research, vol. 80, pp. 40–49. Proceedings of Machine Learning Research (2018)

    Google Scholar 

  2. Blaga, A., Militaru, C., Mezei, A.-D., Tamas, L.: Augmented reality integration into MES for connected workers. Robot. Comput.-Integr. Manuf. 68, 102057 (2021)

    Article  Google Scholar 

  3. Creswell, A., White, T., Dumoulin, V., Arulkumaran, K., et al.: Generative adversarial networks: an overview. IEEE Sig. Process. Mag. 35(1), 53–65 (2018)

    Article  Google Scholar 

  4. Frohlich, R., Tamas, L., Kato, Z.: Absolute pose estimation of central cameras using planar regions. IEEE Trans. Pattern Anal. Mach. Intell. 43(2), 377–391 (2021)

    Article  Google Scholar 

  5. Gadelha, M., Wang, R., Maji, S.: Multiresolution tree networks for 3D point cloud processing. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 105–122. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_7

    Chapter  Google Scholar 

  6. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., et al.: Generative adversarial nets. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 27: Annual Conference on Neural Information Processing Systems 2014, 8–13 December 2014, Montreal, Quebec, Canada, pp. 2672–2680. Curran Associates Inc. (2014)

    Google Scholar 

  7. Gu, X., Gortler, S.J., Hoppe, H.: Geometry images. ACM Trans. Graph. 21(3), 355–361 (2002)

    Article  Google Scholar 

  8. Higgins, I., Matthey, L., Pal, A., Burgess, C., et al.: Beta-VAE: learning basic visual concepts with a constrained variational framework. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, 24–26 April 2017, Conference Track Proceedings (2017)

    Google Scholar 

  9. Keshtkaran, M.R., Pandarinath, C.: Enabling hyperparameter optimization in sequential autoencoders for spiking neural data. In: Wallach, H.M., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E.B., Garnett, R. (eds.) Advances in Neural Information Processing Systems: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8–14 December 2019, Vancouver, BC, Canada, vol. 32, pp. 15911–15921. Neural Information Processing Systems Foundation, Inc. (NeurIPS) (2019)

    Google Scholar 

  10. Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: Bengio, Y., LeCun, Y. (eds.) 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, 14–16 April 2014, Conference Track Proceedings (2014)

    Google Scholar 

  11. Marnissi, Y., Zheng, Y., Chouzenoux, E., Pesquet, J.-C.: A variational Bayesian approach for image restoration - application to image deblurring with Poisson-Gaussian noise. IEEE Trans. Comput. Imaging 3(4), 722–737 (2017)

    Article  MathSciNet  Google Scholar 

  12. Masuda, M., Hachiuma, R., Fujii, R., Saito, H., Sekikawa, Y.: Toward unsupervised 3D point cloud anomaly detection using variational autoencoder. In: 2021 IEEE International Conference on Image Processing, ICIP 2021, Anchorage, AK, USA, 19–22 September 2021, pp. 3118–3122. IEEE (2021)

    Google Scholar 

  13. Molnár, S., Kelényi, B., Tamás, L.: ToFNest: efficient normal estimation for time-of-flight depth cameras. In: IEEE/CVF International Conference on Computer Vision Workshops, ICCVW 2021, Montreal, BC, Canada, 11–17 October 2021, pp. 1791–1798. IEEE, online (2021)

    Google Scholar 

  14. Rybkin, O., Daniilidis, K., Levine, S.: Simple and effective VAE training with calibrated decoders. In: Meila, M., Zhang, T. (eds.) Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18–24 July 2021, Virtual Event. Proceedings of Machine Learning Research, vol. 139, pp. 9179–9189. Proceedings of Machine Learning Research (2021)

    Google Scholar 

  15. Siivola, E., Paleyes, A., González, J., Vehtari, A.: Good practices for Bayesian optimization of high dimensional structured spaces. Applied AI Lett. 2(2), e24 (2021)

    Article  Google Scholar 

  16. Sinha, A., Bai, J., Ramani, K.: Deep learning 3D shape surfaces using geometry images. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 223–240. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_14

    Chapter  Google Scholar 

  17. Su, F.G., Lin, C.S., Wang, Y.: Learning interpretable representation for 3D point clouds. In: 25th International Conference on Pattern Recognition, ICPR 2020, Virtual Event/Milan, Italy, 10–15 January 2021, pp. 7470–7477. IEEE (2021)

    Google Scholar 

  18. Tamas, L., Cozma, A.: Embedded real-time people detection and tracking with time-of-flight camera. In: Real-Time Image Processing and Deep Learning 2021, vol. 11736, pp. 65–70. International Society for Optics and Photonics, SPIE, online (2021)

    Google Scholar 

  19. Thanou, D., Chou, P.A., Frossard, P.: Graph-based compression of dynamic 3D point cloud sequences. IEEE Trans. Image Process. 25(4), 1765–1778 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  20. Yílmaz, M.A., Kelesş, O., Güven, H., Tekalp, A.M., Malik, J., Kíranyaz, S.: Self-organized variational autoencoders (self-VAE) for learned image compression. In: 2021 IEEE International Conference on Image Processing, ICIP 2021, Anchorage, AK, USA, 19–22 September 2021, pp. 3732–3736. IEEE (2021)

    Google Scholar 

  21. Zamorski, M., Zięba, M., Klukowski, P., Nowak, R., et al.: Adversarial autoencoders for compact representations of 3D point clouds. Comput. Vis. Image Underst. 193, 102921 (2020)

    Article  Google Scholar 

  22. Zeng, S., Geng, G., Gao, H., Zhou, M.: A novel geometry image to accurately represent a surface by preserving mesh topology. Sci. Rep. 11(1), 1–9 (2021)

    Article  Google Scholar 

Download references

Acknowledgments

The authors are thankful for the support of Analog Devices GMBH Romania, for the equipment list and Nvidia for graphic cards offered as support to this work.This work was financially supported by the Romanian National Authority for Scientific Research, project number PN-III-P2-2.1-PED-2021-3120. The authors are also thankful to KMTA (Kárpát-medencei Tehetségkutató Alapítvány) and Domus Foundation for their support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Levente Tamás .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Molnár, S., Tamás, L. (2023). Representation Learning for Point Clouds with Variational Autoencoders. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13806. Springer, Cham. https://doi.org/10.1007/978-3-031-25075-0_49

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-25075-0_49

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-25074-3

  • Online ISBN: 978-3-031-25075-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics