Skip to main content

FIRE: Fine Implicit Reconstruction Enhancement with Detailed Body Part Labels and Geometric Features

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14426))

Included in the following conference series:

  • 503 Accesses

Abstract

While considerable progress has been made in 3D human reconstruction, existing methodologies exhibit performance limitations, mainly when dealing with complex clothing variations and challenging poses. In this paper, we propose an innovative approach that integrates fine-grained body part labels and geometric features into the reconstruction process, addressing the aforementioned challenges and boosting the overall performance. Our method, Fine Implicit Reconstruction Enhancement (FIRE), leverages blend weight-based soft human parsing labels and geometric features obtained through Ray-based Sampling to enrich the representation and predictive power of the reconstruction pipeline. We argue that incorporating these features provides valuable body part-specific information, which aids in improving the accuracy of the reconstructed model. Our work presents a subtle yet significant enhancement to current techniques, pushing the boundaries of performance and paving the way for future research in this intriguing field. Extensive experiments on various benchmarks demonstrate the effectiveness of our FIRE approach in terms of reconstruction accuracy, particularly in handling challenging poses and clothing variations. Our method’s versatility and improved performance render it a promising solution for diverse applications within the domain of 3D human body reconstruction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Alldieck, T., Zanfir, M., Sminchisescu, C.: Photorealistic monocular 3d reconstruction of humans wearing clothing. In: CVPR, pp. 1506–1515 (2022)

    Google Scholar 

  2. Chan, K., Lin, G., Zhao, H., Lin, W.: S-pifu: integrating parametric human models with pifu for single-view clothed human reconstruction. Adv. Neural. Inf. Process. Syst. 35, 17373–17385 (2022)

    Google Scholar 

  3. Chan, K.Y., Lin, G., Zhao, H., Lin, W.: Integratedpifu: integrated pixel aligned implicit function for single-view human reconstruction. In: Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, 23–27 October 2022, Proceedings, Part II, pp. 328–344. Springer (2022). https://doi.org/10.1007/978-3-031-20086-1_19

  4. Chen, X., Zheng, Y., Black, M.J., Hilliges, O., Geiger, A.: Snarf: Differentiable forward skinning for animating non-rigid neural implicit shapes. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 11594–11604 (2021)

    Google Scholar 

  5. Kavan, L., Collins, S., Žára, J., O’Sullivan, C.: Skinning with dual quaternions. In: Proceedings of the 2007 Symposium on Interactive 3D Graphics and Games, pp. 39–46 (2007)

    Google Scholar 

  6. Chen, Z., Zhang, H.: Learning implicit fields for generative shape modeling. In: CVPR, pp. 5939–5948 (2019)

    Google Scholar 

  7. Feng, Q., Liu, Y., Lai, Y.K., Yang, J., Li, K.: Fof: learning fourier occupancy field for monocular real-time human reconstruction. arXiv:2206.02194 (2022)

  8. He, T., Collomosse, J., Jin, H., Soatto, S.: Geo-pifu: geometry and pixel aligned implicit functions for single-view human reconstruction. Adv. Neural. Inf. Process. Syst. 33, 9276–9287 (2020)

    Google Scholar 

  9. He, T., Xu, Y., Saito, S., Soatto, S., Tung, T.: Arch++: animation-ready clothed human reconstruction revisited. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 11046–11056 (2021)

    Google Scholar 

  10. Huang, Y., et al.: One-shot implicit animatable avatars with model-based priors. arXiv:2212.02469 (2022)

  11. Huang, Z., Xu, Y., Lassner, C., Li, H., Tung, T.: Arch: animatable reconstruction of clothed humans. In: CVPR, pp. 3093–3102 (2020)

    Google Scholar 

  12. Jiang, B., Hong, Y., Bao, H., Zhang, J.: Selfrecon: self reconstruction your digital avatar from monocular video. In: CVPR, pp. 5605–5615 (2022)

    Google Scholar 

  13. Joo, H., Simon, T., Sheikh, Y.: Total capture: a 3d deformation model for tracking faces, hands, and bodies. In: CVPR, pp. 8320–8329 (2018)

    Google Scholar 

  14. Lin, S., Zhang, H., Zheng, Z., Shao, R., Liu, Y.: Learning implicit templates for point-based clothed human modeling. In: Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part III, pp. 210–228. Springer (2022). https://doi.org/10.1007/978-3-031-20062-5_13

  15. Loper, M., Mahmood, N., Romero, J., Pons-Moll, G., Black, M.J.: Smpl: a skinned multi-person linear model. ACM Trans. Graph. (TOG) 34(6), 1–16 (2015)

    Article  Google Scholar 

  16. Lorensen, W.E., Cline, H.E.: Marching cubes: a high resolution 3d surface construction algorithm. ACM siggraph computer graphics 21(4), 163–169 (1987)

    Article  Google Scholar 

  17. Ma, Q., et al.: Learning to dress 3d people in generative clothing. In: CVPR, pp. 6469–6478 (2020)

    Google Scholar 

  18. Mescheder, L., Oechsle, M., Niemeyer, M., Nowozin, S., Geiger, A.: Occupancy networks: learning 3d reconstruction in function space. In: CVPR, pp. 4460–4470 (2019)

    Google Scholar 

  19. Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: Nerf: representing scenes as neural radiance fields for view synthesis. Commun. ACM 65(1), 99–106 (2021)

    Article  Google Scholar 

  20. Park, J.J., Florence, P., Straub, J., Newcombe, R., Lovegrove, S.: Deepsdf: learning continuous signed distance functions for shape representation. In: CVPR, pp. 165–174 (2019)

    Google Scholar 

  21. Pavlakos, G., et al.: Expressive body capture: 3d hands, face, and body from a single image. In: CVPR, pp. 10975–10985 (2019)

    Google Scholar 

  22. Peng, S., et al.: Neural body: implicit neural representations with structured latent codes for novel view synthesis of dynamic humans. In: CVPR, pp. 9054–9063 (2021)

    Google Scholar 

  23. Pesavento, M., Volino, M., Hilton, A.: Super-resolution 3d human shape from a single low-resolution image. In: Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, 23–27 October 2022, Proceedings, Part II, pp. 447–464. Springer (2022). https://doi.org/10.1007/978-3-031-20086-1_26

  24. Romero, J., Tzionas, D., Black, M.J.: Embodied hands: modeling and capturing hands and bodies together. arXiv:2201.02610 (2022)

  25. Saito, S., Huang, Z., Natsume, R., Morishima, S., Kanazawa, A., Li, H.: Pifu: pixel-aligned implicit function for high-resolution clothed human digitization. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2304–2314 (2019)

    Google Scholar 

  26. Saito, S., Simon, T., Saragih, J., Joo, H.: Pifuhd: multi-level pixel-aligned implicit function for high-resolution 3d human digitization. In: CVPR, pp. 84–93 (2020)

    Google Scholar 

  27. Jackson, A.S., Manafas, C., Tzimiropoulos, G.: 3D human body reconstruction from a single image via volumetric regression. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11132, pp. 64–77. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11018-5_6

    Chapter  Google Scholar 

  28. Wang, K., Zheng, H., Zhang, G., Yang, J.: Parametric model estimation for 3d clothed humans from point clouds. In: ISMAR, pp. 156–165. IEEE (2021)

    Google Scholar 

  29. Wang, S., Mihajlovic, M., Ma, Q., Geiger, A., Tang, S.: Metaavatar: learning animatedly clothed human models from few depth images. Adv. Neural. Inf. Process. Syst. 34, 2810–2822 (2021)

    Google Scholar 

  30. Weng, C.Y., Curless, B., Srinivasan, P.P., Barron, J.T., Kemelmacher-Shlizerman, I.: Humannerf: free-viewpoint rendering of moving people from monocular video. In: CVPR, pp. 16210–16220 (2022)

    Google Scholar 

  31. Xiong, Z., et al.: Pifu for the real world: a self-supervised framework to reconstruct dressed humans from single-view images. arXiv:2208.10769 (2022)

  32. Xiu, Y., Yang, J., Tzionas, D., Black, M.J.: Icon: implicit clothed humans obtained from normals. In: CVPR, pp. 13286–13296. IEEE (2022)

    Google Scholar 

  33. Xu, H., Bazavan, E.G., Zanfir, A., Freeman, W.T., Sukthankar, R., Sminchisescu, C.: Ghum & ghuml: generative 3d human shape and articulated pose models. In: CVPR, pp. 6184–6193 (2020)

    Google Scholar 

  34. Zheng, Z., Yu, T., Liu, Y., Dai, Q.: Pamir: Parametric model-conditioned implicit representation for image-based human reconstruction. IEEE Trans. Pattern Anal. Mach. Intell. 44(6), 3170–3184 (2021)

    Article  Google Scholar 

  35. Patel, P., Huang, C.H.P., Tesch, J., Hoffmann, D.T., Tripathi, S., Black, M.J.: Agora: avatars in geography optimized for regression analysis. In: CVPR, pp. 13468–13478 (2021)

    Google Scholar 

  36. Zheng, Z., Yu, T., Wei, Y., Dai, Q., Liu, Y.: Deephuman: 3d human reconstruction from a single image. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 7739–7749 (2019)

    Google Scholar 

  37. Moon, G., Nam, H., Shiratori, T., Lee, K.M.: 3d clothed human reconstruction in the wild. In: Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, 23–27 October 2022, Proceedings, Part II, pp. 184–200. Springer (2022). https://doi.org/10.1007/978-3-031-20086-1_11

  38. Jinka, S.S., Srivastava, A., Pokhariya, C., Sharma, A., Narayanan, P.: Sharp: shape-aware reconstruction of people in loose clothing. Int. J. Comput. Vision 131(4), 918–937 (2023)

    Article  Google Scholar 

  39. Yu, T., Zheng, Z., Guo, K., Liu, P., Dai, Q., Liu, Y.: Function4d: real-time human volumetric capture from very sparse consumer rgbd sensors. In: CVPR, pp. 5746–5756 (2021)

    Google Scholar 

  40. Kocabas, M., Huang, C.H.P., Hilliges, O., Black, M.J.: Pare: part attention regressor for 3d human body estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 11127–11137 (2021)

    Google Scholar 

  41. Ravi, N., Reizenstein, J., Novotny, D., Gordon, T., Lo, W.Y., Johnson, J., Gkioxari, G.: Accelerating 3d deep learning with pytorch3d. arXiv:2007.08501 (2020)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liang Lin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, J., Chen, X., Wang, K., Wei, P., Lin, L. (2024). FIRE: Fine Implicit Reconstruction Enhancement with Detailed Body Part Labels and Geometric Features. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14426. Springer, Singapore. https://doi.org/10.1007/978-981-99-8432-9_5

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8432-9_5

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8431-2

  • Online ISBN: 978-981-99-8432-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics