FIRE: Fine Implicit Reconstruction Enhancement with Detailed Body Part Labels and Geometric Features

Zhang, Junzheng; Chen, Xipeng; Wang, Keze; Wei, Pengxu; Lin, Liang

doi:10.1007/978-981-99-8432-9_5

Junzheng Zhang¹⁵,
Xipeng Chen¹⁵,
Keze Wang¹⁵,
Pengxu Wei¹⁵ &
…
Liang Lin¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14426))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

503 Accesses

Abstract

While considerable progress has been made in 3D human reconstruction, existing methodologies exhibit performance limitations, mainly when dealing with complex clothing variations and challenging poses. In this paper, we propose an innovative approach that integrates fine-grained body part labels and geometric features into the reconstruction process, addressing the aforementioned challenges and boosting the overall performance. Our method, Fine Implicit Reconstruction Enhancement (FIRE), leverages blend weight-based soft human parsing labels and geometric features obtained through Ray-based Sampling to enrich the representation and predictive power of the reconstruction pipeline. We argue that incorporating these features provides valuable body part-specific information, which aids in improving the accuracy of the reconstructed model. Our work presents a subtle yet significant enhancement to current techniques, pushing the boundaries of performance and paving the way for future research in this intriguing field. Extensive experiments on various benchmarks demonstrate the effectiveness of our FIRE approach in terms of reconstruction accuracy, particularly in handling challenging poses and clothing variations. Our method’s versatility and improved performance render it a promising solution for diverse applications within the domain of 3D human body reconstruction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alldieck, T., Zanfir, M., Sminchisescu, C.: Photorealistic monocular 3d reconstruction of humans wearing clothing. In: CVPR, pp. 1506–1515 (2022)
Google Scholar
Chan, K., Lin, G., Zhao, H., Lin, W.: S-pifu: integrating parametric human models with pifu for single-view clothed human reconstruction. Adv. Neural. Inf. Process. Syst. 35, 17373–17385 (2022)
Google Scholar
Chan, K.Y., Lin, G., Zhao, H., Lin, W.: Integratedpifu: integrated pixel aligned implicit function for single-view human reconstruction. In: Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, 23–27 October 2022, Proceedings, Part II, pp. 328–344. Springer (2022). https://doi.org/10.1007/978-3-031-20086-1_19
Chen, X., Zheng, Y., Black, M.J., Hilliges, O., Geiger, A.: Snarf: Differentiable forward skinning for animating non-rigid neural implicit shapes. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 11594–11604 (2021)
Google Scholar
Kavan, L., Collins, S., Žára, J., O’Sullivan, C.: Skinning with dual quaternions. In: Proceedings of the 2007 Symposium on Interactive 3D Graphics and Games, pp. 39–46 (2007)
Google Scholar
Chen, Z., Zhang, H.: Learning implicit fields for generative shape modeling. In: CVPR, pp. 5939–5948 (2019)
Google Scholar
Feng, Q., Liu, Y., Lai, Y.K., Yang, J., Li, K.: Fof: learning fourier occupancy field for monocular real-time human reconstruction. arXiv:2206.02194 (2022)
He, T., Collomosse, J., Jin, H., Soatto, S.: Geo-pifu: geometry and pixel aligned implicit functions for single-view human reconstruction. Adv. Neural. Inf. Process. Syst. 33, 9276–9287 (2020)
Google Scholar
He, T., Xu, Y., Saito, S., Soatto, S., Tung, T.: Arch++: animation-ready clothed human reconstruction revisited. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 11046–11056 (2021)
Google Scholar
Huang, Y., et al.: One-shot implicit animatable avatars with model-based priors. arXiv:2212.02469 (2022)
Huang, Z., Xu, Y., Lassner, C., Li, H., Tung, T.: Arch: animatable reconstruction of clothed humans. In: CVPR, pp. 3093–3102 (2020)
Google Scholar
Jiang, B., Hong, Y., Bao, H., Zhang, J.: Selfrecon: self reconstruction your digital avatar from monocular video. In: CVPR, pp. 5605–5615 (2022)
Google Scholar
Joo, H., Simon, T., Sheikh, Y.: Total capture: a 3d deformation model for tracking faces, hands, and bodies. In: CVPR, pp. 8320–8329 (2018)
Google Scholar
Lin, S., Zhang, H., Zheng, Z., Shao, R., Liu, Y.: Learning implicit templates for point-based clothed human modeling. In: Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part III, pp. 210–228. Springer (2022). https://doi.org/10.1007/978-3-031-20062-5_13
Loper, M., Mahmood, N., Romero, J., Pons-Moll, G., Black, M.J.: Smpl: a skinned multi-person linear model. ACM Trans. Graph. (TOG) 34(6), 1–16 (2015)
Article Google Scholar
Lorensen, W.E., Cline, H.E.: Marching cubes: a high resolution 3d surface construction algorithm. ACM siggraph computer graphics 21(4), 163–169 (1987)
Article Google Scholar
Ma, Q., et al.: Learning to dress 3d people in generative clothing. In: CVPR, pp. 6469–6478 (2020)
Google Scholar
Mescheder, L., Oechsle, M., Niemeyer, M., Nowozin, S., Geiger, A.: Occupancy networks: learning 3d reconstruction in function space. In: CVPR, pp. 4460–4470 (2019)
Google Scholar
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: Nerf: representing scenes as neural radiance fields for view synthesis. Commun. ACM 65(1), 99–106 (2021)
Article Google Scholar
Park, J.J., Florence, P., Straub, J., Newcombe, R., Lovegrove, S.: Deepsdf: learning continuous signed distance functions for shape representation. In: CVPR, pp. 165–174 (2019)
Google Scholar
Pavlakos, G., et al.: Expressive body capture: 3d hands, face, and body from a single image. In: CVPR, pp. 10975–10985 (2019)
Google Scholar
Peng, S., et al.: Neural body: implicit neural representations with structured latent codes for novel view synthesis of dynamic humans. In: CVPR, pp. 9054–9063 (2021)
Google Scholar
Pesavento, M., Volino, M., Hilton, A.: Super-resolution 3d human shape from a single low-resolution image. In: Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, 23–27 October 2022, Proceedings, Part II, pp. 447–464. Springer (2022). https://doi.org/10.1007/978-3-031-20086-1_26
Romero, J., Tzionas, D., Black, M.J.: Embodied hands: modeling and capturing hands and bodies together. arXiv:2201.02610 (2022)
Saito, S., Huang, Z., Natsume, R., Morishima, S., Kanazawa, A., Li, H.: Pifu: pixel-aligned implicit function for high-resolution clothed human digitization. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2304–2314 (2019)
Google Scholar
Saito, S., Simon, T., Saragih, J., Joo, H.: Pifuhd: multi-level pixel-aligned implicit function for high-resolution 3d human digitization. In: CVPR, pp. 84–93 (2020)
Google Scholar
Jackson, A.S., Manafas, C., Tzimiropoulos, G.: 3D human body reconstruction from a single image via volumetric regression. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11132, pp. 64–77. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11018-5_6
Chapter Google Scholar
Wang, K., Zheng, H., Zhang, G., Yang, J.: Parametric model estimation for 3d clothed humans from point clouds. In: ISMAR, pp. 156–165. IEEE (2021)
Google Scholar
Wang, S., Mihajlovic, M., Ma, Q., Geiger, A., Tang, S.: Metaavatar: learning animatedly clothed human models from few depth images. Adv. Neural. Inf. Process. Syst. 34, 2810–2822 (2021)
Google Scholar
Weng, C.Y., Curless, B., Srinivasan, P.P., Barron, J.T., Kemelmacher-Shlizerman, I.: Humannerf: free-viewpoint rendering of moving people from monocular video. In: CVPR, pp. 16210–16220 (2022)
Google Scholar
Xiong, Z., et al.: Pifu for the real world: a self-supervised framework to reconstruct dressed humans from single-view images. arXiv:2208.10769 (2022)
Xiu, Y., Yang, J., Tzionas, D., Black, M.J.: Icon: implicit clothed humans obtained from normals. In: CVPR, pp. 13286–13296. IEEE (2022)
Google Scholar
Xu, H., Bazavan, E.G., Zanfir, A., Freeman, W.T., Sukthankar, R., Sminchisescu, C.: Ghum & ghuml: generative 3d human shape and articulated pose models. In: CVPR, pp. 6184–6193 (2020)
Google Scholar
Zheng, Z., Yu, T., Liu, Y., Dai, Q.: Pamir: Parametric model-conditioned implicit representation for image-based human reconstruction. IEEE Trans. Pattern Anal. Mach. Intell. 44(6), 3170–3184 (2021)
Article Google Scholar
Patel, P., Huang, C.H.P., Tesch, J., Hoffmann, D.T., Tripathi, S., Black, M.J.: Agora: avatars in geography optimized for regression analysis. In: CVPR, pp. 13468–13478 (2021)
Google Scholar
Zheng, Z., Yu, T., Wei, Y., Dai, Q., Liu, Y.: Deephuman: 3d human reconstruction from a single image. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 7739–7749 (2019)
Google Scholar
Moon, G., Nam, H., Shiratori, T., Lee, K.M.: 3d clothed human reconstruction in the wild. In: Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, 23–27 October 2022, Proceedings, Part II, pp. 184–200. Springer (2022). https://doi.org/10.1007/978-3-031-20086-1_11
Jinka, S.S., Srivastava, A., Pokhariya, C., Sharma, A., Narayanan, P.: Sharp: shape-aware reconstruction of people in loose clothing. Int. J. Comput. Vision 131(4), 918–937 (2023)
Article Google Scholar
Yu, T., Zheng, Z., Guo, K., Liu, P., Dai, Q., Liu, Y.: Function4d: real-time human volumetric capture from very sparse consumer rgbd sensors. In: CVPR, pp. 5746–5756 (2021)
Google Scholar
Kocabas, M., Huang, C.H.P., Hilliges, O., Black, M.J.: Pare: part attention regressor for 3d human body estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 11127–11137 (2021)
Google Scholar
Ravi, N., Reizenstein, J., Novotny, D., Gordon, T., Lo, W.Y., Johnson, J., Gkioxari, G.: Accelerating 3d deep learning with pytorch3d. arXiv:2007.08501 (2020)

Download references

Author information

Authors and Affiliations

School of Computer Science, Sun Yat-sen University, Guangzhou, China
Junzheng Zhang, Xipeng Chen, Keze Wang, Pengxu Wei & Liang Lin

Authors

Junzheng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xipeng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Keze Wang
View author publications
You can also search for this author in PubMed Google Scholar
Pengxu Wei
View author publications
You can also search for this author in PubMed Google Scholar
Liang Lin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liang Lin .

Editor information

Editors and Affiliations

Nanjing University of Information Science and Technology, Nanjing, China
Qingshan Liu
Xiamen University, Xiamen, China
Hanzi Wang
Beijing University of Posts and Telecommunications, Beijing, China
Zhanyu Ma
Sun Yat-sen University, Guangzhou, China
Weishi Zheng
Peking University, Beijing, China
Hongbin Zha
Chinese Academy of Sciences, Beijing, China
Xilin Chen
Chinese Academy of Sciences, Beijing, China
Liang Wang
Xiamen University, Xiamen, China
Rongrong Ji

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, J., Chen, X., Wang, K., Wei, P., Lin, L. (2024). FIRE: Fine Implicit Reconstruction Enhancement with Detailed Body Part Labels and Geometric Features. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14426. Springer, Singapore. https://doi.org/10.1007/978-981-99-8432-9_5

Download citation

DOI: https://doi.org/10.1007/978-981-99-8432-9_5
Published: 24 December 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8431-2
Online ISBN: 978-981-99-8432-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

FIRE: Fine Implicit Reconstruction Enhancement with Detailed Body Part Labels and Geometric Features