ABSTRACT
We demonstrate an automatic 3D creation system, which can create realistic 3D assets solely from a text or image prompt without requiring any specialized 3D modeling skills. Users can either describe the object they envision in natural language or upload a reference image that records what they have seen with the phone. Our system will generate a high-quality 3D mesh that faithfully matches the users' input. We propose a coarse-to-fine framework to achieve this goal. Specifically, we first obtain a low-resolution mesh instantly by utilizing a pre-trained text/image conditional 3D generative model. Using such coarse mesh as the initialization, we further optimize a high-resolution textured 3D mesh with fine-grained appearance guidance from large-scale 2D diffusion models. Our system can create visually-pleasing results in minutes, which is significantly faster than existing methods. Meanwhile, the system ensures that the resulting 3D assets are precisely aligned with the input text or image prompt. With these advanced capabilities, our demonstration provides a streamlined and intuitive platform for users to incorporate 3D creation into their daily lives.
- Yang Chen, Yingwei Pan, Ting Yao, Xinmei Tian, and Tao Mei. 2019a. Animating Your Life: Real-Time Video-to-Animation Translation. In ACM MM Demo.Google Scholar
- Yang Chen, Yingwei Pan, Ting Yao, Xinmei Tian, and Tao Mei. 2019b. Mocycle-gan: Unpaired video-to-video translation. In ACM MM.Google ScholarDigital Library
- Heewoo Jun and Alex Nichol. 2023. Shap-e: Generating conditional 3d implicit functions. arXiv preprint arXiv:2305.02463 (2023).Google Scholar
- Yehao Li, Ting Yao, Yingwei Pan, and Tao Mei. 2022. Contextual transformer networks for visual recognition. IEEE TPAMI (2022).Google ScholarCross Ref
- Chen-Hsuan Lin, Jun Gao, Luming Tang, Towaki Takikawa, Xiaohui Zeng, Xun Huang, Karsten Kreis, Sanja Fidler, Ming-Yu Liu, and Tsung-Yi Lin. 2023. Magic3D: High-Resolution Text-to-3D Content Creation. In CVPR.Google Scholar
- Ruoshi Liu, Rundi Wu, Basile Van Hoorick, Pavel Tokmakov, Sergey Zakharov, and Carl Vondrick. 2023. Zero-1-to-3: Zero-shot One Image to 3D Object. ArXiv, Vol. abs/2303.11328 (2023).Google Scholar
- Alex Nichol, Heewoo Jun, Prafulla Dhariwal, Pamela Mishkin, and Mark Chen. 2022. Point-E: A System for Generating 3D Point Clouds from Complex Prompts. arXiv preprint arXiv:2212.08751 (2022).Google Scholar
- Yingwei Pan, Zhaofan Qiu, Ting Yao, Houqiang Li, and Tao Mei. 2017. To create what you tell: Generating videos from captions. In ACM MM.Google Scholar
- Ben Poole, Ajay Jain, Jonathan T Barron, and Ben Mildenhall. 2023. Dreamfusion: Text-to-3d using 2d diffusion. In ICLR.Google Scholar
- Tianchang Shen, Jun Gao, Kangxue Yin, Ming-Yu Liu, and Sanja Fidler. 2021. Deep marching tetrahedra: a hybrid representation for high-resolution 3d shape synthesis. Advances in Neural Information Processing Systems, Vol. 34 (2021), 6087--6101.Google Scholar
- Ting Yao, Yehao Li, Yingwei Pan, Yu Wang, Xiao-Ping Zhang, and Tao Mei. 2023. Dual vision transformer. IEEE TPAMI (2023).Google ScholarDigital Library
- Ting Yao, Yingwei Pan, Yehao Li, Chong-Wah Ngo, and Tao Mei. 2022. Wave-vit: Unifying wavelet and transformers for visual representation learning. In ECCV.Google Scholar
Index Terms
- 3D Creation at Your Fingertips: From Text or Image to 3D Assets
Recommendations
3D Character Model Creation from Cel Animation
CW '04: Proceedings of the 2004 International Conference on CyberworldsWhen creating a cel animation, the animators often use 3D character models to add some effects on the character or to generate intermediate images between the key frames. However, it is a troublesome and time-consuming task to create a 3D model. In this ...
3D puppetry: a kinect-based interface for 3D animation
UIST '12: Proceedings of the 25th annual ACM symposium on User interface software and technologyWe present a system for producing 3D animations using physical objects (i.e., puppets) as input. Puppeteers can load 3D models of familiar rigid objects, including toys, into our system and use them as puppets for an animation. During a performance, the ...
Comments