Hi3D for Image to 3D
So essentially,
Hi3D generated High Resolution 3D Videos from Images!
Paper: Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models (11 Pages)
Github: https://github.com/yanghb22-fdu/Hi3D-Official
Researchers from HiDream.ai Inc., Fudan University and SMU are interested in better image-to-3D generation methods.
Hmm..What’s the background?
Traditionally, image-to-3D generation methods, which aim to reconstruct a textured 3D mesh from a single-view image, have struggled to produce multi-view consistent images with high-resolution textures, especially when using 2D diffusion models that lack inherent 3D awareness.
Current methods are often limited to generating low-resolution multi-view images (e.g., 256 x 256 pixels). Reconstructing high-quality 3D meshes from a limited number of generated views poses a significant challenge, as SDF-based reconstruction methods, commonly employed for this task, are typically designed for dense image sequences.
Ok, So what is proposed in the research paper?
Video Diffusion Models for 3D-Aware Image Generation: The authors propose using per-trained video diffusion models for image-to-3D generation, arguing that the inherent temporal consistency learned by these models from video data translates well to geometric consistency across multiple views in 3D. This addresses the limitations of 2D diffusion models that lack 3D awareness.
Additionally, Hi3D employs a two-stage approach to generate high-resolution (1024 x 1024) multi-view consistent images.
The authors address the challenge of reconstructing high-quality meshes from a limited number of views by augmenting the generated views with additional interpolation views rendered using 3D Gaussian Splatting (3DGS).
What’s next?
While Hi3D significantly improves multi-view consistency and detail generation, there's always room for advancement. Future work could focus on:
Refining the 3D-Aware Refiner: Exploring more sophisticated methods for incorporating depth information or other 3D cues into the refinement stage could further enhance geometric consistency and detail preservation
Investigating Alternative 3D Representations: The sources primarily rely on SDF-based reconstruction. Investigating other 3D representations, such as point clouds, meshes, or neural radiance fields, might provide alternative avenues for capturing and reconstructing 3D geometry with even greater fidelity
So essentially,
Hi3D generated High Resolution 3D Videos from Images!
Learned something new? Consider sharing with your friends!