3DPhysVideo Generates Physically Realistic Videos from Single Images

ai-technology · 2026-05-20

A new pipeline called 3DPhysVideo has been developed by researchers, enabling the generation of physically realistic videos from just one image without requiring training. This innovative approach utilizes a standard video model in two phases: initially, it serves as a novel view synthesizer to create a complete 360-degree 3D scene geometry by directing an image-to-video (I2V) flow model with rendered point clouds. Subsequently, physics solvers are applied to this geometry, allowing the physically simulated point cloud to guide the same I2V flow model in producing high-quality final videos. A significant element is the Consistency-Guided Flow SDE, which breaks down the prediction process to maintain physical accuracy. This research addresses challenges faced by previous methods like PhysGen3D, particularly in fluid dynamics, multi-object interactions, and achieving photorealism. The study can be found on arXiv with reference 2605.16795.

Key facts

3DPhysVideo generates physically realistic videos from a single image
It is a training-free pipeline
Repurposes an off-the-shelf video model
First stage: reconstructs 360-degree 3D scene geometry using I2V flow model and point clouds
Second stage: applies physics solvers, then uses point cloud to guide I2V model for final video
Uses Consistency-Guided Flow SDE
Addresses limitations of PhysGen3D in fluid dynamics, multi-object interactions, photorealism
Paper available on arXiv: 2605.16795

3DPhysVideo Generates Physically Realistic Videos from Single Images

Key facts

Entities

Institutions

Sources