ARTFEED — Contemporary Art Intelligence

PhyCo: Physics-Controlled Video Generation Framework

ai-technology · 2026-05-01

Researchers have introduced PhyCo, a framework that integrates physical priors into video generation, addressing the lack of physical consistency in current diffusion models. PhyCo uses a dataset of over 100K photorealistic simulation videos with varied friction, restitution, deformation, and force. It fine-tunes a pretrained diffusion model via ControlNet conditioned on pixel-aligned physical property maps, and employs VLM-guided reward optimization for feedback. This approach enables controllable generation of physically plausible motion, such as realistic collisions and material responses.

Key facts

  • PhyCo introduces continuous, interpretable, physically grounded control into video generation.
  • Dataset includes over 100K photorealistic simulation videos with systematic variation of friction, restitution, deformation, and force.
  • Physics-supervised fine-tuning uses a ControlNet conditioned on pixel-aligned physical property maps.
  • VLM-guided reward optimization provides differentiable feedback via a fine-tuned vision-language model.
  • Addresses issues like object drift, unrealistic collisions, and mismatched material responses in video diffusion models.

Entities

Institutions

  • arXiv

Sources