DyMoS: Training-Free Method to Improve Motion in Image-to-Video Models

ai-technology · 2026-05-20

A significant factor contributing to motion suppression in image-to-video (I2V) models has been identified by researchers as reference-frame dominance. To tackle this issue, they have introduced DyMoS (Dynamic Motion Slider), a method that does not require training and is compatible with various models. DyMoS shifts focus from generated frames to the reference frame during the initial denoising process, all while maintaining the original input image and model weights. It introduces a single scalar parameter that allows for continuous adjustment of motion intensity. This approach effectively mitigates the problem of excessively static videos often seen in I2V models, without compromising the fidelity to the reference image.

Key facts

arXiv:2605.19398v1
Reference-frame dominance is identified as a key mechanism behind motion suppression in I2V models.
Non-reference frames allocate excessive self-attention to reference-frame key tokens.
DyMoS rebalances attention pathways from generated frames to the reference frame.
DyMoS is training-free and model-agnostic.
DyMoS leaves input image and model weights unchanged.
A single scalar parameter allows continuous control over motion.
The method improves motion without additional training or sacrificing reference image fidelity.

DyMoS: Training-Free Method to Improve Motion in Image-to-Video Models

Key facts

Entities

Institutions

Sources