Emotion-Conditioned Short-Horizon Human Pose Forecasting with a Lightweight Predictive World Model
A research paper on arXiv (2604.23532) proposes a lightweight autoregressive predictive world model for short-term human pose forecasting that incorporates emotion embeddings derived from facial expressions. The model uses a learnable gating mechanism to combine pose keypoints with emotion signals and performs 15-step rolling predictions via a two-layer LSTM architecture. Experiments were conducted on small-scale pose-emotion video datasets. The work addresses the gap in current trajectory prediction models that rely solely on geometric cues and ignore emotional signals influencing human motion dynamics.
Key facts
- Paper published on arXiv with ID 2604.23532
- Proposes emotion-conditioned short-horizon human pose forecasting
- Uses facial expression-derived emotion embeddings as conditional signals
- Lightweight autoregressive predictive world model with 15-step rolling prediction
- Combines pose keypoints and emotion embeddings via learnable gating mechanism
- Based on two-layer LSTM architecture
- Experiments on small-scale pose-emotion video datasets
- Addresses limitation of current models that overlook emotional signals
Entities
Institutions
- arXiv