ARTFEED — Contemporary Art Intelligence

Game-Time Benchmark Tests Temporal Skills of Spoken Language Models

ai-technology · 2026-05-04

A new benchmark called Game-Time evaluates temporal dynamics in conversational Spoken Language Models (SLMs), including timing, tempo, and simultaneous speaking. Inspired by human language learning through activities, it includes basic instruction-following tasks and advanced tasks with temporal constraints like tempo adherence and synchronized responses. Evaluation of diverse SLM architectures shows a clear performance disparity: state-of-the-art models handle basic tasks well, but many contemporary systems struggle with fundamental instruction-following. Nearly all models degrade substantially under temporal constraints, highlighting a critical gap in conversational fluency. The research is published on arXiv with ID 2509.26388.

Key facts

  • Game-Time Benchmark assesses temporal dynamics in SLMs
  • Tasks include basic instruction-following and advanced temporal constraints
  • State-of-the-art models perform well on basic tasks
  • Many contemporary systems struggle with fundamental instruction-following
  • Nearly all models degrade under temporal constraints
  • Research published on arXiv with ID 2509.26388
  • Inspired by human language learning through activities

Entities

Institutions

  • arXiv

Sources