ARTFEED — Contemporary Art Intelligence

When2Speak Dataset Improves LLM Turn-Taking in Multi-Party Conversations

ai-technology · 2026-05-09

When2Speak is a newly developed synthetic dataset aimed at enhancing the capability of large language models to discern appropriate moments for speaking during multi-party discussions. It comprises more than 215,000 instances sourced from 16,000 dialogues featuring between 2 and 6 participants, showcasing a variety of conversational styles, tones, and dynamics among speakers. The dataset specifically focuses on modeling SPEAK versus SILENT choices at each conversational turn. A four-phase generation process integrates real-world grounding, structured augmentation, controlled transcript creation, and supervision suitable for fine-tuning. Both the dataset and the generation pipeline are fully open-source, promoting reproducibility and adaptation for specific conversational domains. This initiative tackles a significant issue in LLM effectiveness, as existing models frequently cause interruptions in group interactions, harming overall coherence.

Key facts

  • Dataset named When2Speak
  • Over 215,000 examples
  • Derived from 16,000 conversations
  • Involves 2 to 6 speakers
  • Models SPEAK vs. SILENT decisions
  • Four-stage generation pipeline
  • Fully open-sourced
  • Addresses interruption problem in multi-party conversations

Entities

Sources