SpeechParaling-Bench: New Benchmark for Paralinguistic Speech Generation

other · 2026-04-24

A new benchmark called SpeechParaling-Bench has been launched by researchers to assess paralinguistic-aware speech generation in Large Audio-Language Models (LALMs). This benchmark significantly increases the range of features from under 50 to more than 100 detailed characteristics, utilizing over 1,000 parallel speech queries in English and Chinese. It comprises three tasks that escalate in difficulty: fine-grained control, intra-utterance variation, and context-aware adaptation. To ensure accurate evaluations, a pairwise comparison pipeline has been established, where an LALM-based judge assesses candidate responses against a fixed baseline, emphasizing relative preference instead of absolute scoring to reduce subjectivity. This research was published on arXiv under ID 2604.20842.

Key facts

SpeechParaling-Bench is a benchmark for paralinguistic-aware speech generation.
It covers over 100 fine-grained features, up from fewer than 50.
Includes more than 1,000 English-Chinese parallel speech queries.
Organized into three tasks: fine-grained control, intra-utterance variation, context-aware adaptation.
Uses a pairwise comparison pipeline with an LALM-based judge.
Evaluation is based on relative preference rather than absolute scoring.
Aims to address subjectivity in assessment.
Announced on arXiv with ID 2604.20842.

SpeechParaling-Bench: New Benchmark for Paralinguistic Speech Generation

Key facts

Entities

Institutions

Sources