GPT-4o Paraphrase Augmentation Improves Sign Language Translation
Researchers at the University of Surrey and collaborators have introduced target-side paraphrase augmentation for sign language translation (SLT), using GPT-4o to generate controlled variants of reference sentences while keeping sign videos unchanged. A Signformer-style pose-based Transformer is trained in two stages: pre-training on augmented data, then fine-tuning on original references. Evaluated on three datasets—PHOENIX14T (German Sign Language), GSL (Greek Sign Language), and LSA-T (Argentinian Sign Language)—the method improved BLEU-4 from 9.56 to 10.33 on PHOENIX14T. However, near-saturated GSL and extremely sparse LSA-T revealed limits. This is the first study to apply LLM-based paraphrase augmentation to SLT, addressing limited paired corpora and heavy-tailed vocabularies.
Key facts
- Target-side paraphrase augmentation uses GPT-4o to generate controlled variants of reference sentences.
- Sign input remains unchanged during augmentation.
- A Signformer-style pose-based Transformer is trained with two-stage schedule: pre-training on augmented corpus, fine-tuning on original references.
- Evaluated on three datasets: PHOENIX14T (German Sign Language), GSL (Greek Sign Language), LSA-T (Argentinian Sign Language).
- On PHOENIX14T, BLEU-4 improved from 9.56 to 10.33.
- GSL baseline was near-saturated; LSA-T had severe long-tail sparsity.
- First study to apply LLM-based paraphrase augmentation to sign language translation.
- Addresses limited paired sign-video/text corpora and heavy-tailed target vocabularies.
Entities
Institutions
- University of Surrey
- arXiv