ARTFEED — Contemporary Art Intelligence

Josh Talks Develops Full-Duplex Hindi Conversational AI

ai-technology · 2026-04-29

At Josh Talks, researchers introduced Human-1, the first full-duplex spoken dialogue system that’s open and reproducible, specifically for Hindi. They adapted the existing Moshi architecture by replacing its English tokenizer with one designed for Hindi and adjusted certain parameters while retaining the pre-trained audio elements. The training process involved 26,000 hours of real, spontaneous conversations from 14,695 speakers across various channels, which helped the system learn turn-taking and overlapping speech. They used a two-step training method: first extensive pre-training, then fine-tuning with 1,000 hours of chat data. This system effectively imitates natural conversation features like interruptions, which haven’t been deeply explored in Indian languages. Evaluations included prompts for dialogue continuations.

Key facts

  • Human-1 is a full-duplex spoken dialogue system for Hindi.
  • It adapts the Moshi architecture with a custom Hindi tokeniser.
  • Training data: 26,000 hours from 14,695 speakers.
  • Two-stage training: pre-training then fine-tuning on 1,000 hours.
  • Models interruptions, overlaps, and backchannels.
  • First open, reproducible system of its kind for Hindi.
  • Text-vocabulary parameters reinitialised; audio components retained.
  • Evaluation via prompted dialogue continuations.

Entities

Institutions

  • Josh Talks

Sources