ARTFEED — Contemporary Art Intelligence

Voice Cloning Is Actually Style Transfer, Study Finds

ai-technology · 2026-05-20

A recent study released on arXiv (2605.16578v1) disputes the concept of "voice cloning," suggesting that prevalent models do not accurately reproduce a person's voice but rather implement a consistent style transfer. Human evaluators found the cloned voices to be more authoritative, warm, and reminiscent of customer service, perceiving them as more human-like than the originals. Additionally, they expressed heightened trust in these cloned voices and a greater inclination to share sensitive personal information with them. The study indicates that voice cloning results in a standardization of speaker traits, evidenced by decreased variance. These results have significant implications for tasks like completing recordings, dubbing, and preserving the voices of those who have lost their ability to speak.

Key facts

  • Voice cloning does not faithfully clone an individual's voice.
  • Widely-used voice cloning models apply systematic style transfer to source voices.
  • Cloned voices are perceived as more authoritative, warm, customer-service-like, and human-like.
  • Human annotators report greater trust in cloned voices than source voices.
  • Greater willingness to disclose sensitive personal information to cloned voices.
  • Voice cloning leads to homogenization of speaker characteristics.
  • Study published on arXiv with identifier 2605.16578v1.
  • Applications include completing recordings, dubbing, and preserving voices of individuals with speech loss.

Entities

Institutions

  • arXiv

Sources