Open-Source Workflow Creates Talking Slide Avatars for Teaching

other · 2026-04-29

A new open-source workflow enables instructors to generate talking slide avatars by combining OpenVoice for text-to-speech and voice cloning with Ditto-TalkingHead for audio-driven talking-image synthesis. This approach addresses the loss of instructor presence in online, hybrid, and asynchronous slide-based teaching by transforming a script and static portrait into a short narrated video. The study frames this as a pedagogical and production solution rather than a purely technical one.

Key facts

Workflow integrates OpenVoice and Ditto-TalkingHead
OpenVoice handles text-to-speech and voice cloning
Ditto-TalkingHead performs audio-driven talking-image synthesis
Output is a short narrated video embedded in slide decks or HTML materials
Addresses loss of instructor presence in online teaching
Aims to restore narrative continuity and expressive framing
Practice-based analysis of an open-source workflow
Targets higher education slide-based teaching

Entities

—

Sources

arXiv cs.AI — 2026-04-28