Interpretable AI Tutoring System for Presentation Skills

ai-technology · 2026-05-25

An innovative closed-loop Intelligent Tutoring System (ITS) has been developed to enhance learners' on-camera presentation abilities through multimodal affective feedback. This system implements a seven-dimensional Behaviorally Anchored Rating Scale (BARS) and incorporates a three-tiered feedback framework: multimodal scoring aligned with rubrics, diagnostics based on audience perception, and conversational coaching augmented by retrieval. Utilizing an XGBoost foundation, it analyzes facial, vocal, textual, and oculomotor data to provide evidence-based feedback linked to observable performance indicators. After training on 10,360 MOOC video clips, the ITS demonstrated scoring that aligns closely with expert evaluations (R² = 0.48–0.61, Spearman's ρ = 0.69–0.78, MAE = 0.43–0.57). This system enables focused practice by delivering actionable feedback independently of human instructors. The findings were shared on arXiv (ID: 2605.17468) and are significant for AI in education and human-computer interaction.

Key facts

The ITS uses multimodal inputs: facial, vocal, textual, and oculomotor features.
It operationalizes a seven-dimensional Behaviorally Anchored Rating Scale (BARS).
The feedback architecture has three layers: scoring, diagnostics, and coaching.
The system is built on an XGBoost backbone.
It was trained on 10,360 MOOC video segments.
Performance metrics: R² = 0.48–0.61, Spearman's ρ = 0.69–0.78, MAE = 0.43–0.57.
The system supports deliberate practice without human instructors.
The paper is available on arXiv with ID 2605.17468.

Interpretable AI Tutoring System for Presentation Skills

Key facts

Entities

Institutions

Sources