Tadabur Dataset Expands Quranic Audio Research with 1400+ Hours from 600 Reciters

ai-technology · 2026-04-22

The Tadabur audio dataset has been launched to advance research on the Quran, comprising over 1,400 hours of recitation audio from more than 600 unique reciters. This resource introduces significant diversity in recitation styles and recording environments, overcoming earlier challenges related to scale and variety. Available on arXiv under identifier 2604.18932, Tadabur is designed to foster future studies and establish standardized benchmarks for analyzing Quranic speech. Its comprehensive collection enables thorough exploration of recitation trends and encourages innovative approaches in the discipline. Additionally, the dataset supports interdisciplinary collaboration and acts as an essential tool for examining the convergence of technology and religious practices, enhancing both technical skills and cultural preservation.

Key facts

Tadabur is a large-scale Quran audio dataset
Contains over 1400 hours of recitation audio
Features more than 600 distinct reciters
Provides variation in recitation styles and vocal characteristics
Includes diverse recording conditions
Addresses limitations in existing Quran datasets
Aims to support future Quranic speech research
Facilitates development of standardized Quranic speech benchmarks

Tadabur Dataset Expands Quranic Audio Research with 1400+ Hours from 600 Reciters

Key facts

Entities

Institutions

Sources