Generative RIR Augmentation Boosts Speaker Distance Estimation Accuracy
At ICASSP 2025, there will be a challenge focused on Room Acoustics and Speaker Distance Estimation (SDE) as part of the GenDARA initiative. This competition aims to enhance SDE models using augmented room impulse response (RIR) data. Researchers employed the FastRIR generator, which is open-source, to improve limited datasets by concentrating on speaker and listener placements. They applied a quality filter to ensure the RIRs produced were suitable for the challenge and used hyperparameter optimization for refinement. As a result, the mean absolute error (MAE) decreased significantly, from 1.66m to 0.6m for GWA rooms, and from 2.18m to 0.69m for Treble rooms, especially boosting accuracy for medium to long distances.
Key facts
- ICASSP 2025 hosts the Room Acoustics and Speaker Distance Estimation Challenge
- Challenge is part of GenDARA initiative
- FastRIR generator used for RIR augmentation
- Quality filter aligns generated RIRs with challenge data
- Hyperparameter optimization applied for model fine-tuning
- MAE reduced from 1.66m to 0.6m for GWA rooms
- MAE reduced from 2.18m to 0.69m for Treble rooms
- Improvements most notable at medium to long distances
Entities
Institutions
- ICASSP
- GenDARA