EIGENET: Geometry-Informed AI for Room Impulse Response Prediction
EIGENET is an innovative framework that leverages geometry-informed multi-modal learning to forecast spatially varying Room Impulse Response (RIR) based on limited observations. It utilizes a Cross-view Alternate-attention Transformer to progressively enhance local intra-view acoustic structures alongside global cross-view spatial relationships. Drawing inspiration from acoustic ray tracing, a geometry-informed modulation block links geometric attributes to the RIR power spectrum. Additionally, an auxiliary loss converts single-target waveform predictions into a multi-task learning model. This research has been made available on arXiv with the identifier 2605.28101.
Key facts
- EIGENET is a geometry-informed multi-modal framework for few-shot novel view RIR prediction.
- It uses a Cross-view Alternate-attention Transformer for spatial-temporal reasoning.
- A geometry-informed modulation block is inspired by acoustic ray tracing.
- An auxiliary loss enables multi-task learning for waveform prediction.
- The paper is available on arXiv with ID 2605.28101.
- The framework addresses the inverse problem of predicting RIR from sparse observations.
- It is designed for immersive spatial audio rendering.
- The architecture makes full use of multi-view multi-modal context.
Entities
Institutions
- arXiv