AnchorSteer: Self-Discovered Concept Injection for Music Editing
AnchorSteer is a system designed for manageable music editing that alters overarching characteristics while maintaining rhythmic and melodic frameworks. It tackles the issue of semantic-structural intertwining by integrating structural anchoring with autonomously identified semantic steering. This method investigates internal representations to derive interpretable, label-free concept vectors through a self-supervised reconstruction goal, allowing for the separation of attributes without the need for curated datasets. During the editing process, these versatile concept vectors are introduced into diffusion hidden manifolds, with a structural adaptor ensuring consistency. Variants for both unconditioned and conditioned injections strike a balance between robustness and semantic strength. The experiments utilized the ZoM dataset.
Key facts
- AnchorSteer is a framework for controllable music editing.
- It modifies high-level attributes while preserving rhythmic and melodic structures.
- It disentangles semantic-structural entanglement.
- It uses self-supervised reconstruction to extract concept vectors.
- Concept vectors are injected into diffusion hidden manifolds.
- A structural adaptor enforces consistency during editing.
- Variants for unconditioned and conditioned injections are provided.
- Experiments were performed on the ZoM dataset.
Entities
—