MARI: Multi-Adapter Representation Interventions via Energy Calibration

ai-technology · 2026-05-28

arXiv:2605.28722 introduces MARI (Multi-Adapter Representation Interventions via Energy Calibration), a novel method for aligning large language models (LLMs) without modifying weights. Existing representation interventions apply a fixed correction uniformly, but MARI finds that optimal intervention direction and strength vary per sample. To address this, MARI employs a competitive multi-adapter mechanism where specialized experts capture non-linear correction patterns and adaptively determine intervention parameters. An energy-based gating module uses internal propagation dynamics to distinguish inputs suitable for intervention, preventing degradation of general capabilities on benign inputs. The approach promises more precise and less harmful alignment.

Key facts

arXiv:2605.28722 proposes MARI for LLM alignment.
Existing methods apply fixed intervention uniformly across inputs.
MARI uses a competitive multi-adapter mechanism.
Specialized experts capture non-linear correction patterns.
Intervention direction and strength are adaptively determined per sample.
An energy-based gating module distinguishes inputs suitable for intervention.
MARI prevents degradation of general capabilities on benign inputs.
The method does not modify model weights.

MARI: Multi-Adapter Representation Interventions via Energy Calibration

Key facts

Entities

Institutions

Sources