FineSteer Framework Enables Precise Inference-Time Control in Large Language Models
A new framework called FineSteer offers fine-grained control over large language model behavior during inference without parameter updates. Developed to address undesirable outputs like safety violations and hallucinations, the approach decomposes steering into two complementary stages. The first stage employs Subspace-guided Conditional Steering to preserve model utility by avoiding unnecessary interventions. In the second stage, a Mixture-of-Steering-Experts mechanism captures multimodal aspects of desired behaviors. This research addresses limitations of existing methods that often sacrifice effectiveness, utility preservation, or training efficiency. The work is documented in arXiv preprint 2604.15488v1. FineSteer allows precise control over when and how to steer internal representations during generation. The framework represents a cost-effective alternative to full model retraining for behavior adjustment.
Key facts
- FineSteer is a unified framework for fine-grained inference-time steering in LLMs
- Addresses undesirable behaviors like safety violations and hallucinations
- Decomposes steering into conditional steering and fine-grained vector synthesis
- First stage uses Subspace-guided Conditional Steering to preserve utility
- Second stage employs Mixture-of-Steering-Experts mechanism
- Allows control over when and how to steer internal representations
- arXiv preprint identifier: 2604.15488v1
- Announce type: cross
Entities
—