ARTFEED — Contemporary Art Intelligence

Research Demonstrates Local Linearity in LLMs Enables Activation Steering via Linear Optimal Control

ai-technology · 2026-04-22

A recent study reveals that large language models (LLMs) display local linearity in their dynamics across layers, facilitating improved activation steering during inference. The paper, available on arXiv with the identifier 2604.19018v1, illustrates that even with the nonlinear nature of transformer blocks, the dynamics of various LLM architectures can be approximated using locally-linear models. This characteristic allows for modeling LLM inference as a linear time-varying dynamical system, enabling the adaptation of traditional linear quadratic regulator techniques for feedback controller computation. By employing layer-wise Jacobians, the method guides activations toward specific semantic targets with minimal computational cost and no need for offline training. Unlike existing methods that often overlook perturbation propagation and lack real-time error feedback, this research provides both theoretical bounds and empirical support for the local linearity concept, marking a notable improvement in inference-time alignment strategies for LLMs.

Key facts

  • Research paper published on arXiv under identifier 2604.19018v1
  • Demonstrates local linearity in layer-wise dynamics of large language models
  • Enables activation steering via linear optimal control methods
  • Models LLM inference as linear time-varying dynamical system
  • Uses layer-wise Jacobians to compute feedback controllers
  • Requires no offline training and minimal computational overhead
  • Addresses limitations of existing non-anticipative intervention methods
  • Provides theoretical bounds supporting local linearity observation

Entities

Institutions

  • arXiv

Sources