Transformer Representations for Time Series Forecasting: Mechanistic Interpretability Analysis

publication · 2026-05-07

A recent preprint on arXiv (2605.05151) explores whether the mechanisms that enhance transformer capabilities in NLP are applicable to time series data. Utilizing sparse autoencoders (SAEs) for mechanistic interpretability, the research examines the internal representations of PatchTST. Findings indicate that a single-layer, low-dimensional transformer achieves forecasting results comparable to more complex models across standard benchmarks. Training SAEs on post-GELU intermediate FFN activations with dictionary sizes between 0.5x and 4.0x the native dimensionality resulted in minimal changes in downstream performance (average 0.214%), with a significant portion of the dictionary remaining unused. This implies that superposition—a critical aspect of transformer representations in NLP—is unnecessary for time series forecasting, providing a mechanistic rationale for the effectiveness of simpler linear models like DLinear.

Key facts

arXiv:2605.05151
Transformer architectures used for time series forecasting
Sparse autoencoders (SAEs) applied to PatchTST
Single-layer narrow transformer matches deeper configurations
Dictionary sizes from 0.5x to 4.0x native dimensionality
Average performance change of 0.214%
Superposition not necessary for time series
Explains competitiveness of DLinear

Transformer Representations for Time Series Forecasting: Mechanistic Interpretability Analysis

Key facts

Entities

Institutions

Sources