Language Pretraining Enables Cross-Modal Transfer to Time Series Forecasting
A recent study published on arXiv (2605.20449) reveals that transformers trained on language can be adapted for time series forecasting without the need for paired supervision. By applying linear probes to fixed LLM states, researchers can effectively decode realistic time-series patterns, with retrieval in the projected space leading to competitive predictions. Utilizing pretrained models results in coherent gradients and a distinctly anisotropic loss landscape, in contrast to random initialization. Fine-tuning serves as a means of low-dimensional alignment, leveraging existing directions instead of developing temporal primitives from the ground up, as demonstrated by low-rank updates, subspace alignment, and shared features related to periodicity, trends, and repetition.
Key facts
- arXiv paper 2605.20449 shows cross-modal transfer from LLMs to time series.
- Linear probes on frozen LLM states decode time-series trajectories without paired supervision.
- Retrieval in projected space yields competitive forecasts.
- Pretrained initialization produces coherent gradients and anisotropic loss landscape.
- Finetuning acts as low-dimensional alignment, reusing existing directions.
- Evidence includes low-rank updates, subspace alignment, and shared features for periodicity, trend, and repetition.
Entities
Institutions
- arXiv