Random Embeddings Boost LLM Reasoning Without Training
A new arXiv preprint (2605.11936) reveals that inserting random embedding vectors into large language model inputs can improve reasoning performance as effectively as trained soft prompts. The study introduces Random Soft Prompts (RSPs), which replace learned vectors with sequences sampled from an isotropic Gaussian fitted to the pretrained embedding table's statistics. Despite carrying no learned content, RSPs achieve accuracy comparable to optimized soft prompts on math reasoning benchmarks. The mechanism involves two stages: attention's encounter with a novel random position flattens token distribution and diversifies reasoning trajectories, then naturally dilutes as generation progresses, leading to committed responses. This finding suggests that the act of injection itself, rather than learned content, may drive performance gains.
Key facts
- arXiv:2605.11936
- Random Soft Prompts (RSPs) use no training
- RSP vectors sampled from isotropic Gaussian fitted to embedding table
- Accuracy comparable to optimized soft prompts on math reasoning
- Two-stage mechanism: initial flattening then dilution
- Attention to random position flattens token distribution
- Reasoning trajectories branch before committing
- Published on arXiv
Entities
Institutions
- arXiv