Transformers' Symbolic Reasoning Limited by Representational Collapse

ai-technology · 2026-04-25

A new study on arXiv investigates why decoder-only transformer models struggle with abstract symbolic reasoning, specifically propositional logic problems given in-context. Previous work found that models fail to generalize to problems with unseen variable names, partly due to difficulty in copying unseen tokens. The new research reveals a key additional factor: representational collapse, where the unembeddings (last-layer weights) of unseen tokens converge to nearly identical vectors during training. This collapse makes it hard for the model to distinguish multiple unseen variables, especially when embedding and unembedding parameters are shared. The finding provides a mechanistic explanation for the effectiveness of heuristic interventions like 'active forgetting', which periodically resets token embeddings. The study combines theoretical analysis with empirical evidence, offering insights into the limitations of current transformer architectures in symbolic reasoning tasks.

Key facts

Study investigates decoder-only transformer models' ability to perform abstract symbolic reasoning
Focuses on solving propositional logic reasoning problems given in-context
Previous work showed models fail to generalize to problems with unseen variable names
One reason was difficulty in copying unseen tokens
New finding: representational collapse of unseen tokens' unembeddings
Unembeddings of unseen tokens collapse to nearly the same vector during training
Collapse makes distinguishing multiple unseen variables difficult
Provides mechanistic explanation for 'active forgetting' heuristic
Combines theoretical and empirical evidence

Transformers' Symbolic Reasoning Limited by Representational Collapse

Key facts

Entities

Institutions

Sources