Transformer Models Show Dual Geometry in Concept Representation

ai-technology · 2026-05-06

A recent study featured on arXiv investigates the role of causal inner product derived from unembedding covariance in enhancing concept transfer between languages via transformer architectures. The research analyzed 17 models across four different language pairs and found that Whitened Causal Alignment performs similarly to spectral regularization, achieving a statistical significance level of p = 0.95. Notably, differences among five model types displayed a significant anti-concentration (p < 10^{-33}), backed by SAE feature analysis (p = 4.5 × 10^{-19}). Key observations include unique geometric patterns in activation spaces and effective interventions on models Gemma and Llama, marking Cohen's d values of up to 1.8.

Key facts

Study tests cross-lingual concept transport using causal inner product from Park et al. 2024
17 models and 4 language pairs tested
Whitened Causal Alignment indistinguishable from spectral regularization (p = 0.95)
Anti-concentration observed in residual-stream difference-of-means vectors (p < 10^{-33})
SAE features support anti-concentration (p = 4.5 × 10^{-19})
Linear probes on Gemma and Llama confirm findings
Dual geometry: activation-space anti-concentrates, unembedding-row concentrates (p < 10^{-4})
Split-injection causal interventions support functional basis (Cohen's d up to 1.8)

Transformer Models Show Dual Geometry in Concept Representation

Key facts

Entities

Institutions

Sources