ARTFEED — Contemporary Art Intelligence

Hierarchical Language Models Show Predictable Scaling and Reasoning Benefits

publication · 2026-05-14

A new arXiv paper (2605.13687) introduces synthetic languages with hierarchical structure generated by a broadcast process on trees, enabling precise analysis of context length and reasoning in autoregressive generation. The authors propose an exact k-gram ansatz as a substitute for transformers with context length k, validated empirically. For the Ising broadcast process, they prove the variance of generated sums scales log-linearly with context depth and kurtosis converges to Gaussian, deviating from the true language for sublinear context. For the coloring broadcast process in the freezing regime, bounded-context models also show predictable deviations.

Key facts

  • Paper introduces synthetic languages with hierarchical structure via broadcast process on trees
  • Exact k-gram ansatz substitutes for transformers with context length k
  • Ising broadcast process: variance scales log-linearly, kurtosis converges to Gaussian
  • Coloring broadcast process analyzed in freezing regime
  • Predictable scaling laws for distributional statistics
  • Empirical validation of the ansatz
  • Provable benefits of reasoning in autoregressive generation
  • arXiv preprint 2605.13687

Entities

Institutions

  • arXiv

Sources