Probabilistic Circuits vs LLMs: Expressivity Gap Analysis

other · 2026-05-14

A recent study published on arXiv examines the differences between Probabilistic Circuits (PCs) and Transformer-based large language models (LLMs) using a unified autoregressive approach, revealing two significant limitations. The first is an output bottleneck, where PCs represent predictions as convex combinations in probability space, which struggles with sharp language distributions; however, using logit-space parameterization helps mitigate this issue. The second limitation pertains to context encoding: while structured-decomposable PCs can achieve the same separation rank as Transformers on vtree-aligned partitions, their capacity is restricted by fixed routing structures, leading to performance declines with diverse dependencies. This research underscores both the theoretical and empirical constraints of PCs in language modeling.

Key facts

Probabilistic Circuits (PCs) are deep generative models supporting exact probabilistic inference.
PCs lag behind Transformer-based LLMs in autoregressive language modeling.
Output bottleneck: PCs use convex combinations in probability space, struggling with sharp distributions.
Logit-space parameterization substantially narrows the output bottleneck.
Context-encoding bottleneck: structured-decomposable PCs match Transformer separation rank only on vtree-aligned partitions.
PC capacity is limited to partitions aligned with fixed routing structure.
Heterogeneous dependencies cause severe degradation in PC performance.
Study is from arXiv preprint 2605.12940v1.

Probabilistic Circuits vs LLMs: Expressivity Gap Analysis

Key facts

Entities

Institutions

Sources