Cross-Entropy Removal Tested in K-Way Energy Probe on Predictive Coding Networks
There's a new study looking into how the K-way energy probe behaves in predictive coding networks when cross-entropy loss isn’t used. In 2026, Cacioli found that under five specific conditions, the probe becomes a simple function of the log-softmax margin, particularly when cross-entropy is present at the output. This research explores two setups: one using mean squared error instead of cross-entropy and another with bidirectional predictive coding (bPC) as outlined by Oliviers, Tang, and Bogacz in 2025. Using 10 seeds on CIFAR-10 with a 2.1 million parameter model, the results show that the probe in standard PC is lower than softmax, while in bPC, it is higher across all seeds. However, the bPC doesn’t show a significantly larger latent movement compared to standard PC at this scale. You can check out the full study on arXiv:2604.21286v1.
Key facts
- Cacioli (2026) showed K-way energy probe reduces to monotone function of log-softmax margin.
- Reduction rests on five assumptions including cross-entropy and feedforward inference.
- Pre-registered study tests sensitivity to CE removal.
- Two conditions: standard PC with MSE, and bidirectional PC (bPC).
- 10 seeds on CIFAR-10 with 2.1M-parameter backbone.
- Standard PC: probe below softmax (Delta = -0.082, p < 10^-6).
- bPC: probe exceeds softmax across all 10 seeds (Delta = +0.008, p = 0.000027).
- bPC does not produce greater latent movement than standard PC (ratio 1.6).
Entities
Institutions
- arXiv