Forward-Forward Networks Suffer Layer Free-Riding, Remedies Found

other · 2026-05-09

A new paper on arXiv (2605.06240) identifies and addresses a flaw in cumulative-goodness variants of Forward-Forward (FF) neural networks. The authors formalize 'layer free-riding,' where later layers inherit partially separated tasks from earlier layers, causing the class-discrimination gradient to decay exponentially with accumulated positive margin. They propose three local remedies—per-block, hardness-gated, and depth-scaled—that recover separation without backpropagated gradients. On CIFAR-10 and CIFAR-100, these fixes improve layer-separation statistics by 4× to 45× in deeper layers, while accuracy changes by less than one percentage point for non-degenerate training. Tiny ImageNet provides a cross-dataset check. The work suggests free-riding is real and repairable but not accuracy-dominant.

Key facts

Forward-Forward (FF) training uses local goodness criterion per layer.
Cumulative-goodness variants cause layer free-riding.
Gradient decays exponentially with positive margin from preceding blocks.
Three local remedies: per-block, hardness-gated, depth-scaled.
CIFAR-10 and CIFAR-100 show 4× to 45× gains in deeper layers.
Accuracy changes by less than one percentage point.
Tiny ImageNet used as cross-dataset check.
Paper available at arXiv:2605.06240.

Forward-Forward Networks Suffer Layer Free-Riding, Remedies Found

Key facts

Entities

Institutions

Sources