Auxiliary Particle Power Sampling Boosts LLM Reasoning at Inference Time

ai-technology · 2026-05-06

A novel technique known as Auxiliary Particle Power Sampling (APPS) enhances the multi-step reasoning capabilities of large language models (LLMs) without requiring extra training. This method tackles a recognized limitation: while base LLMs can assign meaningful probabilities to correct answers, they often have difficulty identifying them efficiently during inference. APPS employs a blockwise particle algorithm to estimate sequence-level power sampling, focusing on p_theta(x)^alpha with alpha > 1. It simultaneously propagates hypotheses using proposal-corrected power reweighting and improves survival through future-value-guided selection at resampling points. This approach reallocates limited computational resources among competing prefixes instead of adhering to a single trajectory. The research, detailed in the paper on arXiv (2605.02427v1), presents APPS as a systematic method to steer decoding toward favorable outcomes.

Key facts

APPS stands for Auxiliary Particle Power Sampling
Targets p_theta(x)^alpha with alpha > 1
Uses blockwise particle algorithm
Propagates hypotheses in parallel
Employs proposal-corrected power reweighting
Refines survival via future-value-guided selection
Redistributes compute across competing prefixes
Published on arXiv with ID 2605.02427v1

Auxiliary Particle Power Sampling Boosts LLM Reasoning at Inference Time

Key facts

Entities

Institutions

Sources