Empirical Audit of k-NAF Budget Accounting in Anchored Decoding

other · 2026-05-28

An empirical audit has been conducted to investigate the k-NAF budget-accounting mechanism within Anchored Decoding. This research employs a fixed, stratified workload consisting of around 8,500 randomized executions across six different prompt categories, along with an adaptive prompt-search strategy aimed at maximizing high proxy spend ratios. For the fixed workload, the average cumulative KL spend is significantly lower than the sequence-level budgets K of {600, 1000}, and an empirical Bernstein-style proxy remains under K for all classes. The diagnostics for surface overlap (ROUGE-L and 5-gram Jaccard) are similarly minimal. Although adaptive search raises the proxy spend ratio, it does not lead to evident budget exhaustion. In a separate copyright-domain workload at k = 3, some prompts show proxy ratios exceeding 1 during early-stopped evaluations with limited sample sizes. However, re-evaluating these prompts with a larger allocation brings the proxy ratio down to between [0.26, 0.40] under similar mean spending, aligning with proxy artifacts.

Key facts

Study audits k-NAF budget accounting in Anchored Decoding
Fixed workload: ~8,500 randomized executions across six prompt classes
Adaptive prompt-search targets high proxy spend ratios
Mean cumulative KL spend below budgets K in {600, 1000}
Bernstein-style proxy stays below K for every class
Surface-overlap diagnostics (ROUGE-L, 5-gram Jaccard) are small
Adaptive search increases proxy ratio but no budget exhaustion
Copyright-domain workload at k=3 shows proxy ratios above 1 under early stopping
Re-evaluation with larger allocation reduces proxy ratio to [0.26, 0.40]

Entities

—

Sources

arXiv cs.AI — 2026-05-28