Empirical Audit of k-NAF Budget Accounting in Anchored Decoding
An empirical audit has been conducted to investigate the k-NAF budget-accounting mechanism within Anchored Decoding. This research employs a fixed, stratified workload consisting of around 8,500 randomized executions across six different prompt categories, along with an adaptive prompt-search strategy aimed at maximizing high proxy spend ratios. For the fixed workload, the average cumulative KL spend is significantly lower than the sequence-level budgets K of {600, 1000}, and an empirical Bernstein-style proxy remains under K for all classes. The diagnostics for surface overlap (ROUGE-L and 5-gram Jaccard) are similarly minimal. Although adaptive search raises the proxy spend ratio, it does not lead to evident budget exhaustion. In a separate copyright-domain workload at k = 3, some prompts show proxy ratios exceeding 1 during early-stopped evaluations with limited sample sizes. However, re-evaluating these prompts with a larger allocation brings the proxy ratio down to between [0.26, 0.40] under similar mean spending, aligning with proxy artifacts.
Key facts
- Study audits k-NAF budget accounting in Anchored Decoding
- Fixed workload: ~8,500 randomized executions across six prompt classes
- Adaptive prompt-search targets high proxy spend ratios
- Mean cumulative KL spend below budgets K in {600, 1000}
- Bernstein-style proxy stays below K for every class
- Surface-overlap diagnostics (ROUGE-L, 5-gram Jaccard) are small
- Adaptive search increases proxy ratio but no budget exhaustion
- Copyright-domain workload at k=3 shows proxy ratios above 1 under early stopping
- Re-evaluation with larger allocation reduces proxy ratio to [0.26, 0.40]
Entities
—