Stale Repository Context Actively Harms Code Completion in Retrieval-Augmented Generation
Researchers conducted a diagnostic study (arXiv:2605.14478) to determine if outdated snippets in retrieval-augmented code generation serve as benign noise or contribute to code that is incompatible with the current project state. The analysis utilized a carefully selected set of 17 production-helper signature changes from five Python repositories. When neutral prompts obscured commit freshness and anticipated current signatures, stale-only retrieval resulted in stale helper references for 15 out of 17 samples with Qwen2.5-Coder-7B-Instruct and for 13 out of 17 with gpt-4.1-mini, reflecting increases of 88.2% and 76.5% over current-only retrieval. While no retrieval method resulted in zero stale references, only 1 out of 17 completions was successful. The results indicate that stale context is detrimental rather than harmless, emphasizing the need for freshness-aware retrieval.
Key facts
- Study conducted on 17 samples from five Python repositories.
- Stale-only retrieval induced stale references in 15/17 Qwen2.5-Coder-7B-Instruct samples.
- Stale-only retrieval induced stale references in 13/17 gpt-4.1-mini samples.
- Percentage-point increase over current-only retrieval: 88.2% for Qwen, 76.5% for GPT.
- No retrieval resulted in zero stale references but only 1/17 passing completions.
- Study published on arXiv with ID 2605.14478.
- Controlled diagnostic study design with four retrieval conditions.
- Prompts were neutralized to hide commit freshness and expected signatures.
Entities
Institutions
- arXiv