LLM Context Windows Fail Far Before Advertised Maximums

ai-technology · 2026-04-25

A new study on arXiv defines the 'Maximum Effective Context Window' (MECW) to measure real-world LLM performance, finding that most models degrade severely by 1000 tokens, with some failing at just 100 tokens, despite advertised context windows of 128K or more. The research collected hundreds of thousands of data points across multiple models and problem types, revealing that MECW varies by task and is drastically smaller than reported Maximum Context Window (MCW).

Key facts

Study defines Maximum Effective Context Window (MECW) concept
Hundreds of thousands of data points collected across multiple models
Significant differences found between MCW and MECW
MECW shifts based on problem type
Some top models failed with as little as 100 tokens in context
Most models had severe degradation by 1000 tokens in context
Published on arXiv with ID 2509.21361
Announce type: replace-cross

LLM Context Windows Fail Far Before Advertised Maximums

Key facts

Entities

Institutions

Sources