FutureSim: AI Agents Tested on Real-World Event Forecasting

ai-technology · 2026-05-16

Researchers have developed FutureSim, a simulation framework that replays real-world events chronologically to evaluate how AI agents adapt to new information. The system presents agents with news articles and questions that resolve over a three-month period from January to March 2026, testing their ability to forecast world events beyond their knowledge cutoff. In evaluations, the best-performing agent achieved only 25% accuracy, and many agents performed worse than making no prediction at all, as measured by Brier skill score. The study highlights a clear separation in adaptive capabilities among frontier AI agents and demonstrates FutureSim's utility for studying emerging research in adaptive AI. The work is detailed in a paper on arXiv (ID: 2605.15188).

Key facts

FutureSim replays real-world events in chronological order to test AI agents.
Agents forecast events beyond their knowledge cutoff using real news articles.
Evaluation period: January to March 2026.
Best agent accuracy: 25%.
Many agents had worse Brier skill score than no prediction.
Study reveals clear separation in adaptive capabilities.
Paper available on arXiv with ID 2605.15188.
FutureSim offers a realistic setting for studying adaptive AI.

FutureSim: AI Agents Tested on Real-World Event Forecasting

Key facts

Entities

Institutions

Sources