ARTFEED — Contemporary Art Intelligence

Live Tracker Visualizes AI Model Performance Decay Over Time

ai-technology · 2026-05-14

A developer has built a live tracker that visualizes the lifecycle and performance changes of flagship AI models using historical ELO ratings from Arena AI. The dashboard plots a single continuous curve per major AI lab, tracking their highest-rated flagship model over time to highlight both generational jumps and gradual performance decays. The tool addresses the common perception that flagship models feel amazing at launch but degrade weeks later. The developer notes a data blindspot: Arena AI relies on testing API endpoints, while consumer chat UIs often layer heavy system prompts that may affect performance. The chart is optimized for mobile and includes an optional dark mode.

Key facts

  • The tracker visualizes lifecycle and performance changes of flagship AI models.
  • It uses historical ELO ratings from Arena AI.
  • One continuous curve per major AI lab is plotted.
  • The curve tracks the highest-rated flagship model over time.
  • It highlights sudden generational jumps and slow performance decays.
  • The chart is optimized for mobile and includes dark mode.
  • Arena AI largely relies on testing API endpoints.
  • Consumer chat UIs often layer heavy system prompts, creating a data blindspot.

Entities

Institutions

  • Arena AI

Sources