SWE-chat: First Large-Scale Dataset of Real Coding Agent Sessions

ai-technology · 2026-04-24

SWE-chat has been introduced by researchers as the first extensive dataset of genuine coding agent interactions sourced from open-source developers. This dataset features 6,000 sessions, which include over 63,000 user prompts and 355,000 tool calls made by agents. It is designed to be a dynamic resource, equipped with an automated system that continuously identifies and processes sessions from public repositories. Preliminary findings indicate two distinct coding behaviors: in 41% of the sessions, agents generate nearly all the committed code (referred to as "vibe coding"), while in 23%, humans produce all the code independently. Despite advancements, coding agents show limited efficiency in real-world scenarios, with only 44% of their outputs deemed useful. This dataset seeks to offer concrete insights into the practical usage of AI coding agents and the utility of their outputs.

Key facts

SWE-chat is the first large-scale dataset of real coding agent sessions.
Dataset contains 6,000 sessions, 63,000+ user prompts, and 355,000 agent tool calls.
Data collected from open-source developers in the wild.
SWE-chat is a living dataset with automatic discovery and processing.
41% of sessions are 'vibe coding' where agents write most code.
23% of sessions have humans writing all code themselves.
Only 44% of agent outputs are useful in natural settings.
Study provides empirical characterization of coding agent usage and failure modes.

Entities

—

Sources

arXiv cs.AI — 2026-04-23