LLM Agents Reproduce Social Science Results from Paper Methods Alone

ai-technology · 2026-04-27

A recent investigation published on arXiv (2604.21965) examines the ability of LLM agents to replicate findings in social science by utilizing solely the methodology outlined in research papers and the original datasets, without any access to the code or results. The system systematically extracts structured methodologies from the documents, conducts reimplementations while maintaining strict information isolation, and allows for deterministic comparisons at the cell level. An error attribution phase identifies the underlying reasons for any discrepancies. The study, which assesses four agent scaffolds and four LLMs across 48 papers with confirmed human reproducibility, reveals that while agents can generally reproduce published findings, their effectiveness varies significantly among models, scaffolds, and individual papers. Root cause analysis indicates that failures arise from both agent mistakes and additional factors.

Key facts

arXiv paper 2604.21965 tests LLM agents reproducing social science results from methods description and data only
Agents never see original code, results, or paper
System enables deterministic cell-level comparison of reproduced outputs to original results
Error attribution step traces discrepancies through the system chain
Evaluated four agent scaffolds and four LLMs on 48 papers
All 48 papers have human-verified reproducibility
Performance varies substantially between models, scaffolds, and papers
Failures stem from both agent errors and other causes

LLM Agents Reproduce Social Science Results from Paper Methods Alone

Key facts

Entities

Institutions

Sources