CausaLab: Interactive Causal Discovery Platform for AI Scientists

ai-technology · 2026-05-26

CausaLab has been developed by researchers as a scalable platform for assessing interactive causal discovery through LLM agents. This evaluation method differs from previous ones by examining both the use of causal evidence in problem-solving and the validity of the hypothesis regarding the causal mechanism. In each scenario, an agent operates within a synthetic lab, where it analyzes prior measurement data, manipulates a crystal, and forecasts the resonance frequency of a separate reactor crystal influenced by the same mechanism. The underlying data-generating process is derived from a randomly selected structural causal model (SCM), compelling agents to reconstruct both the causal graph and structural equations rather than relying on existing knowledge. CausaLab features a specialized language that tracks the agent's developing SCM hypothesis, allowing for the analysis and comparison of trajectories against the ground truth. Experimental results reveal a consistent performance gap between current LLM agents and optimal outcomes, underscoring the difficulties in causal reasoning.

Key facts

CausaLab is a scalable environment for evaluating interactive causal discovery by LLM agents.
It evaluates both problem-solving and correctness of causal mechanism hypotheses.
Each episode involves predicting resonance frequency of a held-out reactor crystal.
The hidden data-generating process is a randomly sampled structural causal model (SCM).
CausaLab includes a domain-specific language for recording SCM hypotheses.
Experiments show a persistent gap between LLM agents and optimal performance.
The platform is designed for AI scientists to test causal reasoning capabilities.
The paper is available on arXiv with ID 2605.26029.

CausaLab: Interactive Causal Discovery Platform for AI Scientists

Key facts

Entities

Institutions

Sources