LLMs Achieve Retrosynthesis via Atom-Anchored Reasoning

ai-technology · 2026-05-23

A new framework enables general-purpose large language models (LLMs) to perform single-step retrosynthesis without task-specific training. The method anchors chain-of-thought reasoning to molecular structure using unique atomic identifiers. In a zero-shot task, the LLM identifies relevant fragments and chemical labels; an optional few-shot step uses class examples to predict the transformation. This approach overcomes prior LLM underperformance in retrosynthesis, validated on academic benchmarks and expert-validated drug discovery molecules. The work addresses the scarcity of labeled chemical data by leveraging LLMs' reasoning capabilities.

Key facts

Framework uses unique atomic identifiers to anchor reasoning.
Operates without task-specific model training.
Two-step process: zero-shot fragment identification followed by optional few-shot prediction.
Applied to single-step retrosynthesis, a task where LLMs previously underperformed.
Tested on academic benchmarks and expert-validated drug discovery molecules.
Addresses scarcity of labeled data in chemistry.
Published on arXiv with ID 2510.16590v2.
Announcement type: replace-cross.

LLMs Achieve Retrosynthesis via Atom-Anchored Reasoning

Key facts

Entities

Institutions

Sources