ARTFEED — Contemporary Art Intelligence

InteractBind: Large-Scale Dataset for Protein-Ligand Binding Site Localization

publication · 2026-05-26

Researchers introduced InteractBind, a large-scale dataset of approximately 100,000 protein-ligand pairs, designed to evaluate whether models can localize binding sites and identify non-covalent interactions. Existing benchmarks focus on binary binding prediction and affinity regression, but do not assess spatial understanding of molecular recognition. InteractBind includes fine-grained binding-site localization tasks using protein-residue and ligand-atom interaction maps covering six major non-covalent interaction types. The dataset also provides binding affinity data. This work addresses a critical gap in computational drug discovery and molecular design by enabling more rigorous evaluation of protein-ligand models.

Key facts

  • InteractBind comprises approximately 100,000 protein-ligand pairs.
  • The dataset includes binding-site localization as a core fine-grained task.
  • Interaction maps cover six major types of non-covalent interactions.
  • Existing benchmarks evaluate only binary binding and affinity regression.
  • The work aims to assess whether models learn binding sites or just binding likelihood.
  • InteractBind is introduced as a benchmark for fine-grained evaluation.
  • The dataset supports computational drug discovery and molecular design.
  • The paper is published on arXiv with ID 2605.24045.

Entities

Institutions

  • arXiv

Sources