ARTFEED — Contemporary Art Intelligence

UrbanDataMiner Portal Launches with 60,000 Datasets Extracted from Nature-Affiliated Publications

digital · 2026-04-22

A new open portal called UrbanDataMiner now offers dataset-level search and filtering for over 60,000 urban datasets, addressing the lack of a unified global platform for urban data discovery. This tool is powered by Paper2Data, a large-scale LLM-driven pipeline that automatically extracts and structures dataset mentions from scientific literature. Paper2Data processes more than 15,000 Nature-affiliated publications, identifying datasets with high recall of approximately 90% and field-level precision above 80%, as validated by human-annotated evaluation. The pipeline employs a unified urban data metadata schema to organize the extracted information, enabling researchers to efficiently locate relevant datasets without manual searches. Urban data supports diverse applications across multiple disciplines, yet researchers previously had to sift through websites or papers individually. The arXiv preprint 2604.16317v1 details this cross-disciplinary initiative, which enhances accessibility to global urban data for academic and practical use.

Key facts

  • UrbanDataMiner is an open urban data discovery portal
  • It provides access to over 60,000 urban datasets
  • Datasets are extracted from more than 15,000 Nature-affiliated publications
  • Paper2Data is an LLM-driven pipeline for dataset identification and structuring
  • Paper2Data achieves approximately 90% recall in dataset identification
  • Field-level precision of Paper2Data is above 80%
  • A unified urban data metadata schema is used for structuring
  • The initiative addresses the lack of a global unified platform for urban data discovery

Entities

Institutions

  • Nature

Sources