LLM-Metrics: Measuring Research Impact Through LLM Memory

ai-technology · 2026-05-23

A team of researchers has introduced LLM-Metrics, an innovative metric intended for evaluating the impact of research through the parametric memory of large language models (LLMs). Their hypothesis suggests that papers with greater influence garner more visibility within the academic sphere, subsequently contributing to LLM training data and enhancing parametric memory. To validate this concept, the researchers created four categories of multiple-choice probes focusing on title, author, method, and venue recognition. They assessed 549 computer science papers from 2023-2024 using 17 LLMs, which varied from 0.5B to 72B parameters across six different vendors. Out of the 17 models, 15 yielded positive predictions, with 9 showing significance at p < 0.05, demonstrating an overall Spearman correlation. This method seeks to overcome the shortcomings of citation counts, including temporal lag, disciplinary bias, and Matthew effects.

Key facts

LLM-Metrics is a research-impact assessment metric derived from LLM parametric memory.
Four types of multiple-choice probes were designed: title, author, method, and venue recognition.
549 computer science papers from 2023-2024 were evaluated.
17 LLMs from six vendors were tested, ranging from 0.5B to 72B parameters.
15 of 17 models produced positive predictions.
9 models showed significance at p < 0.05.
The approach addresses limitations of citation counts: temporal lag, disciplinary bias, Matthew effects.
The study is available on arXiv under ID 2605.22176.

LLM-Metrics: Measuring Research Impact Through LLM Memory

Key facts

Entities

Institutions

Sources