Probabilistic Token Attribution for LLMs

ai-technology · 2026-05-23

A new research paper introduces a model-agnostic probabilistic token attribution measure for Large Language Models (LLMs). The method uses Bayes rule to invert next-token log-probabilities, capturing the model's internal representation of token sequence distributions independently of its computational structure. The attribution score is defined as the log ratio of the conditional probability of the response given the prompt versus the probability with a token marginalized away. This approach situates LLMs within stochastic process theory, offering a framework to understand how models generate responses based on learned probabilities. The paper is available on arXiv under identifier 2605.21726.

Key facts

The paper is titled 'Probabilistic Attribution For Large Language Models'.
It proposes a model-agnostic probabilistic token attribution measure.
The method uses Bayes rule to invert next-token log-probabilities.
The attribution score is the log ratio of two conditional probabilities.
The approach is independent of the model's computational structure.
It situates LLMs within the mathematical theory of stochastic processes.
The paper is available on arXiv with ID 2605.21726.
The method captures the model's internal representation of token sequence distributions.

Probabilistic Token Attribution for LLMs

Key facts

Entities

Institutions

Sources