Hyper-Parallel Decoding Boosts LLM Efficiency for Attribute Value Extraction

ai-technology · 2026-04-30

Researchers have introduced Hyper-Parallel Decoding (HPD), a novel algorithm that accelerates offline decoding in large language models (LLMs) for tasks like Attribute Value Extraction (AVE). By exploiting the conditional independence of attribute-value pairs, HPD enables parallel generation of multiple sequences from the same document context. Through position ID manipulation, tokens can be generated out of order, allowing up to 96 tokens per prompt when stacking multiple documents. HPD is compatible with all LLMs and reduces inference costs and time by up to 13.8 times. The paper is available on arXiv under ID 2604.26209.

Key facts

Hyper-Parallel Decoding (HPD) is a new decoding algorithm for LLMs.
HPD targets Attribute Value Extraction (AVE) tasks.
It leverages conditional independence of attribute-value pairs.
Parallelizes value generation within each prompt.
Enables out-of-order token generation via position ID manipulation.
Can decode up to 96 tokens per prompt by stacking multiple documents.
Works with all LLMs.
Reduces inference costs and time by up to 13.8X.

Hyper-Parallel Decoding Boosts LLM Efficiency for Attribute Value Extraction

Key facts

Entities

Institutions

Sources