TABALIGN: Enhancing LLM Table Reasoning with Cell-Grounding Attention
TABALIGN is a newly introduced framework designed to enhance multi-step reasoning in structured tables for large language models (LLMs). Existing techniques struggle due to the absence of a clear cell-grounding agreement in planning and execution, and they limit planners to a left-to-right approach that overlooks the invariance of table permutations. A preliminary study revealed that diffusion language models (DLMs) demonstrate greater alignment with human reasoning and maintain stable cell attention across permutations, achieving a 40.2% median decrease in attention-AUROC variability when rows are reordered. TABALIGN integrates a masked DLM planner, which generates plan steps as binary cell masks through bidirectional denoising, alongside TABATTN, a compact verifier trained on 1,600 human-validated attention standards, evaluating each step based on its attention overlap with the plan. This method effectively establishes a cell-grounding contract to synchronize planning with table organization.
Key facts
- TABALIGN addresses multi-step LLM reasoning over structured tables.
- Current methods fail due to lack of explicit cell-grounding contract.
- Existing planners use left-to-right factorization at odds with table permutation invariance.
- Diffusion language models (DLMs) produce more human-aligned cell attention than autoregressive models.
- DLMs show 40.2% median reduction in attention-AUROC variability under row reordering.
- TABALIGN uses a masked DLM planner emitting plan steps as binary cell masks.
- TABATTN is a lightweight verifier trained on 1,600 human-verified attention standards.
- TABATTN scores each step by attention overlap with the plan.
Entities
Institutions
- arXiv