LLM-ReSum: Self-Reflective Summarization Framework

ai-technology · 2026-04-30

Researchers have introduced a new framework called LLM-ReSum, designed for self-reflective summarization. This innovative approach utilizes a closed feedback loop integrating LLM-based evaluation and generation, all without needing to finetune the model. The framework was developed after a thorough meta-evaluation involving 14 automatic summarization metrics and LLM evaluators, assessed across seven datasets from various fields. These datasets include everything from short news pieces to extensive scientific and legal documents, containing between 2,000 to 27,000 words, along with over 1,500 human-annotated summaries. The study found that traditional metrics like ROUGE and BLEU poorly correlate with human judgments, while specialized neural metrics and LLM evaluators show much better alignment, especially regarding linguistic quality. You can check out the research on arXiv, ID 2604.25665.

Key facts

LLM-ReSum integrates LLM-based evaluation and generation in a closed feedback loop without model finetuning.
Meta-evaluation covered 14 automatic summarization metrics and LLM-based evaluators.
Seven datasets across five domains were used, including short news and long scientific, governmental, and legal texts.
Document lengths ranged from 2K to 27K words.
Over 1,500 human-annotated summaries were analyzed.
Traditional lexical overlap metrics (ROUGE, BLEU) showed weak or negative correlation with human judgments.
Task-specific neural metrics and LLM-based evaluators achieved higher alignment with human judgments.
The study is published on arXiv with ID 2604.25665.

LLM-ReSum: Self-Reflective Summarization Framework

Key facts

Entities

Institutions

Sources